Introduction to Docker: Part 3

Adding the secrets to make your dockerized Conversational app run

Oct 01, 2024

Secret values play an important role in conversational AI applications facilitating connections between services, protecting sensitive information, and keeping an application secure. SSH keys, API keys, authentication tokens, passwords, and other sensitive information all play a part in making an application work as intended. When sharing an application, you need to make sure that secrets are kept safe. Docker provides a variety of ways to work with your application’s secrets.

Updating the App with Secrets

In the previous articles (Part 1, Part 2), we’ve worked with a simple FastAPI application. This article assumes you have set up this application with the associated Dockerfile.

Our API has one endpoint that returns a simple “hello” message. However, the API could benefit from making connections to external services to facilitate more complex functionality. For example, the API might connect to an LLM provider to facilitate chats. In adding this connection, though, we’ll have to manage a secret value. LLM providers require that a service passes along a valid key to use their models.

Let’s add a new endpoint to the API. It will accept some user query, send it to OpenAI, and then return the result. In this simple app, we are making a request using the generic Python requests package, not a provider-specific package. You can substitute the LLM call with your preferred LLM provider or a general service like OpenRouter. You need to add the requests package to the requirements.txt and ensure you’ve installed it.

fastapi[standard]
python-dotenv
requests

The API key for the LLM provider should not be in your application code. If you shared that publicly, anyone could use it. Instead, you need to create an .env file and reference it with “os.environ.get()”. Below shows the updated API code.

import os
import requests

from dotenv import load_dotenv
from fastapi import FastAPI
from pydantic import BaseModel

load_dotenv()

OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")

app = FastAPI()

class QueryRequest(BaseModel):
    user_query: str

@app.get("/")
async def root():
    return {"message": "Hello World!"}

@app.post("/chat")
async def chat(request: QueryRequest):
    openai_endpoint = "https://api.openai.com/v1/chat/completions"

    headers = {
        "Authorization": f"Bearer {OPENAI_API_KEY}",
        "Content-Type": "application/json",
    }

    payload = {
        "model": "gpt-4o-mini",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": request.user_query},
        ],
        "max_tokens": 100,
    }

    response = requests.post(openai_endpoint, headers=headers, json=payload).json()
    completion_text = response["choices"][0]["message"]["content"]
    return {"response": completion_text}

Just as LLM providers restrict access only to those who provide a valid key, you need to make sure that those who access your API are authorized to do so. You can limit access to your own API by requiring an API token. For this API, we’ll restrict the /chat endpoint with a token requirement and leave the other endpoint open. Add the secret “MY_API_TOKEN” to your .env file with a value of your choice, such as “password123”.

import os
import requests

from dotenv import load_dotenv
from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import OAuth2PasswordBearer
from pydantic import BaseModel

load_dotenv()

OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")

app = FastAPI()
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")

class QueryRequest(BaseModel):
    user_query: str

@app.get("/")
async def root():
    return {"message": "Hello World!"}

@app.post("/chat")
async def chat(request: QueryRequest, token: str = Depends(oauth2_scheme)):
    VALID_TOKEN = os.environ.get("MY_API_TOKEN")
    if token != VALID_TOKEN:
        raise HTTPException(status_code=401)

    openai_endpoint = "https://api.openai.com/v1/chat/completions"

    headers = {
        "Authorization": f"Bearer {OPENAI_API_KEY}",
        "Content-Type": "application/json",
    }

    payload = {
        "model": "gpt-4o-mini",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": request.user_query},
        ],
        "max_tokens": 100,
    }

    response = requests.post(openai_endpoint, headers=headers, json=payload).json()
    completion_text = response["choices"][0]["message"]["content"]
    return {"response": completion_text}

With that, the application has two secrets that each serve an important purpose in the app. One connects your application to another service; the other controls access to your service. You can test that the app works with the following requests:

Testing the Dockerized Application

Run Docker to build an image with the changes. Then, run the application in a Docker container.

>>> docker build -t simple_fastapi_app:latest .
>>> docker run -p 8000:8000 simple_fastapi_app:latest

Send a request to the API. For this test, leave off the headers. This will let us verify that the request fails when the MY_API_TOKEN isn’t provided.

>>> resp = requests.post(url='http://localhost:8000/chat', json={'user_query':'hi'})
>>> resp.content
b'{"detail":"Unauthorized"}'

Now add the headers to make sure that it recognizes MY_API_TOKEN.

>>> MY_API_TOKEN = your-api-token
>>> headers = {"Authorization": f"Bearer {MY_API_TOKEN}", "Content-Type": "application/json"}
>>> resp = requests.post(url='http://localhost:8000/chat', json={'user_query':'hi'}, headers=headers)
>>> resp.content
b'{"response":"Hello! How can I assist you today?"}'

The application still works. It fails when it does not receive the MY_API_TOKEN, and it calls the LLM when it does.

However, there is a problem with the way we Dockerized this application. The instructions in the Dockerfile say to copy all the files from the computer’s project root folder into the Docker image’s /app directory. That has included the .env file.

COPY . /app

Let’s check out the environment of the Docker container. When you explore the container, you’ll see a list of variables, including OPENAI_API_KEY or API_TOKEN.

>>> docker exec -it {container-id} cat .env
OPENAI_API_KEY="djikhlasdgjklasdgjklagsdjkl" # Not real
API_TOKEN="password123"

Because the Docker image contains the secrets, anyone you shared it with would have access to the secrets, too. This could be a security issue if you pushed it up to a public place like DockerHub.

Protecting Your .env File

When building the Dockerfile, one good practice is to only add what you need. Instead of copying over everything with “COPY .”, it’s better to copy over the few files that are necessary for deployment. For our simple API, we only need the requirements.txt and the app.py.

FROM python:3.11-alpine

WORKDIR /app

COPY requirements.txt /app/requirements.txt

RUN pip install -r requirements.txt

COPY app.py /app # Updated line

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

That will take care of the .env file. It will not be copied over. To confirm that, rebuild your Docker image from scratch.

>>> docker build -t simple_fastapi_app:latest . --no-cache
>>> docker run -p 8000:8000 simple_fastapi_app:latest
>>> docker exec -it {container-id} cat .env
cat: can't open '.env': No such file or directory

This solution works for our .env file. However, as your API grows in complexity, there may be many files or even subdirectories that you’d like to exclude from the Docker image. For example, you may want to exclude documentation, experiments, or even other kinds of secrets. It would be tedious and brittle to write COPY statements for each thing you want to include.

To easily ignore these files, you can make a .dockerignore file. Each line of the .dockerignore file is a different pattern that should be excluded from the build process.

.env
env

With this .dockerignore, Docker would not copy over the .env file whether we adjusted “COPY .” or not. As it stands, though, we’ve secured the .env file in multiple ways.

We only copied in the app.py by specifying that file directly.
We told Docker to ignore the .env file
We wrote down different forms of “env” to ensure it remains excluded should people rename it in their local repository.

Following these strategies not only makes the application more secure but also keeps the application smaller. Smaller applications build faster, upload faster, and take up less space - all of which would ultimately help with deployment.

Injecting Secrets into Docker

To make Docker images, you often don’t need the secret values. No one makes a request to a Docker image. The API isn’t running, so it won’t try to connect to the LLM provider. The Docker image just contains what our application needs to run. When the application is ready to run, via a container, you can inject the secrets into the Docker container.

Injecting secrets has some benefits. One is that your secrets are safe while still allowing you to share your application. Injecting secrets instead of hard-coding them also gives you flexibility to change the settings between running instances. You can configure your application to run differently for testing or for other unique situations. For example, you may have staging and production instances that require different settings to use different versions of resources. Or , in the case of this API, you may want to use different keys for your LLM calls.

Send an .env file

One easy way to inject secrets is to tell the container to use an .env file. The following tells Docker to use the .env in the folder where we run the command.

>>> docker run --env-file .env -p 8000:8000 simple_fastapi_app

This command puts the variables in your Docker container’s environment. It’s doesn’t download the .env file to the project directory in the image.

>>> docker exec -it {container-id} cat .env
cat: can't open '.env': No such file or directory

>>> docker exec -if {container-id} env
<your env variables>
OPENAI_API_KEY="djikhlasdgjklasdgjklagsdjkl" # Not real
API_TOKEN="password123"

If you make a request to the Docker container now, the application will still fail to provide a response due to an authentication error. That’s unusual because we saw that the environment has the right ENV variables. You can also verify that the application is still working locally. We didn’t change anything. Do you notice anything unusual about the environment variables in the Docker container?

>>> docker exec -it {container-id} env
OPENAI_API_KEY="djikhlasdgjklasdgjklagsdjkl" # Not real
API_TOKEN="password123"

The token has been copied to the Docker container’s environment exactly as it is written, but it includes unnecessary quotation marks. The API is rightfully saying that password123 does not match “password123”.

Your local system probably excluded these quotation marks when loading the variables. Different systems handle secret files and values in different ways, so it can be useful to leave off quotation marks where they are unnecessary. Let’s update the ENV.

OPENAI_API_KEY=djikhlasdgjklasdgjklagsdjkl # Not real
API_TOKEN=password123

You don’t need to rebuild the Docker image. Just rerun the run command with the updated .env file. Now things should work.

>>> docker run --env-file .env -p 8000:8000 simple_fastapi_app

Unfortunately, you will need to resolve issues with the formatting of values when working across different platforms, especially with complex objects. Cloud platforms often have a credentials file that has a JSON object with nesting. Different systems (Git repo, local IDE, Docker, etc.) might represent that value in different ways, which can complicate the standardized build process you laid out in the Dockerfile. Indeed, you may even need to use other techniques to get a value, such as a file, into the system. Although Dockerfiles are supposed to work similarly across systems, you may need to experiment with how different platforms handle variables to ensure you understand how to build the Dockerfile correctly if you want to pass those in with this method or others.

Set individual .env variables

Another way to get secrets into your Dockerized application is to specify the variables individually. To do so, you use the “-e” flag with the variable name and its value. Note that you still do not need to rebuild the Docker container.

docker run -e OPENAI_API_KEY=your-openai-api-key -e MY_API_TOKEN=your-api-token -p 8000:8000 simple_fastapi_app

This can become cumbersome when you have a lot of environmental variables because you need to prefix each with “-e” and write them all out. It’s also very easy to make mistakes as you write more and more variables.

Use build arguments

The previous methods worked with variables necessary at runtime, but you may need to use a secret value to successfully build your application. Perhaps you need to use a token to download some resource, or perhaps you need to pass in some placeholder value so that the app can run through its build. In this case, you can use —build-args in the Docker build command, instead of the run command.

docker build --build-arg OPENAI_API_KEY=$OPENAI_API_KEY \
             --build-arg OTHER_ENV_VAR=$OTHER_ENV_VAR \
             -t my-fastapi-app .

To reference these —build-arg values in the Dockerfile, you can use ARG followed by the variable name. These build-args will be present during the build process, basically when Docker is running through your Dockerfile steps. However, the resulting image will not retain those values during runtime (unlike -e variables).

ARG OPENAI_API_KEY

To persist the values as ENV variables, you can assign the ARG value to ENV. For values that truly need to be kept secret, you may not want to do this or you may want to pass in dummy values.

ARG OPENAI_API_KEY 
ENV OPENAI_API_KEY = $OPENAI_API_KEY

Note that ARG does not pick up environment variables from an .env or the general environment in your local environment. By adding ARG to the Dockerfile, you must pass in those arguments via the build command with “—build-arg”. If you do not, the Docker image will not be built properly.

Unlike the previous two methods, using build-args, ARG, and ENV will require you to rebuild from the Dockerfile if you actually update the Dockerfile. This method also has the disadvantage of cluttering up both the Dockerfile and build command.

Conclusion

Docker provides many ways of handling secrets that allow you to flexibly choose how you inject them into your application. Sometimes, you may want to directly enter secret values into the Dockerfile. For example, you may want to create a default user/password using dummy info like “test_user” / “password”. A user could adjust these to something more secure, or they could just use them to get the application running quickly (for testing). It’s worth considering how best to apply the techniques to achieve your particular goals. The techniques we covered so far are basic ones, but other features of Docker provide even more flexibility.

Managing secrets is an important part of working with Docker. The tools Docker provides can help you come up with a strategy best suited towards your needs and goals. Additionally, for working with team members, you may consider a central place to store these values so that everyone had access to the most up-to-date secrets. Credential management services, like Bitwarden, Infisical, and Doppler, can store these for your team. With the application built, Dockerized, and secured, you’re ready to share your application!

Chatbot Development Newsletter

Discussion about this post