Cloning private dependencies in Docker and Go
One topic that seems to come up repeatedly on Stack Overflow or other online forums is the topic of how to go get
private dependencies. Specifically, if I have a private Git repository on Github or Bitbucket, how do I bring that code locally via the go get
tool such that automated builds can produce a clean, consistent build without interaction from a user? This problem is largely solved for public Github dependencies but continues to be a challenge for private dependencies. To reiterate, if you're only cloning publicly available dependencies, you probably won't be reading this post. This post is specifically for how to resolve private dependencies which require some form of authentication to access.
Docker
Docker provides a great environment for building software by isolating it from all outside dependencies. It definitely helps reduce the "It Works on My Machine!" phenomenon. This level of isolation brings its own set of challenges as well. How do you clone private dependencies with Git and Go when inside a Docker container?
There are at least ~~four~~ five ways to create Docker images from Go sources that contain private dependencies. Each has benefits and drawbacks:
- Compile the Go binary on the host and copy the resulting artifact into a container.
- Resolve the dependencies on the host, copy the source into a container and then compile it. (2 options)
- Copy an SSH key (
id_rsa
) into the container and compile the binary in a container. - Copy
.git-credentials
into the container and compile the binary in a container. - Copy
.netrc
into the container and compile the binary in a container.
Option 1: Compiling on the host
This is probably one of the simplest methods. Because of Go's cross-compilation capabilities, it's oftentimes easiest to run: GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o app
to produce an artifact.
From there, a full Dockerfile
can often be as simple as:
FROM scratch
COPY app /
ENTRYPOINT ["/app"]
Granted, with scratch
you'll typically want to ensure you have SSL root certificates installed, among other things.
The downside to this is juggling all of the build flags. For example, if you're developing on the host, you'll run go build
by itself. But when compiling for a container, you'll typically want to specify GOOS=linux GOARCH=amd64 CGO_ENABLED=0
, etc. As long as you embed that logic into a Makefile
or similar tool it shouldn't be too complicated.
One other caveat is that Go still prefers to download via HTTP and if you're running ssh-agent
, you'll want to be sure that you force Go to use SSH when cloning dependencies and you'll want to be sure your ~/.ssh/known_hosts
file contains the appropriate list of recognized hosts otherwise go get
will fail:
$ git config --global url."git@github.com:".insteadOf https://github.com/
$ git config --global url."git@bitbucket.org:".insteadOf https://bitbucket.org/
$ ssh-keyscan github.com >> ~/.ssh/known_hosts
$ ssh-keyscan bitbucket.org >> ~/.ssh/known_hosts
From there, go get github.com/joliver/sample
(change the name to your private dependency) will work.
Option 2a: Resolve dependencies on the host and copy them into a container
UPDATE: This is a new option I discovered after first publishing this post. I had a situation where I needed a Go binary that interacted with C-based dependencies.
The new go mod
dependency resolution mechanism is great...except it downloads all dependencies into a shared, system-level directory at $GOPATH/pkg/mod
(unlike Node and the default node_modules
directory in the local workspace). While having a central directory is great for option #1 above, it adds some complications for this option because we typically only want to clone our immediate dependency source files into the container rather than the entire $GOPATH/pkg/mod
directory which contains all versions of all dependencies for every project on the host machine. Fortunately, we can fake the GOPATH during compilation to resolve dependencies just for the immediate compilation being performed.
Here's the associated Dockerfile
:
FROM golang:1.12 as builder
WORKDIR /builder
ADD . /builder
RUN GOPATH=.dependencies CGO_ENABLED=0 go build -o app
FROM scratch
COPY --from=builder /builder/app /app
ENTRYPOINT ["/app"]
The build the image run:
GOPATH=.dependencies go mod download # resolve dependencies on the host
docker build . -t my-image
Aside: one reason to use a "dot" directory (.e.g. .dependencies
) is to inform the go build chain to ignore the directory when running commands such as go test ./...
, etc.
By resolving dependencies on the host and then compiling in the container, we actually get the best of both worlds. We avoid all the complexity of trying to authenticate within a Docker container while at the same time we get clean, isolated, and fully repeatable builds without environmental factors influencing the build.
Option 2b: Resolve dependencies on host and copy the into a container (again)
UPDATE: This is yet another option I discovered since publishing this post. Like option 2a above, this allows you to get the dependencies on the host where you have some kind of security context such as SSH keys or other authentication information available. But unlike the option above, the Go tooling can help simplify things.
The trick is to use Go's built-in vendoring capabilities without checking in the vendor/
folder. Here's the associated command you'd run on the host:
go mod vendor
docker build . -t my-image
Here's the corresponding Dockerfile
FROM golang:1.12 as builder
WORKDIR /builder
ADD . /builder
RUN CGO_ENABLED=0 go build -mod=vendor -o app
FROM scratch
COPY --from=builder /builder/app /app
ENTRYPOINT ["/app"]
We can prevent committing the entire vendor/
tree by adding it as an exception to the .gitignore
file.
Until the concept of vendoring is no longer a thing in Go, this seems like the best option to get private dependencies into the container because it has the full support of the Go build toolchain and doesn't require any special magic to make it work. It simply uses standard Go command-line flags. Additionally, when running commands on the host environment such as go build
or go test
, the vendor/
directory is ignored because of the go.mod
and go.sum
files. This means your host-based developer workflows are blissfully unaffected by the existence of a a vendor/
directory.
Option 3: Compiling in Docker: SSH authentication
For this to work, you need to get the environment set up to compile in the container AND have an SSH key in the container. The SSH key cannot have a passphrase. There are several moving parts:
The Dockerfile
:
FROM golang:latest as builder
# ARG security: https://bit.ly/2oY3pCn
ARG SSH_PRIVATE_KEY
WORKDIR /builder/
ADD . /builder/
RUN mkdir -p ~/.ssh && umask 0077 && echo "${SSH_PRIVATE_KEY}" > ~/.ssh/id_rsa \
&& git config --global url."git@bitbucket.org:".insteadOf https://bitbucket.org/ \
&& git config --global url."git@github.com:".insteadOf https://github.com/ \
&& ssh-keyscan bitbucket.org >> ~/.ssh/known_hosts \
&& ssh-keyscan github.com >> ~/.ssh/known_hosts
RUN go build . -o app
FROM scratch
COPY --from=builder /builder/app/ .
ENTRYPOINT ["/app"]
Running from the command line on the host:
$ export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_rsa)"
$ docker build . --build-arg SSH_PRIVATE_KEY -t my-image
Notice in the Dockerfile
above we need to perform the following steps:
- Copy our SSH key to
~/.ssh/id_rsa
and configure it with correct permissions (0600
). - Configure Git to use SSH instead of HTTP to clone dependencies.
- Ensure we have the SSH host keys for our source control providers.
- Compile the Go binary.
You may also notice that there are actually two sections in the Dockerfile, the first (builder
) is used soley to compile and assemble the Go binary, while the second is very lightweight and is designed for production runtime. Splitting the Dockerfile
into these two phases allows us to separate the compilation needs from the production runtime needs.
Optionally, you can have also have a docker-compose.yml
file:
version: '3.7'
services:
app:
build:
context: .
args:
- SSH_PRIVATE_KEY
and from the host run:
$ export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_rsa)"
$ docker-compose up
Option 4: Compiling in Docker: ~/.git-credential
authentication:
Much like the the SSH solution above, this solution involves copying a file into a container through a build argument. For this particular file, we will copy a .git-credentials
file located somewhere on your host machine. The git credentials file looks something like this:
https://joliver:my-random-github-token-here@github.com
https://joliver:my-random-bitbucket-token-here@bitbucket.org
Tokens can be generated for:
- Github
- Bitbucket (https://bitbucket.org/account/settings/app-passwords/new)
One you have the .git-credentials
file created and saved on your host machine, we're ready to copy it in.
Here's the Dockerfile:
FROM golang:latest as builder
# ARG security: https://bit.ly/2oY3pCn
ARG DOCKER_GIT_CREDENTIALS
WORKDIR /builder/
ADD . /builder/
RUN git config --global credential.helper store && echo "${DOCKER_GIT_CREDENTIALS}" > ~/.git-credentials
RUN go build . -o app
FROM scratch
COPY --from=builder /builder/app .
ENTRYPOINT ["/app"]
Running from the command line on the host:
$ export DOCKER_GIT_CREDENTIALS ="$(cat ~/.git-credentials)"
$ docker build . --build-arg DOCKER_GIT_CREDENTIALS -t my-image
Within the Dockerfile above there are several steps:
- Set up the git configuration to use the
store
credential helper - Get the
.git-credentials
file created in the proper location. - Building the app.
Again, just like the SSH-based solution above, we split the Dockerfile
into two parts: build/compile and runtime.
We can also have docker-compose.yml
as follows:
version: '3.7'
services:
app:
build:
context: .
args:
- DOCKER_GIT_CREDENTIALS
and from the host run:
$ export DOCKER_GIT_CREDENTIALS ="$(cat ~/.git-credentials)"
$ docker-compose up
Option 5. Compiling in Docker: ~/.netrc
authentication:
Finally, there's yet another method that doesn't involve Git at all. This is a mechanism called .netrc
. I first came across it as I was looking at Go modules now available in Go v1.11. The file lives on your host machine and is structured as follows (using the same tokens used to generate .git-credentials
found above):
machine github.com login joliver password my-random-github-token-here
machine bitbucket.org login joliver password my-random-bitbucket-token-here
The Dockerfile
:
FROM golang:latest as builder
# ARG security: https://bit.ly/2oY3pCn
ARG DOCKER_NETRC
WORKDIR /builder/
ADD . /builder/
RUN echo "${DOCKER_NETRC}" > ~/.netrc
RUN go build . -o app
FROM scratch as deploy
COPY --from=builder /builder/app .
ENTRYPOINT ["/app"]
$ export DOCKER_NETRC ="$(cat ~/.netrc)"
$ docker build . --build-arg DOCKER_NETRC -t my-image
Here's docker-compose.yml
:
version: '3.7'
services:
app:
build:
context: .
args:
- DOCKER_NETRC
and from the host run:
$ export DOCKER_NETRC ="$(cat ~/.netrc)"
$ docker-compose up
Pros and cons
Each solution has strengths and benefits. It's matter of finding the right balance for your use case. For example, if you're running on a CI/CD server such as Circle CI, Travis CI, or Bitbucket Pipelines, what does your build server provide? Can you feed it environment variables? Can it have an SSH key? Furthermore, what caching is available for repeating builds quickly? Each solution has a different amount of overhead to be established within the container.
Most CI/CD environments provide some SSH-based mechanism when executing the build process. As such, the SSH solution can be the most attractive. It has a bit more overhead than some of the other solutions because it requires you to run several initializing instructions.
Obviously the one with the least amount of overhead is compiling on the host, but it's also the one with the most potential for external interference. Fortunately with the introduction of Go modules in Go v1.11, much of the GOPATH
complexity will disappear and will allow for cleaner builds ensuring an unmodified/clean state of the dependency tree. (In the GOPATH
era it was easy to tweak a dependency and not commit that modification and then get different results between workstation invocations and production environments.)
With the advent of Go modules, internally we've started to re-evaluate which solution might be best. We're leaning heavily toward compiling on the host because our workstations are already configured with SSH and already have a list of trusted known_hosts
. In our CI/CD environment, the "host" environment provides SSH and also trusts Github and Bitbucket host keys by default. The only thing we need to execute is the "insteadOf" commands above to force Go to utilize SSH instead of HTTP when cloning dependencies.