It is a daunting task to keep the size of docker images small. You might have seen your docker build get extensively large once you start adding more artifacts that would be needed in production. However, most of the time you’ll not need the built environment as part of the final container.

That is when Docker Multi-stage builds comes into play.

Let’s understand it with the help of an example.

We’ll deploy a simple hello-world application in Go, first using a single-stage docker file, which is a more traditional way when you step into the world of containers, and then we’ll convert this into a multi-stage docker file and see how it makes a difference.

Our hello world program is stored as a main.go file.

package main

import "fmt"

func main() {
    fmt.Println("hello world")
}

The single stage docker named SingleStageDockerfile file will look like this:

FROM golang:1.11.1
RUN mkdir /app
ADD . /app
WORKDIR /app
# Create executable for Go code
RUN CGO_ENABLED=0 GOOS=linux go build -o main ./...
CMD ["./main"]

We can run this as:

$ docker build . -f SingleStageDockerfile -t single-stage-golang
[+] Building 4.3s (11/11) FINISHED
=> [internal] load build definition from SingleStageDockerfile            0.0s
=> => transferring dockerfile: 213B                                       0.0s
=> [internal] load .dockerignore                                          0.0s
=> => transferring context: 2B                                            0.0s
=> [internal] load metadata for docker.io/library/golang:1.11.1           3.0s
=> [auth] library/golang:pull token for registry-1.docker.io              0.0s
=> [internal] load build context                                          0.1s
=> => transferring context: 2.03MB                                        0.1s
=> [1/5] FROM docker.io/library/golang:1.11.1@sha256:63ec0e29aeba39c0fe2  0.0s
=> CACHED [2/5] RUN mkdir /app                                            0.0s
=> [3/5] ADD . /app                                                       0.0s
=> [4/5] WORKDIR /app                                                     0.0s
=> [5/5] RUN CGO_ENABLED=0 GOOS=linux go build -o main ./...              1.0s
=> exporting to image                                                     0.1s
=> => exporting layers                                                    0.0s
=> => writing image sha256:5033a0baf065a64bd7a4b9c275b2b0c182d4dc7ab6189  0.0s
=> => naming to docker.io/library/single-stage-golang                     0.0s

and now can run this as:

$ docker run single-stage-golang
hello world

Let’s check the size of the image:

$ docker images
REPOSITORY                    TAG        IMAGE ID       CREATED         SIZE
single-stage-golang           latest     5033a0baf065   9 seconds ago   780MB

Now, the multi-stage docker file named Dockerfile, for this example, will look like this:

FROM golang:1.11.1 AS builder
RUN mkdir /app
ADD . /app
WORKDIR /app
# Create executable for Go code
RUN CGO_ENABLED=0 GOOS=linux go build -o main ./...

# Second stage build which will execute the binary
FROM alpine:latest AS production
COPY --from=builder /app .
CMD ["./main"]

Let’s break the multi-stage builds into sections

FROM golang:1.11.1 AS builder

We are aliasing the first image as builder, which we are then referring to in the second build, to copy our artifact from (in this case the golang executable file) as:

COPY --from=builder /app .

In the first stage, we are making an executable file for the Go code. This env would need golang and thus the very first base image that we use is of golang. Next, in the second stage, we just use alpine as the base image (which is very small in size) and does not contain the golang runtime. We just copy our executable from the previous stage in our final stage.

Let’s make a docker image:

$ docker build . -f Dockerfile -t multi-stage-golang
[+] Building 3.8s (14/14) FINISHED
=> [internal] load build definition from Dockerfile0.0s
=> => transferring dockerfile: 324B                                         0.0s
=> [internal] load .dockerignore                                            0.0s
=> => transferring context: 2B                                              0.0s
=> [internal] load metadata for docker.io/library/alpine:latest             2.5s
=> [internal] load metadata for docker.io/library/golang:1.11.1             1.0s
=> [auth] library/alpine:pull token for registry-1.docker.io                0.0s
=> CACHED [production 1/2] FROM docker.io/library/alpine:latest@sha256:69e70a79f2d41ab5d637de98c1e0b055206ba40a8145e7bddb55ccc04e13cf8f                                                                          0.0s
=> [builder 1/5] FROM docker.io/library/golang:1.11.1@sha256:63ec0e29aeba39c0fe2fc6551c9ca7fa16ddf95394d77ccee75bc7062526a96c                                                                                    0.0s
=> [internal] load build context                                            0.0s
=> => transferring context: 124B                                            0.0s
=> CACHED [builder 2/5] RUN mkdir /app                                      0.0s
=> CACHED [builder 3/5] ADD . /app                                          0.0s
=> CACHED [builder 4/5] WORKDIR /app                                        0.0s
=> CACHED [builder 5/5] RUN CGO_ENABLED=0 GOOS=linux go build -o main ./... 0.0s
=> [production 2/2] COPY --from=builder /app .                              0.0s
=> exporting to image                                                       0.0s
=> => exporting layers                                                      0.0s
=> => writing image sha256:ed5e9282d8d5f4d195584da215dd93017be48e8602a347f9226045ed8e11b759                                                                                                                      0.0s
=> => naming to docker.io/library/multi-stage-golang                        0.0s

Let’s check the newly created image:

$ docker images
REPOSITORY                    TAG                     IMAGE ID       CREATED          SIZE
multi-stage-golang            latest                  ed5e9282d8d5   56 seconds ago   7.52MB

We can run this as a container by executing the following:

$ docker run multi-stage-golang
hello world

The container that is spawned from this multi-stage docker file, will just have the alpine image and run the executable. It does not care about the source code or how it is generated and if it even has dependencies for that.

Thus, the size of the container is directly a reflection of the size of the last stage of the build including the size of the artifacts from previous builds. There is a considerable amount of difference in the image size even for this simple hello-world program where we don’t want an entire runtime to be available as part of our container. The single-stage docker file is 780 MB while the multi-stage docker file is just 7.52 MB.

This can be replicated for Java programs, go programs, etc., and provides a considerable advantage to keep your container size small and your Dockerfile more maintainable.

Related articles: