in case anyone doesn't know what that means, its basically this kind of dockerfi...

jchw · on Jan 26, 2023

Typically I wind up using a different source image for the builder that ideally has (most of) the toolchain bits needed, but the same runtime base as the final image. (For Go, go:alpine and alpine work well. I'm aware alpine/musl is not technically supported by Go, but I have yet to hit issues in prod with it, so I guess I'll keep taking that gamble.)

KronisLV · on Jan 26, 2023

I take advantage of multi-stage builds, however I still think that the layer system could have some nice improvements done to it.

For example, say I have my own Ubuntu image that is based on one of the official ones, but adds a bit of common configuration or tools and so on, on which I then build my own Java image using the package manager (not unlike what Bitnami do with their minideb, on which they then base their PostgreSQL and most other container images).

So I might have something like the following in the Ubuntu image Dockerfile:

  RUN apt-get update && apt-get install -y \
    curl wget \
    net-tools inetutils-ping dnsutils \
    supervisor \
    && apt-get clean && rm -rf /var/lib/apt/lists /var/cache/apt/*

But then, if I want to install additional software, I need to fetch the package list anew downstream:

  FROM my-own-repo/ubuntu
  
  RUN apt-get update && apt-get install -y \
    openjdk-17-jdk-headless \
    && apt-get clean && rm -rf /var/lib/apt/lists /var/cache/apt/*

As opposed to being able to just leave the cache files in the previous layers/images, then remove them in a later layer and just do something like:

  docker build -t my_optimized_java_image -f java.Dockerfile --purge-deleted-files .
  
  or maybe
  
  docker build -t my_regular_java_image -f java.Dockerfile .
  purge-deleted-files -t my_regular_java_image -o my_optimized_java_image

Which would then work backwards from the last layer and create copies of all of the layers where files have been removed/masked (in the later layers) to use instead of the originals. Thus if I'd have 10 different images that need to use apt to install stuff while building them, I could leave the cache in my own Ubuntu image and then just remove it for whatever I want to consider the "final" images that I'll ship, which would then alter the contents of the included layers to purge deleted files.

There's little reason why these optimized layers couldn't be shared across all 10 of those "final" images either: "Hey, there's these optimized Ubuntu image layers without the package caches, so we'll use it for our .NET, Java, Node and other images" as opposed to --squash which would put everything in a single large layer, thus removing the benefits from the shared layers of the base Ubuntu image and so on.

Who knows, maybe someone will write a tool like that some day.

Too · on Jan 26, 2023

You will be happy to hear that already exists since. Read up on docker buildkit and the --mount option.