Infrastructure Base Images

The final step of our Infrastructure Seed process creates the "infrastructure grade" Docker images that are based off assets coming from our own DML. Our DML should be running and accessible at http://localhost:8081 from the previous step, allowing us to start using artifacts we are managing ourselves. The following images will now be created:

  1. Ubuntu Base Image with APT template
  2. OpenJDK8 with our own copy of the JDK and APT proxy

Ubuntu 20.04 Base Image

In principle, the Dockerfile remains the same, however, the APT sources have been adapted by adding a template file for our new sources and removing the existing sources.

labimagesubuntu-focal
version: '3.3'

services:
  ubuntu-focal:
    image: infra/ubuntu/focal:1.0.0
    build: .
FROM scratch
# download externally first
ADD local/ubuntu-focal-oci-amd64-root.tar.gz /
CMD ["bash"]

##################################
# ensure we only get apt from us #
##################################

COPY sources.list.base /etc/apt/sources.list.base
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal main restricted
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-updates main restricted
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal universe
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-updates universe
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal multiverse
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-updates multiverse
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-backports main restricted universe multiverse
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-security main restricted
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-security universe
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal-security multiverse

The IMPORTANT sources.list change appended at the end of the Dockerfile.

##################################
# ensure we only get apt from us #
##################################

COPY sources.list.base /etc/apt/sources.list.base
RUN mv /etc/apt/sources.list /etc/apt/sources.list.bak

Building the Ubuntu Image

Since the Ubuntu is a FROM scratch image we do not have curl available during the build, so we are going to first download the distro from our own DML before running the Dockerfile build. This manual "external to docker" download is only required for base FROM scratch images. All other images will be using the same Ubuntu base image, thus providing access to package managers and of course curl for downloads.

mkdir images/ubuntu-focal/local/
curl http://admin:admin123@localhost:8081/repository/dml/docker/ubuntu/focal/20210827/ubuntu-focal-core-cloudimg-amd64-root.tar.gz -o images/ubuntu-focal/local/ubuntu-focal-core-cloudimg-amd64-root.tar.gz
docker-compose -f images/ubuntu-focal/docker-compose.yml build
rm -R images/ubuntu-focal/local/

Nexus3 and Dockerbuilds

Using Nexus3 for all artifacts during docker builds is handled by a parameterized build argument parameter called ARG_ART_URL (Argument Artifact URL) part of the docker-compose used for building images. This allows us to properly isolate various environments for development, testing and production.

build:
args:
    ARG_ART_URL: [nexus-root-url]

APT Proxy and Dockerfile builds

The primary reason for creating a source template inside our base ubuntu image is the ability to specify the location at build time. Lets compare sources.list.base (ours) with the sources.list.bak (original) on our Ubuntu image.

#before (sources.list.bak)
deb http://archive.ubuntu.com/ubuntu/ focal main restricted

#after (sources.list.base)
deb APT_URL/repository/apt-proxy-ubuntu-focal/ focal main restricted

Where APT_URL will act as a placeholder to inject a specific location. As we now create images that need to use the package manager, we will be wrapping the apt-get command chains with another before and after command. Before, a typical apt-get install may have looked like this:

RUN apt-get update \
  && apt-get install -y curl \
  && apt-get clean

Now we wrap the following additional commands around the main body:

RUN sed -e "s|APT_URL|${ARG_ART_URL}|" /etc/apt/sources.list.base > /etc/apt/sources.list \
  && apt-get update \
  && apt-get install -y curl \
  && apt-get clean \
  && rm /etc/apt/sources.list

ARG_ART_URL being the build argument passed into the docker build process to replace the placeholder inside our sources.list.base file. End result is a new sources.list created on the fly before package manager installation and removed again after installation ensuring that the docker image layer will have no reference to the DML after command execution. Alternatively, if you want to bypass DML all together you can replicate the prior behavior via

RUN cp /etc/apt/sources.list.bak /etc/apt/sources.list  \
  && apt-get update \
  && apt-get install -y curl \
  && apt-get clean \
  && rm /etc/apt/sources.list

A nice side benefit besides improving our availability guideline is significant speed up of docker images build times. For example a Dockerfile installing nginx 1.18 and openssl has the following build time changes:

  • apt-get "online" 72.7 seconds
  • apt-get "Nexus3" first run 79.9 seconds (+7.2 seconds)
  • apt-get "Nexus3" cached run 16.1 seconds (-56.6 seconds)

Also worth noting that the cached run does not need any outside network access and is fully isolated to local network.

DML Downloads and Dockerfile builds

Besides the APT proxy, our docker build curl downloads will now also rely as much as possible on the Nexus3 DML. For example our nexus3 download we had in our seed image:

RUN curl -fsSL https://download.sonatype.com/nexus/3/nexus-${NEXUS_VERSION}-unix.tar.gz -o /tmp/nexus-${NEXUS_VERSION}-unix.tar.gz \
  && echo "${NEXUS_CHECKSUM}  /tmp/nexus-${NEXUS_VERSION}-unix.tar.gz" | sha256sum -c - \
  && tar -xvzf /tmp/nexus-${NEXUS_VERSION}-unix.tar.gz --strip-components=1 -C /opt/sonatype/nexus \
  && rm -rf /tmp/*

Now turns into:

RUN curl $ARG_ART_URL/repository/dml/docker/nexus/nexus-${NEXUS_VERSION}-unix.tar.gz -o /tmp/nexus-${NEXUS_VERSION}-unix.tar.gz \
  && tar -xvzf /tmp/nexus-${NEXUS_VERSION}-unix.tar.gz --strip-components=1 -C /opt/sonatype/nexus \
  && rm -rf /tmp/*

Note: sha256 integrity verification is skipped as it comes from our own source that has previously been verified before uploading. This helps when building new images, or updating existing ones, as it safes you some time hunting down and updating shas for version updates. The file and sha integrity check will only be done once, during the upload to the DML.

Security Considerations for Artifact Build Access

The next challenge we are going to encounter is our password protected Nexus3 instance that we need to download artifacts. Unfortunately as of this point, secrets and builds are not fully supported. While the new Buildkit allows for secrets, it is not yet compatible with docker-compose (Pending PR: https://github.com/docker/compose/pull/7046). Once made available we will update the lab after security verification.

If we were to configure our ARG_ART_URL as http://admin:admin123@localhost:8081 to allow us easy curl access, we would be exposing our credentials as part of the docker image. Meaning, anyone who has access to your image has access to all commands executed as part of the build. While we are dealing with our own images only, in our own user/password protected environments we still need to minimize any risks of leaking credentials. An example of any of our images that would be using the ARG_ART_URL as an argument with credentials included.

>docker history infra/[image-using-our-artifact-arg]
IMAGE          CREATED         CREATED BY
1ceae2e6296f   x minutes ago   |1 ARG_ART_URL=http://admin:admin123@localhost:8081

The way to solve this problem is through another reverse proxy using nginx with the following rules:

  • NGINX installed without Nexus3 APT-proxy (since we cannot yet access it without leaking credentials)
  • Implement Nexus3 authentication by configuring nginx to pass an authorization header for each request
  • Ensure the nginx config is git ignored to not checkin credentials
  • Connect to container that runs Nexus3 (either through networks or host)

If you now have this gut feeling that this appears hacky, rest assured, it 100% is feeling hacky. However, as of today (Apr 2021), there is no other approach to handle this if we want to maintain our assets needed for built inside the Dockerfile and use docker-compose.

labimagesnginx-build
version: '3.3'

services:
  nginx:
    image: infra/nginx/build:1.0.0
    build:
      context: .
        
FROM infra/ubuntu/focal:1.0.0

RUN cp /etc/apt/sources.list.bak /etc/apt/sources.list  \
    && apt-get update \
    && apt-get install -y nginx=1.18.* \
    && apt-get clean \
    && rm /etc/apt/sources.list

CMD ["nginx"]

We will be adding the nginx config at compose time via mounts to ensure our credentials can remain dynamic (and of course are never checked into git). This is done via the build.conf.base file to inject our Base64 encoded credentials as an Authorization HTTP header. The base64 encode of admin:admin123 is YWRtaW46YWRtaW4xMjM= which we will add in the base config.

sed -e "s|BASE64_USER_PASSWORD|YWRtaW46YWRtaW4xMjM=|" composed/nginx/build.conf.base > composed/nginx/build.conf

The final compose for this container is below, with the updated nginx config.

labcomposed
daemon off;
worker_processes  1;

#error_log  /var/log/nginx/error.log warn;
#pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}


http {

    proxy_send_timeout 120;
    proxy_read_timeout 300;
    proxy_buffering    off;
    keepalive_timeout  5 5;
    tcp_nodelay        on;

    server {
        listen       *:3001;

        client_max_body_size 1G;

        location / {
            proxy_pass         http://d1i-doc-nexus01:8081;

            proxy_set_header   Host             $host;
            proxy_set_header   X-Real-IP        $remote_addr;
            proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
            proxy_set_header   Authorization "Basic BASE64_USER_PASSWORD";
        }
    }
}
version: '3.3'

services:
  nginx-build:
    image: infra/nginx/build:1.0.0
    volumes:
      - ./nginx/build.conf:/etc/nginx/nginx.conf
    ports:
        - 3001:3001
    hostname: d1i-doc-ngbuild
    networks:
      ops-network:
        ipv4_address: 172.22.90.2
networks:
  ops-network:
    name: ops-network
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 172.22.90.0/16

Building the NGINX Build Image

Now build it and start it.

docker-compose -f images/nginx-build/docker-compose.yml build
docker-compose -f composed/docker-compose-nginx-build.yml up --no-start
docker-compose -f composed/docker-compose-nginx-build.yml start

We now have the Nexus3 application up and running at http://localhost:8081, and http://localhost:3001 an nginx reverse proxy to Nexus3 with authentication passed as a header.

Open JDK 8

The OpenJDK Dockerfile had a few more changes, primarily limited execution to our architecture only, proxying APT commands, and using our own DML to download the JDK.

labimagesopenjdk-8
version: '3.3'

services:
    openjdk-8:
      image: infra/java/openjdk-8:1.0.0
      build:
        context: .
        network: host
        args:
          ARG_ART_URL: http://d1i-doc-ngbuild:3001
        extra_hosts:
          - "d1i-doc-ngbuild:172.22.90.2"
FROM infra/ubuntu/focal:1.0.0

ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'

ARG ARG_ART_URL

RUN sed -e "s|APT_URL|${ARG_ART_URL}|" /etc/apt/sources.list.base > /etc/apt/sources.list \
    && apt-get update \
    && apt-get install -y --no-install-recommends tzdata curl ca-certificates fontconfig locales \
    && echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \
    && locale-gen en_US.UTF-8 \
    && rm -rf /var/lib/apt/lists/* \
    && rm /etc/apt/sources.list

ENV JAVA_VERSION jdk8u292-b10


#RUN set -eux; \
RUN set -eu; \
    curl $ARG_ART_URL/repository/dml/docker/jdk/jdk8u292-b10/OpenJDK8U-jdk_x64_linux_hotspot_8u292b10.tar.gz -o /tmp/openjdk.tar.gz; \
    mkdir -p /opt/java/openjdk; \
    cd /opt/java/openjdk; \
    tar -xf /tmp/openjdk.tar.gz --strip-components=1; \
    rm -rf /tmp/openjdk.tar.gz;

ENV JAVA_HOME=/opt/java/openjdk \
    PATH="/opt/java/openjdk/bin:$PATH"

Blue Box of Information: Logs and Leaking information

Another very subtle change that was done to the JDK public Dockerfile for security reasons is changing interactive shell command to remove the argument prints. If we had passed the DML user/password as a argument, it would have shown up in the logs. While we have guarded ourselves via the nginx build container it may still be a risk in other scenarios.

#RUN set -eux; \ RUN set -eu; \

Where the -x specifically is described as:

-x  Print commands and their arguments as they are executed.

Dockerfile build log before with set -eux and credentials.

Step 6/7 : RUN set -eux;     curl $ARG_ART_URL/repository/dml/docker/jdk/jdk8u275-b01/OpenJDK8U-jdk_x64_linux_hotspot_8u275b01.tar.gz -o /tmp/openjdk.tar.gz;     mkdir -p /opt/java/openjdk;     cd /opt/java/openjdk;     tar -xf /tmp/openjdk.tar.gz --strip-components=1;     rm -rf /tmp/openjdk.tar.gz;
---> Running in 8c1e908d8fd9
+ curl http://admin:admin123@host.docker.internal:8081/repository/dml/docker/jdk/jdk8u275-b01/OpenJDK8U-jdk_x64_linux_hotspot_8u275b01.tar.gz -o /tmp/openjdk.tar.gz
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100 99.2M  100 99.2M    0     0  13.3M      0  0:00:07  0:00:07 --:--:-- 13.4M
+ mkdir -p /opt/java/openjdk
+ cd /opt/java/openjdk
+ tar -xf /tmp/openjdk.tar.gz --strip-components=1
+ rm -rf /tmp/openjdk.tar.gz

And Dockerfile build log after. While we did lose the log of other commands, we minimized a user/password showing up somewhere.

Step 6/7 : RUN set -eu;     curl $ARG_ART_URL/repository/dml/docker/jdk/jdk8u275-b01/OpenJDK8U-jdk_x64_linux_hotspot_8u275b01.tar.gz -o /tmp/openjdk.tar.gz;     mkdir -p /opt/java/openjdk;     cd /opt/java/openjdk;     tar -xf /tmp/openjdk.tar.gz --strip-components=1;     rm -rf /tmp/openjdk.tar.gz;
---> Running in 314ad0fdf007
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100 99.2M  100 99.2M    0     0  14.3M      0  0:00:06  0:00:06 --:--:-- 14.6M

And yes, someone could potentially look at the running processes to grab the command currently being executed but that will require host access, whereas the output of a Docker build may end up being logged and shipped to another location.

Building the OpenDJK Image

Building image using the NGINX build proxy to remove credentials during build time.

docker-compose -f images/openjdk-8/docker-compose.yml build