How Docker uses cache while building an image

Question

When I rebuild image, docker uses cache for some layers, but doesn't for others:

Step 1/10 : FROM ubuntu:16.04
 ---> 6a2f32de169d
Step 2/10 : MAINTAINER User R "mail@gmail.com"
 ---> Using cache
 ---> c80135951886
Step 3/10 : RUN apt-get update && apt-get install -y python3 python3-pip
 ---> Using cache
 ---> e2fb88917cc1
Step 4/10 : ADD . /srv/dataset-service
 ---> 9504708a88ae
Removing intermediate container 76532d1a35a9
Step 5/10 : WORKDIR /srv/dataset-service
 ---> 4e94e0b03138
Removing intermediate container 71b7acc78bd5
Step 6/10 : RUN pip3 install -r requirements.txt && pip3 install grpcio-tools && pip3 install .
 ---> Running in 7356d49ae7a5
Collecting psycopg2==2.7.1 (from -r requirements.txt (line 1))  
...............................................................
...............................................................

Layers from 1 to 5 were build from cache, but from sixth layer was started from scratch. Why docker doesn't use cache for sixth layer?

please have a look on a similar question http://stackoverflow.com/questions/38655630/how-does-docker-know-when-to-use-the-cache-during-a-build-and-when-not — Dezigo, May 16 '17 at 14:02

score 1 · Answer 1 · answered May 16 '17 at 14:18

Docker uses the instruction you specified to determine whether or not the cached layer at that instruction can be re-used.

Basically, any time the result of a given instruction is determined to be different than the current cache layer, the layer in question is invalidated.

Once a single layer is invalidated, all layers after that must be invalid as well.

In practice, your layer at step 4 is considered different than the previously built / cached layer. This is most likely because you have changed code or configuration in your application. Once layer 4 is determined to be different, all layer caches after that are considered invalid and must be re-built.

A common workaround for the constant re-install of modules from pip, node.js' npm, ruby's gem, etc, is to install those modules prior to copying the code. That way you can have the layer cache for the modules while still being able to modify your code.

In a node.js Dockerfile, it would look like this:


FROM node:6.9.5

RUN mkdir -p /var/app
WORKDIR /var/app

COPY ./package.json /var/app
RUN npm install --production

COPY . /var/app

# ...

This will create the project folder, copy only the package.json file with the dependency list, and then install the needed modules and libraries. After that is done, the rest of the code will be copied.

In your example, you would copy over the requirements.txt and other files that determine which pip modules are needed.

How Docker uses cache while building an image

1 Answers1