Source: http://kimh.github.io/blog/en/docker/gotchas-in-writing-dockerfile-en/
Docker creates a commit for each line of instruction in Dockerfile. As long as you don’t change the instruction, Docker thinks it doesn’t need to change the image, so use cached image which is used by the next instruction as a parent image. This is the reason why
docker build
takes long time in the first time, but immediately finishes in the second time.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
|
However, when cache is used and what invalids cache are sometimes not very clear. Here is a few cases that I found worth to note.
Cache invalidation at one instruction invalids cache of all subsequent instructions
This is the basic rule of caching. If you cause cache invalidation at one instruction, subsequent instructions doesn’t use cache.
1
2
3
4
5
6
7
8
9
10
|
|
Since you add Run apt-get update instruction, all instructions after that have to be done from the scratch even if they are not changed. This is inevitable because Dockerfile uses the image built by the previous instruction as a parent image to execute next instruction. So, if you insert an instruction that creates a new parent image, all subsequent instructions cannot use cache because now parent image differs.
Cache is invalid even when adding commands that don’t do anything
This invalidates caching. For example,
1
2
3
4
5
|
|
Even if
true
command doesn’t change anything of the image, Docker invalids the cache.Cache is invalid when you add spaces between command and arguments inside instruction
This invalids cache
1
2
3
4
5
|
|
Cache is used when you add spaces around commands inside instruction
Cache is valid even if you add space around commands
1
2
3
4
5
|
|
Cache is used for non-idempotent instructions
This is kind of pitfall of build caching. What I mean by non-idempotent instructions is the execution of commands that may return different result each time. For example,
apt-get update
is not idempotent because the content of updates changes as time goes by.1
2
|
|
You made this Dockerfile and create image. 3 months later, Ubuntu made some security updates to their repository, so you rebuild the image by using the same Dockerfile hoping your new image includes the security updates. However, this doesn’t pick up the updates. Since no instructions or files are changed, Docker uses cache and skips doing
apt-get update
.
If you don’t want to use cache, just pass
-no-cache
option to build.1
|
|
Instructions after ADD never cached (Only versions prior to 0.7.3)
If you use Docker before v7.3, watch out!
1
2
3
4
|
|
If you have Dockerfile like this, Run apt-get update and Run apt-get install openssh-serverwill never be cached.
The behavior is changed from v7.3. It caches even if you have ADD instruction, but invalids cache if file content is changed.
1
2
3
4
5
6
7
8
9
10
11
|
|
Since you change rock.you file, instructions after Add doesn’t use cache.
Hack to run container in the background
If you want to simplify the way to run containers, you should run your container on background with
docker run -d image your-command
. Instead of running with docker run -i -t image your-command
, using -d
is recommended because you can run your container with just one command and you don’t need to detach terminal of container by hitting Ctrl + P + Q
.
However, there is a problem with
-d
option. Your container immediately stops unless the commands are not running on foreground.
Let me explain this by using case where you want to run apache service on a container. The intuitive way of doing this is
1
|
|
However, the container stops immediately after it is started. This is because
apachectl
exits once it detaches apache daemon.
Docker doesn’t like this. Docker requires your command to keep running in the foreground. Otherwise, it thinks that your applications stops and shutdown the container.
You can solve this by directly running apache executable with foreground option.
1
2
3
4
5
6
7
|
|
Here we are manually doing what
apachectl
does for us and run apache executable. With this approach, apache keeps running on foreground.
The problem is that some application does not run in the foreground. Also, we need to do extra works such as exporting environment variables by ourselves. How can we make it easier?
In this situation, you can add
tail -f /dev/null
to your command. By doing this, even if your main command runs in the background, your container doesn’t stop because tail
is keep running in the foreground. We can use this technique in the apache case.1
|
|
Much better, right? Since
tail -f /dev/null
doesn’t do any harm, you can use this hack to any applications.