How I barely got my first Ceph monitor running in Docker
Docker is definitely the new trend. Thus I quickly wanted to try to put a Ceph monitor inside a Docker container. Story of a tough journey…
First let’s start with the DockerFile, this makes the setup easy and repeatable by anybody:
FROM ubuntu:latest MAINTAINER Sebastien Han <[email protected]> # Hack for initctl not being available in Ubuntu RUN dpkg-divert --local --rename --add /sbin/initctl RUN ln -s /bin/true /sbin/initctl # Repo and packages RUN echo deb http://archive.ubuntu.com/ubuntu precise main | tee /etc/apt/sources.list RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates main | tee -a /etc/apt/sources.list RUN echo deb http://archive.ubuntu.com/ubuntu precise universe | tee -a /etc/apt/sources.list RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates universe | tee -a /etc/apt/sources.list RUN apt-get update RUN apt-get install -y --force-yes wget lsb-release sudo # Fake a fuse install otherwise ceph won't get installed RUN apt-get install libfuse2 RUN cd /tmp ; apt-get download fuse RUN cd /tmp ; dpkg-deb -x fuse_* . RUN cd /tmp ; dpkg-deb -e fuse_* RUN cd /tmp ; rm fuse_*.deb RUN cd /tmp ; echo -en '#!/bin/bash\nexit 0\n' > DEBIAN/postinst RUN cd /tmp ; dpkg-deb -b . /fuse.deb RUN cd /tmp ; dpkg -i /fuse.deb # Install Ceph CMD wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key add - RUN echo deb http://ceph.com/debian-dumpling/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph-dumpling.list RUN apt-get update RUN apt-get install -y --force-yes ceph ceph-deploy # Avoid host resolution error from ceph-deploy RUN echo ::1 ceph-mon | tee /etc/hosts # Deploy the monitor RUN ceph-deploy new ceph-mon EXPOSE 6789
Then build the image:
$ sudo docker build -t leseb/ceph-mon .
Now we almost have th full image, we just need to instruct Docker to install the monitor. For this, we simply run the image that we just created and we pass the command that creates the monitor:
$ docker run -d -h="ceph-mon" leseb/ceph-mon ceph-deploy --overwrite-conf mon create ceph-mon
Check if it works properly:
$ docker logs e2f48f3cca26
Then commit the last version of your image to save the latest change:
$ docker commit e2f48f3cca26 leseb/ceph-mon
Finally run the monitor in a new container:
$ docker run -d -p 6789 -h="ceph-mon" leseb/ceph ceph-mon --conf /ceph.conf --cluster=ceph -i ceph-mon -f
Now the tough part, because of the use of
ceph-deploy the monitor listens to the IPv6 local address.
Which in normal circonstances is not a problem since we can access from either its local IP (lo) or its private address (eth0 or something else).
However with Docker, things are a little bit different, the monitor is only accessible from its namespace, so even if you expose a port this won’t work.
Basically exposing a port creates an Iptables DNAT rule, that says: everything that goes from anywhere to the host IP address on a specific port is redirected to the IP address within the container namespace.
In the end, if you try to access the monintor using the IP address of the host plus the exposed port you will get something like this:
.connect claims to be [::1]:6804/1031425 not [::1]:6804/31537 - wrong node!
Although there is a way to access the monitor! We need to access it from host directly through the namespace.
First grab your container’s ID:
$ docker ps
Use this script, stolen and adapt from Jérôme Petazzoni here. This script creates the entry point on the host to access the namespace of the container.
$ ./pipework.sh 9cfa541f6be9
Now, get the monitor’s key:
$ cp /var/lib/docker/containers/9cfa541f6be97821131355b4005bc24b509baf3028759f0f871bf43840399f96/rootfs/ceph.mon.keyring ceph.mon.docker.keyring
$ sudo ip netns exec 10660 ceph -k ceph.mon.docker.keyring -n mon. -m 172.17.0.8 -s
I’m not really convinced by this first shot. The biggest issue here is that the monitor needs to be known.
Wow that was a hell of a job to get this working. At the end, the effort is quite useless since nothing can reach the monitor except the host itself. Thus, other Ceph components will only work if they share the same network namespace as the monitor. Sharing all the containers namespace into one could quite difficult as well. But what’s the point to have a Ceph cluster stuck within some namespaces, without any clients accessing it?
I have to admit that this was pretty fun to hack. Although, in practice, that’s not usable at all. Thus you can consider this as an experiment and a way to get into Docker ;-).