New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use RBD volumes (pod fails to start with error "rbd: failed to modprobe rbd") #23924
Comments
@jperville have you tried apt-get installing the ceph-common package? |
Hi @maclof , I have installed the ceph-common package from ceph.org, in version 0.94 hammer because the documentation recommends version >= 0.87 and Ubuntu 14.04 only packages version 0.80 out of the box. I successfully ran |
I have created the rbd volume from the host with no problem so I think that the ceph-common package on host is compatible with the ceph inside the ceph/demo image. |
CC @rootfs @kubernetes/rh-storage |
|
Hi @rootfs, as explained above I am using gcr.io/google_containers/hyperkube-amd64:v1.2.1 as kubelet image and there was no modprobe installed inside so I manually ran I'm not sure if /sbin is in the path so I will try symlinking in a few hours when I wake up. |
After installing the The container creation is now stuck at another step. Unable to mount volumes for pod "rbd2_default(199392ab-fc91-11e5-8533-28d2444cbe8c)": rbd: map failed executable file not found in $PATH This time it is the 'rbd' executable missing. After redeploying the kubelet container with
Every 25 seconds or so the 'foo' rbd device is mapped again (eg. |
I finally made it work by applying several hacks:
Workaround 3 is necessary because the getDevFromImageAndPool helper returns a device path such as I traced the algorithm of the getDevFromImageAndPool helper using the shell in the kubelet container: # cd /sys/bus/rbd/devices/
/sys/bus/rbd/devices# ls
0
/sys/bus/rbd/devices# name=0
/sys/bus/rbd/devices# cd ${name}
/sys/bus/rbd/devices/0# cat pool name
rbd
foo
/sys/bus/rbd/devices/0# devicePath="/dev/rdb${name}" # pool 'rbd' and image 'foo' match
/sys/bus/rbd/devices/0# echo ${devicePath}
/dev/rbd0 For the moment I will prepare a custom hyperkube image with my workarounds but this is hackish. |
Nice hack to get it running. For step 3, what if you bind mount host /dev (i.e. -v /dev:/dev) or there is an issue with that? |
If i bind-mount EDIT: after checking the issue tracker, the breakage resulting from bind-mounting /dev from host into the kubelet container is documented in #18230. What I finally did was to wrap the hyperkube image like this: # apply hacks from https://github.com/kubernetes/kubernetes/issues/23924#issuecomment-206803980
# so that pods that use rbd persistent resources work in the single-node docker setup.
# Build with the following command: `docker build -t custom/hyperkube-amd64:v1.2.1 .`
FROM gcr.io/google_containers/hyperkube-amd64:v1.2.1
RUN curl https://raw.githubusercontent.com/ceph/ceph/master/keys/release.asc | apt-key add - && \
echo deb http://download.ceph.com/debian-hammer/ jessie main | tee /etc/apt/sources.list.d/ceph.list && \
apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -q -y ceph-common && \
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* And then run the kubelet container like this: docker run \
--volume=/:/rootfs:ro \
--volume=/sys:/sys:rw \ # necessary to do mount from container
--volume=/var/lib/docker/:/var/lib/docker:rw \
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
--volume=/var/run:/var/run:rw \
--volume=/sbin/modprobe:/sbin/modprobe:ro \ # to skip having to install in container
--volume=/lib/modules:/lib/modules:ro \ # to make `modprobe rbd` work
--volume=/etc/ceph:/etc/ceph:ro \ # to copy ceph config from host
--volume=/dev/rbd0:/rootfs/dev/rbd0:ro \ # workaround for point 3 above
--net=host \
--pid=host \
--privileged=true \
--name=kubelet \
-d \
custom/hyperkube-amd64:v${K8S_VERSION} \ # image with ceph-common vendored-in
/hyperkube kubelet \
--containerized \
--hostname-override="127.0.0.1" \
--address="0.0.0.0" \
--api-servers=http://localhost:8080 \
--config=/etc/kubernetes/manifests \
--cluster-dns=10.0.0.10 \
--cluster-domain=cluster.local \
--allow-privileged=true --v=2 Then I can use rbd persistent volumes from my dockerized kubernetes setup. |
I hope that ptmx bug is fixed. Is binding |
@jperville what docker version are you using? |
@pmorie : Using docker version 1.9.1 (cannot use 1.10 because of layer format changes that are incompatible with a tool we are using). @rootfs : ptmx bug still here (but there is a workaround); regarding |
@jperville Ok, the issue w/ pseudoterminals is supposed to be resolved in 1.10 |
I ended up adding a wrapper into my custom hyperkube image for now. Would adding some code to the getDevFromImageAndPool helper to understand the |
@jperville In containerized openshift, the node (kubelet) bind mount /sys. rbd map works in this environment. |
Hi guys, I am also having this issue, which i can work around by adding args to kubelet to mount /sbin/modprobe and /lib/modules. Is there any plan to fix this so that we dont have to manually feed this workaround to kubelet? |
@kokhang which kubelet distribution you are using? |
@rootfs Im using the kubelet that comes with coreos 4.7.3 |
I am not quite familiar with that setup. Do do use docker or rkt? Can you post |
Taking this conversation with rootfs offline. But im still curious if there are any plans of making these modules enabled by default for RBD block storage |
@jperville I still got stuck in |
Also dealing with "Could not map image: Timeout after 10s". Is there a solution? |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Bug 1754840: Return proper error message when BindPodVolumes fails Origin-commit: dc84d06390f0862b3bf5c6a5874024838433d765
Hello kubernetes,
I am trying to follow the instructions from the rbd example. After successfully booting a ceph demo cluster (
sudo ceph -s
on the host displaysHEALTH_OK
) and manually creating afoo
rdb volume formatted in ext4, I cannot start any pod that uses rdb volumes.The
rdb2
pod never starts, it stays inContainerCreating
state, as shown bykubectl get pod
output below:I am using kubernetes 1.2.1 with docker 1.9.1 on ubuntu 14.04 amd64 host using the single-node docker cluster.
The output of
kubectl describe pods rbd2
is the following:In the kubelet docker log, I can see the following trace, repeated multiple times.
As I understand the above logs, the
kubelet
container is trying to run something likemodprobe rbd
inside itself (or somewhere else?) and that fails; I noticed that there is nomodprobe
command inside the kubelet container (image: gcr.io/google_containers/hyperkube-amd64:v1.2.1) so I manually ranapt-get update && apt-get install kmod
to make that command appear inside the container, but without success).My files look like this:
I have checked that
172.17.42.1:6789
is reachable from the kubernetes cluster (because of using--net=host
when booting the kubelet container).How can I mount RBD volumes inside container as of kubernetes 1.2.1?
The text was updated successfully, but these errors were encountered: