Closed
Description
Hi @kubernetes/sig-cluster-lifecycle
I tried to follow the docs for kubeadm on centOS 7.1.
It seems that the kubeadm init
blocks waiting for 'control plane to become ready' even though all containers are running.
# kubeadm init --token foobar.1234
<util/tokens> validating provided token
<master/tokens> accepted provided token
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
here are the running containers on the same master machine:
$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
30aff4f98753 gcr.io/google_containers/kube-apiserver-amd64:v1.4.0 "/usr/local/bin/kube-" 3 minutes ago Up 3 minutes k8s_kube-apiserver.c44dda3f_kube-apiserver-k8ss-head_kube-system_6b83c87a9bf5c380c6f948f428b23dd1_408af885
8fd1842776ab gcr.io/google_containers/kube-controller-manager-amd64:v1.4.0 "/usr/local/bin/kube-" 3 minutes ago Up 3 minutes k8s_kube-controller-manager.a2978680_kube-controller-manager-k8ss-head_kube-system_5f805ed49f6fd9f0640be470e3dea2a2_7ac41d83
32b7bfb55dc0 gcr.io/google_containers/kube-scheduler-amd64:v1.4.0 "/usr/local/bin/kube-" 3 minutes ago Up 3 minutes k8s_kube-scheduler.1b5cde04_kube-scheduler-k8ss-head_kube-system_586d16be4ecaac95b0162c5d11921019_0ca14012
8a1797fdb1df gcr.io/google_containers/etcd-amd64:2.2.5 "etcd --listen-client" 8 minutes ago Up 8 minutes k8s_etcd.4ffa9846_etcd-k8ss-head_kube-system_42857e4bd57d261fc438bcb2a87572b9_f1b219d3
292bcafb3316 gcr.io/google_containers/pause-amd64:3.0 "/pause" 8 minutes ago Up 8 minutes k8s_POD.d8dbe16c_kube-controller-manager-k8ss-head_kube-system_5f805ed49f6fd9f0640be470e3dea2a2_fe9592ab
ab929dd920a2 gcr.io/google_containers/pause-amd64:3.0 "/pause" 8 minutes ago Up 8 minutes k8s_POD.d8dbe16c_kube-apiserver-k8ss-head_kube-system_6b83c87a9bf5c380c6f948f428b23dd1_c93e3a3b
71c28763aeab gcr.io/google_containers/pause-amd64:3.0 "/pause" 8 minutes ago Up 8 minutes k8s_POD.d8dbe16c_kube-scheduler-k8ss-head_kube-system_586d16be4ecaac95b0162c5d11921019_eb12a865
615cb42e0108 gcr.io/google_containers/pause-amd64:3.0 "/pause" 8 minutes ago Up 8 minutes k8s_POD.d8dbe16c_etcd-k8ss-head_kube-system_42857e4bd57d261fc438bcb2a87572b9_891fc5db
I tried to join a node but I get a connection refused error, even though there is no firewall...
# kubeadm join --token foobar.1234 <master_ip>
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://185.19.30.178:9898/cluster-info/v1/?token-id=foobar"
error: <node/discovery> failed to request cluster info [Get http://MASTER_IP:9898/cluster-info/v1/?token-id=foobar: dial tcp MASTER_IP:9898: getsockopt: connection refused]
and now I am actually wondering if the init
is blocking waiting for nodes to join. According to the docs it is not blocking, but the logs of kubeadm
seems to indicate that it is.
Activity
DaspawnW commentedon Sep 27, 2016
Same issue for me on an aws installation,
but I can't see any docker containers running.Some Informations are here:
Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-38-generic x86_64)
Using http_proxy and https_proxy
Looked at the logs of the apiServer. It returns with an exception:
yoojinl commentedon Sep 27, 2016
@sebgoa Looks similar to #33541, do you have SELinux enabled?
Try to run
docker ps -a | grep discovery
, get id ofkube-discovery
container rundocker logs <id>
to see, if there is permissions denied error for/tmp/secret
directory.oz123 commentedon Sep 27, 2016
@RustyRobot , disabling SELINUX on Ubuntu 16.04 does solve the problem of hanging.
document problem with SELINUX
sebgoa commentedon Sep 27, 2016
Ok disabling selinux got me further, the
kubeadm init
finished. But now there is nothing listening on 9898.What component is supposed to be listening on that port for cluster joins ?
sebgoa commentedon Sep 27, 2016
ok so the discover port is using a hostPort on 9898.
logs on that pod return this:
I am following the docs
The DNS pod is not starting:
sebgoa commentedon Sep 27, 2016
@errordeveloper looks like this might be right up your alley..
lukemarsden commentedon Sep 27, 2016
@sebgoa can you try starting from scratch, following the instructions at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/ please?
oz123 commentedon Sep 27, 2016
@lukemarsden, I followed the instructions you posted, and it seems that systemd is immediately starting kubelet:
yoojinl commentedon Sep 27, 2016
@oz123 If you already did
kubeadm init
, you need to start from scratch i.e. remove/etc/kuberentes
and/var/lib/etcd
directories. We have plans to introduce--reset
flag in the future, in order to do it automatically.benmathews commentedon Sep 27, 2016
There is a uninstall script referenced at http://deploy-preview-1321.kubernetes-io-vnext-staging.netlify.com/docs/getting-started-guides/kubeadm/. After running it, my init ran correctly again.
errordeveloper commentedon Sep 27, 2016
I think this can be closed now, as soon as new packages become available.
On Tue, 27 Sep 2016, 18:33 Ben Mathews, notifications@github.com wrote:
52 remaining items
kenzhaoyihui commentedon Jan 3, 2017
@Dmitry1987 Yeah,Thank for your help.I will check the log again.
krishvoor commentedon Jan 14, 2017
Facing the same issue as well ..
SELINUX is disabled.
OS : Ubuntu 1604
ARCH : ppc64le
iptables/firewalld : disabled
Natively compiled Kubernetes (release 1.5) and tried " kubeadm init ". Hangs here :-
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: kubelet service does not exist
[init] Using Kubernetes version: v1.5.2
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[apiclient] Created API client, waiting for the control plane to become ready
kenzhaoyihui commentedon Jan 14, 2017
@harsha544 Do you attach the log about /var/log/messages and docker images?
krishvoor commentedon Jan 14, 2017
@kenzhaoyihui Nothing in /var/log/syslog . Enclosing docker images output
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
kenzhaoyihui commentedon Jan 14, 2017
@harsha544 https://github.com/kenzhaoyihui/kubeadm-images-gcr.io/blob/master/pull_kubernetes_images.sh
The shell script is to pull the all docker images that needed, could you pull all the images and then execute "kubeadm init".
krishvoor commentedon Jan 15, 2017
@kenzhaoyihui Thanks for the URL, I tweaked the respective script to pull ppc64le docker images, however not all docker images appears to be present.
ozbillwang commentedon Jan 15, 2017
@harsha544
Be careful the link and script which @kenzhaoyihui provided, that script tries to fake the google's images with his own images.
You'd better to not run with it.
In fact the solution has been provided in this ticket, I fixed my issue already. It was provided by @benmathews commented on Sep 28, 2016. If yo missed that comment, you should take a try.
krishvoor commentedon Jan 17, 2017
@SydOps I was cautious enough to pull docker images from gcr.io/google_containers/ppc64le..
Given that this is a ppc64le ARCH, I built the binaries from the GitHub Source, and hence I don't have the feasibility to restart via systemctl .. However I'm following the approach of Kubernetes Ansible (https://github.com/kubernetes/contrib/tree/master/ansible) To deploy the K8 cluster among my nodes.
luxas commentedon Jan 17, 2017
@harsha544 Please open a new issue in kubernetes/kubeadm about this.
It's fully possible to solve it, but requires some manual hacking until we've got ppc64le mainline again, see: #38926
mohamedbouchriha commentedon Feb 20, 2017
thanks @saidiahd It works for me
shufanhao commentedon Apr 30, 2017
I also hit this issues. have disabled SELINUX.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready