Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The pods of kube-flannel is in ERROR state in agent node after installing cluster by kubeadm #39701

Closed
maweina opened this issue Jan 11, 2017 · 9 comments

Comments

@maweina
Copy link

maweina commented Jan 11, 2017

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):
request

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):
kubadm, flannel

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:52:01Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: vagrant + vritualbox in Windows 10
  • OS (e.g. from /etc/os-release): centos 7
  • Kernel (e.g. uname -a): 3.10.0-327.4.5.el7.x86_64
  • Install tools: kubeadmin
  • Others:

What happened:
In step (4/4) of Joining your nodes of http://kubernetes.io/docs/getting-started-guides/kubeadm/, after node joined cluster, the pods of kube-flannel in the new added agent node is in ERROR state.

What you expected to happen:
The pods of kube-flannel is in running state.

How to reproduce it (as minimally and precisely as possible):
Follow instruction of http://kubernetes.io/docs/getting-started-guides/kubeadm/ and choose flannel as network. In step 2, specifying both --api-advertise-addresses= and --pod-network-cidr=10.244.0.0/16 when executing "kubeadm init". In step 4, editing kube-flannel.yml by changing flannel command to be " command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]", as enp0s8 is the right network interface in vagrant environment.

Anything else do we need to know:
The docker logs of the failed container shows

E0110 23:48:52.242726 1 main.go:127] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-xbbqs': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-xbbqs: dial tcp 10.96.0.1:443: i/o timeout

10.96.0.1 is the service ip of kube-api-service.

In the new added agent node, the related rules in iptable is configured as

Chain KUBE-SEP-MUTNCW54Z2U4XKXF (2 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- k8snode1 anywhere /* default/kubernetes:https /
2 DNAT tcp -- anywhere anywhere /
default/kubernetes:https */ recent: SET name: KUBE-SEP-MUTNCW54Z2U4XKXF side: source mask: 255.255.255.255 tcp to:10.211.56.101:6443

k8snode1 is the hostname of master and 10.211.56.101 is the ip address of master node.

In the new added agent node, try to use nc to connect the 10.96.0.1:443

"nc -4 -v 10.96.0.1 443" failed
"nc -4 -v -s 10.211.56.102 10.96.0.1 443" works. The 10.211.56.102 is the ip address of network interface of enp0s8 in new added agent node.

Just wondering why flanneld does not bind 10.211.56.102 when communicating with kube-api-server after "iface=enp0s8" is specified in command line.

I am afraid in Chain KUBE-SEP-MUTNCW54Z2U4XKXF, if the rule of "KUBE-MARK-MASQ all -- k8snode1 anywhere " is changed to "KUBE-MARK-MASQ all -- anywhere anywhere", the issue maybe resolved. However I did not try it yet.

@maweina
Copy link
Author

maweina commented Jan 11, 2017

After checking the codes of flannel and kubeclient, I found "--iface" is not used by flanneld to communicate with kube-apiservice. Actually caller of kubeclient is not able to specify source ip address.

After hardcoding the ip address of kube-apiservice in kube-flannel.yml as follows, the issue is worked around. However we still need an official solution to install kubenetes by kubeadm in vagrant+virtualbox environment.

...
containers:
- name: kube-flannel
image: quay.io/coreos/flannel-git:v0.6.1-28-g5dde68d-amd64
command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]
securityContext:
privileged: true
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: KUBERNETES_SERVICE_HOST
value: "10.211.56.101" #ip address of the host where kube-apiservice is running
- name: KUBERNETES_SERVICE_PORT
value: "6443"
...

@snavien
Copy link

snavien commented Feb 8, 2017

Was that the only line you changed, because since then there has been updates to the image, pod name and there were no kubernetes_service_host settings available; The current version is giving me issues and just adding the iface is not sufficient

@luxas
Copy link
Member

luxas commented May 30, 2017

This is a flannel issue not a Kubernetes one. Please reopen this in the flannel or kubeadm repo if you can reproduce this anymore.

@0xmichalis
Copy link
Contributor

Reopened on the kubeadm repo: kubernetes/kubeadm#287

@ReSearchITEng
Copy link

I am facing the same, cannot connect to api-server on the exported:
KUBERNETES_PORT=tcp://10.96.0.1:443

@funky81
Copy link

funky81 commented Mar 4, 2018

any solutions for this?

@huangjiasingle
Copy link

@funky81 if firewalld, selinux, swapoff was closed, it's repoduce the same err, please check your kube-proxy running on the node or not

@pouryameh
Copy link

I'm having the same issue
kube-proxy is working properly
iptables is on.
I have no idea

@pouryameh
Copy link

After checking the codes of flannel and kubeclient, I found "--iface" is not used by flanneld to communicate with kube-apiservice. Actually caller of kubeclient is not able to specify source ip address.

After hardcoding the ip address of kube-apiservice in kube-flannel.yml as follows, the issue is worked around. However we still need an official solution to install kubenetes by kubeadm in vagrant+virtualbox environment.

...
containers:

  • name: kube-flannel
    image: quay.io/coreos/flannel-git:v0.6.1-28-g5dde68d-amd64
    command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=enp0s8" ]
    securityContext:
    privileged: true
    env:
  • name: POD_NAME
    valueFrom:
    fieldRef:
    fieldPath: metadata.name
  • name: POD_NAMESPACE
    valueFrom:
    fieldRef:
    fieldPath: metadata.namespace
  • name: KUBERNETES_SERVICE_HOST
    value: "10.211.56.101" #ip address of the host where kube-apiservice is running
  • name: KUBERNETES_SERVICE_PORT
    value: "6443"
    ...

Actually adding those lines in the .yaml file worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants