关于k8s驱逐节点上image和pod的问题
查阅官网得知,通过设置驱逐条件的阈值,如果达到阈值,那么触发节点上进行pod的驱逐和image的驱逐。
问题: 我设置了驱逐条件如下,设置位于/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
内容如下,主要添加了threshold of eviction
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
#threshold of eviction
Environment="KUBELET_EVICTION_ARGS=--eviction-hard=memory.available<100Mi,nodefs.available<1Gi,imagefs.available<1Gi"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBERLET_EVICTION_ARGS
设置了区驱逐条件,内存小于100m,或者节点空间小于1g,或者images空间小于1g。我的节点的空间如下
Filesystem Size Used Avail Use% Mounted on
udev 63G 0 63G 0% /dev
tmpfs 13G 91M 13G 1% /run
/dev/sdc2 439G 384G 34G 93% /
tmpfs 63G 23G 41G 36% /dev/shm
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sdf 917G 389G 482G 45% /public
/dev/sdd 917G 828G 43G 96% /data
/dev/sde 917G 269G 603G 31% /weights
/dev/sdb 917G 9.0G 862G 2% /data2
/dev/sdc1 511M 3.7M 508M 1% /boot/efi
/dev/sda1 1.9T 484G 1.3T 28% /data1
tmpfs 13G 32K 13G 1% /run/user/108
tmpfs 13G 0 13G 0% /run/user/1009
tmpfs 13G 0 13G 0% /run/user/1005
tmpfs 13G 0 13G 0% /run/user/1008
tmpfs 13G 0 13G 0% /run/user/1002
根目录位于/dev/sdc/2 占用了93%,但是很明显绝对大于1G啊,为什么一直驱逐我的pod。
master节点kubectl get pod --all-namespace
如下
kube-system coredns-5644d7b6d9-d6xgf 1/1 Running 0 20d
kube-system coredns-5644d7b6d9-w76xz 1/1 Running 0 25d
kube-system etcd-vrlab 1/1 Running 0 25d
kube-system kube-apiserver-vrlab 1/1 Running 0 8d
kube-system kube-controller-manager-vrlab 1/1 Running 0 8d
kube-system kube-flannel-ds-amd64-9wzrb 0/1 Evicted 0 7s
kube-system kube-flannel-ds-amd64-c46qs 1/1 Running 0 25d
kube-system kube-flannel-ds-amd64-rvlzg 1/1 Running 0 3d8h
kube-system kube-proxy-7mz45 0/1 Evicted 0 11m
kube-system kube-proxy-csglg 1/1 Running 0 24h
kube-system kube-proxy-wpcl7 1/1 Running 0 19d
kube-system kube-scheduler-vrlab 1/1 Running 4 11d
kube-system metrics-server-6fd69cd864-2gqxb 1/1 Running 0 12d
kube-system nvidia-device-plugin-daemonset-9x97f 0/1 Evicted 0 13m
kube-system nvidia-device-plugin-daemonset-vlbqv 1/1 Running 0 3d8h
kubernetes-dashboard dashboard-metrics-scraper-7cf5979dbb-27vf9 1/1 Running 0 30h
kubernetes-dashboard kubernetes-dashboard-594549879-w7b2z 1/1 Running 0 30h
describe其中一个被驱逐的节点event信息如下
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Evicted 4m36s kubelet, amax-server The node had condition: [DiskPressure].
Normal Scheduled 4m36s default-scheduler Successfully assigned kube-system/kube-flannel-ds-amd64-2svn9 to amax-server
可知确实是磁盘压力的问题。但是不知道问题在哪
可以看到flannel,kube-proxy一直再被驱逐?为什么?是哪里设置的不对吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
已找到解决方案。目前重新配置的是
/var/lib/kubelet/config.yaml
,以下四项是关于是否驱逐pod:
上述是修改后的,虽说不会驱逐pod了,但是镜像文件仍然莫名的被删除,仔细阅读此yaml文件,发现为这两个字段
一开始以为驱逐pod字段的imagefs为驱逐镜像的原因。然后自己配置所需百分比即可。