kubernetes污点在主人上,但没有安排工人节点
我在kubernetes(K3S)群集上有一个问题:
0/4 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {k3s-controlplane: true}, that the pod didn't tolerate.
为了描述这是如何发生的,我有4个K3S服务器,带有3个控制平面和1名工人。
没有节点具有污染物,因此每个POD都可以在任何节点上安排。
我想更改它并污染我的主节点,所以我补充说: 污点:k3s-controlplane = true:在2个节点上
进行测试,我已经重新启动了一个部署,现在,该POD无法安排。
据我了解,默认情况下,它应该安排无污染的节点,但事实并非如此。
对于新部署,它运行良好。
所以我想,部署中有历史记录使问题陷入困境。部署很简单:
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
labels:
app: test
spec:
replicas: 1
selector:
matchLabels:
app: test
strategy:
type: Recreate
template:
metadata:
labels:
app: test
spec:
nodeSelector:
type: "slow"
containers:
- env:
- name: PUID
value: "1000"
- name: GUID
value: "1000"
- name: TZ
value: Europe/Paris
- name: AUTO_UPDATE
value: "true"
image: test/test
imagePullPolicy: Always
name: test
volumeMounts:
- mountPath: /config
name: vol0
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
volumes:
- name: vol0
persistentVolumeClaim:
claimName: test-config-lh
I have an issue on my kubernetes (K3S) cluster :
0/4 nodes are available: 2 node(s) didn't match Pod's node affinity/selector, 2 node(s) had taint {k3s-controlplane: true}, that the pod didn't tolerate.
To describe how that happened, I have 4 K3S server, with 3 control-plane and 1 worker.
No nodes have taints, so each pod was able to schedule on any node.
I want to change that and taint my master nodes, so I added:
Taints: k3s-controlplane=true:NoSchedule on 2 nodes
To test it, I've restarted one deployment, and now, that pod won't schedule.
As I understand, it should schedule on the no tainted nodes by default, but it seems that is not the case.
For new deployment, it works great.
So I guess, there is history in my deployment that crate the issue. The deployment is kind of simple :
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
labels:
app: test
spec:
replicas: 1
selector:
matchLabels:
app: test
strategy:
type: Recreate
template:
metadata:
labels:
app: test
spec:
nodeSelector:
type: "slow"
containers:
- env:
- name: PUID
value: "1000"
- name: GUID
value: "1000"
- name: TZ
value: Europe/Paris
- name: AUTO_UPDATE
value: "true"
image: test/test
imagePullPolicy: Always
name: test
volumeMounts:
- mountPath: /config
name: vol0
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
volumes:
- name: vol0
persistentVolumeClaim:
claimName: test-config-lh
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,此特定部署有一个选择器:“慢”是这两个节点的标签。...
如果我使用此命令:
您可以注意到两个节点上的标签“ type = slow”“ baal-01”和“ Baal-02”,并且两个节点没有时间表污点。
因此,部署试图用标签“ type = slow”在节点上的豆荚,而可计划节点都没有此标签。
抱歉,我错过了..
所以没有问题...
Well, this particular deployment had a selector : "slow" which are the tag for these two node ....
If i use this command :
You can notice the label "type=slow" on the two nodes "baal-01" and "baal-02", and thoses two nodes have the no schedule taint.
So the deployment was trying to shcedule the pods on a node with the label "type=slow" and none of the schedulable node had this label.
Sorry, i missed it ..
so no issue there ...