K8s 作业和 pod 在使用主机+子域时的差异
我有Helm 3使用的K8。
- 我需要在YAML文件(由Helm创建)运行时访问K8S作业。
kubectl版本:
客户端版本:version.info {major:“ 1”,Minor:“ 21”, gitversion:“ v1.21.6”, GitCommit:“ D921BC6D1810DA51177FBD0ED61DC811C5228097”, gittreestate:“清洁”,builddate:“ 2021-10-27T17:50:34Z”, Goverion:“ GO1.16.9”,编译器:“ GC”,Platform:“ Linux/amd64”}服务器 版本:版本。 GitCommit:“ D921BC6D1810DA51177FBD0ED61DC811C5228097”, gittreestate:“清洁”,builddate:“ 2021-10-27T17:44:26z”, GoVersion:“ GO1.16.9”,编译器:“ GC”,Platform:“ Linux/amd64”}
Helm版本:
version.buildinfo {版本:“ v3.3.4”, GitCommit:“ A61CE5633AF99708171414353ED49547CF05013D”, gittreestate:“清洁”,goversion:“ go1.14.9”}
作为以下链接:
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#:%7edcm-bot-890c06f7dff54e2ff54e2d9d9d9d9d35332364e 不适合POD。
如所解释的那样,将主机名和子域放入POD的YAML文件中,并添加保存域的服务...
- 是否需要检查状态。
对于POD,它已经准备就绪状态。
kubectl wait pod/pod-name --for=condition=ready ...
对于工作,没有现成的状态(而背后的POD正在运行)。
我如何检查工作背后的POD状态(作业正在运行)?如何将HOST +子域用于工作?
我的代码... 删除了一些安全标签,但是相同。
(我 作业):
侦听器(POD是最后一个工作):
我添加的是主机名和子域(适用于POD,而不是工作)。如果它在豆荚上 - 没问题。
我还意识到,POD的名称(由作业创建)具有哈希自动扩展名。
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-project.fullname" . }}-listener
namespace: {{ .Release.Namespace }}
labels:
name: {{ include "my-project.fullname" . }}-listener
app: {{ include "my-project.fullname" . }}-listener
component: {{ .Chart.Name }}
subcomponent: {{ .Chart.Name }}-listener
annotations:
"prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
"prometheus.io/path": {{ .Values.prometheus.path }}
"prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
template: #PodTemplateSpec (Core/V1)
spec: #PodSpec (core/v1)
hostname: {{ include "my-project.fullname" . }}-listener
subdomain: {{ include "my-project.fullname" . }}-listener-dmn
initContainers:
# twice - can add in helers.tpl
- name: wait-mysql-exist-pod
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
- name: COMPONENT_NAME
value: {{ .Values.global.mysql.database.name }}
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get pod $MYSQL_POD_NAME 2>/dev/null | grep $MYSQL_POD_NAME | awk '{print $1;}')" \!= "$MYSQL_POD_NAME" ];do
echo 'Waiting for mysql pod to be existed...';
sleep 5;
done
- name: wait-mysql-ready
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
command:
- kubectl
args:
- wait
- pod/$(MYSQL_POD_NAME)
- --for=condition=ready
- --timeout=120s
- name: wait-mysql-has-db
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
{{- include "k8s.db.env" . | nindent 12 }}
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl exec $MYSQL_POD_NAME -- mysql -uroot -p$MYSQL_ROOT_PASSWORD -e 'show databases' 2>/dev/null | grep $MYSQL_DATABASE | awk '{print $1;}')" \!= "$MYSQL_DATABASE" ]; do
echo 'Waiting for mysql database up...';
sleep 5;
done
containers:
- name: {{ include "my-project.fullname" . }}-listener
image: {{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag | default "latest" }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
{{- include "k8s.db.env" . | nindent 12 }}
- name: SCHEDULER_DB
value: $(CONNECTION_STRING)
command: {{- toYaml .Values.image.entrypoint | nindent 12 }}
args: # some args ...
ports:
- name: api
containerPort: 8081
resources:
limits:
cpu: 1
memory: 1024Mi
requests:
cpu: 100m
memory: 50Mi
readinessProbe:
httpGet:
path: /api/scheduler/healthcheck
port: api
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 1
livenessProbe:
tcpSocket:
port: api
initialDelaySeconds: 120
periodSeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: {{ include "my-project.fullname" . }}-volume
mountPath: /etc/test/scheduler.yaml
subPath: scheduler.yaml
readOnly: true
volumes:
- name: {{ include "my-project.fullname" . }}-volume
configMap:
name: {{ include "my-project.fullname" . }}-config
restartPolicy: Never
服务(用于子域):
apiVersion: v1
kind: Service
metadata:
name: {{ include "my-project.fullname" . }}-listener-dmn
spec:
selector:
name: {{ include "my-project.fullname" . }}-listener
ports:
- name: api
port: 8081
targetPort: 8081
type: ClusterIP
角色 + colebinding(启用curl命令访问):
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "my-project.fullname" . }}-role
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list", "update"]
- apiGroups: [""] # "" indicates the core API group
resources: ["pods/exec"]
verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
- apiGroups: ["", "app", "batch"] # "" indicates the core API group
resources: ["jobs"]
verbs: ["get", "watch", "list"]
Role-Binding:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "go-scheduler.fullname" . }}-rolebinding
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: {{ include "go-scheduler.fullname" . }}-role
apiGroup: rbac.authorization.k8s.io
最后是执行curl命令的测试仪:(
为了检查我将荚。
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-project.fullname" . }}-test
namespace: {{ .Release.Namespace }}
labels:
name: {{ include "my-project.fullname" . }}-test
app: {{ include "my-project.fullname" . }}-test
annotations:
"prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
"prometheus.io/path": {{ .Values.prometheus.path }}
"prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
template: #PodTemplateSpec (Core/V1)
spec: #PodSpec (core/v1)
initContainers:
# twice - can add in helers.tpl
#
- name: wait-sched-listener-exists
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
value: {{ include "my-project.fullname" . }}-listener
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get job $POD_NAME 2>/dev/null | grep $POD_NAME | awk '{print $1;}')" \!= "$POD_NAME" ];do
echo 'Waiting for scheduler pod to exist ...';
sleep 5;
done
- name: wait-listener-running
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
value: {{ include "my-project.fullname" . }}-listener
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get pods 2>/dev/null | grep $POD_NAME | awk '{print $3;}')" \!= "Running" ];do
echo 'Waiting for scheduler pod to run ...';
sleep 5;
done
containers:
- name: {{ include "my-project.fullname" . }}-test
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- /bin/sh
args:
- -c
- "tail -f"
# instead of above can be curl: "curl -H 'Accept: application/json' -X get my-project-listener.my-project-listener-dmn:8081/api/scheduler/jobs"
restartPolicy: Never
我输入测试吊舱
kubectl exec -it my-tester-<hash> -- /bin/sh
...并运行命令:
ping my-project-listener.my-project-listener-dmn
get:
ping: bad address 'my-project-listener.my-project-listener-dmn'
为pod做到这一点时:
ping pod-hostname.pod-subdomain():...数据字节
I have K8s used by Helm 3.
- I need to access a k8s job while running in yaml file (created by helm).
The kubectl version:
Client Version: version.Info{Major:"1", Minor:"21",
GitVersion:"v1.21.6",
GitCommit:"d921bc6d1810da51177fbd0ed61dc811c5228097",
GitTreeState:"clean", BuildDate:"2021-10-27T17:50:34Z",
GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"} Server
Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6",
GitCommit:"d921bc6d1810da51177fbd0ed61dc811c5228097",
GitTreeState:"clean", BuildDate:"2021-10-27T17:44:26Z",
GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"}
Helm version:
version.BuildInfo{Version:"v3.3.4",
GitCommit:"a61ce5633af99708171414353ed49547cf05013d",
GitTreeState:"clean", GoVersion:"go1.14.9"}
As the following link:
DNS concept
It works fine for Pod, but not for job.
As explained, for putting hostname and subdomain in Pod's YAML file, and add service that holds the domain...
- Need to check the state if running.
for pod, it is ready state.
kubectl wait pod/pod-name --for=condition=ready ...
For job there is no ready state (while pod behind is running).
How can I check the state of pod behind the job (job is running) and how can I use host + subdomain for jobs?
My code ...
(I removed some security tags, but the same. Important - It may be complicated.
I create a listener - running when listen, with job that need to do some curl command, and this can be achieved whether it has access to that pod behind the job):
Listener (the pod is the last job):
What I added is hostname and subdomain (which work for Pod, and not for Job). If it ever was on Pod - no problem.
I also realized that the name of the Pod (created by the job) has a hash automatic extension.
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-project.fullname" . }}-listener
namespace: {{ .Release.Namespace }}
labels:
name: {{ include "my-project.fullname" . }}-listener
app: {{ include "my-project.fullname" . }}-listener
component: {{ .Chart.Name }}
subcomponent: {{ .Chart.Name }}-listener
annotations:
"prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
"prometheus.io/path": {{ .Values.prometheus.path }}
"prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
template: #PodTemplateSpec (Core/V1)
spec: #PodSpec (core/v1)
hostname: {{ include "my-project.fullname" . }}-listener
subdomain: {{ include "my-project.fullname" . }}-listener-dmn
initContainers:
# twice - can add in helers.tpl
- name: wait-mysql-exist-pod
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
- name: COMPONENT_NAME
value: {{ .Values.global.mysql.database.name }}
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get pod $MYSQL_POD_NAME 2>/dev/null | grep $MYSQL_POD_NAME | awk '{print $1;}')" \!= "$MYSQL_POD_NAME" ];do
echo 'Waiting for mysql pod to be existed...';
sleep 5;
done
- name: wait-mysql-ready
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
command:
- kubectl
args:
- wait
- pod/$(MYSQL_POD_NAME)
- --for=condition=ready
- --timeout=120s
- name: wait-mysql-has-db
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
{{- include "k8s.db.env" . | nindent 12 }}
- name: MYSQL_POD_NAME
value: {{ .Release.Name }}-mysql
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl exec $MYSQL_POD_NAME -- mysql -uroot -p$MYSQL_ROOT_PASSWORD -e 'show databases' 2>/dev/null | grep $MYSQL_DATABASE | awk '{print $1;}')" \!= "$MYSQL_DATABASE" ]; do
echo 'Waiting for mysql database up...';
sleep 5;
done
containers:
- name: {{ include "my-project.fullname" . }}-listener
image: {{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag | default "latest" }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
{{- include "k8s.db.env" . | nindent 12 }}
- name: SCHEDULER_DB
value: $(CONNECTION_STRING)
command: {{- toYaml .Values.image.entrypoint | nindent 12 }}
args: # some args ...
ports:
- name: api
containerPort: 8081
resources:
limits:
cpu: 1
memory: 1024Mi
requests:
cpu: 100m
memory: 50Mi
readinessProbe:
httpGet:
path: /api/scheduler/healthcheck
port: api
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 1
livenessProbe:
tcpSocket:
port: api
initialDelaySeconds: 120
periodSeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: {{ include "my-project.fullname" . }}-volume
mountPath: /etc/test/scheduler.yaml
subPath: scheduler.yaml
readOnly: true
volumes:
- name: {{ include "my-project.fullname" . }}-volume
configMap:
name: {{ include "my-project.fullname" . }}-config
restartPolicy: Never
The service (for the subdomain):
apiVersion: v1
kind: Service
metadata:
name: {{ include "my-project.fullname" . }}-listener-dmn
spec:
selector:
name: {{ include "my-project.fullname" . }}-listener
ports:
- name: api
port: 8081
targetPort: 8081
type: ClusterIP
Roles + RoleBinding (to enable access for curl command):
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "my-project.fullname" . }}-role
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list", "update"]
- apiGroups: [""] # "" indicates the core API group
resources: ["pods/exec"]
verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
- apiGroups: ["", "app", "batch"] # "" indicates the core API group
resources: ["jobs"]
verbs: ["get", "watch", "list"]
Role-Binding:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "go-scheduler.fullname" . }}-rolebinding
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: {{ include "go-scheduler.fullname" . }}-role
apiGroup: rbac.authorization.k8s.io
And finally a tester that doing a curl command:
(For check I put tail -f
), and enter the pod.
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-project.fullname" . }}-test
namespace: {{ .Release.Namespace }}
labels:
name: {{ include "my-project.fullname" . }}-test
app: {{ include "my-project.fullname" . }}-test
annotations:
"prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
"prometheus.io/path": {{ .Values.prometheus.path }}
"prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
template: #PodTemplateSpec (Core/V1)
spec: #PodSpec (core/v1)
initContainers:
# twice - can add in helers.tpl
#
- name: wait-sched-listener-exists
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
value: {{ include "my-project.fullname" . }}-listener
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get job $POD_NAME 2>/dev/null | grep $POD_NAME | awk '{print $1;}')" \!= "$POD_NAME" ];do
echo 'Waiting for scheduler pod to exist ...';
sleep 5;
done
- name: wait-listener-running
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: IfNotPresent
env:
- name: POD_NAME
value: {{ include "my-project.fullname" . }}-listener
command:
- /bin/sh
args:
- -c
- |-
while [ "$(kubectl get pods 2>/dev/null | grep $POD_NAME | awk '{print $3;}')" \!= "Running" ];do
echo 'Waiting for scheduler pod to run ...';
sleep 5;
done
containers:
- name: {{ include "my-project.fullname" . }}-test
image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- /bin/sh
args:
- -c
- "tail -f"
# instead of above can be curl: "curl -H 'Accept: application/json' -X get my-project-listener.my-project-listener-dmn:8081/api/scheduler/jobs"
restartPolicy: Never
I enter the test pod
kubectl exec -it my-tester-<hash> -- /bin/sh
... and run the command:
ping my-project-listener.my-project-listener-dmn
Got:
ping: bad address 'my-project-listener.my-project-listener-dmn'
When doing that for pod:
PING pod-hostname.pod-subdomain (): ... data bytes
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里有很多内容,但我认为您应该能够通过一些小的更改来解决所有这些问题。
总之,我建议更改:
删除 Role 和 RoleBinding 对象;连接到服务
http://my-project-listener-dmn:8081
而不是单个 Pod;您可以在部署上kubectl wait --for=condition=available
。连接到服务,而不是单个 Pod(或作业或部署)。该服务名为
{{ include "my-project.fullname" 。 }}-listener-dmn
这是您应该连接到的主机名。该服务充当非常轻量级的集群内负载均衡器,并将请求转发到其选择器标识的 Pod 之一。因此,在此示例中,您将连接到服务的名称和端口
http://my-project-listener-dmn:8081
。您的应用程序不响应非常低级别的 ICMP 协议,我会避免 ping(1),而选择更有用的诊断。还可以考虑将Service的端口设置为默认的HTTP端口80;它不一定需要与 Pod 的端口匹配。服务选择器需要匹配 Pod 标签(而不是作业或部署的标签)。 Service 附加到 Pod; Job 或 Deployment 有一个创建 Pod 的模板;需要匹配的是那些标签。您需要向 Pod 模板添加标签:
或者,在 Helm 图表中,您有一个帮助程序来生成这些标签,
这里要检查的是
kubectl describe service my-project-listener-dmn
。底部应该有一行显示Endpoints:
和一些 IP 地址(技术上是一些单独的 Pod IP 地址,但您通常不需要知道这一点)。如果显示Endpoints:
这通常表明标签不匹配。您可能需要某种程度的自动重启。 Pod 可能会因多种原因而失败,包括代码错误和网络故障。如果您设置
restartPolicy: Never
那么您将拥有一个失败的 Pod,并且对服务的请求将失败,直到您采取某种手动干预。我建议至少将其设置为restartPolicy: OnFailure
,或者(对于部署)将其保留为默认值Always
。 (有有关作业重启策略的更多讨论 在 Kubernetes 文档中。)您可能需要在此处进行部署。作业适用于您执行一组批处理然后作业完成的情况;这就是为什么
kubectl wait
没有您正在寻找的生命周期选项的部分原因。我猜你想要一个部署。根据您在此处显示的内容,我认为您根本不需要进行任何更改,除了
迄今为止有关服务和 DNS 和标签的所有内容仍然适用。
您可以
kubectl wait
等待 Deployment 可用。由于作业预计运行完成并退出,因此这是kubectl wait
允许的状态。如果至少有最少数量的托管 Pod 正在运行并且通过了运行状况检查,则部署是“可用”的,我认为这就是您所追求的状态。有更简单的方法来检查数据库活动性。此处显示的大部分内容是一个具有特殊权限的复杂序列,用于在 Pod 启动之前查看数据库是否正在运行。
如果 pod 运行时数据库出现故障会发生什么?一个常见的情况是,您会遇到一系列级联异常,并且您的 Pod 会崩溃。然后使用
restartPolicy: Always
Kubernetes 将尝试重新启动它;但如果数据库仍然不可用,则会再次崩溃;然后您将进入 CrashLoopBackOff 状态。如果数据库确实再次可用,那么最终 Kubernetes 将尝试重新启动 Pod 并且会成功。同样的逻辑也适用于启动时。如果 Pod 尝试启动,而数据库尚未准备好,并且崩溃了,Kubernetes 将默认重新启动它,在前几次尝试后增加一些延迟。如果数据库在 30 秒左右启动,那么应用程序将在一分钟左右启动。重新启动计数将大于 0,但 kubectl messages --previous 希望有一个明显的异常。
这将使您删除此处显示的大约一半内容。删除 全部
initContainers:
块;然后,由于您没有执行任何 Kubernetes API 操作,因此也删除 Role 和 RoleBinding 对象。如果您确实想强制 Pod 等待数据库并将启动视为特殊情况,我建议您使用 mysql 客户端工具,甚至是 wait 来使用更简单的 shell 脚本-for 进行基本 TCP 调用的脚本(Docker Compose 等待在启动 Y 之前容器 X)。这仍然可以让您避免所有 Kubernetes RBAC 设置。
There's a lot here, but I think you should be able to resolve all of this with a couple of small changes.
In summary, I'd suggest changing:
Delete the Role and RoleBinding objects; connect to the Service
http://my-project-listener-dmn:8081
and not an individual Pod; and you cankubectl wait --for=condition=available
on the Deployment.Connect to Services, not individual Pods (or Jobs or Deployments). The Service is named
{{ include "my-project.fullname" . }}-listener-dmn
and that is the host name you should connect to. The Service acts as a very lightweight in-cluster load balancer, and will forward requests on to one of the pods identified by its selector.So in this example you'd connect to the Service's name and port,
http://my-project-listener-dmn:8081
. Your application doesn't answer the very-low-level ICMP protocol and I'd avoid ping(1) in favor of a more useful diagnostic. Also consider setting the Service's port to the default HTTP port 80; it doesn't necessarily need to match the Pod's port.The Service selector needs to match the Pod labels (and not the Job's or Deployment's labels). A Service attaches to Pods; a Job or a Deployment has a template to create Pods; and it's those labels that need to match up. You need to add labels to the Pod template:
Or, in a Helm chart where you have a helper to generate these labels,
The thing to check here is
kubectl describe service my-project-listener-dmn
. There should be a line at the bottom that saysEndpoints:
with some IP addresses (technically some individual Pod IP addresses, but you don't usually need to know that). If it saysEndpoints: <none>
that's usually a sign that the labels don't match up.You probably want some level of automatic restarts. A Pod can fail for lots of reasons, including code bugs and network hiccups. If you set
restartPolicy: Never
then you'll have a Failed Pod, and requests to the Service will fail until you take manual intervention of some sort. I'd suggest setting this to at leastrestartPolicy: OnFailure
, or (for a Deployment) leaving it at its default value ofAlways
. (There is more discussion on Job restart policies in the Kubernetes documentation.)You probably want a Deployment here. A Job is meant for a case where you do some set of batch processing and then the job completes; that's part of why
kubectl wait
doesn't have the lifecycle option you're looking for.I'm guessing you want a Deployment instead. With what you've shown here I don't think you need to make any changes at all besides
Everything so far about Services and DNS and labels still applies.
You can
kubectl wait
for a Deployment to be available. Since a Job is expected to run to completion and exit, that's the statekubectl wait
allows. A Deployment is "available" if there is at least a minimum number of managed Pods running that pass their health checks, which I think is the state you're after.There are simpler ways to check for database liveness. A huge fraction of what you show here is an involved sequence with special permissions to see if the database is running before the pod starts up.
What happens if the database fails while the pod is running? One common thing that will happen is you'll get a cascading sequence of exceptions and your pod will crash. Then with
restartPolicy: Always
Kubernetes will try to restart it; but if the database still isn't available, it will crash again; and you'll get to a CrashLoopBackOff state. If the database does become available again then eventually Kubernetes will try to restart the Pod and it will succeed.This same logic can apply at startup time. If the Pod tries to start up, and the database isn't ready yet, and it crashes, Kubernetes will by default restart it, adding some delays after the first couple of attempts. If the database starts up within 30 seconds or so then the application will be up within a minute or so. The restart count will be greater than 0, but
kubectl logs --previous
will hopefully have a clear exception.This will let you delete about half of what you show here. Delete all of the
initContainers:
block; then, since you're not doing any Kubernetes API operations, delete the Role and RoleBinding objects too.If you really do want to force the Pod to wait for the database and treat startup as a special case, I'd suggest a simpler shell script using the
mysql
client tool, or even thewait-for
script that makes basic TCP calls (the mechanism described in Docker Compose wait for container X before starting Y). This still lets you avoid all of the Kubernetes RBAC setup.