我们使用GKE中的Ingress-Nginx控制器暴露了十几种服务。
为了在同一域名上正确路由流量,我们需要使用重写目标规则。
自2019年推出以来,这些服务运行良好,没有任何维护,直到最近。当Cert-Manager突然停止续订Let's Encrypt证书时,我们通过临时从Ingress定义中删除“ TLS”部分来“解决”此问题,从而迫使我们的客户使用HTTP版本。
之后,我们删除了所有试图从头开始设置的CERT-MANAGER痕迹。
现在,CERT-MANAGER正在创建证书签名请求,催生ACME HTTP求解器POD并将其添加到入口处,但是在访问其URL后,我可以看到它返回空白的响应,而不是预期的令牌。
这与重写目标的注释有关,这使ACME挑战的路线弄乱了。
最困惑的是,这曾经用过以前的工作。 (不幸的是,它是由前员工设置的)
禁用重写目标不是一个选择,因为它将阻止路由正确工作。
使用DNS01不起作用,因为我们的ISP不支持DNS记录的程序化更改。
有没有办法在不禁用重写目标的情况下进行这项工作?
ps
这是在GitHub上报告的许多类似案例:
它们都没有帮助。
这是我的clusterissuer的定义
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx
We have a dozen of services exposed using a ingress-nginx controller in GKE.
In order to route the traffic correctly on the same domain name, we need to use a rewrite-target rule.
The services worked well without any maintenance since their launch in 2019, that is until recently; when cert-manager suddenly stopped renewing the Let's Encrypt certificates, we "resolved" this by temporarily removing the "tls" section from the ingress definition, forcing our clients to use the http version.
After that we removed all traces of cert-manager attempting to set it up from scratch.
Now, the cert-manager is creating the certificate signing request, spawns an acme http solver pod and adds it to the ingress, however upon accessing its url I can see that it returns an empty response, and not the expected token.
This has to do with the rewrite-target annotation that messes up the routing of the acme challenge.
What puzzles me the most, is that this used to work before. (It was set up by a former employee)
Disabling rewrite-target is unfortunately not an option, because it will stop the routing from working correctly.
Using dns01 won't work because our ISP does not support programmatic changes of the DNS records.
Is there a way to make this work without disabling rewrite-target?
P.S.
Here's a number of similar cases reported on Github:
None of them help.
Here's the definition of my ClusterIssuer
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx
发布评论
评论(3)
请分享您正在使用的集群发行人或问题。
IngressClass
参考: .12-docs/configuration/acme/http01/#infressClass
大多数情况下,我们看不到http求解器挑战的挑战,如果DNS或HTTP工作正常,则会删除它。
另外,请确保您的入口没有SSL重新授予注释,这也可能是 CERTS 不会产生的原因。
您是否尝试检查其他经理的对象,例如订单和证书状态请求?
kubectl描述挑战
您是否获得 404 ?如果您不断尝试,您可能会击中我们加密要求生成证书的速率限制。
故障排除: https://cert-manager.io /docs/faq/故障排除/#故障排除-a-failed-certificate-requecest
Please share the cluster issuer or issue you are using.
ingressClass
Ref : https://cert-manager.io/v0.12-docs/configuration/acme/http01/#ingressclass
Mostly we don't see the HTTP solver challenge it comes and get removed if DNS or HTTP working fine.
Also, make sure your ingress doesn't have SSL-redirect annotation that could be also once reason behind certs not getting generated.
Did you try checking the other object of cert-manager like order and certificate status request ?
kubectl describe challenge
are you getting 404 there ?If you are trying continuously there could be chance you hit rate limit of let's encrypt to request generating certificates.
Troubleshooting : https://cert-manager.io/docs/faq/troubleshooting/#troubleshooting-a-failed-certificate-request
如果对任何人有帮助,我将头发拔出一天后解决了这个问题。
解决方案是在不重写的情况下创建替代入口。原始的入口就是这样。
为了确保这不会干扰Letsencrypt的请求,我创建了其他入口:
if it helps anybody, i solved this issue after pulling my hair out for a day.
The solution was to create an alternate ingress without rewrite. The original ingress was like this.
To make sure this didn't interfere with the request letsencrypt would make, i created this other ingress:
当您使用
http01
配置发行人时,默认的ServiceType为nodePort
。这意味着,它甚至都不会贯穿入口控制器。来自 docs> docs :我不确定您的其余设置看起来如何,但是
http01
导致ACME服务器发出HTTP请求(而不是HTTPS)。您需要确保您的nginx具有HTTP(80)的侦听器。它确实遵循重定向,因此您可以在HTTP上聆听并将所有流量重定向到HTTPS,这是合法的。CERT-MANAGER创建
Ingress
用于验证的资源。它将流量引导到临时豆荚。此入口具有自己的一组规则,您可以使用此。您可以尝试在此资源上禁用或修改重写目标。我尝试尝试的另一件事是从群集内部访问此URL(绕过Ingress Nginx)。如果直接工作,则是一个入口 /网络问题,否则是其他问题。
请分享相关的NGINX和CERT-MANAGER日志,这对于调试或了解您的问题存在可能很有用。
When you configure an Issuer with
http01
, the default serviceType isNodePort
. This means, it won't even go through the ingress controller. From the docs:I'm not sure how the rest of your setup looks like, but
http01
cause the acme server to make HTTP requests (not https). You need to make sure your nginx has listener for http (80). It does follow redirects, so you can listen on http and redirect all traffic to https, this is legit and working.The cert-manager creates an
ingress
resource for validation. It directs traffic to the temporary pod. This ingress has it's own set of rules, and you can control it using this setting. You can try and disable or modify the rewrite-targets on this resource.Another thing I would try is to access this URL from inside the cluster (bypassing the ingress nginx). If it works directly, then it's an ingress / networking problem, otherwise it's something else.
Please share the relevant nginx and cert-manager logs, it might be useful for debugging or understanding where your problem exist.