Issue
-
Communication between controller, Operations Center or Agents are failing with DNS resolution issues, exposed as a
UnknownHostException
pointing to internal DNS. For examplejava.net.UnknownHostException: cjoc.cloudbees-core.svc.cluster.local
orjava.net.UnknownHostException: cjoc.cloudbees-core.svc.cluster.local
.
Explanation
Based on various investigation, Alpine suffered from DNS resolution problems that can be encountered when running in Kubernetes. One reason is that kubernetes (actually coredns
) by default configured the DNS resolver with ndots
set to 5. Another is that Alpine uses a specific DNS resolution library: musl
.
With ndots
set to 5
and the musl
library, if the DNS Client fails the search for a first path with an unexpected error (or if it is NXDOMAIN
), it does not try others. This specific behavior has a negative impact in some environment and causes host (even kubernetes internal endpoints) resolution to fail. In many cases, the problem is caused by DNS servers returning incorrect answers (NOERROR
instead of NXDOMAIN
). Essentially the DNS server says the domain exists but there is no entry for the requested type (A
). As a result, the musl
dns client (used by Alpine) stops resolution. Whereas the glibc
library (used by other unix distribution) would not. For more information have a look at the following:
Resolution
Starting from version 2.204.1.3, CloudBees Core supports a new variant of docker images based on the UBI and it becomes the default variant. Those images are not using the musl
library and therefore are not impacted by those DNS resolution problems.
The recommend solution is to upgrade to version 2.204.1.3 or later.
Workaround
The workaround if using alpine images (for some agents for example) is to customize the dnsConfig
of pods and set the ndots
to 1
:
dnsConfig: options: - name: ndots value: "1"
Agents
For agents, add the following snippet to the YAML configuration of the Pod template:
apiVersion: "apps/v1" kind: "Pod" spec: dnsConfig: options: - name: ndots value: "1"
Operations Center**
Add following snippet to the cjoc
Statefulset:
apiVersion: "apps/v1" kind: "StatefulSet" spec: template: spec: dnsConfig: options: - name: ndots value: "1"
Managed controllers
Add following snippet to the Managed controller item configuration and restart the controller from CJOC:
apiVersion: "apps/v1" kind: "StatefulSet" spec: template: spec: dnsConfig: options: - name: ndots value: "1"
*Note: To have this applied to any newly created controller, add this snippet to
.