Issue
-
My Kubernetes Cloud configuration does not work.
-
I am having issues with spining up an agent by using a Kubernetes Pod Template.
Quick check
Please, notice that many issues are related to the selected image for the Pod Template, so before continuing please verify if your Pod Template can spin up an agent using the jenkins/inbound-agent
image as you can read on the description of the plugin:
Tested with jenkins/inbound-agent, see the Docker image source code.
Required Data Kubernetes Cloud
This article describes how to collect the minimum required information for Kubernetes Cloud on a Client/Managed controller so that it can be efficiently troubleshooted.
If the required data is bigger than 50 MB you will not be able to use ZenDesk to upload all the information. On this case we would like to encourage you to use our upload service in order to attach all the required information.
Environment
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Managed controller
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
-
CloudBees CI (CloudBees Core) on traditional platforms - Client controller
-
CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
Required Data check list
-
From CloudBees Jenkins
-
Jenkins log recorder for Kubernetes Cloud Plugin
-
Jenkins Kubernetes Cloud description
-
Jenkins Kubernetes Pod Template description
-
(Optional) Items required from
An issue with a Build of a Job
-
-
From Kubernetes
-
Kubernetes Cluster Description
-
Agent Events
-
Agent Container logs
-
CNCF validation tool output
-
From CloudBees Jenkins
Jenkins log recorder for Kubernetes Cloud Plugin
Please follow "How do I create a logger in Jenkins for troubleshooting and diagnostic information?" to increase verbosity of the following loggers:
-
org.csanchez.jenkins.plugins.kubernetes
atALL
level -
com.cloudbees.jenkins.plugins.kube
atALL
level -
okhttp3
atALL
level
When you generate the support bundle ensure to select Controller Custom Log Recorders
Important:
-
Reproduce the issue in order to populate those logs before producing the support bundle.
-
After you verified that those logs have been populated, do not leave those logs enabled in a production environment. This is just for troubleshooting. Then they should be removed
Jenkins Kubernetes Cloud description
The Jenkins Kubernetes Cloud configuration is saved under $JENKINS_HOME/config.xml
you have 2 options here:
-
When you generate the support bundle ensure to select the
Jenkins Global Configuration File (Encrypted secrets are redacted)
option. -
Send
$JENKINS_HOME/config.xml
directly.
Jenkins Kubernetes Pod Template description
The Jenkins Kubernetes Pod Template description of the agent you are having issues with. Two options:
-
For Pod Templates defined in the Jenkins UI, the name of the template
-
For Pod Template defined in the Pipeline code, attach the
Jenkinsfile
Additionally:
-
If the agent is getting provisioned, the Console Output of the job displays the yaml description
-
In case you are not using
jenkins/inbound-agent
, attach theDockerfile
From the Kubernetes Cluster
Not all the items are needed, it depends on the situation. For instance, if the pod is not Running
you cannot get the container logs.
Kubernetes Cluster Description
Kubernetes Cloud description including:
-
The Cloud provider where the cluster is hosted (Openshift, AWS, etc)
-
The
cloudbees-cluster-details.txt
as result of:
$> kubectl get node,statefulset,pod,svc,ingress,endpoints,cm,pvc,pv -o wide -n <yournamespace> > cloudbees-details.txt
Notes:
1.- Replace <yournamespace>
by the namespace where you have deployed the CJE cluster (normally cje
). If you are using more than one namespace for distributing for applications, please include them.
2.- For openshift installation replace ingress
by route
object.
Agent Events
You have a couple of options to fetch that information:
via events
-
From the Jenkins Console logs, get the
agent-UID-example
for the build:
... [Pipeline] { (hide) [Pipeline] node Still waiting to schedule task ‘agent-UID-example’ is offline ...
Then, search for the Jenkins agent events for the UID. Make sure you are in the correct cluster and namespace.
$> kubectl get events -n <agent-namespace> | grep agent-UID-example > agent-events.txt 89s Normal Scheduled pod/agent-UID-example Successfully assigned cje-support-general/agent-UID-example to gke-cluster-support-gene-default-pool-33d2ba61-t4dl 87s Normal Pulling pod/agent-UID-example Pulling image "gcr.io/image2/executor:debug" 87s Normal Pulled pod/agent-UID-example Successfully pulled image "example.io/image2/executor:debug" 86s Normal Created pod/agent-UID-example Created container image2 85s Normal Started pod/agent-UID-example Started container image2 85s Normal Pulled pod/agent-UID-example Container image "maven:3.3.9-jdk-8-alpine" already present on machine 84s Normal Created pod/agent-UID-example Created container image1 84s Normal Started pod/agent-UID-example Started container image1 84s Normal Pulled pod/agent-UID-example Container image "cloudbees/cloudbees-core-agent:2.204.2.2" already present on machine 83s Normal Created pod/agent-UID-example Created container jnlp 82s Normal Started pod/agent-UID-example Started container jnlp
-
If the
agent-UID-example
is not displayed you have the following alternatives:
> kubectl get events -n <agent-namespace> --watch > kubectl get events -n <agent-namespace> --sort-by=.metadata.creationTimestamp > agent-events.txt
Agent container logs
If they are in Running
state, use kubectl logs
to get the log output for each of the containers (e.g example-container
) that are running in the pod, including jnlp
. Note -f
specify if the logs should be streamed.
kubectl logs -f my-jenkins-agent -c example-container | tee jenkins-agent-example-container-logs.txt
If you are trying to find out the reason behind being the Pod Killed
or Terminated
before getting logs, you could adjust the following properties for the troubleshooting (Important: get back to your standard setup after troubleshooting):
-
Increase Timeout in seconds for Jenkins connection, 2 options here:
-
Adding the Java Property to the issued controller
-Dorg.csanchez.jenkins.plugins.kubernetes.PodTemplate.connectionTimeout=60000
-
Updating Kubernetes Cloud Templates UI to
60000
-
-
Set podRetention to
always()