Issue
-
In a CloudBees Jenkins Enterprise cluster the marathon service is not able to start
If you execute this command on the controllers:
systemctl status marathon.service
you get this log:
marathon.service - Scheduler for Apache Mesos Loaded: loaded (/usr/lib/systemd/system/marathon.service; enabled; vendor preset: disabled) Active: inactive (dead) (Result: exit-code) since ..... Process: .... ExecStartPre=/bin/mkdir -p /run/marathon (code=exited, status=217/USER)
Resolution
This error is usually related to Marathon package included in the OS being in an unsupported version for CloudBees Jenkins Enterprise
This usually happens when the OS is updated which can involve the update of some other packages. CloudBees Jenkins Enterprise need of a specific set of packages in a specific version deppending on the version of CloudBees Jenkins Enterprise.
When this happens is usual that there are other packages in different versions.
How to check it:
-
From a PSE support bundle we need to get the "project.config" file. Inside that file we need to check the "[tiger]" section and review the packages showing after the "version" key.
... [tiger] ... version = 1.11.7 ami_version = 1.11.7 mesos_version = 0.28.2\* zookeeper_version = 3.4.6\* marathon_version = 0.15.3\* docker_version = 1.13.1\* topbeat_version = 1.1.0\* pythonpip_version = 1.5.4\* nvme_version = 0.2.19\*
Once we have the version of the packages, we need to check the values on the different servers of the cluster.
For this specific "project.config":
yum list installed | grep -i mesos yum list installed | grep -i zookeeper yum list installed | grep -i marathon yum list installed | grep -i docker yum list installed | grep -i topbeat yum list installed | grep -i pythonpip yum list installed | grep -i nvme
If the values from these command are different from the list on the "project.config" you will need to downgrade the packages to the version specified in the project.config