KBEC-00295 – Cleaning up un-abortable jobs

Article ID:360033190931
2 minute readKnowledge base

Problem

There is a known issue where aborting a job through the UI or the command line does not abort the job. It is rarely seen by most customers, but for those customers that see this issue we have a plugin utility, CleanupStalledJob available on our github site, to assist with cleaning up these jobs.

Solution

Version 10.8 and above

A cleanupStalledJob API has been built in to ectool and ec-perl so you can clean up a stalled job by running:

ectool cleanupStalledJob <jobId>

Pre 10.8

For Java version 8, please use the CleanupStalledJob-jar-with-dependencies_v.10.5.0.jar that you can download from this repository.

In pre-5.x versions, this utility works by using the jobId to identify the job to be cleaned. In 5.x and after, the UUID needs to be used to clean the jobs. Follow the steps here to find the UUID for your selected job:

  1. In the CloudBees CD (CloudBees Flow)/Commander UI, click on the "Jobs" tab and locate the Job for which you want to run the CleanupStalledJob utility.

In the below screenshot the job is "build2_OnelPlatform":

  1. Click on the Job you are interested in to get the UUID of the job on the browsers address bar. In my case the UUID for my job is "de9547fe-a596-11e4-9e3b-080027a41600"

Now I can run the utility as:

vagrant@commander53:/tmp$ /opt/electriccloud/electriccommander/jre/bin/java -jar /vagrant/CleanupStalledJob-jar-with-dependencies.jar --database-properties /opt/electriccloud/electriccommander/conf/database.properties --passkey /opt/electriccloud/electriccommander/conf/passkey --jobId "de9547fe-a596-11e4-9e3b-080027a41600" --output "jobid123"

In the "cleanupStalledJob.log" in the directory where you ran the CleanupStalledJob you should see the below lines indicating that CleanupStalledJob was able to find the job:

Microsoft SQL Server with integrated authentication

If you are using SQL Server with integrated security, you will need sqljdbc_auth.dll in the java library path to run this utility. You will also need to run the jar as a user that has full access permissions to the database. You can download this dll as part of the SQL Server JDBC driver here. To use the Java installed with CloudBees CD (CloudBees Flow), place this dll in \jre\bin before running the CleanupStalledJob jar.

Alternate Solution

If the CleanupStalledJob utility does not work, you can delete the unabortable job by marking it for deletion in the database. We should start by marking a particular job '9a86944d-c36d-11e4-8b95-0800279076c5' that is stuck aborting as deleted.

MySQL

update ec_job
set deleted = 1
where id = UNHEX(REPLACE('9a86944d-c36d-11e4-8b95-0800279076c5', '-', ''))
and deleted is null;

SQL Server

update ec_job
set deleted = 1
where CONVERT([varchar](36), id,2) =  CONVERT(CHAR(36),UPPER(REPLACE('9a86944d-c36d-11e4-8b95-0800279076c5', '-', '')))
and deleted is null;

Oracle

update ec_job
set deleted = 1
where id = UPPER(REPLACE('9a86944d-c36d-11e4-8b95-0800279076c5', '-', ''))
and deleted is null;

You should see this output:

Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Then trigger the background deleter. This can be achieved via the UI by deleting any other completed job, or a property. Alternatively you can delete something temporary, for example: a) create a dummy project and then delete the project or b) create a dummy step and delete the step etc…​