KBEA-00039 - Fixing builds that the Cluster Manager reports as timed out

Article ID:360032828092
1 minute readKnowledge base

Summary

The Cluster Manager reports that the build timed out. Electric Make (eMake) appears fine, or it appears to have been killed by the Cluster Manger.

Solution

There are several possible reasons for this:

  • The network became unavailable for over a minute. At that point, the Cluster Manager times out the build. Diagnose the event log and see whether there are any errors about network trouble.

  • The eMake process was forcefully terminated (SIGKILL or the equivalent). Do not do this.

  • The eMake process crashed unexpectedly. In this case, a core file, or a dump file is expected. Ensure the ulimit is set properly or the Windows machine is set up to create dump files. Send Electric Cloud the core/dump file.

  • The eMake heartbeat thread is unable to send the heartbeat. Either because it crashed (this happened some time ago due to an uncaught exception), or because it’s stuck in other operations (if debugging is enabled, the thread slows to a point where it cannot send heartbeats fast enough). Don’t log, or increase the Cluster Manager timeout for heartbeats.

Applies to

  • Product versions: All

  • OS versions: All