Resolution
Review the memory options for your operations center installation. Insufficient settings can impact CloudBees Jenkins Operations Center stability.
-
We recommend (to begin with) setting Xmx and MaxPermSize to the following
-Xmx2048m -XX:MaxPermSize=384m
-
Additionally, it is recommend to set up alerts using our alerting functionality to send you an email when permgen usage grows above 65% and 80% usage levels (see our documentation on alerts and the metric you want alerts on is
vm.memory.non-heap.usage
with thresholds of 0.65 (after 15 min) and 0.8 (after 5 min) respectively. The 0.65 alert should let you know that an issue may be approaching and can be used as a canary. The 0.8 alert is indication that you need to set OC into 'prepare for shutdown' mode (if you have any cluster operations in progress) and do a safe restart. If you have repeated need to restart due to permgen that would indicate that your system scale is such that permgen would need to be increased to 512m. A permgen requirement above 512m is indicative of some other usage, so if you end up needing to go past 512m for OC then you should definitely open a support ticket so that we can investigate what might be causing such a large permgen usage. -
We also recommend setting up alerts using our alerting functionality to watch for overall heap usage. That would be
vm.memory.heap.usage
over 0.85 for more than 5 minutes (as if GC cannot clear it below 0.85 within 5 minutes then there is certainly a shortage of heap memory allocated to operations center. Unless you have 100’s of agents/client controllers I would consider it normal to have heap memory allocation up as far as 4096m. I would not increase it by more than 512-1024m at a time. If you have several 100’s of shared agents (or agents connected by a shared cloud) and 100’s of client controllers then larger heaps could be required.