Summary
The default settings in CloudBees CD (CloudBees Flow) repository and CloudBees CD (CloudBees Flow) agent works well in most cases. However, in some extreme cases, some settings need to be changed.
Problem
When publishing multiple artifacts, especially big files, in parallel, the publishing may fail after about 180 seconds (the default timeout value of an ectool command). Possible error lines like following may be found in jagent.log:
2018-04-19T21:18:10.600 | DEBUG | AgentHttp-458 | | | | LogThreadCallback | Exiting Thread[AgentHttp-458,5,main] 2018-04-19T21:18:10.610 | DEBUG | AgentHttp-461 | | | | LogThreadCallback | Exiting Thread[AgentHttp-461,5,main] 2018-04-19T21:18:39.468 | DEBUG | AgentHttp-460 | | | | LogThreadCallback | Exiting Thread[AgentHttp-460,5,main] 2018-04-19T21:18:40.899 | DEBUG | AgentHttp-459 | | | | LogThreadCallback | Exiting Thread[AgentHttp-459,5,main] 2018-04-19T21:20:28.630 | DEBUG | AgentHttp-462 | | | | LogThreadCallback | Starting Thread[AgentHttp-462,5,main] 2018-04-19T21:20:30.126 | DEBUG | nio-014 | | | | ForwardingNIOResponseHandler | onResponseReceived with status: 400 2018-04-19T21:20:30.134 | DEBUG | nio-014 | | | | ForwardingNIOResponseHandler | onEntityEnclosed with length: -1 2018-04-19T21:20:30.168 | DEBUG | nio-014 | | | artifactRequest-190 | NIORequestImpl | onResponse: request[id=artifactRequest-190,submitted] 2018-04-19T21:20:30.168 | DEBUG | nio-014 | | | artifactRequest-190 | UnauthenticatedClientHandler | request-handling-state updated: done 2018-04-19T21:20:30.169 | DEBUG | AgentHttp-463 | | | | LogThreadCallback | Starting Thread[AgentHttp-463,5,main] 2018-04-19T21:20:30.169 | DEBUG | AgentHttp-463 | | | Request-190 | AgentArtifactsHandler | Continuing a suspended op for /artifacts/AVN_WIDE/AVN_WIDE_CPU/AVN_WIDE_CPU_AVN_WIDE_SK3EV_US_Build_908096_20180419_205201_artifact 2018-04-19T21:20:30.169 | DEBUG | AgentHttp-463 | | | Request-190 | UnauthenticatedClientHandler | Finished processing request 2018-04-19T21:20:30.169 | DEBUG | AgentHttp-463 | | | Request-190 | UnauthenticatedClientHandler | ArtifactsHandler elapsed time: 1,800,180.650ms 2018-04-19T21:20:32.366 | DEBUG | AgentHttp-464 | | | | LogThreadCallback | Starting Thread[AgentHttp-464,5,main] 2018-04-19T21:20:32.477 | DEBUG | AgentHttp-465 | | | | LogThreadCallback | Starting Thread[AgentHttp-465,5,main] 2018-04-19T21:20:32.480 | DEBUG | AgentHttp-462 | | | commanderRequest-9334 | HttpUtil | POST on port 6800 from 127.0.0.1:37461: /commanderRequest 2018-04-19T21:20:32.481 | DEBUG | AgentHttp-462 | | | commanderRequest-9334 | CommanderProtocolHandler | Forwarding commander request from 127.0.0.1:37461 to 10.230.9.251:8443:
Then we may change the ectool command timeout value and try again. However, it’s possible that the publishing request is too many so the repository server is not able to handle it, thus the publishing still fails after a period of time (probably more than 30 minutes as one setting defaults to 30 minutes).
Solution
The following settings are helpful in such case.
Settings in the repository server, in the file "/conf/repository/server.properties":
Variable | Default Value | Note |
---|---|---|
REPOSITORY_REQUEST_TIMEOUT |
1800000 |
Async request timeout ( value set in milliseconds) Bigger value helps to make more parallel steps which started earlier to succeed |
change of the setting requires the repository server to be restarted.
Setting in the agent side, in the file "/conf/agent/agent.properties":
Variable | Default Value | Note |
---|---|---|
MAX_CONNECTIONS_PER_ROUTE |
20 |
Bigger value helps to support more parallel publishing |
OUTBOUND_CONNECT_TIMEOUT |
30000 |
Bigger value helps to make more parallel steps which started earlier to succeed |
change of the setting requires the agent to be restarted.