I want to understand how the passwords used to log in to Jenkins, as well as other credentials that can be used in Jenkins jobs, are kept secure.
In order to accurately answer this question, we must first define some terms.
A security hashing algorithm is a mapping function that takes one input and provides one output. The output is generally of a fixed size. This is a 1-N relation, because multiple inputs could produce the same output. It’s a surjective, or one-way, function. This is generally used to verify that a particular input "seems" to be what we expected. This mechanism ensures that we cannot reverse the process, by using the output to find the input that was used to generate it. In simpler terms, a hash is a somewhat like an advanced checksum. Common use case: password storage. We do not need to know the password value, we just need to know if the provided password (input) is the same as the one the user set for themselves.
An encryption algorithm is a also a mapping function between one input and one output. But as the output size depends on the input, the relation is one-to-one. A single input will always produce the same output and a single output would be able to be reversed to the initial input. That’s a bijective, or two-way, function. This mechanism is mainly used to store data that we need to protect during the storage phase, but in the future we will need to retrieve the input value. In simpler terms, encryption is like a safe that you can give to a transporter. Without the key, the safe cannot be opened. Common use case: secure storage of confidential data. When only some of the users should have access to this information, we want to ensure only a subset of the users have access to the "key" used to decrypt the data.
To apply these terms to Jenkins, we use hashing algorithms to check the passwords and API keys that users use to log in to Jenkins, because those passwords are not used for any other purpose. We don’t need to store the passwords themselves in any form. Whereas we use an encryption algorithm to securely store credentials that can be used by builds (for example, GitHub access credentials), because we need to be able to decrypt those credentials in order to send them to the external service. If we used a hashing algorithm for these job credentials, it would prevent us from using them for their intended purpose.
Specifically, we are using BCrypt (strong hash for password validation) to store the user login passwords in the embedded Jenkins Security Realm. The code for that is available here. We are using SHA-256 to store the random bytes of the user API token. There is no need to have a stronger hash as the input is random compared to user-entered password with low entropy. The code for that is available here. The API token was using an encryption mechanism instead of a hash until Jenkins 2.128, but we corrected that behavior. See this blog post.
Credentials are stored on disk using AES/CBC/PKCS5Padding with a 128bits AES key. As the master.key file in JENKINS_HOME is used to derive other encryption keys, if its content leaks, all the instance secrets / credentials must be considered as compromised.