It seamed reasonable when I heard that it was a good security practice to pair an authenticator with an identificator. I could not give a concrete explanation of why it should be so. This post is an attempt to provide such explanation.
For starters, let's agree on a few terms:
Authenticator is something used to prove caller's or callee's identity. For example, a secret, or a signature.
Identificator is a non-secret reference to either calle[re]'s identity (e.g. username) or a key (public or secret), associated with caller's identity.
A few instances of identificators and authenticators to make it tangible.
Identificator & Authenticator pairs
- Username & Password, where username is a reference to the caller's identity (identificator) and password is the authenticator. Used in client-to-server and sometimes server-to-server authentication.
- API Key Id & API Secret Key, where API Key Id is a reference (identificator) to the API Secret Key (authenticator). Sometimes a caller can have more than one pair of these. Used in client-to-server and server-to-server authentication.
- Host name & server TLS certificate, where host name is a references to the callee's identity (identificator) and the signature in the TLS certificate is the authenticator. Used in server-to-client authentication.
- Sender's email address & signature generated with sender's private key, where email address is the identificator, and signature is the authenticator. Used in client-to-client authentication.
Identificator-less authenticator (or is it authenticator-less identificator?)
- API Key without Key Id. It is required to be included in requests in plaintext. E.g. NY Times API Key currently looks like a UUID with removed dashes: 4dd3f7b027074ff89961cfa321a2dfd2
- SNMP v1/2 community string used to "authenticate" to connections to network devices.
If the id-less API Key is transmitted unencrypted over secure channel (e.g. HTTPS) one could claim that it remains secret and bears some authentication power. Ultimately your average webapp transmits your username and password in cleartext when you first authenticate to get your session token. You lose your application-level accountability, but hey, if the outer layer can guarantee that this API key will only be used by a single identity (human or server), then why not?
On the other hand, because SNMP protocol is cleartext, I am leaning towards calling SNMP community string an identificator as it neither remains secret on the wire, nor can certify authenticity of the caller. However, if you can guarantee your channel security without resorting to encryption... well, let's not delude ourselves.
- SSN – is a number that in the US uniquely identifies US citizens and residents (something that would not be possible using first and last name combination for they are not unique across population). As a low-entropy number and because of its identification role it should clearly not be a secret. However, practically, many private and government establishments treat it as an authenticator and use to confirm person's identity.
So why should they come in pair?
Follows is a list of issues arising from an authenticator not being paired with an identificator. The first three apply to client-to-server and server-to-server authentication. The fourth and fifth apply to the server-to-client and client-to-client authentication:
- The authenticator cannot be stored securely in the back end using traditional hashing with a salt approach. If all authenticators are stored hashed with a salt (as they should be) but without any identificators to reference them, then the authenticating party will have to hash the authenticator with all available salts prior to being able to authenticate. The authentication will be slowing down for all callers as their number grows.
- Every new authenticator issued will be increasing attacker's chances of recovering a valid authenticator and compromise one of the callers. For example, if you randomly generate 128-bit long authenticators, and issue only 1 authenticator, it'll take an attacker 2128/2=2127 time on average to recover it by brute forcing the available key space. With 1024 active authenticators the time to recover a valid key will be decreased to 2127/210=2117. With 32768 active authenticators, the time goes down to 2112. To be equally secure an authenticator will be required to employ more bits (become longer) compared to the case where it is accompanied by an identificator.
- In case of non-local communication channel, the authenticator will have to be transmitted on the wire at least once with the first authentication call where it is exchanged for a session token. This is how webapp authentication works today, which is deemed acceptable. However, if paired with an identificator, the security bar can be raised for authenticated calls by employing HMAC signing where the authenticator is not sent over the wire at all (this is how AWS SigV4 works).
- The authenticating party may not be capable of storing authenticators for all callees and rely on the presence of an identificator to be able to retrieve artifacts required to authenticate the callee. This is the case of client-to-client authentication between an email sender and receiver, where the receiver may have to retrieve sender's public key before they can authenticate the message.
- It may be required for the authenticating party to only communicate to a particular callee with a valid authenticator, and not to any callee with a valid authenticator. This may not matter for client-to-server authentication, but is the case for the server-to-client HTTPS authentication where the web server has to prove its identity to the user agent. Had a server certificate only contained a validly signed public key, without a hostname, it would not have been possible for the user agent to be convinced that the certificate was issued for the web site it was trying to talk to.
Can identificator-less authenticators be stored securely?
The simple solution to storing identificator-less authenticators securely is to treat the first part of the authenticator as an identificator and the second part as the actual secret. When storing them only the second part will need to be hashed. You'll need to ensure that the second part is of sufficient length to combat bruteforcing attacks. Also, unless a special separator is used between two parts, the first part will need to be of sufficient length to prevent risk of running out of ids. With all this tricks we arrive to the secret key + key id scheme anyway, so why not do it properly from the very beginning?
You can also fall back to encrypting your authenticators with a secret stored separately, and indexing on the encrypted value. This has the obvious drawback of needing to bruteforce only one secret to recover all authenticators once the table with encrypted values is leaked.