CS253 Lecture Summaries: Part XIII Authentication

The question for authentication - how can we build systems that are secure even when the attacker has the user’s password?

The key thing is defence in depth.

What is authentication? We want to verify tha the user is who they say they are.

Authentication systems classically use three factors:

Something you know (eg a password)
Something you have (eg a phone, badge, or key)
Something you are (eg a fingerprint or biometric data)

The more factors used, the more sure we are that the user is who they say they are.

So for an ATM, you’ll provide two factors - something you have (the bank card), and something you know (the pin).

Remember the distinction between autentication and authorization:

Authentication - verify the user is who they say they are - login form, ambient authority, http auth.

Authorization - decide if a user has permission to access a resource (Access Control Lists, Capability URLs).

Here’s how the NIST Guidelines put it:

Authentication establishes confidence that the claimant has possession of the authenticator(s) bound to the credential, and in some cases in the attribute values of the subscriber (e.g. if the subscriber is a US citizen, is a student at a particular university, or is assigned a particular number or code by an agency or organization). Authentication does not determine the claimant’s authorizations or access privileges; this is a separate decision, and is out of these guidelines’ scope. RPs can use a subscriber’s authenticated identity and attributes with other factors to make authorization decisions…

EG if you give a user a session token, you then verify that session token against permissions on each request. You don’t give them a token with all their permissions.

Usernames

Usernames should be stored case insensitively (alex is the same user as Alex). Make sure you store them case insensitively in the db (eg by forcing them to lower case). If you want to remember user display preference on casing, store that in another column.

Usernames should be unique - two users should not share the same username.

Passwords

Users choose weak passwords - most popular passwords are password 123456 qwerty abc123 monkey letmein etc.

So we enforce password requirements. But what should they be?

Here’s some typical outdated advice:

ensure they’re “complex” ie numeric, alphabetic, and special symbol characters
force users to change them regularly
require new passwords not previously used
such as P@ssw0rd1

Even worse practices you see often:

Maximum length of 8-10 chars
Minimum password age policy (to prevent password requirement dodging)
disable cut and paste
password hints that lack sufficient entropy
show on screen keyboard and make user click to enter password

What we’ve learned since the early advice:

Complex isn’t necessarily strong. Mixing character types doesn’t lead to stronger passwords: Choosing multiple words from a suitably large dictionary may result in stronger passwords even if all the words appear in dictionaries, are lower case, and no punctuation is used.
Instead check passwords against known leaked breach data
Changing passwords regularly leads to weak passwords
Length is the most important factor
User passwords are way too short. 9 characters takes 2 minutes, 10 characters takes 2 hours, 11 characters 6 days, 12 characters 1 year, 13 characters 64 years.

Current best practices:

Minimum pw length should be at least 8 characters (8 is still low though)
Maximum pw length should be at least 54 characters. Do not allow unlimited length to avoid password denial of service attacks.
common gotcha - bcrypt has a max length of 72 ASCII characters.
check paswords against known breach data
rate-limit authentication attempts
encourage/require use of second factor

Common implementation mistakes:

Do not silently truncate long passwords
Do not restrict characters (unicode and whitespace should be allowed)
Do not include passwords in plaintext log files (surprisingly common)
Use TLS for all traffic

Network Based Guessing Attacks

Whereas systems choose keys at random, users choosing a password will select from a very small subset of possible passwords of a given length. Many will choose very similar values.

This makes them vulnerable to guessing attacks. There are three primary types:

Brute force (testing multiple passwords from dictionary or other source against a single account)
Credential stuffing (testing username/password pairs obtained from the breach of another site)
Password spraying (testing a single weak password against a large number of accounts)

Defences: rate limit (eg express-rate-limiter package) authentication routes.

Keep track of IP addresses and limit the number of unsuccessful attempts.

Temporarily ban the user after too many unsuccessful attempts.

Another defence is captcha - reverse Turing test. CAPTCHA stands for ‘Completely Automated Public Turing test to tell Computers and Humans Apart’.

Problems with CAPTCHAs, it takes an average person 10 seconds (wasted time). Difficult for users with visual impairement. Security of the captcha is only as strong as the weakest form of it.

Attackers can proxy captcha to another user in real time.

Dark market services offer captcha sovling services powered by humans.

reCAPTCHA builds up a trust score across sites over time. If your trust score is high enough you don’t get the captcha test.

Another defence is reauthentication for sensitive features. Before change password, change email, add new shipping address, you can add re-authenticate step.

This is a defence in depth technique against xss, csrft.

Response Discrepancy Information Exposure

Authentication systems are vulnerable to leaking information that the attacker should not know.

Happens when the software provides different responses to incoming requests in a way that allows an attacker to determine system state information that is outside their control sphere.

Defence: Respond with a generic error message regardless of whether the username/pw was incorrect, the account does not exist, the account is locked or disabled.

Don’t forget password reset and account creation!

So don’t do this:

‘Login for User foo: Invalid Password’
‘Login failed, invalid user id’
‘Login failed, account disabled’
‘Login failed, user is not active’

Do this:

‘Login failed, invalid user id or password’

More annoying for the user, but more secure.

Don’t do this:

We just sent you a password reset link!
This email address doesn’t exist in our database.

Do this:

If this email address is in our database, we will send you a reset link.

Pay attention to HTTP status codes - always send generic 403 if unsuccessful.

Timing can also be an issue - if for example we check first for a user, then do a bunch of work only if they exist, attacker can time the response to see if the user exists.

The defence is to do the same operations regardless. Avoid early returns in authentication code.

If this really matters to your service, empirically test it.

Storing Passwords

Never, ever, ever store passwords in plain text.

Hash the plaintext password, and store the hash in the database.

We need to use a cryptographic hash function (see earlier lectures for properties).

We don’t necessarily want the function to be quick (attackers should find it slow).

We want the avalanche effect - small change to the original message should produce large change to the hash.

Here’s an example in node:

const crypto = require('crypto')
const sha256 = s => crypto.createHash('sha256').update(s).digest('hex')

const passwordHash = sha256('some password')

//later
const isValid = sha256('other password') === passwordHash

But there’s an issue. The hash function is deterministic, which means if we just store the hash we can see if users have the same password.

We can also do pre-computed lookup attacks.

Let’s say I know that a site uses sha256 as its hash function. I can then put a load of common passwords through the function and save the outputs. Then when I get the victim’s db. I just compare the user’s pw to my set. If I find a match I know the password.

My precomputed table is called a rainbow table.

The solution is password salts. The goals here are:

Prevent two users who use identical passwords from being revealed.
Add entropy to weak passwords to make pre-computed lookup attacks intractable.

Solution:

A salt is a fixed-length cryptographically-strong random value.
No need to keep the salt secret, it can be stored alongside the password (salt is usually 16, 32, or 64 bytes)
Concatenate the salt and password before hashing.

You can use the bcrypt library - password hashing function very popular. Expensive key setup algorithm. Automatically handles all password salting complexity and includes it in the hash output.

Multi-Factor Authentication

Even if we’ve done everything right, an attacker can breach our secure database and get many users’ passwords very easily.

But if we use MFA accounts are 99.9% less likely to be compromised (according to Microsoft blog).

Strong passwords don’t protect against credential stuffing, phishing, keystroke logging, local discovery, extortion.

Only protects against password spray and brute force.

So we want to require MFA, but how?

To preserve user experience we only use MFA if:

A new browser/device or IP address.
An unusual country or location
An IP address on a known blocklist
An IP address that has tried to login to multiple accounts
A login attempt that appears to be scripted rather than manual

How do we implement it?

We can try TOTP (time-based one time passwords) - authenticator app, or text.

Server creates a secret key for a specific user
Server shares secret key with the user’s phone app (eg in the form of a QR code)
Phone app initializes a counter
Phone app generates a one time password using secret key and counter
Phone app changes the counter after a certain interval and regenerates the one time password

Note the phone app doesn’t require a server connection - there’s no network communication here.

Then the server does the same thing - it sets the same counter and can then verify that the user has the device with the key.

So the server just takes the current time, divides it by 30 seconds (Date.now() / (30 * 1000)) to produce the counter, add the shared key and digest it:

hash = crypto.createHmac('sha1', secretKey).update(counter).digest()

Then we do some stuff to get the final six digit number.

Alex's Notes

CS253 Lecture Summaries: Part XIII Authentication

Network Based Guessing Attacks

Response Discrepancy Information Exposure

Storing Passwords

Multi-Factor Authentication

Links to this note