The scrypt parameters

The recommended scrypt parameters in the Go docs were recently brought up for discussion given they haven't changed since 2009.

Even if at this point I memorized the three numbers (N=16384, r=8, p=1) I only have a vague understanding of their meaning, so I took some time to read the scrypt paper.

It's an enjoyable and witty read, even if mathy at times, with lots of future predictions and reality modeling. Really drives across how scrypt is a fine piece of engineering. Also, it's single column with numbered pages, which earns it 100 points in my book.

The definitions are nested, each building on top of the previous one. In this post I summed up how each parameter impacts the whole scrypt algorithm. Finally, I had a look at what parameters you should use in 2017.

𝑟

𝑟 is the second parameter, but we start with it because it's used by the deepest nested function, BlockMix.

BlockMix turns a hash function with 𝑘-bit long inputs and outputs into a hash function with 2𝑟𝑘-bit long inputs and outputs. That is, it makes the core hash function in scrypt 2𝑟 wider.

It does that by iterating the hash function 2𝑟 times, so both memory usage (to store the hash values) and CPU time scale linearly with it. That is, if 𝑟 doubles the resources double.

That's useful because scrypt applies the hash to "random" memory positions. CPUs load memory in fixed-size blocks called cache lines. If the hash block size is smaller than the cache line, all the rest of the loaded line will be wasted memory bandwidth. Also, it dilutes the memory latency cost. Percival predicted both cache line sizes and memory latencies would increase over time, so made the hash size tunable to prevent scrypt from becoming latency-bound.

I have read that 𝑟 tunes memory usage, and believed it meant it is a memory-only work factor. That is incorrect because both CPU and memory scale with 𝑟. Also, while 𝑟 acts as a work factor, it's unclear increasing it provides the same security as 𝑁 (since there is no added randomization in memory accesses, see below), so it shouldn't be used as one.

𝑁

𝑁 is the one and only work factor.

Memory and CPU usage scale linearly with 𝑁. The mixing function, ROMix, stores 𝑁 sequential hash results in RAM, to then load them in a random order and sequentially xor and hash them.

The reason 𝑁 must be a power of two is that to randomly select one of the 𝑁 memory slots at each iteration, scrypt converts the hash output to an integer and reduces it mod 𝑁. If 𝑁 is a power of two, that operation can be optimized into simple (and fast) binary masking.

Estimating scrypt memory usage

scrypt requires 𝑁 times the hash block size memory. Because of BlockMix, the hash block size is 2𝑟 the underlying hash output size. In scrypt, that hash is the Salsa20 core, which operates on 64-bytes blocks.

So the minimum memory requirement of scrypt is:

𝑁 × 2𝑟 × 64 = 128 × 𝑁 × 𝑟 bytes

For 𝑁 = 16384 and 𝑟 = 8 that would be 16 MiB. It scales linearly with 𝑁 and 𝑟, and some implementations or APIs might cause internal copying doubling the requirement.

𝑝

𝑝 is used in the outmost function, MFcrypt. It is a parallelization parameter. 𝑝 instances of the mixing function are run independently and their outputs concatenated as salt for the final PBKDF2.

𝑝 > 1 can be handled in two ways: sequentially, which does not increase memory usage but requires 𝑝 times the CPU and wall clock time; or parallelly, which requires 𝑝 times the memory and effective CPU time, but does not increase wall clock time.

So 𝑝 can be used to increase CPU time without affecting memory requirements when handled sequentially, or without affecting wall clock time when handled parallelly. However, it offers attackers the same opportunity to optimize for processing or memory.

Parameters for 2017

We apply the same methodology of the paper to pick recommended 𝑁 values for interactive logins and file encryption: the biggest power of two that will run in less than 100ms and 5s respectively on "the CPU in the author's laptop" (a 3.1 GHz Intel Core i5).

func main() {  
    for n := uint8(14); n < 22; n++ {
        b := testing.Benchmark(func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                scrypt.Key([]byte("password"), []byte("salt"), 1<<n, 8, 1, 32)
            }
        })
        t := b.T / time.Duration(b.N)
        fmt.Printf("N = 2^%d\t%dms\n", n, t/time.Millisecond)
    }
}
  • interactive logins: 2^15 — 1 << 15 — 32 768 — 86ms
  • file encryption: 2^20 — 1 << 20 — 1 048 576 — 3802ms

Curiously enough, the execution time of 𝑁 = 2^20 is exactly the same as in the paper's Table 1, while the sub-100ms value went from 2^14 to 2^15.

Cache line sizes have not significantly increased since 2009, so 8 should still be optimal for 𝑟.

If we really wanted to insist that CPUs have changed in 10 years we could say that more cores are now available, and increase the 𝑝 factor. However, common implementations don't spread the load of 𝑝 and instead compute each instance sequentially. Also, many use cases involve processing multiple parallel requests, so the available cores are not idle. So it seems ok to leave 𝑝 at 1.

Final miscellaneous notes

Colin Percival seems to agree [1] [2] on the new parameters.

Since the final output of scrypt is generated by PBKDF2(HMAC‑SHA256, Password, MixingOutput, 1), even if everything about scrypt were broken, it would still be a secure KDF as long as PBKDF2 with 1 iterations is. (While scrypt uses PBKDF2, it doesn't use it for its work factor.)

Best quote from the paper:

those few organizations which have the resources and inclination to design and fabricate custom circuits for password-cracking tend to be somewhat secretive

If you like digging into a cryptography paper now and then, you might enjoy following me on Twitter.

This post features my favourite Unicode points: U+2011 NON-BREAKING HYPHEN, U+202F NARROW NO-BREAK SPACE and the Mathematical Alphanumeric Symbols.