Thar Be Dragons!

The Maths

Before examining the password strength of passwords generated with this module we need to lay out the relatively simple maths underlying it all.

Maths Primer

A coin could be used as a very simple password generator. Each character in the password would be the result of a single coin toss. If the coin lands heads up, we add a 'H' to our password, if it lands tails up, we add a 'T'.

If you made a one-letter password in this way there would only be two possibilities, 'H', or 'T', or two permutations. If you made a two-letter password in this way there would be four possible combinations, or permutations, 'HH', 'HT', 'TH', and 'TT'. If you made a three-character password in this way there would be 16 permutations, a five character one would have 32 permutations, and so forth.

So, for a coin toss, which has two possible values for each character, the formula for the number of permutations 'P' for a given length of password 'L' is:

P = 2^L

Or, two to the power of the length of the password.

If we now swapped our coin for a dice, we would go from two possible values per letter, to six possible values per letter. For one dice roll there would be six permutations, for two there would be 36, for three there would be 108 and so on.

This means that for a dice, the number of permutations can be calculated with the formula:

P = 6^L

When talking about passwords, the set of possible symbols used for each character in the password is referred to as the password's alphabet. So, for the coin toss the alphabet was just H and T, and for the dice, it was 1, 2, 3, 4, 5, and 6. The actual characters used in the alphabet make no difference to the strength of the password, all that matters is the size of the alphabet, which we'll call A.

As you can probably infer from the two examples above, the formula for the number of possible permutations P for a password of length L created from an alphabet of size A is:

P = A^L

In the real world, our passwords are generally made up of a mix of letters, digits, and symbols. If we use mixed case that gives us 52 letters alone, then add in the ten digits from O to 9 and we're already up to 62 possible characters before we even start on the array of symbols and punctuation characters on our keyboards. It's generally accepted that if you include symbols and punctuation, there are 95 characters available for use in randomly generated passwords. Hence, in the real world, the value for A is assumed to be 95. When you start raising a number as big as 95 to even low powers the number of permutations quickly rises.

A two-character password with an alphabet of 95 has 9025 permutations, increasing the length to three characters brings that up to 857,375, and so on. These numbers very quickly become too big to handle. For just an 8-character password we are talking about 6,634, 204, 312,890,625 permutations, which is a number so big most people couldn't say it (what do you call something a thousand times bigger than a trillion?)

Because the numbers get so astronomically big so quickly, computer scientists use bits of entropy to measure password strength rather than the number of permutations. The formula to turn permutations into bits of entropy E is very simple:

E = Log(2)P

In other words, the entropy is the log to base two of the permutations. For our eight-character example that equates to about 52 bits.

There are two approaches to increasing the number of permutations, and hence the entropy, you can choose more characters, or, you can make the alphabet you are choosing from bigger.

The Entropy of XKPASSWD Passwords

Exactly how much entropy does a password need? That's the subject of much debate, and the answer ultimately depends on the value of the assets being protected by the password.

Two common recommendations you hear are 8 characters containing a mix of upper and lower case letters, digits, and symbols, or 12 characters with the same composition. These evaluate to approximately 52 bits of entropy and 78 bits of entropy respectively.

When evaluating the entropy of passwords generated by this module, it has to be done from two points of view for the answer to be meaningful. Firstly, a best-case scenario - the attacker has absolutely no knowledge of how the password was generated, and hence must mount a brute-force attack. Then, secondly from the point of view of an attacker with full knowledge of how the password was generated. Not just the knowledge that this module was used, but a copy of the dictionary file used, and, a copy of the configuration settings used.

For the purpose of this documentation, the entropy in the first scenario, the brute force attack, will be referred to as the blind entropy, and the entropy in the second scenario the seen entropy.

The blind entropy is solely determined by the configuration settings, the seen entropy depends on both the settings and the dictionary file used.

Calculating the bind entropy Eb is quite straightforward, we just need to know the size of the alphabet resulting from the configuration A, and the minimum length of passwords generated with the configuration L, and plug those values into this formula:

Eb = Log(2)(A^L)

Calculating A simply involves determining whether or not the configuration results in a mix of letter cases (26 or 52 characters), the inclusion of at least one symbol (if any one is present, assume the industry standard of a 33-character search space), and the inclusion of at least one digit (10 characters). This will result in a value between 26 and 95.

Calculating L is also straightforward. The one minor complication is that some configurations result in a variable-length password. In this case, assume the shortest possible length the configuration could produce.

The example password from the Philosophy section (!15.play-MAJOR.fresh.FLAT.23!) was generated using the preset WEB32. This preset uses four words of between four and five letters long, with the case of each word randomly set to all lower or all upper as the basis for the password, it then chooses two pairs of random digits as extra words to go front and back, before separating each word with a copy of a randomly chosen symbol, and padding the front and back of the password with a copy of a different randomly chosen symbol. This results in passwords that contain a mix of cases, digits, and symbols, and are between 27 and 31 characters long. If we add these values into the formula we find that the blind entropy for passwords created with this preset is:

Eb = Log(2)(95^27) = 163 bits

This is spectacularly secure! And, this is the most likely kind of attack for a password to face. However, to have confidence in the password we must also now calculate the entropy when the attacker knows everything about how the password was generated.

We will calculate the entropy resulting from the same WEB32 config being used to generate a password using the sample library file that ships with the XKPASSWD.

The number of permutations the attacker needs to check is purely the product of possible results for each random choice made during the assembly of the password.

Let's start with the words that will form the core of the password. The configuration chooses four words of between four and five letters long from the dictionary, and then randomises their case, effectively making it a choice from twice as many words (each word in each case).

The sample dictionary file contains 698 words of the configured length, which doubles to 1396. Choosing four words from that very large alphabet gives a starting point of '1396^4, or 3,797,883,801,856 permutations.

Next we need to calculate the permutations for the separator character. The configuration specifies just nine permitted characters, and we choose just one, so that equates to 9 permutations.

Similarly, the padding character on the end is chosen from 13 permitted symbols giving 13 more permutations.

Finally, there are four randomly chosen digits, giving 10^4, or 10,000 permutations.

The total number of permutations is the product of all these permutations:

Pseen = 3,797,883,801,856 * 9 * 13 * 10,000 = 2.77x10^17

Finally, we convert this to entropy by taking the base 2 log:

Eseen = Log(2)2.77x10^17 = ~57bits

What this means is that most probably, passwords generated with this preset using the sample dictionary file are spectacularly more secure than even 12 randomly chosen characters, and, that in the very unlikely event that an attacker knows absolutely everything about how the password was generated, it is still significantly more secure than 8 randomly chosen characters.

Because the exact strength of the passwords produced by this module depends on the configuration and dictionary file used, the constructor does the above math when creating an XPasswd object, and throws a warning if either the blind entropy falls below 78bits, or the seen entropy falls below 52 bits.