Human Readable GUID

7.3k Views Asked by At

I'm writing a small system that will allow me to sell my band's music at gigs by generating vouchers that can be redeemed for MP3s at our website.

The vouchers will need a code that the user types in. The code needs to have the following qualities:

  1. Some level of human readability in terms of length and content, to prevent user frustration and data entry error.
  2. Given one voucher code, not trivial to guess another voucher code.

If I use GUIDs I'm concerned about point 1. If I use an incrementing integer I'm concerned about point 2. There has to be some happy medium in between, right? I thought perhaps this work has already been done and there's an ideal solution waiting out there for me. In the absence of that, I'm thinking I'll go with a random alphanumeric string, or possibly letters only (excluding I and O for clarity), and have the application block IP addresses that fail X number of times, which would indicate a possible brute force attack. If I went with that, how long of a string and what value of X would work, and why?

Thanks for your help!


Update: I wasn't totally explicit about the method: I will generate lists of voucher codes for printing, then enter the "sold" codes after a gig. Therefore I think elements like a checksum are not necessary like they are in software keys that don't use validation servers.

13

There are 13 best solutions below

0
On BEST ANSWER

Only 8 alphanumeric letters (except I and O) have 1785793904896 possible combinations. That's for all intent and purposes unguessable as long as you don't have 5 billions vouchers.

2
On

5 blocks of 5 characters each should be sufficient - four blocks for the "key", the fifth as a checksum to ensure validity. And of course, don't use the whole keyspace.

That's roughly how software serial numbers appear to be laid out, anyway.

3
On

You could use a Markov Chain trained on English syllables to create a sentence composed of pronounceable-gibberish words. Just add the generated sentence to a database of valid vouchers when you print them (and invalidate them when they're redeemed, of course).

1
On

hmm, I do not know how most systems work, but I think it would be neat and simple to define a static number and multiply that number by a random other number. Then if the big GUID is a multiple of your static you are good.

Easy to produce, not easy to guess a new one (short term use only)

int i = 61234;
int j = rand()%99999
long GUID = i * j;

will give you a phone number length GUID

only 99999 uses though! doh

1
On

AOL used to use a random combination of two words for the CDs they sent out. You can take the same approach, and just increase the number of words to get the odds that you require.

0
On

you can try something like random letter sequence generator ?. You can mix and match letters/numbers as well

0
On

One simple solution is to call the getHashCode method that most languages have on their string types. Set the string to some word from your list of approved words. Then call gethashcode and that will be your key. To verify it, compare it against your list of existing word hashes and maybe delete it from the list so it can't be used again.

1
On

I would use your own encoding scheme. In addition to omitting I and O, for optimal readability it's also a good idea to omit all but one letter out of near-homonym sets (C/E, M/N) and multisyllabic letters, such as W, and of course stick to one case.

As far as length, you could use 60 bits, plus a 4-bit checksum. 64 bits is enough to store the time to millisecond granularity for several thousand years, so it's for all practical purposes unguessable. At say 4 bits per letter, that's 16 letters long. Even half that length is probably plenty.

Another way to think of this is in the form of automobile license plates: 3 letters and 3 numbers is enough to cover a pretty large state, and tends to be very readable. Unless you provide a way for someone to hack codes at high-speed, they certainly won't be guessable at human time scales.

3
On

Probably best to avoid all the vowels[*], thus avoiding all the swearwords.

[*] Including W if you're Welsh!

2
On

I'm assuming you're getting an email address when they purchase the voucher (you should). If so, why not just email them a single-use GUID? That way both you and they have a record of it, you can track redemptions, you don't run the risk of guessing (or at least not one worth bothering with), the user doesn't have to remember anything because it's there in the email, and you don't have to code anything.

They give you email address. You email GUID (with link). They click link and get song. GUID use is registered in system and will no longer work.

0
On

Why not just go with the GUID and then replace any questionable characters with a different letter (so 0 becomes 'h', 1 is 'q' and so forth).

0
On

Well, if you really want human readable, you can use BubbleBabble. Create a Perl script like the following:

#!/usr/bin/perl
use Digest::BubbleBabble qw(bubblebabble);
use Digest::SHA1 qw(sha1);
print bubblebabble(Digest => sha1(join(' ', @ARGV))), "\n";

Then feed it any command line argument you want to get output like the following:

xogan-nydut-zogiv-kotyn-ledah-taseb-gyhib-tucel-vudul-mykom-mexax

Or if Perl's not your preference, you can use PWGen (also available online to get output like this:

aiCee5om Ohxai2is tae3Gael Gaeth7ei ooCh0ish

Honestly, this level of human readability is overkill; RickNZ's answer should work just fine (and is pretty close to what we did for some software keys). But BubbleBabble is kind of fun.

0
On

Context

  • human-readable UUID
  • language-independent algorithm

Problem

  • devise an algorithm for generating "human readable" UUIDs (HR-UUID)
  • HR-UUID should be robust against brute-force guesses
  • entry and recall by a human being should be straightforward and not error-prone
  • having 1 or more known valid HR-UUID should not be statistically relevant for guessing other valid HR-UUIDs

Solution

  • Use the DiceWare password algorithm.
  • In contrast to the other solutions offered in this thread, this approach solves the human-readable UUID problem by re-casting the problem to that of password generation.
  • In contrast to the BubbleBabble solution offered elsewhere in this thread, Diceware allows you to choose how many elements are included in each UUID, depending on how many times you wish to "roll the die" ... this means you get to choose the entropy per UUID.
  • DiceWare password algorithm solves the problem of generating high-entropy passphrases that are nonetheless easy for humans to both enter and remember.
  • Below is a sampling of Diceware "UUIDs" consisting of six elements each:

    crabmeat-coach-properly-driving-yoga-ferret
    edition-mousy-fabric-budding-book-mortuary
    rickety-uncrown-earful-majority-sublet-evade
    

See also