Random Numbers

"Everything we do to achieve privacy and security in the computer age depends on random numbers."

Random number sequences have been around for 4,000 years, but never have they been in such demand. That's because they're the building blocks of cryptography. Every time you establish an SSL connection to, say, E*Trade, there's a string of digits working hard behind the scenes. As many as 368 bits of random data go into creating the connection - 128 bits to make encryption keys, the rest for authentication codes and the prevention of replay attacks - that's necessary whenever you send your credit card information over an ecommerce site's "secure server" or exchange medical records with your insurance company online. Even the secrecy of the messages whizzing between military commanders in the Middle East depends on random numbers.

A sequence is considered random if no patterns can be recognized in it - the longer the string, the stronger the encryption.

Computers make lousy RNGs (Random Number Generators). Computer programs are, by design, deterministic; they merely follow set procedures. Start a new process from the same place, and a pattern emerges. Such systems are often referred to as pseudo-random number generators(PRNG's). PRNG's can be easily generated using algorithms. A good example is the Yarrow algorithm.

The problem with PRNG's is that it needs an initial seed(value) to generate the random number. The same random number is always generated for the given same seed. The seed should be large, uniform and unpredictable, just like random numbers.

For example,In 1995, two U.C. Berkeley researchers, Ian Goldberg and David Wagner, discovered that Netscape was using an easily guessed PRNG seed to generate the SSL keys. Netscape's RNG seed was derived from just three quantities: the time of day, the process ID, and the parent process ID. The seed was just a standard MD5 hash of all 3 values. Their program was able to find the correct encryption key in less than half a min, if the attacker could guess the second in which the PRNG was seeded -- something that wasn't hard if you could watch the network traffic

Most of the time it is very easy for attackers to guess the sequence of pseudo-random number generators, and for some applications this is not acceptable. So instead, we must try to gather "environmental noise" from the computer's environment, which must be hard for outside attackers to observe, and use that to generate random numbers.

Sources of randomness can include

Inter-keyboard timings

Inter-interrupt timings

Mouse movement

rate of packets coming in and out of the system

Other non-deterministic sources

Random Numbers in Linux/Unix :

In Linux or Unix, this is collected in an entropy pool and is passed though a CRC like function to randomize the pool. When random bytes are desired, they are obtained by taking the SHA hash of the contents of the "entropy pool".

This data is avilble in device named /dev/random and /dev/urandom.

Thought this data is not truly random, it is however good enough to be used as a seed for other random number functions.

/dev/random is high quality entropy, generated from events which are non-deterministic. It blocks until enough bits of random data are available. Note that on some systems, it can block for a long time waiting for new user-generated entry to be entered into the system. So you have to use care before using /dev/random.

/dev/urandom is similar, but when the store of entropy is running low, it'll return a cryptographically strong hash of what there is. This isn't as secure, but it's enough for most applications.

If you need lot of random numbers, then reading from /dev/urandom could be slow. You are probably better off using /dev/urandom to choose a random seed, and then using a good, fast RNG.
Here is a sample C program to read random bytes from the device

#include <stdio.h>
#include <fcntl.h>
#include <time.h>

main (void) {
        int randnum;
	    int fd = open ("/dev/urandom", O_RDONLY);
        
        if (fd != -1) {
            read( fd, &randnum, 4 );
            randnum = abs( randnum );   /* If you only want positive values */
            printf("Random number : %d\n",randnum);
            close(fd);
         } else {
             /* Please don't do this if you need cryptographic security! */
             randnum = time(NULL) + getpid();  // + other weird stuff;
         }
         //seed_the_rng_function(randnum);
}

If you have the OpenSSL library installed on your machine, then you could also use the RAND_bytes function in OpenSSL to get random bytes from /dev/(u)random device. The function will return 1 if it is able to fetch random bytes from /dev/urandom. Else it returns 0, in which case you will have to resort to random numbers based on time/pid etc.

#include <stdio.h>
#include <openssl/rand.h>

// you will need to compile it with openssl lib
// $ gcc <filename.c> -lcrypto
main (void) {
        unsigned char buf[10];
        if (!RAND_bytes(buf, 10)) {
            // the usual md5(time+pid)
        }
        printf("Random : %s\n", buf);
}