Actually it looks like Truecrypt is using a similar system (I just read the entire thing, before I only glanced at it), but I don't know why they have made it so much more complicated.
Random Number Generator
The random number generator (RNG) is used to generate the master encryption key, the secondary key (XTS mode), salt, and keyfiles. It creates a pool of random values in RAM (memory). The pool, which is 320 bytes long, is filled with data from the following sources:
Mouse movements
Keystrokes
Mac OS X and Linux: Values generated by the built-in RNG (both /dev/random and /dev/urandom)
MS Windows: Windows CryptoAPI (collected regularly at 500-ms interval)
MS Windows: Network interface statistics (NETAPI32)
MS Windows: Various Win32 handles, time variables, and counters (collected regularly at 500-ms interval)
So essentially Truecrypt is describing the buffer from the scheme I mentioned, where raw input data is added (from mouse position, etc).
Before a value obtained from any of the above-mentioned sources is written to the pool, it is divided into individual bytes (e.g., a 32-bit number is divided into four bytes). These bytes are then individually written to the pool with the modulo 28 addition operation (not by replacing the old values in the pool) at the position of the pool cursor. After a byte is written, the pool cursor position is advanced by one byte. When the cursor reaches the end of the pool, its position is set to the beginning of the pool. After every 16th byte written to the pool, the pool mixing function is applied to the entire pool (see below).
Truecrypt diverges here, after they fill the buffer up they don't ever replace it but rather they modify it. When new input comes in after the buffer is filled, they change the value of the current position in the pool to be the current position in the pool modulo 28 the new raw input.
Pool Mixing Function
The purpose of this function is to perform diffusion [2]. Diffusion spreads the influence of individual "raw" input bits over as much of the pool state as possible, which also hides statistical relationships. After every 16th byte written to the pool, this function is automatically applied to the entire pool.
Description of the pool mixing function:
Let R be the randomness pool.
Let H be the hash function selected by the user (SHA-512, RIPEMD-160, or Whirlpool).
l = byte size of the output of the hash function H (i.e., if H is RIPEMD-160, then l = 20; if H is SHA-512, l = 64)
z = byte size of the randomness pool R (320 bytes)
q = z / l – 1 (e.g., if H is Whirlpool, then q = 4)
R is divided into l-byte blocks B0...Bq.
For 0 < i < q (i.e., for each block B) the following steps are performed:
M = H (B0 || B1 || ... || Bq) [i.e., the randomness pool is hashed using the hash function H, which produces a hash M]
Bi = Bi ^ M
R = B0 || B1 || ... || Bq
Now Truecrypt divides the buffer into sections that are the size of the output of the hash function selected by the user. Truecrypt goes through each section and generates a value that is equal to the hash value of every section in the buffer, and then uses modular exponentiation to transform the current section to the current section ^ value generated by the hash function. It does this for every section in the randomness pool.
For example, if q = 1, the randomness pool would be mixed as follows:
(B0 || B1) = R
B0 = B0 ^ H(B0 || B1)
B1 = B1 ^ H(B0 || B1)
R = B0 || B1
Generated Values
The content of the RNG pool is never directly exported (even when TrueCrypt instructs the RNG to generate and export a value). Thus, even if the attacker obtains a value generated by the RNG, it is infeasible for him to determine or predict (using the obtained value) any other values generated by the RNG during the session (it is infeasible to determine the content of the pool from a value generated by the RNG).
The RNG ensures this by performing the following steps whenever TrueCrypt instructs it to generate and export a value:
Data obtained from some of the sources listed above is added to the pool as described above.
The requested number of bytes is copied from the pool to the output buffer (the copying starts from the position of the pool cursor; when the end of the pool is reached, the copying continues from the beginning of the pool; if the requested number of bytes is greater than the size of the pool, no value is generated and an error is returned).
The state of each bit in the pool is inverted (i.e., 0 is changed to 1, and 1 is changed to 0).
Data obtained from some of the sources listed above is added to the pool as described above.
The content of the pool is transformed using the pool mixing function. Note: The function uses a cryptographically secure one-way hash function selected by the user (for more information, see the section Pool Mixing Function above).
The transformed content of the pool is XORed into the output buffer as follows:
The output buffer write cursor is set to 0 (the first byte of the buffer).
The byte at the position of the pool cursor is read from the pool and XORed into the byte in the output buffer at the position of the output buffer write cursor.
The pool cursor position is advanced by one byte. If the end of the pool is reached, the cursor position is set to 0 (the first byte of the pool).
The position of the output buffer write cursor is advanced by one byte.
Steps b-d are repeated for each remaining byte of the output buffer (whose length is equal to the requested number of bytes).
The content of the output buffer, which is the final value generated by the RNG, is exported.
Truecrypt then uses this randomness pool to generate keying material, first going through some steps to transform it a final time.
I think that this is kind of an over complicated method personally. I would do something more like this:
Have the buffer that raw input is added to, every period of time (or better yet every time the buffer is filled), I would take the SHA-256 value of the buffer. Then I would totally clear the buffer, and add the SHA-256 value to it. Then as more input comes, I would add it to the buffer past the SHA-256 value, and when the buffer is filled again I would take the SHA-256 value of it again. I would do this for some number of rounds, or as long as the user desires. At the end, I would take a final SHA-256 value of the buffer. If I was only generating an AES-256 key, I would use the final SHA-256 hash for the key. However, Truecrypt is using their PRNG to generate more than the key, so I would need more than 256 bits. In this case, I would do the following:
Using the final SHA-256 output as a seed, I would then hash it out once more. If S is the original value and O1 is the hash of the original value
Ox = H(S || Ox-1)
O1..Oz would be my pseudorandom material to use for keying and other operations, where Oz is the final Ox.
The original seed is generated securely because:
A. Cryptographic hash functions preserve entropy, meaning the output of H(X) contains as much entropy as X.
Therefore, with R = raw input
H(R1) = A
H(A || R2) = B
H(B || R3) = D
D must contain as much entropy as the total entropy of R1, R2, and R3 combined.
B. Cryptographic hash functions distill entropy
If H produces 256 bits of output, feeding H(1000 non-random-input-bits + 1 random-input-bit) = 256 bit output with 1 random bit (due to preservation of entropy there is 1 bit of entropy in the output, distilled from 1001 bits to 256 bits)
C. Cryptographic hash functions distribute entropy
If H produces 256 bits of output, H(1 bit of entropy) = 256 bits each with 1/256 bits of entropy
therefore if there are 256 bits of entropy total fed to H during the seed generation process, the final value will be 256 random bits.
The final output material is secure because:
A. The seed is used to generate every 32 bytes of output
Ox = H(S || Ox-1)
Without knowing the seed value an attacker cannot use Ox to determine Ox-1 or Ox+1, because a cryptographic hash algorithm is (ideally) a one way function.
H(S) = O1 (the attacker can not determine the seed from the first output because doing so would mean H is not a one way function)
H(S || O1) = O2 (Same as above, plus the attacker can not guess O2 with knowledge of O1 unless they also have knowledge of S, and S is 256 bits of randomness)
H(S || O2) = O3
etc.
It should be safe to use randomly sized sections of output as well, since the entropy is evenly distributed by the hash function. That means just because the hash outputs A9 and A is used for the publicly viewable salt and 9 is used for the secret key, the attacker should not be able to use knowledge of A to be able to determine that the key starts with 9, since the entropy of A is the same as the entropy of 9 and the two are probabilistically independent of each other.