How to generate a verification code/number? - algorithm

I'm working on an application where users have to make a call and type a verification number with the keypad of their phone.
I would like to be able to detect if the number they type is correct or not. The phone system does not have access to a list of valid numbers, but instead, it will validate the number against an algorithm (like a credit card number).
Here are some of the requirements :
It must be difficult to type a valid random code
It must be difficult to have a valid code if I make a typo (transposition of digits, wrong digit)
I must have a reasonable number of possible combinations (let's say 1M)
The code must be as short as possible, to avoid errors from the user
Given these requirements, how would you generate such a number?
EDIT :
#Haaked: The code has to be numerical because the user types it with its phone.
#matt b: On the first step, the code is displayed on a Web page, the second step is to call and type in the code. I don't know the user's phone number.
Followup : I've found several algorithms to check the validity of numbers (See this interesting Google Code project : checkDigits).

After some research, I think I'll go with the ISO 7064 Mod 97,10 formula. It seems pretty solid as it is used to validate IBAN (International Bank Account Number).
The formula is very simple:
Take a number : 123456
Apply the following formula to obtain the 2 digits checksum : mod(98 - mod(number * 100, 97), 97) => 76
Concat number and checksum to obtain the code => 12345676
To validate a code, verify that mod(code, 97) == 1
Test :
mod(12345676, 97) = 1 => GOOD
mod(21345676, 97) = 50 => BAD !
mod(12345678, 97) = 10 => BAD !
Apparently, this algorithm catches most of the errors.
Another interesting option was the Verhoeff algorithm. It has only one verification digit and is more difficult to implement (compared to the simple formula above).

For 1M combinations you'll need 6 digits. To make sure that there aren't any accidentally valid codes, I suggest 9 digits with a 1/1000 chance that a random code works. I'd also suggest using another digit (10 total) to perform an integrity check. As far as distribution patterns, random will suffice and the check digit will ensure that a single error will not result in a correct code.
Edit: Apparently I didn't fully read your request. Using a credit card number, you could perform a hash on it (MD5 or SHA1 or something similar). You then truncate at an appropriate spot (for example 9 characters) and convert to base 10. Then you add the check digit(s) and this should more or less work for your purposes.

You want to segment your code. Part of it should be a 16-bit CRC of the rest of the code.
If all you want is a verification number then just use a sequence number (assuming you have a single point of generation). That way you know you are not getting duplicates.
Then you prefix the sequence with a CRC-16 of that sequence number AND some private key. You can use anything for the private key, as long as you keep it private. Make it something big, at least a GUID, but it could be the text to War and Peace from project Gutenberg. Just needs to be secret and constant. Having a private key prevents people from being able to forge a key, but using a 16 bit CR makes it easier to break.
To validate you just split the number into its two parts, and then take a CRC-16 of the sequence number and the private key.
If you want to obscure the sequential portion more, then split the CRC in two parts. Put 3 digits at the front and 2 at the back of the sequence (zero pad so the length of the CRC is consistent).
This method allows you to start with smaller keys too. The first 10 keys will be 6 digits.

Does it have to be only numbers? You could create a random number between 1 and 1M (I'd suggest even higher though) and then Base32 encode it. The next thing you need to do is Hash that value (using a secret salt value) and base32 encode the hash. Then append the two strings together, perhaps separated by the dash.
That way, you can verify the incoming code algorithmically. You just take the left side of the code, hash it using your secret salt, and compare that value to the right side of the code.

I must have a reasonnable number of possible combinations (let's say 1M)
The code must be as short as possible, to avoid errors from the user
Well, if you want it to have at least one million combinations, then you need at least six digits. Is that short enough?

When you are creating the verification code, do you have access to the caller's phone number?
If so I would use the caller's phone number and run it through some sort of hashing function so that you can guarantee that the verification code you gave to the caller in step 1 is the same one that they are entering in step 2 (to make sure they aren't using a friend's validation code or they simply made a very lucky guess).
About the hashing, I'm not sure if it's possible to take a 10 digit number and come out with a hash result that would be < 10 digits (I guess you'd have to live with a certain amount of collision) but I think this would help ensure the user is who they say they are.
Of course this won't work if the phone number used in step 1 is different than the one they are calling from in step 2.

Assuming you already know how to detect which key the user hit, this should be doable reasonably easily. In the security world, there is the notion of a "one time" password. This is sometimes referred to as a "disposable password." Normally these are restricted to the (easily typable) ASCII values. So, [a-zA-z0-9] and a bunch of easily typable symbols. like comma, period, semi colon, and parenthesis. In your case, though, you'd probably want to limit the range to [0-9] and possibly include * and #.
I am unable to explain all the technical details of how these one-time codes are generated (or work) adequately. There is some intermediate math behind it, which I'd butcher without first reviewing it myself. Suffice it to say that you use an algorithm to generate a stream of one time passwords. No matter how mnay previous codes you know, the subsequent one should be impossibel to guess! In your case, you'll simply use each password on the list as the user's random code.
Rather than fail at explaining the details of the implementation myself, I'll direct you to a 9 page article where you can read up on it youself: https://www.grc.com/ppp.htm

It sounds like you have the unspoken requirement that it must be quickly determined, via algorithm, that the code is valid. This would rule out you simply handing out a list of one time pad numbers.
There are several ways people have done this in the past.
Make a public key and private key. Encode the numbers 0-999,999 using the private key, and hand out the results. You'll need to throw in some random numbers to make the result come out to the longer version, and you'll have to convert the result from base 64 to base 10. When you get a number entered, convert it back to base64, apply the private key, and see if the intereting numbers are under 1,000,000 (discard the random numbers).
Use a reversible hash function
Use the first million numbers from a PRN seeded at a specific value. The "checking" function can get the seed, and know that the next million values are good. It can either generate them each time and check one by one when a code is received, or on program startup store them all in a table, sorted, and then use binary search (maximum of compares) since one million integers is not a whole lot of space.
There are a bunch of other options, but these are common and easy to implement.
-Adam

You linked to the check digits project, and using the "encode" function seems like a good solution. It says:
encode may throw an exception if 'bad' data (e.g. non-numeric) is passed to it, while verify only returns true or false. The idea here is that encode normally gets it's data from 'trusted' internal sources (a database key for instance), so it should be pretty usual, in fact, exceptional that bad data is being passed in.
So it sounds like you could pass the encode function a database key (5 digits, for instance) and you could get a number out that would meet your requirements.

Related

Chicken/Egg problem: Hash of file (including hash) inside file! Possible?

Thing is I have a file that has room for metadata. I want to store a hash for integrity verification in it. Problem is, once I store the hash, the file and the hash along with it changes.
I perfectly understand that this is by definition impossible with one way cryptographic hash methods like md5/sha.
I am also aware of the possibility of containers that store verification data separated from the content as zip & co do.
I am also aware of the possibility to calculate the hash separately and send it along with the file or to append it at the end or somewhere where the client, when calculating the hash, ignores it.
This is not what I want.
I want to know whether there is an algorithm where its possible to get the resulting hash from data where the very result of the hash itself is included.
It doesn't need to be cryptographic or fullfill a lot of criterias. It can also be based on some heuristics that after a realistic amount of time deliver the desired result.
I am really not so into mathematics, but couldn't there be some really advanced exponential modulo polynom cyclic back-reference devision stuff that makes this possible?
And if not, whats (if there is) the proof against it?
The reason why i need tis is because i want (ultimately) to store a hash along with MP4 files. Its complicated, but other solutions are not easy to implement as the file walks through a badly desigend production pipeline...
It's possible to do this with a CRC, in a way. What I've done in the past is to set aside 4 bytes in a file as a placeholder for a CRC32, filling them with zeros. Then I calculate the CRC of the file.
It is then possible to fill the placeholder bytes to make the CRC of the file equal to an arbitrary fixed constant, by computing numbers in the Galois field of the CRC polynomial.
(Further details possible but not right at this moment. You basically need to compute (CRC_desired - CRC_initial) * 2-8*byte_offset in the Galois field, where byte_offset is the number of bytes between the placeholder bytes and the end of the file.)
Note: as per #KeithS's comments this solution is not to prevent against intentional tampering. We used it on one project as a means to tie metadata within an embedded system to the executable used to program it -- the embedded system itself does not have direct knowledge of the file(s) used to program it, and therefore cannot calculate a CRC or hash itself -- to detect inadvertent mismatch between an embedded system and the file used to program it. (In later systems I've just used UUIDs.)
Of course this is possible, in a multitude of ways. However, it cannot prevent intentional tampering.
For example, let
hash(X) = sum of all 32-bit (non-overlapping) blocks of X modulo 65521.
Let
Z = X followed by the 32-bit unsigned integer (hash(X) * 65521)
Then
hash(Z) == hash(X) == last 32-bits of Z
The idea here is just that any 32-bit integer congruent to 0 modulo 65521 will have no effect on the hash of X. Then, since 65521 < 2^16, hash has a range less then 2^16, and there are at least 2^16 values less than 2^32 congruent to 0 modulo 65521. And so we can encode the hash into a 32 bit integer that will not affect the hash. You could actually use any number less than 2^16, 65521 just happens to be the largest such prime number.
I remember an old DOS program that was able to embed in a text file the CRC value of that file. However, this is possible only with simple hash functions.
Altough in theory you could create such file for any kind of hash function (given enough time or the right algorithm), the attacker would be able to use exactly the same approach. Even more, he would have a chose: to use exactly your approach to obtain such file, or just to get rid of the check.
It means that now you have two problems instead of one, and both should be implemented with the same complexity. It's up to you to decide if it worth it.
EDIT: you could consider hashing some intermediary results (like RAW decoded output, or something specific to your codec). In this way the decoder would have it anyway, but for another program it would be more difficult to compute.
No, not possible. You either you a separate file for hashs ala md5sum, or the embedded hash is only for the "data" portion of the file.
the way the nix package manager does this is by when calculating the hash you pretend the contents of the hash in the file are some fixed value like 20 x's and not the hash of the file then you write the hash over those 20 x's and when you check the hash you read that and ignore again it pretending the hash was just the fixed value of 20 x's when hashing
they do this because the paths at which a package is installed depend on the hash of the whole package so as the hash is of fixed length they set it as some fixed value and then replace it with the real hash and when verifying they ignore the value they placed and pretend it's that fixed value
but if you don't use such a method is it impossible
It depends on your definition of "hash". As you state, obviously with any pseudo-random hash this would be impossible (in a reasonable amount of time).
Equally obvious, there are of course trivial "hashes" where you can do this. Data with an odd number of bits set to 1 hash to 00 and an even number of 1s hash to 11, for example. The hash doesn't modify the odd/evenness of the 1 bits, so files hash the same when their hash is included.

Are fragments of hashes collision-resistent?

If you only use the first 4 bytes of an MD5 hash, would that mean theoretically only 1 in 255^4 chance of collision? That is, are hashes designed such that you only have to use a small portion of the returned hash (say the hash is of a file of some size)?
Remember that, even without considering a smart attacker deliberately trying to cause collisions, you need to start worrying about accidental collisions once the number of objects you're hashing get comparable to the square root of the hash space... just a few tens of thousands of objects for a 32-bit hash key. This comes from the so-called birthday paradox.
It is 256, not 255.
Assuming that MD5 is a secure hash function (it turns out it is not secure, but, for the sake of the discussion, let's suppose that it is secure), then it should behave like a random oracle, a mythical object which outputs uniformly random values, under the sole constraint that it "remembers" its previous outputs and returns the same value again, given the same input.
Truncating the output of a random oracle yields another random oracle. Thus, if you keep 32 bits, then the probability of a collision with two distinct input messages is 1 in 2^32 (i.e. 1 in 256^4).
Now there is a thing known as the birthday paradox which says that, with about 2^16 distinct inputs, there are good chances that two of the 2^16 corresponding outputs collide.
MD5 has been shown to be insecure for some purposes -- in particular anything which is related to collisions. The current default recommendation is SHA-2 (a family of four functions, with output sizes 224, 256, 384 and 512 bits, respectively). A new (american) standard is currently being defined, through an open competition, under the code name SHA-3. This is a long process; the new function shall be chosen by mid-2012. Some of the remaining candidates (currently 14, out of an initial 51) are substantially faster than SHA-2, some approaching MD5 in performance, while being considerably more secure. But this is a bit new, so right now you shall use SHA-2 by default.
Assume we have a pre-determined message1. hash1 = md5(message1)
Now choose a message2 randomly, and set hash2 = md5(message2).
In theory there is a 1/255^4 chance that the first four characters of hash2 match the first four of pre-determined hash1.
It is also supposed to be very hard for an attacker that knows message1 to come up with a different message2 that has the same hash. This is called second pre-image resistance. However, even with the full MD5, there are better than theoretical pre-image attacks.
MD5 is completely broken for collisions. This means it is quite feasible for an attacker (in a few hours) to come up with two messages with the same hash (let alone the same first four bytes). The attacker gets to choose both messages, but this can still cause major damage. See for instance the poisoned message example.
If you're generating unique identifiers, you might want to use a UUID instead. These are designed to minimize the change of collisions so that in practice they should never occur.
If you're worried about filenames being too long, which is a peculiar thing to be concerned about when most operating systems support names as long as 255 characters, you can always split the filename into a path and filename component. This has the advantage of splitting up the files into different directories:
fdadda221fd71619e6c0139730b012577dd4de90
fdadda221fd71619e6c/0139730b012577dd4de90
fdad/da22/1fd7/1619/e6c0/1397/30b0/1257/7dd4/de90
Depends on the purpose of the hash.
Hash functions for use in hash tables tend to have more "randomness" in the lower bits (which are used to find the array index) than in the higher bits. Checksum and cryptographic hash functions are more evenly distributed.

Generating unique N-valued key

I want to generate unique random, N-valued key.
This key can contain numbers and latin characters, i.e. A-Za-z0-9.
The only solution I am thinking about is something like this (pseudocode):
key = "";
smb = "ABC…abc…0123456789"; // allowed symbols
for (i = 0; i < N; i++) {
key += smb[rnd(0, smb.length() - 1)]; // select symbol at random position
}
Is there any better solution? What can you suggest?
I would look into GUIDs. From the Wikipedia entry, "the primary purpose of the GUID is to have a totally unique number," which sounds exactly like what you are looking for. There are several implementations out there that generate GUIDs, so it's likely you will not have to reinvent the wheel.
Keeping in mind that the whole field of cryptography relies on, amongst other things, making random numbers. Therefore the NSA, the CIA, and some of the best mathematicians in the world are working on this so I guarantee you that there are better ideas.
Me? I'd just do what fbrereto suggests, and just get a guid. Or look into cryptographic key generators, or y'know, some lava lamps and a camera.
Oh, and as to the code you have; depending on the language, you may need to seed the RNG, or it'll generate the same key every time.
Whatever you do, if you wind up generating a key that uses all numbers and all letters, and if a person is ever going to see that key (which is likely if you are using numbers and letters), omit the characters l, I, 1, O, and 0. People get them confused.
Nothing in your post addresses the question of uniqueness. You're going to have to have some way of not generating the same key twice. Usually, when I need a unique key, I have some unique information to start with. I usually take a one-way hash like MD5, then there are ways to convert that to a key with varying degrees of readability:
Convert to hex
Base64 encode it
Use bits of of the key to index into a list of words.
Example: the unique string computed by hashing the part of this answer above the horizontal line is
abduction's brogue's melted bragger's
You could do a base64 encoding of some random data and remove the +, /, and = characters from the result? I don't know if this would make a predictable distribution. Also, it seems like more work that what you're doing now, which is a fine solution.
Assuming you're using a language/library without an utterly pathetic random number generator, what you've got looks pretty good. N symbols uniformly distributed over a reasonable alphabet works for me, and no amount of applying fancier code is likely to make it more random (just slower).
(For the record, pathetic would include ditching the high-order bits of the underlying random numbers when choosing a value from the given range. While ideally all RNGs would make every bit equally random, in practice that's not so; the higher-order bits tend to be more random. This means that the modulus operator is totally the wrong thing to use when clamping to a restricted range.)

Guessing the hash function?

I'd like to know which algorithm is employed. I strongly assume it's something simple and hopefully common. There's no lag in generating the results, for instance.
Input: any string
Output: 5 hex characters (0-F)
I have access to as many keys and results as I wish, but I don't know how exactly I could harness this to attack the function. Is there any method? If I knew any functions that converted to 5-chars to start with then I might be able to brute force for a salt or something.
I know for example that:
a=06a07
b=bfbb5
c=63447
(in case you have something in mind)
In normal use it converts random 32-char strings into 5-char strings.
The only way to derive a hash function from data is through brute force, perhaps combined with some cleverness. There are an infinite number of hash functions, and the good ones perform what is essentially one-way encryption, so it's a question of trial and error.
It's practically irrelevant that your function converts 32-character strings into 5-character hashes; the output is probably truncated. For fun, here are some perfectly legitimate examples, the last 3 of which are cryptographically terrible:
Use the MD5 hashing algorithm, which generates a 16-character hash, and use the 10th through the 14th characters.
Use the SHA-1 algorithm and take the last 5 characters.
If the input string is alphabetic, use the simple substitution A=1, B=2, C=3, ... and take the first 5 digits.
Find each character on your keyboard, measure its distance from the left edge in millimeters, and use every other digit, in reverse order, starting with the last one.
Create a stackoverflow user whose name is the 32-bit string, divide 113 by the corresponding user ID number, and take the first 5 digits after the decimal. (But don't tell 'em I told you to do it!)
Depending on what you need this for, if you have access to as many keys and results as you wish, you might want to try a rainbow table approach. 5 hex chars is only 1mln combinations. You should be able to brute-force generate a map of strings that match all of the resulting hashes in no time. Then you don't need to know the original string, just an equivalent string that generates the same hash, or brute-force entry by iterating over the 1mln input strings.
Following on from a comment I just made to Pontus Gagge, suppose the hash algorithm is as follows:
Append some long, constant string to the input
Compute the SHA-256 hash of the result
Output the last 5 chars of the hash.
Then I'm pretty sure there's no computationally feasible way from your chosen-plaintext attack to figure out what the hashing function is. To even prove that SHA-256 is in use (assuming it's a good hash function, which as far as we currently know it is), I think you'd need to know the long string, which is only stored inside the "black box".
That said, if I knew any published 20-bit hash functions, then I'd be checking those first. But I don't know any: all the usual non-crypto string hashing functions are 32 bit, because that's the expected size of an integer type. You should perhaps compare your results to those of CRC, PJW, and BUZ hash on the same strings, as well as some variants of DJB hash with different primes, and any string hash functions built in to well-known programming languages, like java.lang.String.hashCode. It could be that the 5 output chars are selected from the 8 hex chars generated by one of those.
Beyond that (and any other well-known string hashes you can find), I'm out of ideas. To cryptanalyse a black box hash, you start by looking for correlations between the bits of the input and the bits of the output. This gives you clues what functions might be involved in the hash. But that's a huge subject and not one I'm familiar with.
This sounds mildly illicit.
Not to rain on your parade or anything, but if the implementors have done their work right, you wouldn't notice lags beyond a few tens of milliseconds on modern CPU's even with strong cryptographic hashes, and knowing the algorithm won't help you if they have used salt correctly. If you don't have access to the code or binaries, your only hope is a trivial mistake, whether caused by technical limitations or carelesseness.
There is an uncountable infinity of potential (hash) functions for any given set of inputs and outputs, and if you have no clue better than an upper bound on their computational complexity (from the lag you detect), you have a very long search ahead of you...

two-way keyed encryption/hash algorithm

I am no way experienced in this type of thing so I am not even sure of the keywords (hence the title).
Basically I need a two way function
encrypt(w,x,y) = z
decrypt(z) = w, x, y
Where w = integer
x = string (username)
y = unix timestamp
and z = is an 8 digit number (possibly including letters, spec isn't there yet.)
I would like z to be not easily guessable and easily verifiable. Speed isn't a huge concern, security isn't either. Tracking one-to-one relationship is the main requirement.
Any resources or direction would be appreciated.
EDIT
Thanks for the answers, learning a lot. So to clarify, 8 characters is the only hard requirement, along with the ability to link W <-> Z. The username (Y) and timestamp (Z) would be considered icing on the cake.
I would like to do this mathematically rather than doing some database looks up, if possible.
If i had to finish this tonight, I could just find a fitting hash algorithm and use a look up table. I am simply trying to expand my understanding of this type of thing and see if I could do it mathematically.
Encryption vs. Hashing
This is an encryption problem, since the original information needs to be recovered. The quality of a cryptographic hash is judged by how difficult it is to reverse the hash and recover the original information, so hashing is not applicable here.
To perform encryption, some key material is needed. There are many encryption algorithms, but they fall into two main groups: symmetric and asymmetric.
Application
The application here isn't clear. But if you are "encrypting" some information and sending it somewhere, then later getting it back and doing something with it, symmetric encryption is the way to go. For example, say you want to encode a user name, an IP address, and some identifier from your application in a parameter that you include in a link in some HTML. When the user clicks the link, that parameter is passed back to your application and you decode it to recover the original information. That's a great fit for symmetric encryption, because the sender and the recipient are the same party, and key exchange is a no-op.
Background
In symmetric encryption, the sender and recipient need to know the same key, but keep it secret from everyone else. As a simple example, two people could meet in person, and decide on a password. Later on, they could use that password to keep their email to each other private. However, anyone who overhears the password exchange will be able to spy on them; the exchange has to happen over a secure channel... but if you had a secure channel to begin with, you wouldn't need to exchange a new password.
In asymmetric encryption, each party creates a pair of keys. One is public, and can be freely distributed to anyone who wants to send a private message. The other is private. Only the message recipient knows that private key.
A big advantage to symmetric encryption is that it is fast. All well-designed protocols use a symmetric algorithm to encrypt large amounts of data. The downside is that it can be difficult to exchange keys securely—what if you can't "meet up" (virtually or physically) in a secure place to agree on a password?
Since public keys can be freely shared, two people can exchange a private message over an insecure channel without having previously agreed on a key. However, asymmetric encryption is much slower, so its usually used to encrypt a symmetric key or perform "key agreement" for a symmetric cipher. SSL and most cryptographic protocols go through a handshake where asymmetric encryption is used to set up a symmetric key, which is used to protect the rest of the conversation.
You just need to encrypt a serialization of (w, x, y) with a private key. Use the same private key to decrypt it.
In any case, the size of z cannot be simply bounded like you did, since it depends on the size of the serialization (since it needs to be two way, there's a bound on the compression you can do, depending on the entropy).
And you are not looking for a hash function, since it would obviously lose some information and you wouldn't be able to reverse it.
EDIT: Since the size of z is a hard limit, you need to restrict the input to 8 bytes, and choose a encryption technique that use 64 bits (or less) block size. Blowfish and Triple DES use 64 bits blocks, but remember that those algorithms didn't receive the same scrutiny as AES.
If you want something really simple and quite unsecure, just xor your input with a secret key.
You probably can't.
Let's say that w is 32 bits, x supports at least 8 case-insensitive ASCII chars, so at least 37 bits, and y is 32 bits (gets you to 2038, and 31 bits doesn't even get you to now).
So, that's a total of at least 101 bits of data. You're trying to store it in an 8 digit number. It's mathematically impossible to create an invertible function from a larger set to a smaller set, so you'd need to store more than 12.5 bits per "digit".
Of course if you go to more than 8 characters, or if your characters are 16 bit unicode, then you're at least in with a chance.
Let's formalize your problem, to better study it.
Let k be a key from the set K of possible keys, and (w, x, y) a piece of information, from a set I, that we need to crypt. Let's define the set of "crypted-messages" as A8, where A is the alphabet from which we extract the characters to our crypted message (A = {0, 1, ..., 9, a, b, ..., z, ... }, depending on your specs, as you said).
We define the two functions:
crypt: I * K --> A^8.
decrypt A^8 * K --> I
The problem here is that the size of the set A^8, of crypted-messages, might be smaller than the set of pieces of information (w, x, y). If this is so, it is simply impossible to achieve what you are looking for, unless we try something different...
Let's say that only YOU (or your server, or your application on your server) have to be able to calculate (w, x, y) from z. That is, you might send z to someone, and you don't care that they will not be able to decrypt it.
In this case, what you can do is use a database on your server. You will crypt the information using a well-known algorithm, than you generate a random number z. You define the table:
Id: char[8]
CryptedInformation: byte[]
You will then store z on the Id column, and the crypted information on the corresponding column.
When you need to decrypt the information, someone will give you z, the index of the crypted information, and then you can proceed to decryption.
However, if this works for you, you might not even need to crypt the information, you could have a table:
Id: char[8]
Integer: int
Username: char[]
Timestamp: DateTime
And use the same method, without crypting anything.
This can be applied to an "e-mail verification system" on a subscription process, for example. The link you would send to the user by mail would contain z.
Hope this helps.
I can't tell if you are trying to set this up a way to store passwords, but if you are, you should not use a two way hash function.
If you really want to do what you described, you should just concatenate the string and the timestamp (fill in extra spaces with underscores or something). Take that resulting string, convert it to ASCII or UTF-8 or something, and find its value modulo the largest prime less than 10^8.
Encryption or no encryption, I do not think it is possible to pack that much information into an 8 digit number in such a way that you will ever be able to get it out again.
An integer is 4 bytes. Let's assume your username is limited to 8 characters, and that characters are bytes. Then the timestamp is at least another 4 bytes. That's 16 bytes right there. In hex, that will take 32 digits. Base36 or something will be less, but it's not going to be anywhere near 8.
Hashes by definition are one way only, once hashed, it is very difficult to get the original value back again.
For 2 way encryption i would look at TripleDES which .net has baked right in with TripleDESCryptoServiceProvider.
A fairly straight forward implementation article.
EDIT
It has been mentioned below that you can not cram a lot of information into a small encrypted value. However, for many (not all) situations this is exactly what Bit Masks exist to solve.

Resources