Secure Random hex digits only - ruby

Trying to generate random digits with SecureRandom class of rails. Can we create a random number with SecureRandom.hex which includes only digits and no alphabets.
For example:
Instead of
SecureRandom.hex(4)
=> "95bf7267"
It should give
SecureRandom.hex(4)
=> "95237267"

Check out the api for SecureRandom: http://rails.rubyonrails.org/classes/ActiveSupport/SecureRandom.html
I believe you're looking for a different method: #random_number.
SecureRandom.random_number(a_big_number)
Since #hex returns a hexadecimal number, it would be unusual to ask for a random result that contained only numerical characters.
For basic use cases, it's simple enough to use #rand.
rand(9999)
Edited:
I'm not aware of a library that generates a random number of specified length, but it seems simple enough to write one. Here's my pass at it:
def rand_by_length(length)
rand((9.to_s * length).to_i).to_s.center(length, rand(9).to_s).to_i
end
The method #rand_by_length takes an integer specifying length as a param and tries to generate a random number of max digits based on the length. String#center is used to pad the missing numbers with random number characters. Worst case calls #rand for each digit of specified length. That may serve your need.

Numeric id's are good because they are easier to read over the phone (no c for charlie).
Try this
length = 20
id = (SecureRandom.random_number * (10**length)).round.to_s # => "98075825200269950976"
and for bonus points break it up for easier reading
id.split(//).each_slice(4).to_a.map(&:join).join('-') # => "9807-5825-2002-6995-0976"

This will create a number of the desired length.
length = 11
rand(10**length..(10**length+1)-1).to_s

length = 4
[*'0'..'9'].sample(length).join
as simple as that :)

Related

`sum` function not working as expected in case of large strings

In Ruby, #sum is used to calculate
Sum of array
Sum of an array based on a function or condition
Sum of ASCII codepoints (ord) in a string (not char array) i.e. 'abcd'.sum # => 394
The problem with the third one is the following
For the string below,
AwotIJHOAIJSRoieJHOjasOIADaoiHAOHJAOIJGOIajdOIQWJTOIGJDOINCOIASORIOGIMAOIMEORIQEMOIGMEOIFMASKDJQOWJGOJOASJOIQWOGIMASOIDMOQWIROQIGJOIAMSFOAIJGIHIWUNVNZMXCNXCKJQOWRIEOGSDGSPOKSDLAMKMROQIJRDFLKMZXOIAJSQPIRKLMAdglkaSFAJOIAJFOIQWJEOIQJKAMCLKACMALKSDLAKWEQANLEIRJRQFIJAOIVAWOTIJHOAIJSROIEJHOJASOIADAOIHAOHJAOIJGOIAJDOIQWJTOIGJDOINCOIASORIOGIMAOIMEORIQEMOIGMASODLQWKEJOIFJLKMALSKQIOWELKMZLXKMFALSFJQOIWEAOISFWIDHGPSODRJAWOPIJHOIDJOIAJTGIOJAORAJWOIJHOFMAOIFMOIPDMOAIPWJTOPIJDOIFjawoiRJOIpjmaioGJIGHAIJRHQHQIUEIvnaksJDNWIORQIOPEGHIDVNAJKNASIPHRQEUITHIUHDNAJSNWIHJQIWJQEOIGOIDVNAKOSDNAOPWPJQOPIWTJQEOIPGDPJFNASPJNQWOIRQWIOTOIVNAKSFNAIOAWOTIJHOAIJSROIEJHOJASOIADAOIHAOHJAOIJGOIAJDOIQWJTOIGJDOINCOIASORIOGIMAOIMEORIQEMOIGMASODLQWKEJOIFJLKMALSKQIOWELKMZLXKMFALSFJQOIWEAOISFWIDHGPSODRJAWOPIJHOIDJoiajTGIOJAORAJWOIJHOFMAOIFMOIPDMOAIPWJTOPIJDOIFJAWOIRJOIPJMAIOGJIGHAIJRHQHQIUEIVNAKSJDNWIORQIOPEGHIDVIPNWIHJQIWJQEOIGOIDVNAKOSDNAOPWPJQOPIWTJqeoIPGDPJFNASPJNQWJQWOIRJgonasKFAWOEJQWOIJOGALKFNASLFKqeqOFIJAOISFJAOISFJAWOI
which is large, (of 1000 characters), the following program doesn't work
putc gets.upcase.sum/~/$/
It works for all other strings of lesser size. The output of the above must be K. But it shows \9
But if I do this
putc gets.upcase.chars.sum(&:ord)/~/$/
It shows K. But the former one gives the correct output for all the other string except the large ones like this.
What is wrong here?
EDIT : Try it Online link
Try it online!
Sum of ASCII codepoints (ord) in a string (not char array) i.e. 'abcd'.sum # => 394
I've actually never heard of String#sum before, despite being fairly knowledgeable in the language. So I looked it up:
Returns a basic n-bit checksum of the characters in str, where n is the optional Integer parameter, defaulting to 16. The result is simply the sum of the binary value of each byte in str modulo 2**n - 1. This is not a particularly good checksum.
And sure enough, using your example input string, that's why we get:
str.chars.map(&:ord).sum
# => 77090
str.sum
# => 11554
The values are different because 77090 > 2**15. Moreover, 77090 % 2**15 == 11554.
If you use a larger value for n, the (check)sum is what you expected:
str.sum(100)
#=> 77090

Generate increasing random number rails

I've a random number generator code:
5.times.map { [*0..9].sample }.join.to_i
It gives me random numbers like 63832, 42337, 34998. As you can see that they are completely random, but how to make than I would get only in an increasing way? Not 63832, 42337, 34998, but 34998, 42337, 63832 (this is just an example, Ideally I would get smth like 00[number] => 0025, where 25 is a random number which was generated.
Hope my explanation is understandable :)
If you have the current / last random number, you can generate a larger one by simply adding a random number to it, e.g:
def generate(base = 0)
base + rand(1_000..10_000)
end
number = generate #=> 9635
number = generate(number) #=> 17761
number = generate(number) #=> 22082
number = generate(number) #=> 31061
Each number is 1,000 to 10,000 larger than its predecessor.
An alternative approach, if you want to generate all random numbers within a known range:
[*1..10000].sample(5).sort
# => [602, 5608, 7912, 8384, 8714]
However, this only works if you want to fetch all random numbers upfront, rather than continuously being able to generate new ones which are larger.
It's also not a good approach if your upper limit is very big - e.g. this will freeze your system and need to be cancelled:
[*1..10000000000].sample(5).sort
...But in that case, since the numbers are so huge, you can surely get away with the tiny risk of having a collision:
5.times.map{ rand(1..10000000000) }.sort
# => [460188573, 555213355, 3576967759, 3994239233, 9570165205]

Hashing a long integer ID into a smaller string

Here is the problem, where I need to transform an ID (defined as a long integer) to a smaller alfanumeric identifier. The details are the following:
Each individual on the problem as an unique ID, a long integer of size 13 (something like 123123412341234).
I need to generate a smaller representation of this unique ID, a alfanumeric string, something like A1CB3X. The problem is that 5 or 6 character length will not be enough to represent such a large integer.
The new ID (eg A1CB3X) should be valid in a context where we know that only a small number of individuals are present (less than 500). The new ID should be unique within that small set of individuals.
The new ID (eg A1CB3X) should be the result of a calculation made over the original ID. This means that taking the original ID elsewhere and applying the same calculation, we should get the same new ID (eg A1CB3X).
This calculation should occur when the individual is added to the set, meaning that not all individuals belonging to that set will be know at that time.
Any directions on how to solve such a problem?
Assuming that you don't need a formula that goes in both directions (which is impossible if you are reducing a 13-digit number to a 5 or 6-character alphanum string):
If you can have up to 6 alphanumeric characters that gives you 366 = 2,176,782,336 possibilities, assuming only numbers and uppercase letters.
To map your larger 13-digit number onto this space, you can take a modulo of some prime number slightly smaller than that, for example 2,176,782,317, the encode it with base-36 encoding.
alphanum_id = base36encode(longnumber_id % 2176782317)
For a set of 500, this gives you a
2176782317P500 / 2176782317500 chance of a collision
(P is permutation)
Best option is to change the base to 62 using case sensitive characters
If you want it to be shorter, you can add unicode characters. See below.
Here is javascript code for you: https://jsfiddle.net/vewmdt85/1/
function compress(n) {
var symbols = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïð'.split('');
var d = n;
var compressed = '';
while (d >= 1) {
compressed = symbols[(d - (symbols.length * Math.floor(d / symbols.length)))] + compressed;
d = Math.floor(d / symbols.length);
}
return compressed;
}
$('input').keyup(function() {
$('span').html(compress($(this).val()))
})
$('span').html(compress($('input').val()))
How about using some base-X conversion, for example 123123412341234 becomes 17N644R7CI in base-36 and 9999999999999 becomes 3JLXPT2PR?
If you need a mapping that works both directions, you can simply go for a larger base.
Meaning: using base 16, you can reduce 1 to 16 to a single character.
So, base36 is the "maximum" that allows for shorter strings (when 1-1 mapping is required)!

Generating an Instagram- or Youtube-like unguessable string ID in ruby/ActiveRecord

Upon creating an instance of a given ActiveRecord model object, I need to generate a shortish (6-8 characters) unique string to use as an identifier in URLs, in the style of Instagram's photo URLs (like http://instagram.com/p/P541i4ErdL/, which I just scrambled to be a 404) or Youtube's video URLs (like http://www.youtube.com/watch?v=oHg5SJYRHA0).
What's the best way to go about doing this? Is it easiest to just create a random string repeatedly until it's unique? Is there a way to hash/shuffle the integer id in such a way that users can't hack the URL by changing one character (like I did with the 404'd Instagram link above) and end up at a new record?
Here's a good method with no collision already implemented in plpgsql.
First step: consider the pseudo_encrypt function from the PG wiki.
This function takes a 32 bits integer as argument and returns a 32 bits integer that looks random to the human eye but uniquely corresponds to its argument (so that's encryption, not hashing). Inside the function, you may change the formula: (((1366.0 * r1 + 150889) % 714025) / 714025.0) with another function known only by you that produces a result in the [0..1] range (just tweaking the constants will probably be good enough, see below my attempt at doing just that). Refer to the wikipedia article on the Feistel cypher for more theorical explanations.
Second step: encode the output number in the alphabet of your choice. Here's a function that does it in base 62 with all alphanumeric characters.
CREATE OR REPLACE FUNCTION stringify_bigint(n bigint) RETURNS text
LANGUAGE plpgsql IMMUTABLE STRICT AS $$
DECLARE
alphabet text:='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
base int:=length(alphabet);
_n bigint:=abs(n);
output text:='';
BEGIN
LOOP
output := output || substr(alphabet, 1+(_n%base)::int, 1);
_n := _n / base;
EXIT WHEN _n=0;
END LOOP;
RETURN output;
END $$
Now here's what we'd get for the first 10 URLs corresponding to a monotonic sequence:
select stringify_bigint(pseudo_encrypt(i)) from generate_series(1,10) as i;
stringify_bigint
------------------
tWJbwb
eDUHNb
0k3W4b
w9dtmc
wWoCi
2hVQz
PyOoR
cjzW8
bIGoqb
A5tDHb
The results look random and are guaranteed to be unique in the entire output space (2^32 or about 4 billion values if you use the entire input space with negative integers as well).
If 4 billion values was not wide enough, you may carefully combine two 32 bits results to get to 64 bits while not loosing unicity in outputs. The tricky parts are dealing correctly with the sign bit and avoiding overflows.
About modifying the function to generate your own unique results: let's change the constant from 1366.0 to 1367.0 in the function body, and retry the test above. See how the results are completely different:
NprBxb
sY38Ob
urrF6b
OjKVnc
vdS7j
uEfEB
3zuaT
0fjsab
j7OYrb
PYiwJb
Update: For those who can compile a C extension, a good replacement for pseudo_encrypt() is range_encrypt_element() from the permuteseq extension, which has of the following advantages:
works with any output space up to 64 bits, and it doesn't have to be a power of 2.
uses a secret 64-bit key for unguessable sequences.
is much faster, if that matters.
You could do something like this:
random_attribute.rb
module RandomAttribute
def generate_unique_random_base64(attribute, n)
until random_is_unique?(attribute)
self.send(:"#{attribute}=", random_base64(n))
end
end
def generate_unique_random_hex(attribute, n)
until random_is_unique?(attribute)
self.send(:"#{attribute}=", SecureRandom.hex(n/2))
end
end
private
def random_is_unique?(attribute)
val = self.send(:"#{attribute}")
val && !self.class.send(:"find_by_#{attribute}", val)
end
def random_base64(n)
val = base64_url
val += base64_url while val.length < n
val.slice(0..(n-1))
end
def base64_url
SecureRandom.base64(60).downcase.gsub(/\W/, '')
end
end
Raw
user.rb
class Post < ActiveRecord::Base
include RandomAttribute
before_validation :generate_key, on: :create
private
def generate_key
generate_unique_random_hex(:key, 32)
end
end
You can hash the id:
Digest::MD5.hexdigest('1')[0..9]
=> "c4ca4238a0"
Digest::MD5.hexdigest('2')[0..9]
=> "c81e728d9d"
But somebody can still guess what you're doing and iterate that way. It's probably better to hash on the content

Generating integer within range from unique string in ruby

I have a code that should get unique string(for example, "d86c52ec8b7e8a2ea315109627888fe6228d") from client and return integer more than 2200000000 and less than 5800000000. It's important, that this generated int is not random, it should be one for one unique string. What is the best way to generate it without using DB?
Now it looks like this:
did = "d86c52ec8b7e8a2ea315109627888fe6228d"
min_cid = 2200000000
max_cid = 5800000000
cid = did.hash.abs.to_s.split.last(10).to_s.to_i
if cid < min_cid
cid += min_cid
else
while cid > max_cid
cid -= 1000000000
end
end
Here's the problem - your range of numbers has only 3.6x10^9 possible values where as your sample unique string (which looks like a hex integer with 36 digits) has 16^32 possible values (i.e. many more). So when mapping your string into your integer range there will be collisions.
The mapping function itself can be pretty straightforward, I would do something such as below (also, consider using only a part of the input string for integer conversion, e.g. the first seven digits, if performance becomes critical):
def my_hash(str, min, max)
range = (max - min).abs
(str.to_i(16) % range) + min
end
my_hash(did, min_cid, max_cid) # => 2461595789
[Edit] If you are using Ruby 1.8 and your adjusted range can be represented as a Fixnum, just use the hash value of the input string object instead of parsing it as a big integer. Note that this strategy might not be safe in Ruby 1.9 (per the comment by #DataWraith) as object hash values may be randomized between invocations of the interpreter so you would not get the same hash number for the same input string when you restart your application:
def hash_range(obj, min, max)
(obj.hash % (max-min).abs) + [min, max].min
end
hash_range(did, min_cid, max_cid) # => 3886226395
And, of course, you'll have to decide what to do about collisions. You'll likely have to persist a bucket of input strings which map to the same value and decide how to resolve the conflicts if you are looking up by the mapped value.
You could generate a 32-bit CRC, drop one bit, and add the result to 2.2M. That gives you a max value of 4.3M.
Alternately you could use all 32 bits of the CRC, but when the result is too large, append a zero to the input string and recalculate, repeating until you get a value in range.

Resources