Solidity - Generate unpredictable random number that does not depend on input - random

I know that the "how to generate random number" in solidity is a very common question. However, after reading the great majority of answers I did not find one to fit my case.
A short description of what I want to do is: I have a list of objects that each have a unique id, a number. I need to produce a list that contains 25% of those objects, randomly selected each time the function is called. The person calling the function cannot be depended on to provide input that will somehow influence predictably the resulting list.
The only answer I found that gives a secure random number was Here. However, it depends on input coming from the participants and it is meant to address a gambling scenario. I cannot use it in my implementation.
All other cases mention that the number generated is going to be predictable, and even some of those depend on a singular input to produce a single random number. Once again, does not help me.
Summarising, I need a function that will give me multiple, non-predictable, random numbers.
Thanks for any help.

Here is an option:
function rand()
public
view
returns(uint256)
{
uint256 seed = uint256(keccak256(abi.encodePacked(
block.timestamp + block.difficulty +
((uint256(keccak256(abi.encodePacked(block.coinbase)))) / (now)) +
block.gaslimit +
((uint256(keccak256(abi.encodePacked(msg.sender)))) / (now)) +
block.number
)));
return (seed - ((seed / 1000) * 1000));
}
It generates a random number between 0-999, and basically it's impossible to predict it (It has been used by some famous Dapps like Fomo3D).

Smart Contracts are deterministic, so, basically every functions are predictable - if we know input, we will be and we should be know output. And you cannot get random number without any input - almost every language generates "pseudo random number" using clock. This means, you will not get random number in blockchain using simple method.
There are many interesting methods to generate random number using Smart Contract - using DAO, Oracle, etc. - but they all have some trade-offs.
So in conclusion, There is no method you are looking for. You need to sacrifice something.
:(

100% randomness is definitely impossible on Ethereum. The reason for that is that when distributed nodes are building from the scratch the blockchain they will build the state by running every single transaction ever created on the blockchain, and all of them have to achieve the exact same final status. In order to do that randomness is totally forbidden from the Ethereum Virtual Machine, since otherwise each execution of the exact same code would potentially yield a different result, which would make impossible to reach a common final status among all participants of the network.
That being said, there are projects like RanDAO that pretend to create trustable pseudorandomness on the blockchain.
In any case, there are approaches to achieve pseudandomness, being two of the most important ones commit-reveal techniques and using an oracle (or a combination of both).
As an example that just occurred to me: you could use Oraclize to call from time to time to a trusted external JSON API that returns pseudorandom numbers and verify on the contract that the call has truly been performed.
Of course the downside of these methods is that you and/or your users will have to spend more gas executing the smart contracts, but it's in my opinion a fair price for the huge benefits in security.

Related

Can I rely on Go's `crypto/rand` package to give me unique random string?

I want to generate 32 characters long unique unguessable alphanumeric secret keys. The secret key will be an identifier for my system and will be used to look up information.
While searching the web I stumbled upon the crypto/rand package of Go. Which is able to generate random alphanumerics with the help of underline system calls. But I am concerned that the value returned by the crypto/rand package might produce a non-unique string down the line.
Can anyone clarify if I can rely on the crypto/rand package for the job?
Of course with randomly generated tokens, there is always the possibility of generating a duplicate token. There are standards such as UUID (excluding v4) that use other methods to try to "guarantee" uniqueness of each identifier. These methods do not truly obviate the possibility of collisions, they just shift the failure modes. For example, UUID1 relies on uniqueness of MAC addresses, which is a whole issue of its own.
If you are not limited by the size of your tokens, you can easily pick a sufficiently large number of bits that the probability of collisions becomes so small that it is completely dwarfed by countless other failure modes (such as programmer error, cosmic rays, a mass global extinction event, etc.).
Very approximately, if you have a true random key length of N bits, you can generate 2^(N/2) keys before having a 50% chance of seeing collisions. See the Wikipedia page for UUID#Collisions for a more general formula.
With crypto/rand there is no guarantee that individual random numbers will occur more than once. The probability of this to happen is very low, however, and it may be good enough for your use case. In many cases UUID will be good enough. If you are curious about the probability of duplicate UUIDs, see Wikipedia for example.
If you really need true uniqueness you may want to combine random numbers with a map to record them, where the number serves as key and the value is a "don't care". While recording the numbers, duplicates can be detected and a new random can be requested in case. However, this approach may introduce a new challenge depending on your setting as the numbers are now kept in memory which is insecure per se. It will also be challenging in terms of complexity if your use case does not determine the quantity of secrets required during the lifetime of the system.
For me, it really boils down to the question whether the identifiers for your system you use for info lookups are really secrets or you just want unique identifiers which are hard to predict before they occur in the system. Maybe you can elaborate on your use case to clarify your requirements.
I think, for this type of thing, you should use UUID
package main
import (
"fmt"
"github.com/google/uuid"
)
func main() {
id := uuid.New()
fmt.Println(id.String())
}

Comparing secret data without giving away source

Issue:
Company A has secret data they don't want to give away to company B.
Company B has secret data they don't want to give away to company A.
The secret data is IP addresses on both sides.
But the two companies want to know the number of overlapping IPs they have (IP addresses that both companies have in the database).
Without using a third party I can't think of a way to solve this issue without one party compromising their secret data set. Is there any type of hashing algo written to solve this problem?
First I'll describe a simple but not very secure idea. Then I'll describe a way that I think it can be easily made much more secure. The basic idea is to have each company send an encoding of a one-way function to the other company.
Sending Programs
As a warm-up, let's first suppose that one company (let's say A) develops an ordinary computer program in some language and sends it to B; B will then run it, supplying its own list of email addresses as input, and the program will report how many of them are also used by A. At this point, B knows how many email addresses it shares with A. Then the process can be repeated, but with the roles of A and B reversed.
Sending SAT Instances
Implementing this program straightforwardly in a normal programming language would yield a program that is almost trivially easy to reverse-engineer. To mitigate this, first, instead of having the program report the count directly, let's reformulate the problem as a decision problem: Does the other company have at least k of the emails in the input? (This involves choosing some value k to test for; of course, if both parties agree then the whole procedure can be performed for many different values of k. (But see the last section for possible ramifications.)) Now the program can be represented instead as a SAT instance that takes as input (some bitstring encoding of) a list of email addresses, and outputs a single bit that indicates whether k or more of them also belong to the company that created the instance.
It's computationally easy to supply inputs to a SAT instance and read off the output bit, but when the instance is large, it's (in principle) very difficult to go in "the other direction" -- that is, to find a satisfying assignment of inputs, i.e., a list of email addresses that will drive the output bit to 1: SAT being an NP-hard problem, all known exact techniques take time exponential in the problem size.
Making it Harder with Hashing
[EDIT: Actually there are many more than (n choose k) possible hashes to be ORed together, since any valid subsequence (with gaps allowed) in the list of email addresses that contains at least k shared ones needs to turn the output bit on. If each email address takes at most b bits, then there are much more than 2^((n-k)b)*(n choose k) possibilities. It's probably only feasible to sample a small fraction of them, and I don't know if unsampled ones can be somehow turned into "don't-cares"...]
The SAT instance I propose here would certainly be very large, as it would have to be a disjunction (OR) of all (n choose k) possible allowed bitstrings. (Let's assume that email addresses are required to be listed in some particular order, to wipe off an n-factorial factor.) However it has a very regular structure that might make it amenable to analysis that could dramatically reduce the time required to solve it. To get around this, all we need to do is to require the receiver to hash the original input and supply this hash value as input instead. The resulting SAT instance will still look like the disjunction (OR) of (n choose k) possible valid bitstrings (which now represent hashes of lists of strings, rather than raw lists of strings) -- but, by choosing a hash size large enough and applying some logic minimisation to the resulting instance, I'm confident that any remaining telltale patterns can be removed. (If anyone with more knowledge in the area can confirm or deny this, please edit or comment.)
Possible Attacks
One weakness of this approach is that nothing stops the receiver from "running" (supplying inputs to) the SAT instance many times. So, choosing k too low allows the receiver to easily isolate the email addresses shared with the sender by rerunning the SAT instance many times using different k-combinations of their own addresses, and dummy values (e.g. invalid email addresses) for the remaining input bits. E.g. if k=2, then the receiver can simply try running all n^2 pairs of its own email addresses and invalid email addresses for the rest until a pair is found that turns the output bit on; either of these email addresses can then be paired with all remaining email addresses to detect them in linear time.
You should be able to use homomorphic encryption to carry out the computation. I imagine creating something like bitmasks on both sites, performing encryption, then performing a XOR of the result. I think this source points to some information on what encryption you can perform that supports XOR.

Save/Restore Ruby's Random

I'm trying to create a game, which I want to always run the same given the same seed. That means that random events - be them what they may - will always be the same for two players using the same seed.
However, given the user's ability to save and load the game, Ruby's Random would reset every time the save loaded, making the whole principle void if two players save and load at different points.
The only solution I have imagined for this is, whenever a save file is loaded, to generate the same number of points as before, and thus getting Ruby's Random to the same state as it was before load. However, to do that I'd need to extend it so a counter is updated every time a random number is generated.
Does anyone know how to do that or has a better way to restore the state of Ruby's Random?
PS: I cannot use an instance of Random (Random.new) and Marshall it. I have to use Ruby's default.
Sounds like Marshal.dump/Marshal.load may be exactly what you want. The Random class documentation explicitly states "Random objects can be marshaled, allowing sequences to be saved and resumed."
You may still have problems with synchronization across games, since different user-based decisions can take you through different logic paths and thus use the sequence of random numbers in entirely different ways.
I'd suggest maybe saving the 'current' data to a file when the user decides to save (or when the program closes) depending on what you prefer.
This can be done using the File class in ruby.
This would mean you'd need to keep track of turns and pass that along with the save data. Or you could loop through the data in the file and find out how many turns have occurred that way I suppose.
So you'd have something like:
def loadGame(loadFile)
loadFile.open
data = loadFile.read
# What you do below here depends on how you decide to store the data in saveGame.
end
def saveGame(saveFile)
saveFile.open
saveFile.puts data
end
Havent really tried the above code so it could be bad syntax or such. It's mainly just the concept I'm trying to get across.
Hopefully that helps?
There are many generators that compute each random number in the sequence from the previous value alone, so if you used one of those you need only save the last random number as part of the state of the game. An example is a basic linear congruential generator, which has the form:
z(n+1) = (az(n) + b) mod c
where a, b and c are typically large (known) constants, and z(0) is the seed.
An arguably better one is the so-called "mulitply-with-carry" method.

Random Number Generator that Allows "Indexing"

I hope it's not too obvious a question: is there a random number generation algorithm that doesn't depend on previously returned values, so that I can get (for example) the 50th number in the sequence, without computing the previous 49?
The reason is that I am making roguelike that will be persistent (so that I can recreate the exact same level from the same seed), but to compute certain features of each level, I don't want to have to "compute" all previous features just to get the random number generator to the correct "state" of having been used, for example, 100 times so far. I would like to be able to query the 101st random number without determining previous values so that the program can create level features separately.
You can encrypt ordinary sequence number [1..N] with any cipher,
and by this way - generate unique pseudorandom value for each SeqNo.
If you use a linear congruential random number generator, it is trivial to compute the $n$-th element generated from a given seed. But it is probably easier just to stash away the state at the "interesting" points of the game.
OTOH, if you want to "restart" the game at a certain point, you'll presumably want to be able to recreate the dungeon's features, but (due to different player actions) the RNG usage will be different from then on. I.e., if started at the same point, if I shoot twice at a monster the RNG will be used more times than if I just run away; the next item generated will get different values. Perhaps what you really want is several independent random number streams, and saving the states as needed?
There are lots of roguelike games around, mostly open source. Some are limited/small (from "build a game in a day" sort of competitions), and might make a good starting point for you. Why start your own, and not hack on an existing one?

A good algorithm for generating an order number

As much as I like using GUIDs as the unique identifiers in my system, it is not very user-friendly for fields like an order number where a customer may have to repeat that to a customer service representative.
What's a good algorithm to use to generate order number so that it is:
Unique
Not sequential (purely for optics)
Numeric values only (so it can be easily read to a CSR over phone or keyed in)
< 10 digits
Can be generated in the middle tier without doing a round trip to the database.
UPDATE (12/05/2009)
After carefully reviewing each of the answers posted, we decided to randomize a 9-digit number in the middle tier to be saved in the DB. In the case of a collision, we'll regenerate a new number.
If the middle tier cannot check what "order numbers" already exists in the database, the best it can do will be the equivalent of generating a random number. However, if you generate a random number that's constrained to be less than 1 billion, you should start worrying about accidental collisions at around sqrt(1 billion), i.e., after a few tens of thousand entries generated this way, the risk of collisions is material. What if the order number is sequential but in a disguised way, i.e. the next multiple of some large prime number modulo 1 billion -- would that meet your requirements?
<Moan>OK sounds like a classic case of premature optimisation. You imagine a performance problem (Oh my god I have to access the - horror - database to get an order number! My that might be slow) and end up with a convoluted mess of psuedo random generators and a ton of duplicate handling code.</moan>
One simple practical answer is to run a sequence per customer. The real order number being a composite of customer number and order number. You can easily retrieve the last sequence used when retriving other stuff about your customer.
One simple option is to use the date and time, eg. 0912012359, and if two orders are received in the same minute, simply increment the second order by a minute (it doesn't matter if the time is out, it's just an order number).
If you don't want the date to be visible, then calculate it as the number of minutes since a fixed point in time, eg. when you started taking orders or some other arbitary date. Again, with the duplicate check/increment.
Your competitors will glean nothing from this, and it's easy to implement.
Maybe you could try generating some unique text using a markov chain - see here for an example implementation in Python. Maybe use sequential numbers (rather than random ones) to generate the chain, so that (hopefully) the each order number is unique.
Just a warning, though - see here for what can possibly happen if you aren't careful with your settings.
One solution would be to take the hash of some field of the order. This will not guarantee that it is unique from the order numbers of all of the other orders, but the likelihood of a collision is very low. I would imagine that without "doing a round trip to the database" it would be challenging to make sure that the order number is unique.
In case you are not familiar with hash functions, the wikipedia page is pretty good.
You could base64-encode a guid. This will meet all your criteria except the "numeric values only" requirement.
Really, though, the correct thing to do here is let the database generate the order number. That may mean creating an order template record that doesn't actually have an order number until the user saves it, or it might be adding the ability to create empty (but perhaps uncommitted) orders.
Use primitive polynomials as finite field generator.
Your 10 digit requirement is a huge limitation. Consider a two stage approach.
Use a GUID
Prefix the GUID with a 10 digit (or 5 or 4 digit) hash of the GUID.
You will have multiple hits on the hash value. But not that many. The customer service people will very easily be able to figure out which order is in question based on additional information from the customer.
The straightforward answer to most of your bullet points:
Make the first six digits a sequentially-increasing field, and append three digits of hash to the end. Or seven and two, or eight and one, depending on how many orders you envision having to support.
However, you'll still have to call a function on the back-end to reserve a new order number; otherwise, it's impossible to guarantee a non-collision, since there are so few digits.
We do TTT-CCCCCC-1A-N1.
T = Circuit type (D1E=DS1 EEL, D1U=DS1 UNE, etc.)
C = 6 Digit Customer ID
1 = The customer's first location
A = The first circuit (A=1, B=2, etc) at this location
N = Order type (N=New, X=Disconnect, etc)
1 = The first order of this kind for this circuit

Resources