Find the reverse algorithm to go back to initial value - algorithm

I have a problem and try to solve it for hours. Here is a pseudocode:
x = 30
if x > 100 then max(function_1(x), function_2(x))
elseif x > 50 then max(function_3(x), function_4(x))
elseif x > 20 then max(function_5(x), function_6(x))
elseif x < 10 then function_7(x)
else function_8(x)
This was a code I run with different values of x. Then functions are mathematical formulas. Now, I have the result of the above for each x and I want to revert and go back to x again.
I found all the reversed mathematical formulas of functions. For example for function_1(x), I have a rev_function_1(y) that will get the result and will give me the initial x.
But, since the original code has a lot of cases, plus the MAX, I am not sure how I can run one code, for every value and return the original one.
Edit: All the functions are one-to-one
Edit2: It seems that the whole function is not one-to-one while each of them individually are. As a result, I have two x for every y and I cannot revert it.

You need to study the result space (or domain) of you functions.
There exists an inverse only if each x results in a unique f(x) that cannot be obtained for any other value of x. This property is called one-to-one
Let me give you an example:
Let's say that f(1) == 8 and that also f(10) == 8.
Then you don't know if the inverse of 8 is 1 or 10.
If the function is one-to-one the inverse will be a unique value. If it is not one-to-one the inverse may be more than one value.
The next step is to figure out which inverse to call.
One way to do it is to call the inverse of all subfunctions.
For each x value you get, calculate f(x). If f(x) gets back the value you wanted to inverse, then keep that x, otherwise throw it away.
When you have gone through all values you will have one (or more) matching x value.
Edit:
Another way is to pre-compute which function that corresponds to a certain interval of output values. You can store these in a database as the tuples:
lowerbound, upperbound, inverse_function
You can then find which function to use (assuming SQL):
SELECT inverse_function FROM lookup_table
WHERE :fx > lowerbound and :fx < upperbound
:fx is the value you want to inverse.

You have an output y for each x. If two xs produce the same y then you can't undo the mapping since y could have come from either x. If no two xs produce the same y then you know it came from that y's x.

NOTE: Since a reverse algorithm is required and OP have not made mandatory to use the original functions or their corresponding reverse functions so following method can be used.
"Now, I have the result of the above for each x and I want to revert and go back to x again.". Its seems that its a case of [Key] => [Value].
//one-to-many case.
if x > 100 then max(function_1(x), function_2(x))
elseif x > 50 then max(function_3(x), function_4(x))
Above piece of code tells that multiple different inputs "x" can produce same output "y".
So you can use std::multimap if you are using C++.
Now multimap can be directly used at input output level, that is, if given a input "x" and it produces an output "y" after running all the formulas then multimap.insert(std::pair<int,int>(y,x));
Therefore, now given an output "y" you can find all the prospective input "x" which could produce an output "y" as follows:
std::pair <std::multimap<int,int>::iterator, std::multimap<int,int>::iterator> ret;
ret = multimap.equal_range(y);
for (std::multimap<int,int>::iterator it=ret.first; it!=ret.second; ++it)
std::cout << ' ' << it->second;
If the relation between input "x" and its corresponding "y" is one-to-one then, std::map can be used.

I think that this is not possible, let's take this example
function_1(x) = x - 200
function_2(x) = x - 201
function_7(x) = x - 5
then for x = 200 => y =0 and for x = 5 => y =0
So for a given value of y we can have multiple values of x

There is no solution in the general case. Think about this set of formulas:
function_1(x) { return x }
function_2(x) { return x }
function_3(x) { return x }
....
I guess it's obvious why it can't work.

Related

Function including random number that can be inverted without the random number

Given a number x and a random number n, I am looking for two functions F and G so that:
y = F(x, n) where y is different for different values of n
x = G(y)
all numbers are (large, e.g. 256 bit) integers
For instance given a list of numbers k1, k2, k3, f4 generated by applying multiple times F, it is possible to calculate k3 from k4 but not k4 from k3 (the random number prevents the inversion).
The problem is obvious if we allow to use n (or derived) in G (it is basically an asymmetric encryption) but this is not the target.
Any idea?
Update
I found a function that works with infinite precision F = x * pow(coprime(x), n)
x = 29
p = 5
n = 20
def f(x,n):
return x * pow(p,n)
f(x,n) => 2765655517578125
and G becomes
def g(y):
x = y
while x % p == 0:
x = x/p
return x
g(y) = 29
Unfortunately this fails with overflow as soon as numbers become big (limited precision)
Second update: the problem has no solution
In fact let's start from a situation where the problem has a solution, which is when the domain of G and F is R.
In that case choosing a random output from any function F' that has multiple output will work.
For instance if then F(x, n) = acos(x) + 2nπ, where n random is Integer
then G(y) = cos(y). From y is always possible to go back to x, but not the opposite without knowing n.
A similar example can be built with operation with module, which will work with Integer domains without the need of real numbers.
Anyway this will fail when the domain is the same finite set (like on physical memory) for F and G. It can be proved by contradiction.
Let's assume that for finite domains D1=D2 of size N, a function F:D1->D2 exists that produces M outputs where M > 1.
Assuming that the function produces at least one output for each x in D1,
1 either D2 > D1
2 or outputs from F are the same for different values of x (some overlapping must exists)
Now 1 is against the requirement that D1=D2, while 2 is against the requirement that G(y) has a single output value
If we relax 1 and we allow D2 > D1, then we can solve the problem. This can be done by adding n (or a derivation of it) like suggested in some comments. For my specific scenario probably it makes more sense to use a EC public/private key but that is another story.
Many Thanks
Based on your requirements, the following should work. If there is some other requirement that I did not understand from your question, please clarify, because this seems to suffice based on your definition. In that case, I will change or delete this answer.
f(x, n) = x | n;
g(y | n) = y;
where | means concatenation of bits. We can assign a fixed (maximum) number of bits for n and pad with zeros.
there can be no solution for this problem because:
for a constant x1 and variable r you would have an output set with all Integers in it.
for a constant x2 and variable r again you would have an output set with all Integers in it.
so at best you can have a function g which would take a number from the output set of function f and return all possible answers which are infinite.
this is similar to writing a reverse hashing function; which defies logic.

Check whether A and B exists for the given values AorB and APlusB

Problem Statement:
You are given two non-negative integers: the longs pairOr and pairSum.
Determine whether it is possible that for two non-negative integers A and B we have both:
A or B = pairOr
A + B = pairSum
Above, "or" denotes the bitwise-or operator.
Return True if we can find such A and B, and False if not.
My Algorithm goes like this:
I've taken the equation: A | B = X and A + B =Y,
Now after substituting A's value from 2nd Equation, (Y-B) | B= X.
I'm going to traverse from 0 till Y (in place of B) to check if the above equation is true.
Code Snippet:
boolean isPossible(long orAandB,long plusAandB) {
for(long i=0;i<=plusAandB;i++) {
if(((plusAandB-i)|i)==orAandB ){
return true;
}
}
return false;
It will give TLE if the value of plusAndB is of number 10^18. Could you please help me optimize?
You don't need the full iteration, giving O(N). There's a way to do it in O(logN).
But completely solving the problem for you takes away most of the fun... ;-), so here's the main clue:
Your equation (Y-B) | B= X is one great observation, and the second is to have a look at this equation bit by bit, starting from the right (so you don't have to worry about borrow-bits in the first place). Which last-bit combinations of Y, X, and B can make your equation true? And if you found a B bit, how do you continue recursively with the higher bits (don't forget that subtraction may need a borrow)? I hope you remember the rules for subtracting binary numbers.
And keeping in mind that the problem only asks for true or false, not for any specific A or B value, can save you exponential complexity.

How map/tween a number based on a dynamic curve

I am really lacking terminology here, so any help with that appreciate. Even it doesn't answer the question it can hopefully get me closer to an answer.
How can I get y from a function of p where the curviness is also a variable (possible between 0 and 1? Or whatever is best?).
I am presuming p is always between 1 and 0, as is the output y.
The graphic is just an illustration, I don't need that exact curve but something close to this idea.
Pseudo code is good enough as an answer or something c-style (c, javascript, etc).
To give a little context, I have a mapping function where one parameter can be the – what I have called – easing function. There are based on the penner equations. So, for example if I wanted to do a easeIn I would provide:
function (p) { return p * p; };
But I would love to be able to do what is in the images: varying the ease dynamically. With a function like:
function (p, curviness) { return /* something */; }
You might try messing around with a Superellipse, it seems to have the shape malleability you're looking for. (Special case: Squircle)
Update
Ok, so the equation for the superellipse is as follows:
abs(x/a)^n + abs(y/b)^n = 1
You're going to be working in the range from [0,1] in both so we can discard the absolute values.
The a and b are for the major and minor ellipse axes; we're going to set those to 1 (so that the superellipse only stretches to +/-1 in either direction) and only look at the first quadrant ([0, 1], again).
This leaves us with:
x^n + y^n = 1
You want your end function to look something like:
y = f(p, n)
so we need to get things into that form (solve for y).
Your initial thought on what to do next was correct (but the variables were switched):
y^n = 1 - p^n
substituting your variable p for x.
Now, initially I'd thought of trying to use a log to isolate y, but that would mean we'd have to take log_y on both sides which would not isolate it. Instead, we can take the nth root to cancel the n, thus isolating y:
y = nthRoot(n, 1 - p^n)
If this is confusing, then this might help: square rooting is just raising to a power of 1/2, so if you took a square root of x you'd have:
sqrt(x) == x^(1/2)
and what we did was take the nth root, meaning that we raised things to the 1/n power, which cancels the nth power the y had since you'd be multiplying them:
(y^n)^(1/n) == y^(n * 1/n) == y^1 == y
Thus we can write things as
y = (1 - p^n)^(1/n)
to make things look better.
So, now we have an equation in the form
y = f(p, n)
but we're not done yet: this equation was working with values in the first quadrant of the superellipse; this quadrant's graph looks different from what you wanted -- you wanted what appeared in the second quadrant, only shifted over.
We can rectify this by inverting the graph in the first quadrant. We'll do this by subtracting it from 1. Thus, the final equation will be:
y = 1 - (1 - p^n)^(1/n)
which works just fine by my TI-83's reckoning.
Note: In the Wikipedia article, they mention that when n is between 0 and 1 then the curve will be bowed down/in, when n is equal to 1 you get a straight line, and when n is greater than 1 then it will be bowed out. However, since we're subtracting things from 1, this behavior is reversed! (So 0 thru 1 means it's bowed out, and greater than 1 means it's bowed in).
And there you have it -- I hope that's what you were looking for :)
Your curviness property is the exponent.
function(p, exp) { return Math.pow(p, exp); }
exp = 1 gives you the straight line
exp > 1 gives you the exponential lines (bottom two)
0 < exp < 1 gives you the logarithmic lines (top two)
To get "matching" curviness above and below, an exp = 2 would match an exp = 1/2 across the linear dividing line, so you could define a "curviness" function that makes it more intuitive for you.
function curvyInterpolator(p, curviness) {
curviness = curviness > 0 ? curviness : 1/(-curviness);
return Math.pow(p, curviness);
}

OpenCV: parabola detection using Hough Transform

I want to detect parabola(s) of type : y^2 = 4a*x in an image[size: 512 X 512]. I prepared an accumulator array, acc[size: 512 X 512 X 512]. I prepared a MATRIX corresponding to that image. I used hough-transform. This is how I did it:
for x = 1 to 512
for y= 1 to 512
if image_matrix(x,y)> 245//almost white value, so probable to be in parabola
{
for x1= 1 to 512
for y1= 1 to 512
{
calculate 'a' from (y-y1)^2 = 4*a*(x-x1).
increment acc(i,j,k) by 1
}
}
if acc(i,j,k) has a maximum value.
{
x1=i, y1=j,a =k
}
I faced following problems:
1) acc[512][512][512] takes large memory. It needs huge computation.How can I decrease array size and thus minimize computation?
2) Not always max valued-entry of acc(i,j,k) give intended output. Sometimes second or third maximum, and even 10'th maximum value give the intended output. I need approx. value of 'a', 'x1','y1'(not exact value).
Please help me. Is there any wrong in my concept?
What i'm going to say may only partly answer your question, but it should work.
If you want to find these type of parabolas
y^2 = 4a*x
Then they are parametrized by only one parameter which is 'a'. Therefore, i don't really understand why you use a accumulator of 3 dimensions.
For sure, if you want to find a parabola with a more general equation like :
y = ax^2 + bx + c
or in the y direction by replacing x by y, you will need a 3-dimension accumulator like in your example.
I think in your case the problem could be solved easily, saying you only need one accumulator (as you have only one parameter to accumulate : a)
That's what i would suggest :
for every point (x,y) of your image (x=0 exclusive) {
calculate (a = y^2 / 4x )
add + 1 in the corresponding 'a' cell of your accumulator
(eg: a = index of a simple table)
}
for all the cells of your accumulator {
if (cell[idx] > a certain threshold) there is a certain parabola with a = idx
}
I hope it can help you,
This is as well an interesting thing to look at :
Julien,

How can I randomly iterate through a large Range?

I would like to randomly iterate through a range. Each value will be visited only once and all values will eventually be visited. For example:
class Array
def shuffle
ret = dup
j = length
i = 0
while j > 1
r = i + rand(j)
ret[i], ret[r] = ret[r], ret[i]
i += 1
j -= 1
end
ret
end
end
(0..9).to_a.shuffle.each{|x| f(x)}
where f(x) is some function that operates on each value. A Fisher-Yates shuffle is used to efficiently provide random ordering.
My problem is that shuffle needs to operate on an array, which is not cool because I am working with astronomically large numbers. Ruby will quickly consume a large amount of RAM trying to create a monstrous array. Imagine replacing (0..9) with (0..99**99). This is also why the following code will not work:
tried = {} # store previous attempts
bigint = 99**99
bigint.times {
x = rand(bigint)
redo if tried[x]
tried[x] = true
f(x) # some function
}
This code is very naive and quickly runs out of memory as tried obtains more entries.
What sort of algorithm can accomplish what I am trying to do?
[Edit1]: Why do I want to do this? I'm trying to exhaust the search space of a hash algorithm for a N-length input string looking for partial collisions. Each number I generate is equivalent to a unique input string, entropy and all. Basically, I'm "counting" using a custom alphabet.
[Edit2]: This means that f(x) in the above examples is a method that generates a hash and compares it to a constant, target hash for partial collisions. I do not need to store the value of x after I call f(x) so memory should remain constant over time.
[Edit3/4/5/6]: Further clarification/fixes.
[Solution]: The following code is based on #bta's solution. For the sake of conciseness, next_prime is not shown. It produces acceptable randomness and only visits each number once. See the actual post for more details.
N = size_of_range
Q = ( 2 * N / (1 + Math.sqrt(5)) ).to_i.next_prime
START = rand(N)
x = START
nil until f( x = (x + Q) % N ) == START # assuming f(x) returns x
I just remembered a similar problem from a class I took years ago; that is, iterating (relatively) randomly through a set (completely exhausting it) given extremely tight memory constraints. If I'm remembering this correctly, our solution algorithm was something like this:
Define the range to be from 0 to
some number N
Generate a random starting point x[0] inside N
Generate an iterator Q less than N
Generate successive points x[n] by adding Q to
the previous point and wrapping around if needed. That
is, x[n+1] = (x[n] + Q) % N
Repeat until you generate a new point equal to the starting point.
The trick is to find an iterator that will let you traverse the entire range without generating the same value twice. If I'm remembering correctly, any relatively prime N and Q will work (the closer the number to the bounds of the range the less 'random' the input). In that case, a prime number that is not a factor of N should work. You can also swap bytes/nibbles in the resulting number to change the pattern with which the generated points "jump around" in N.
This algorithm only requires the starting point (x[0]), the current point (x[n]), the iterator value (Q), and the range limit (N) to be stored.
Perhaps someone else remembers this algorithm and can verify if I'm remembering it correctly?
As #Turtle answered, you problem doesn't have a solution. #KandadaBoggu and #bta solution gives you random numbers is some ranges which are or are not random. You get clusters of numbers.
But I don't know why you care about double occurence of the same number. If (0..99**99) is your range, then if you could generate 10^10 random numbers per second (if you have a 3 GHz processor and about 4 cores on which you generate one random number per CPU cycle - which is imposible, and ruby will even slow it down a lot), then it would take about 10^180 years to exhaust all the numbers. You have also probability about 10^-180 that two identical numbers will be generated during a whole year. Our universe has probably about 10^9 years, so if your computer could start calculation when the time began, then you would have probability about 10^-170 that two identical numbers were generated. In the other words - practicaly it is imposible and you don't have to care about it.
Even if you would use Jaguar (top 1 from www.top500.org supercomputers) with only this one task, you still need 10^174 years to get all numbers.
If you don't belive me, try
tried = {} # store previous attempts
bigint = 99**99
bigint.times {
x = rand(bigint)
puts "Oh, no!" if tried[x]
tried[x] = true
}
I'll buy you a beer if you will even once see "Oh, no!" on your screen during your life time :)
I could be wrong, but I don't think this is doable without storing some state. At the very least, you're going to need some state.
Even if you only use one bit per value (has this value been tried yes or no) then you will need X/8 bytes of memory to store the result (where X is the largest number). Assuming that you have 2GB of free memory, this would leave you with more than 16 million numbers.
Break the range in to manageable batches as shown below:
def range_walker range, batch_size = 100
size = (range.end - range.begin) + 1
n = size/batch_size
n.times do |i|
x = i * batch_size + range.begin
y = x + batch_size
(x...y).sort_by{rand}.each{|z| p z}
end
d = (range.end - size%batch_size + 1)
(d..range.end).sort_by{rand}.each{|z| p z }
end
You can further randomize solution by randomly choosing the batch for processing.
PS: This is a good problem for map-reduce. Each batch can be worked by independent nodes.
Reference:
Map-reduce in Ruby
you can randomly iterate an array with shuffle method
a = [1,2,3,4,5,6,7,8,9]
a.shuffle!
=> [5, 2, 8, 7, 3, 1, 6, 4, 9]
You want what's called a "full cycle iterator"...
Here is psudocode for the simplest version which is perfect for most uses...
function fullCycleStep(sample_size, last_value, random_seed = 31337, prime_number = 32452843) {
if last_value = null then last_value = random_seed % sample_size
return (last_value + prime_number) % sample_size
}
If you call this like so:
sample = 10
For i = 1 to sample
last_value = fullCycleStep(sample, last_value)
print last_value
next
It would generate random numbers, looping through all 10, never repeating If you change random_seed, which can be anything, or prime_number, which must be greater than, and not be evenly divisible by sample_size, you will get a new random order, but you will still never get a duplicate.
Database systems and other large-scale systems do this by writing the intermediate results of recursive sorts to a temp database file. That way, they can sort massive numbers of records while only keeping limited numbers of records in memory at any one time. This tends to be complicated in practice.
How "random" does your order have to be? If you don't need a specific input distribution, you could try a recursive scheme like this to minimize memory usage:
def gen_random_indices
# Assume your input range is (0..(10**3))
(0..3).sort_by{rand}.each do |a|
(0..3).sort_by{rand}.each do |b|
(0..3).sort_by{rand}.each do |c|
yield "#{a}#{b}#{c}".to_i
end
end
end
end
gen_random_indices do |idx|
run_test_with_index(idx)
end
Essentially, you are constructing the index by randomly generating one digit at a time. In the worst-case scenario, this will require enough memory to store 10 * (number of digits). You will encounter every number in the range (0..(10**3)) exactly once, but the order is only pseudo-random. That is, if the first loop sets a=1, then you will encounter all three-digit numbers of the form 1xx before you see the hundreds digit change.
The other downside is the need to manually construct the function to a specified depth. In your (0..(99**99)) case, this would likely be a problem (although I suppose you could write a script to generate the code for you). I'm sure there's probably a way to re-write this in a state-ful, recursive manner, but I can't think of it off the top of my head (ideas, anyone?).
[Edit]: Taking into account #klew and #Turtle's answers, the best I can hope for is batches of random (or close to random) numbers.
This is a recursive implementation of something similar to KandadaBoggu's solution. Basically, the search space (as a range) is partitioned into an array containing N equal-sized ranges. Each range is fed back in a random order as a new search space. This continues until the size of the range hits a lower bound. At this point the range is small enough to be converted into an array, shuffled, and checked.
Even though it is recursive, I haven't blown the stack yet. Instead, it errors out when attempting to partition a search space larger than about 10^19 keys. I has to do with the numbers being too large to convert to a long. It can probably be fixed:
# partition a range into an array of N equal-sized ranges
def partition(range, n)
ranges = []
first = range.first
last = range.last
length = last - first + 1
step = length / n # integer division
((first + step - 1)..last).step(step) { |i|
ranges << (first..i)
first = i + 1
}
# append any extra onto the last element
ranges[-1] = (ranges[-1].first)..last if last > step * ranges.length
ranges
end
I hope the code comments help shed some light on my original question.
pastebin: full source
Note: PW_LEN under # options can be changed to a lower number in order to get quicker results.
For a prohibitively large space, like
space = -10..1000000000000000000000
You can add this method to Range.
class Range
M127 = 170_141_183_460_469_231_731_687_303_715_884_105_727
def each_random(seed = 0)
return to_enum(__method__) { size } unless block_given?
unless first.kind_of? Integer
raise TypeError, "can't randomly iterate from #{first.class}"
end
sample_size = self.end - first + 1
sample_size -= 1 if exclude_end?
j = coprime sample_size
v = seed % sample_size
each do
v = (v + j) % sample_size
yield first + v
end
end
protected
def gcd(a,b)
b == 0 ? a : gcd(b, a % b)
end
def coprime(a, z = M127)
gcd(a, z) == 1 ? z : coprime(a, z + 1)
end
end
You could then
space.each_random { |i| puts i }
729815750697818944176
459631501395637888351
189447252093456832526
919263002791275776712
649078753489094720887
378894504186913665062
108710254884732609237
838526005582551553423
568341756280370497598
298157506978189441773
27973257676008385948
757789008373827330134
487604759071646274309
217420509769465218484
947236260467284162670
677052011165103106845
406867761862922051020
136683512560740995195
866499263258559939381
596315013956378883556
326130764654197827731
55946515352016771906
785762266049835716092
515578016747654660267
...
With a good amount of randomness so long as your space is a few orders smaller than M127.
Credit to #nick-steele and #bta for the approach.
This isn't really a Ruby-specific answer but I hope it's permitted. Andrew Kensler gives a C++ "permute()" function that does exactly this in his "Correlated Multi-Jittered Sampling" report.
As I understand it, the exact function he provides really only works if your "array" is up to size 2^27, but the general idea could be used for arrays of any size.
I'll do my best to sort of explain it. The first part is you need a hash that is reversible "for any power-of-two sized domain". Consider x = i + 1. No matter what x is, even if your integer overflows, you can determine what i was. More specifically, you can always determine the bottom n-bits of i from the bottom n-bits of x. Addition is a reversible hash operation, as is multiplication by an odd number, as is doing a bitwise xor by a constant. If you know a specific power-of-two domain, you can scramble bits in that domain. E.g. x ^= (x & 0xFF) >> 5) is valid for the 16-bit domain. You can specify that domain with a mask, e.g. mask = 0xFF, and your hash function becomes x = hash(i, mask). Of course you can add a "seed" value into that hash function to get different randomizations. Kensler lays out more valid operations in the paper.
So you have a reversible function x = hash(i, mask, seed). The problem is that if you hash your index, you might end up with a value that is larger than your array size, i.e. your "domain". You can't just modulo this or you'll get collisions.
The reversible hash is the key to using a technique called "cycle walking", introduced in "Ciphers with Arbitrary Finite Domains". Because the hash is reversible (i.e. 1-to-1), you can just repeatedly apply the same hash until your hashed value is smaller than your array! Because you're applying the same hash, and the mapping is one-to-one, whatever value you end up on will map back to exactly one index, so you don't have collisions. So your function could look something like this for 32-bit integers (pseudocode):
fun permute(i, length, seed) {
i = hash(i, 0xFFFF, seed)
while(i >= length): i = hash(i, 0xFFFF, seed)
return i
}
It could take a lot of hashes to get to your domain, so Kensler does a simple trick: he keeps the hash within the domain of the next power of two, which makes it require very few iterations (~2 on average), by masking out the unnecessary bits. The final algorithm looks like this:
fun next_pow_2(length) {
# This implementation is for clarity.
# See Kensler's paper for one way to do it fast.
p = 1
while (p < length): p *= 2
return p
}
permute(i, length, seed) {
mask = next_pow_2(length)-1
i = hash(i, mask, seed) & mask
while(i >= length): i = hash(i, mask, seed) & mask
return i
}
And that's it! Obviously the important thing here is choosing a good hash function, which Kensler provides in the paper but I wanted to break down the explanation. If you want to have different random permutations each time, you can add a "seed" value to the permute function which then gets passed to the hash function.

Resources