Algorithm to find words spelled out by a number - algorithm

I'm trying to find a way to determine all possible words that can be spelled out by a given number, given a mapping of alphabets to values.
I eventually want to find a solution that works for any 1- or 2- digit value mapping for a letter, but for illustration, assume A=1, B=2, ... Z=26.
Example: 12322 can be equal to abcbb (1,2,3,2,2), lcbb (12,3,2,2), awbb (1,23,2,2), abcv (1,2,3,22), awv (1,23,22), or lcv (12,3,22).
Here's what I have thought of so far:
I will build a tree of all possible words using the number.
To do this, I will start out with a tree with one root node with dummy data.
I will parse then the number digit-by-digit starting from the least significant digit.
At each step, I will take the last digit of the remaining part of the number and insert it into the left subtree of the current node, and remove that digit from the number for that node's left subtree. For the same node, I will then check if the previous TWO digits together form a valid alphabet, and if so, I will put them into the right subtree (and remove the 2 digits from the number for that node's right subtree).
I will then repeat the above steps recursively for each node, using the part of the number that's left, until there are no more digits left.
To illustrate, for 12322 my tree will look something like this:
*
/ \
/ \
2 22
/ / \
2 3 23
/ \ / \ /
3 23 2 12 1
/ \ / /
2 12 1 1
/
1
To get the words then, I will traverse all possible paths from the leaves to the nodes.
This seems to be an overly complex solution for what I thought would be a fairly simple problem, and I'm trying to find if there's a simpler way to solve this.

You need not actually construct a tree - just recurse:
Take a single digit. See if we can form a word considering it as a letter in itself, and recurse.
When we return from the recursion, try adding another digit (if we were 1 or 2 previously), and re-recursing.

Suppose you aleady have all the possible combination of [2, 3, 2, 2] ,what would be the combination of [1, 2, 3, 2, 2] (add [1] to the head)? It is not difficult the deduce it should be:
A1: put [1] to the head of all_the_combinations_of[1,2,3,2,2] and
A2: put [1*10 + 2] to the head of all_the_combinations_of[2,3,2,2] if [1*10 + 2 <=26]
Once we got this , the following should be easy. I implemented an Ruby version with the recusion trace for your reference.
def comb a
c = []
puts a.inspect
return [a] if a.length <= 1
c = comb(a[1..-1]).map {|e| [a[0]] + e}
if a[0] * 10 + a[1] <= 26
c += comb(a[2..-1]).map { |f| [a[0] * 10 + a[1]] + f }
end
c
end
h = Hash[*(1..26).to_a.zip(('A'..'Z').to_a).flatten]
#h.keys.sort.each {|k| puts "#{k}=>#{h[k]}"}
comb([1,2,3,2,2]).each do |comb|
puts comb.map {|k| h[k]}.join
end
[1, 2, 3, 2, 2]
A1 [2, 3, 2, 2]
[3, 2, 2]
[2, 2]
[2]
[]
[2, 2]
[2]
[]
A2 [3, 2, 2]
[2, 2]
[2]
[]
ABCBB
ABCV
AWBB
AWV
LCBB
LCV

A brute-force solution would be to dynamically fill the array from 1 to N, where a[i] element contains a set of strings that form a[1]a[2]a[3]...a[i] after expansion. You can fill a[1] from stratch, then fill a[2], based on a[1] set and second character in the string. Then you fill a[3], etc. At each sted you only have to look back to a[i-1] and a[i-2] (and to s[i-1] and s[i], where s is your number sequence).
Finally, after you fill a[n], it will contain the answer.
For the example '12322', the sequence becomes:
a[1] = { "a" }
a[2] = { a + 'b' | a in a[1] } union { "l" } = { "ab", "l" }
a[3] = { a + 'c' | a in a[2] } union { a + 'w' | a in a[1] } = { "abc", "lc", "aw" }
a[4] = { a + 'b' | a in a[3] } union { } = { "abcb", "lcb", "awb" }
a[5] = { a + 'b' | a in a[4] } union { a + 'v' | a in a[3] } = { "abcbb", "lcbb", "awbb", "abcv", "lcv", "awv" }
This is essentially the dynamic programming version of the recursive solution above.

An alternative way to do this would be to reverse the problem:
Given a dictionary of words, calculate the numeric strings that would be generated, and store this data into a map/dictionary structure, i.e. table['85'] = 'hi'
For each of the first x digits of the number you are looking up, see if it's in the table, i.e. table.ContainsKey('1'), table.ContainsKey('12'), ...
If you're trying to find the word sequences, generate the words that start at each location in the numeric string, and then do a recursive lookup to find all phrases from that.

Related

Min Path Sum in Matrix- Brute Force

I'm working on the following problem:
Given a m x n grid filled with non-negative numbers, find a path from top left to bottom right which minimizes the sum of all numbers along its path.
Note: You can only move either down or right at any point in time.
My initial impression here was to, from each position in the grid, get the min length of going to the right vs going downward. However, this gives me the incorrect answer for the following:
Input:
[[1,2],[1,1]]
Output:
2
Expected:
3
Intuitively, not sure what I'm doing wrong. It's also very simple code (I know it's not memoized--was planning on that for the next step) but intuitively not sure what's going wrong. The recursive base case makes sense, and each number is being taken into consideration.
def min_path_sum(grid)
smallest_path(0, 0, grid)
end
def smallest_path(row, col, grid)
return 0 if (row == grid.length || col == grid.first.length)
current_val = grid[row][col]
[current_val + smallest_path(row+1, col, grid), current_val + smallest_path(row, col+1, grid)].min #memoize here
end
You didn't make a proper termination condition. You check only until you hit either the right column or bottom row. You need to stay within bounds, but continue until you hit the lower-right corner. You need to recur within bounds until you hit both limits.
Given that, your code does work okay: it finds the path of 2 to the bottom row, rather than the path of 3 to the right edge. You just have to teach it to finish the job.
Is that enough to move you to a solution?
As this is a shortest path problem on an acyclic directed graph, you could use a standard shortest path algorithm.
You could also use dynamic programming ("DP), which may be the most efficient optimization technique. My answer implements a DP algorithm.
A shortest-path or DP algorithm would be vastly superior to enumerating all paths from top-left to bottom-right. As the number of paths increases exponentially with the size of the array, simple enumeration could only be used on modest-sized arrays.
The idea of the DP algorithm is as follows. Let n and m be the numbers of rows and columns, respectively. First compute the shortest path from each column in the last row to the last column in the last row. This is an easy calculation because there is only one path to [m-1, n-1] for each of these elements. Starting with [m-1, n-2] we simply work back to [m-1, 0].
Next we compute the shortest paths from each element in each of the other rows to [m-1, n-1], starting with the penultimate row (m-2) and working back to the first row (0). The last element in each row, [i, n-1], is an easy calculation because one can only go down (to [i+1, n-1]). Therefore, the shortest path from [i, n-1] to [m-1, n-1] is first going to [i+1, n-1] and then following the shortest path from [i+1, n-1], which we've already computed (including its length, of course). The length of the shortest path from [i, n-1] is the "down" distance for [i, n-1] plus the length of the shortest path from [i+1, n-1].
For elements [i, j], n-1,i < j < m-1, we calculate the shortest paths if we go right and down, and select the shorter of the two.
We can implement this as follows.
Code
def shortest_path(distance)
last_row, last_col = distance.size-1, distance.first.size-1
h = {}
last_row.downto(0) do |i|
last_col.downto(0) do |j|
h_right = { min_path_len: distance[i][j][:r] + h[[i,j+1]][:min_path_len],
next_node: [i,j+1] } if j < last_col
h_down = { min_path_len: distance[i][j][:d] + h[[i+1,j]][:min_path_len],
next_node: [i+1,j] } if i < last_row
g =
case
when i == last_row && j == last_col
{ min_path_len: 0, next_node: nil }
when i == last_row
h_right
when j == last_col
h_down
else
[h_right, h_down].min_by { |f| f[:min_path_len] }
end
h[[i,j]] = g
end
end
build_path(h)
end
def build_path(h)
node = [0, 0]
len = h[node][:min_path_len]
arr = []
while h[node][:next_node]
arr << node
node = h[node][:next_node]
end
[len, arr]
end
Example
Suppose these are the distances between adjacent nodes.
● 4 ● 3 ● 1 ● 2 ●
6 2 5 4 5
● 3 ● 4 ● 6 ● 3 ●
1 3 4 2 3
● 6 ● 3 ● 1 ● 2 ●
It's convenient to provide this information in the form of an array of hashes.
distance = [
[{ r: 4, d: 6 }, { r: 3, d: 2 }, { r: 1, d: 5 }, { r: 2, d: 4 }, { d: 5 }],
[{ r: 3, d: 1 }, { r: 4, d: 3 }, { r: 6, d: 4 }, { r: 3, d: 2 }, { d: 3 }],
[{ r: 6 }, { r: 3 }, { r: 1 }, { r: 2 }]
]
We may now compute a shortest path.
p shortest_path distance
#=> [15, [[0, 0], [0, 1], [1, 1], [2, 1], [2, 2], [2, 3]]]
A shortest path is given by the second element of the array that is returned. 15 is the length of that path.

Minimum number of special moves to sort number

Given the list of numbers
1 15 2 5 10
I need to obtain
1 2 5 10 15
The only operation I can do is "move the number X at position Y".
In the above example I only need to do "move the number 15 at position 5".
I would like to minimize the number of operations but I can't find/remember a classical algorithm for that, given the operation available.
Some background :
I'm interacting with an API for a kanban-like service.
I have about 600 cards and some actions on our bug-tracker can imply a reordering of these 600 cards in the kanban (multiple cards can move at the same time if the priority of a project is changed)
I can do it in 600 calls to the API but I'm trying to reduce that number as much as possible.
Lemma: The minimum number of (delete element, insert element) pairs you can perform to sort a list L (in increasing order) is:
Smin(L) = |L| - |LIC(L)|
Where LIC(L) is the Longest Increasing Subsequence.
Thus, you have to:
Establish the LIC of your list.
Remove the elements not in it and insert them back at the appropriate position (using binary search).
Proof:
By induction.
For a list of size 1, the longest increasing subsequence is of length... 1! The list is already sorted so the number of (del,ins) pairs required is
|L| - |LIC(L)| = 1 - 1 = 0
Now let Ln be a list of length n, 1 ≤ n. Let Ln+1 be the list obtained by adding an element en+1 to the left of Ln.
This element may or may not influence the Longest Increasing Subsequence. Let's try to see how...
Let in,1 and in,2 be the two first elements of LIC(Ln) (*):
If en+1 > in,2, then LIC(Ln+1) = LIC(Ln)
If en+1 ≤ in,1, then LIC(Ln+1) = en+1 || LIC(Ln)
Else, LIC(Ln+1) = LIC(Ln) - in,1 + en+1. We keep the LIC with the highest first element. This is done by removing in,1 from the LIC and replacing it with en+1.
In the first case, we delete en+1, we thus get to sort Ln. By the induction hypothesis, this require n (deletion, insertion) pairs. We then have to insert en+1 at the appropriate position. Thus:
S(Ln+1)min = 1 + S(Ln)min
S(Ln+1)min = 1 + n - |LIC(Ln)|
S(Ln+1)min = |Ln+1| - |LIC(Ln+1|
In the second case, we ignore en+1. We begin by deleting elements not in LIC(Ln). These elements have to be inserted again! There are
S(Ln)min = |Ln| - |LIC(Ln)|
such elements.
Now, we just have to take care and insert them in the right order (relatively to en+1). In the end, it requires:
S(Ln+1)min = |Ln| - |LIC(Ln)|
S(Ln+1)min = |Ln| + 1 - (|LIC(Ln)| + 1)
Since we have |LIC(Ln+1)| = |LIC(Ln)| + 1 and |Ln+1| = |Ln| + 1, we have in the end:
S(Ln+1)min = |Ln+1| - |LIC(Ln+1)|
The last case can be proved by considering the list L'n obtained by removing in,1 from Ln+1. In that case LIC(L'n) = LIC(Ln+1) and thus:
|LIC(L'n)| = |LIC(Ln)| (1)
From there, we can sort L'n (which takes |L'n| - |LIC(L'n| by the induction hypothesis. The previous equality (1) leads to the result.
(*): If LIC(Ln) < 2, then in,2 doesn't exist. Just ignore the comparisons with it. In that case, only case 2 and case 3 apply... The result is still valid
One possible solution is to find the longest increasing subsequence and move only elements that aren't inside it.
I can't prove it's optimal, but it is easy to prove it is correct and better than N swaps.
Here is a proof-of-concept in Python 2. I implemented it as a O(n2) algorithm, but I'm pretty sure it can be reduced to O(n log n).
from operator import itemgetter
def LIS(V):
T = [1]*(len(V))
P = [-1]*(len(V))
for i, v in enumerate(V):
for j in xrange(i-1, -1, -1):
if T[j]+1 > T[i] and V[j] <= V[i]:
T[i] = T[j] + 1
P[i] = j
i, _ = max(enumerate(T), key=itemgetter(1))
while i != -1:
yield i
i = P[i]
def complement(L, n):
for a, b in zip(L, L[1:]+[n]):
for i in range(a+1, b):
yield i
def find_moves(V):
n = len(V)
L = list(LIS(V))[::-1]
SV = sorted(range(n), key=lambda i:V[i])
moves = [(x, SV.index(x)) for x in complement(L, n)]
while len(moves):
a, b = moves.pop()
yield a, b
moves = [(x-(x>a)+(x>b), y) for x, y in moves]
def make_and_print_moves(V):
print 'Initial array:', V
for a, b in find_moves(V):
x = V.pop(a)
V.insert(b, x)
print 'Move {} to {}. Result: {}'.format(a, b, V)
print '***'
make_and_print_moves([1, 15, 2, 5, 10])
make_and_print_moves([4, 3, 2, 1])
make_and_print_moves([1, 2, 4, 3])
It outputs something like:
Initial array: [1, 15, 2, 5, 10]
Move 1 to 4. Result: [1, 2, 5, 10, 15]
***
Initial array: [4, 3, 2, 1]
Move 3 to 0. Result: [1, 4, 3, 2]
Move 3 to 1. Result: [1, 2, 4, 3]
Move 3 to 2. Result: [1, 2, 3, 4]
***
Initial array: [1, 2, 4, 3]
Move 3 to 2. Result: [1, 2, 3, 4]
***

How do I value from an array that returns objects at the beginning more often?

Given an array like [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], I want to get a random value that takes into consideration the position.
I want the likelihood of 1 popping up to be way bigger than 10.
Is something like this possible?
For the sake of simplicity let's assume an array arr = [x, y, z] from which we will be sampling values. We'd like to see following relative frequencies of x, y and z:
frequencies = [5, 2, 1]
Preprocess these frequencies to calculate margins for our subsequent dice roll:
thresholds = frequencies.clone
1.upto(frequencies.count - 1).each { |i| thresholds[i] += thresholds[i - 1] }
Let's sum them up.
max = frequencies.reduce :+
Now choose a random number
roll = 1 + rand max
index = thresholds.find_index { |x| roll <= x }
Return arr[index] as a result. To sum up:
def sample arr, frequencies
# assert arr.count == frequencies.count
thresholds = frequencies.clone
1.upto(frequencies.count - 1).each { |i| thresholds[i] += thresholds[i - 1] }
max = frequencies.reduce :+
roll = 1 + rand(max)
index = thresholds.find_index { |x| roll <= x }
arr[index]
end
Let's see how it works.
data = 80_000.times.map { sample [:x, :y, :z], [5, 2, 1] }
A histogram for data shows that sample works as we've intended.
def coin_toss( arr )
arr.detect{ rand(2) == 0 } || arr.last
end
a = (1..10).to_a
10.times{ print coin_toss( a ), ' ' } #=> 1 1 1 9 1 5 4 1 1 3
This takes the first element of the array, flips a coin, returns the element and stops if the coinflip is 'tails'; the same with the next element otherwise. If it is 'heads' all the way, return the last element.
A simple way to implement this with an logarithmic probabilistic of being selected is to simulate coin flips. Generate a random integer 0 and 1, the index to that array to choose is the number of consecutive 1s you get. With this method, the chance of selecting 2 is 1/2 as likely as 1, 3 is 1/4th as likely, etc. You can vary the probability slightly say by generating random numbers between 0 and 5 and count the number of consecutive rounds above 1, which makes each number in the array 4/5th as likely to appear as the one before.
A better and more general way to solve this problem is to use the alias method. See the answer to this question for more information:
Data structure for loaded dice?

Need an algorithm to split a series of numbers

After a few busy nights my head isn't working so well, but this needs to be fixed yesterday, so I'm asking the more refreshed community of SO.
I've got a series of numbers. For example:
1, 5, 7, 13, 3, 3, 4, 1, 8, 6, 6, 6
I need to split this series into three parts so the sum of the numbers in all parts is as close as possible. The order of the numbers needs to be maintained, so the first part must consist of the first X numbers, the second - of the next Y numbers, and the third - of whatever is left.
What would be the algorithm to do this?
(Note: the actual problem is to arrange text paragraphs of differing heights into three columns. Paragraphs must maintain order (of course) and they may not be split in half. The columns should be as equal of height as possible.)
First, we'll need to define the goal better:
Suppose the partial sums are A1,A2,A3, We are trying to minimize |A-A1|+|A-A2|+|A-A3|. A is the average: A=(A1+A2+A3)/3.
Therefore, we are trying to minimize |A2+A3-2A1|+|A1+A3-2A2|+|A1+A2-2A3|.
Let S denote the sum (which is constant): S=A1+A2+A3, so A3=S-A1-A2.
We're trying to minimize:
|A2+S-A1-A2-2A1|+|A1+S-A1-A2-2A2|+|A1+A2-2S+2A1+2A2|=|S-3A1|+|S-3A2|+|3A1+SA2-2S|
Denoting this function as f, we can do two loops O(n^2) and keep track of the minimum:
Something like:
for (x=1; x<items; x++)
{
A1= sum(Item[0]..Item[x-1])
for (y=x; y<items; y++)
{
A2= sum(Item[x]..Item[y-1])
calc f, if new minimum found -keep x,y
}
}
find sum and cumulative sum of series.
get a= sum/3
then locate nearest a, 2*a in the cumulative sum which divides your list into three equal parts.
Lets say p is your array of paragraph heights;
int len= p.sum()/3; //it is avarage value
int currlen=0;
int templen=0;
int indexes[2];
int j = 0;
for (i=0;i<p.lenght;i++)
{
currlen = currlen + p[i];
if (currlen>len)
{
if ((currlen-len)<(abs((currlen-p[i])-len))
{ //check which one is closer to avarege val
indexes[j++] = i;
len=(p.sum()-currlen)/2 //optional: count new avearege height from remaining lengths
currlen = 0;
}
else
{
indexes[j++] = i-1;
len=(p.sum()-currlen)/2
currlen = p[i];
}
}
if (j>2)
break;
}
You will get starting index of 2nd and 3rd sequence. Note its kind of pseudo code :)
I believe that this can be solved with a dynamic programming algorithm for line breaking invented by Donald Knuth for use in TeX.
Following Aasmund Eldhuset answer, I previously answerd this question on SO.
Word wrap to X lines instead of maximum width (Least raggedness)
This algo doesn't rely on the max line size but just gives an optimal cut.
I modified it to work with your problem :
L=[1,5,7,13,3,3,4,1,8,6,6,6]
def minragged(words, n=3):
P=2
cumwordwidth = [0]
# cumwordwidth[-1] is the last element
for word in words:
cumwordwidth.append(cumwordwidth[-1] + word)
totalwidth = cumwordwidth[-1] + len(words) - 1 # len(words) - 1 spaces
linewidth = float(totalwidth - (n - 1)) / float(n) # n - 1 line breaks
print "number of words:", len(words)
def cost(i, j):
"""
cost of a line words[i], ..., words[j - 1] (words[i:j])
"""
actuallinewidth = max(j - i - 1, 0) + (cumwordwidth[j] - cumwordwidth[i])
return (linewidth - float(actuallinewidth)) ** P
"""
printing the reasoning and reversing the return list
"""
F={} # Total cost function
for stage in range(n):
print "------------------------------------"
print "stage :",stage
print "------------------------------------"
print "word i to j in line",stage,"\t\tTotalCost (f(j))"
print "------------------------------------"
if stage==0:
F[stage]=[]
i=0
for j in range(i,len(words)+1):
print "i=",i,"j=",j,"\t\t\t",cost(i,j)
F[stage].append([cost(i,j),0])
elif stage==(n-1):
F[stage]=[[float('inf'),0] for i in range(len(words)+1)]
for i in range(len(words)+1):
j=len(words)
if F[stage-1][i][0]+cost(i,j)<F[stage][j][0]: #calculating min cost (cf f formula)
F[stage][j][0]=F[stage-1][i][0]+cost(i,j)
F[stage][j][1]=i
print "i=",i,"j=",j,"\t\t\t",F[stage][j][0]
else:
F[stage]=[[float('inf'),0] for i in range(len(words)+1)]
for i in range(len(words)+1):
for j in range(i,len(words)+1):
if F[stage-1][i][0]+cost(i,j)<F[stage][j][0]:
F[stage][j][0]=F[stage-1][i][0]+cost(i,j)
F[stage][j][1]=i
print "i=",i,"j=",j,"\t\t\t",F[stage][j][0]
print 'reversing list'
print "------------------------------------"
listWords=[]
a=len(words)
for k in xrange(n-1,0,-1):#reverse loop from n-1 to 1
listWords.append(words[F[k][a][1]:a])
a=F[k][a][1]
listWords.append(words[0:a])
listWords.reverse()
for line in listWords:
print line, '\t\t',sum(line)
return listWords
THe result I get is :
[1, 5, 7, 13] 26
[3, 3, 4, 1, 8] 19
[6, 6, 6] 18
[[1, 5, 7, 13], [3, 3, 4, 1, 8], [6, 6, 6]]
Hope it helps

Need a simple explanation of the inject method

[1, 2, 3, 4].inject(0) { |result, element| result + element } # => 10
I'm looking at this code but my brain is not registering how the number 10 can become the result. Would someone mind explaining what's happening here?
You can think of the first block argument as an accumulator: the result of each run of the block is stored in the accumulator and then passed to the next execution of the block. In the case of the code shown above, you are defaulting the accumulator, result, to 0. Each run of the block adds the given number to the current total and then stores the result back into the accumulator. The next block call has this new value, adds to it, stores it again, and repeats.
At the end of the process, inject returns the accumulator, which in this case is the sum of all the values in the array, or 10.
Here's another simple example to create a hash from an array of objects, keyed by their string representation:
[1,"a",Object.new,:hi].inject({}) do |hash, item|
hash[item.to_s] = item
hash
end
In this case, we are defaulting our accumulator to an empty hash, then populating it each time the block executes. Notice we must return the hash as the last line of the block, because the result of the block will be stored back in the accumulator.
inject takes a value to start with (the 0 in your example), and a block, and it runs that block once for each element of the list.
On the first iteration, it passes in the value you provided as the starting value, and the first element of the list, and it saves the value that your block returned (in this case result + element).
It then runs the block again, passing in the result from the first iteration as the first argument, and the second element from the list as the second argument, again saving the result.
It continues this way until it has consumed all elements of the list.
The easiest way to explain this may be to show how each step works, for your example; this is an imaginary set of steps showing how this result could be evaluated:
[1, 2, 3, 4].inject(0) { |result, element| result + element }
[2, 3, 4].inject(0 + 1) { |result, element| result + element }
[3, 4].inject((0 + 1) + 2) { |result, element| result + element }
[4].inject(((0 + 1) + 2) + 3) { |result, element| result + element }
[].inject((((0 + 1) + 2) + 3) + 4) { |result, element| result + element }
(((0 + 1) + 2) + 3) + 4
10
The syntax for the inject method is as follows:
inject (value_initial) { |result_memo, object| block }
Let's solve the above example i.e.
[1, 2, 3, 4].inject(0) { |result, element| result + element }
which gives the 10 as the output.
So, before starting let's see what are the values stored in each variables:
result = 0 The zero came from inject(value) which is 0
element = 1 It is first element of the array.
Okey!!! So, let's start understanding the above example
Step :1 [1, 2, 3, 4].inject(0) { |0, 1| 0 + 1 }
Step :2 [1, 2, 3, 4].inject(0) { |1, 2| 1 + 2 }
Step :3 [1, 2, 3, 4].inject(0) { |3, 3| 3 + 3 }
Step :4 [1, 2, 3, 4].inject(0) { |6, 4| 6 + 4 }
Step :5 [1, 2, 3, 4].inject(0) { |10, Now no elements left in the array, so it'll return 10 from this step| }
Here Bold-Italic values are elements fetch from array and the simply Bold values are the resultant values.
I hope that you understand the working of the #inject method of the #ruby.
The code iterates over the four elements within the array and adds the previous result to the current element:
1 + 2 = 3
3 + 3 = 6
6 + 4 = 10
What they said, but note also that you do not always need to provide a "starting value":
[1, 2, 3, 4].inject(0) { |result, element| result + element } # => 10
is the same as
[1, 2, 3, 4].inject { |result, element| result + element } # => 10
Try it, I'll wait.
When no argument is passed to inject, the first two elements are passed into the first iteration. In the example above, result is 1 and element is 2 the first time around, so one less call is made to the block.
The number you put inside your () of inject represents a starting place, it could be 0 or 1000.
Inside the pipes you have two place holders |x, y|. x = what ever number you had inside the .inject('x'), and the secound represents each iteration of your object.
[1, 2, 3, 4].inject(5) { |result, element| result + element } # => 15
1 + 5 = 6
2 + 6 = 8
3 + 8 = 11
11 + 4 = 15
Inject applies the block
result + element
to each item in the array. For the next item ("element"), the value returned from the block is "result". The way you've called it (with a parameter), "result" starts with the value of that parameter. So the effect is adding the elements up.
tldr; inject differs from map in one important way: inject returns the value of the last execution of the block whereas map returns the array it iterated over.
More than that the value of each block execution passed into the next execution via the first parameter (result in this case) and you can initialize that value (the (0) part).
Your above example could be written using map like this:
result = 0 # initialize result
[1, 2, 3, 4].map { |element| result += element }
# result => 10
Same effect but inject is more concise here.
You'll often find an assignment happens in the map block, whereas an evaluation happens in the inject block.
Which method you choose depends on the scope you want for result. When to not use it would be something like this:
result = [1, 2, 3, 4].inject(0) { |x, element| x + element }
You might be like all, "Lookie me, I just combined that all into one line," but you also temporarily allocated memory for x as a scratch variable that wasn't necessary since you already had result to work with.
[1, 2, 3, 4].inject(0) { |result, element| result + element } # => 10
is equivalent to the following:
def my_function(r, e)
r+e
end
a = [1, 2, 3, 4]
result = 0
a.each do |value|
result = my_function(result, value)
end
[1, 2, 3, 4].inject(0) { |result, element| result + element } # => 10
In plain English, you are going through (iterating) through this array ([1,2,3,4]). You will iterate through this array 4 times, because there are 4 elements (1, 2, 3, and 4). The inject method has 1 argument (the number 0), and you will add that argument to the 1st element (0 + 1. This equals 1). 1 is saved in the "result". Then you add that result (which is 1) to the next element (1 + 2. This is 3). This will now be saved as the result. Keep going: 3 + 3 equals 6. And finally, 6 + 4 equals 10.
This is a simple and fairly easy to understand explanation:
Forget about the "initial value" as it is somewhat confusing at the beginning.
> [1,2,3,4].inject{|a,b| a+b}
=> 10
You can understand the above as: I am injecting an "adding machine" in between 1,2,3,4. Meaning, it is 1 ♫ 2 ♫ 3 ♫ 4 and ♫ is an adding machine, so it is the same as 1 + 2 + 3 + 4, and it is 10.
You can actually inject a + in between them:
> [1,2,3,4].inject(:+)
=> 10
and it is like, inject a + in between 1,2,3,4, making it 1 + 2 + 3 + 4 and it is 10. The :+ is Ruby's way of specifying + in the form of a symbol.
This is quite easy to understand and intuitive. And if you want to analyze how it works step by step, it is like: taking 1 and 2, and now add them, and when you have a result, store it first (which is 3), and now, next is the stored value 3 and the array element 3 going through the a + b process, which is 6, and now store this value, and now 6 and 4 go through the a + b process, and is 10. You are essentially doing
((1 + 2) + 3) + 4
and is 10. The "initial value" 0 is just a "base" to begin with. In many cases, you don't need it. Imagine if you need 1 * 2 * 3 * 4 and it is
[1,2,3,4].inject(:*)
=> 24
and it is done. You don't need an "initial value" of 1 to multiply the whole thing with 1. This time, it is doing
(((1 * 2) * 3) * 4)
and you get the same result as
1 * 2 * 3 * 4
This code doesn't allow the possibility of not passing a starting value, but may help explain what's going on.
def incomplete_inject(enumerable, result)
enumerable.each do |item|
result = yield(result, item)
end
result
end
incomplete_inject([1,2,3,4], 0) {|result, item| result + item} # => 10
One day, I was also banging my head against the default values in Ruby inject/reduce methods, so I've tried to visualize my issue:
Start here and then review all methods that take blocks.
http://ruby-doc.org/core-2.3.3/Enumerable.html#method-i-inject
Is it the block that confuses you or why you have a value in the method?
Good question though. What is the operator method there?
result.+
What does it start out as?
#inject(0)
Can we do this?
[1, 2, 3, 4].inject(0) { |result, element| result.+ element }
Does this work?
[1, 2, 3, 4].inject() { |result = 0, element| result.+ element }
You see I'm building on to the idea that it simply sums all the elements of the array and yields a number in the memo you see in the docs.
You can always do this
[1, 2, 3, 4].each { |element| p element }
to see the enumerable of the array get iterated through. That's the basic idea.
It's just that inject or reduce give you a memo or an accumulator that gets sent out.
We could try to get a result
[1, 2, 3, 4].each { |result = 0, element| result + element }
but nothing comes back so this just acts the same as before
[1, 2, 3, 4].each { |result = 0, element| p result + element }
in the element inspector block.
There is another form of .inject() method That is very helpful
[4,5].inject(&:+) That will add up all the element of the area
A common scenario that arises when using a collection of any sort, is to get perform a single type of operation with all the elements and collect the result.
For example, a sum(array) function might wish to add all the elements passed as the array and return the result.
A generalized abstraction of same functionality is provided in Ruby in the name of reduce (inject is an alias). That is, these methods iterate over a collection and accumulate the value of an operation on elements in a base value using an operator and return that base value in the end.
Let's take an example for better understanding.
>>> (5..10).inject(1) {|product, n| product * n }
=> 151200
In above example, we have the following elements: a base value 1, an enumerable (5..10), and a block with expressions instructing how to add the calculated value to base value (i.e., multiply the array element to product, where product is initialized with base value)
So the execution follows something like this:
# loop 1
n = 1
product = 1
return value = 1*1
# loop 2
n = 2
product = 1
return value = 1*2
n = 3
product = 2
return value = 2*3
..
As you can notice, the base value is continually updated as the expression loops through the element of container, thus returning the final value of base value as result.
Is the same as this:
[1,2,3,4].inject(:+)
=> 10

Resources