Object and Pointers Graph Representation in Ruby - ruby

I am studying how to represent a graph in memory with Objects and Pointers in ruby and cannot find a representation of this anywhere. Can anyone point me in the right direction of how to build this ds?
Edit: Thanks for the answers, they are great. A directed graph works or connected. I appreciate the help!

There are all sorts of ways you can do this, and the best implementation will depend on the nature of the nodes and edges in your graph. One way to do it would be like this:
# Make some objects. Let's just use the Object class for
# now, but they could be a custom class later.
a = Object.new
b = Object.new
c = Object.new
# Make a hash that stores edges of the graph.
edges = {}
edges[a] = [b, c] # Edge from a to b and another from a to c.
edges[b] = [c]
edges[c] = []
# Iterate over each object that a points to.
edges[a].each do |obj|
end
The implementation above uses a separate hash to store the edges of the graph, but you could certainly store the edges inside instance variables in the objects themselves. However, that could get messy because a.inspect would then print out the entire contents of all the objects that a points to, and then it would recursively print out the contents of all of those objects, and so on.

class Node
attr_accessor :id, :neighbors
def initialize(id, neighbors = [])
#id = id
#neighbors = neighbors
end
end
# Directed Graph
class Graph
attr_accessor :nodes
def initialize(nodes)
#nodes = nodes
end
def adjecency_list
#nodes.map do |node|
[node.id, node.neighbors.map(&:id)]
end.to_h
end
end
n1 = Node.new(1)
n2 = Node.new(2, [n1])
n3 = Node.new(3, [n1, n2])
graph = Graph.new([n1, n2, n3])
graph.adjecency_list
# {
# 1 => [],
# 2 => [
# [0] 1
# ],
# 3 => [
# [0] 1,
# [1] 2
# ]
# }
n4 = Node.new(4, [n1])
graph.nodes << n4
graph.adjecency_list
# {
# 1 => [],
# 2 => [
# [0] 1
# ],
# 3 => [
# [0] 1,
# [1] 2
# ],
# 4 => [
# [0] 1
# ]
# }

Related

What's a proper algorithm to count all possible paths from any given vertex to another in a graph?

I'm working with graphs and I'd like to count all possible paths from given vertex X to given vertex Y.
This is the algorithm I've come up with:
class Graph
def paths_from(vertex_a, vertex_b, count = 0, visited = Array.new(#vertices.length, false))
return count if #vertices.none?(vertex_b) || #vertices.none?(vertex_a)
count += 1 if vertex_a == vertex_b
visited[#vertices.index(vertex_a)] = true
#net[#vertices.index(vertex_a)].each do |vertex|
paths_from(vertex, vertex_b, count, visited) unless visited[#vertices.index(vertex)]
end
count
end
end
Using recursion, I'm expecting to traverse df through the graph. However, I keep getting 0 instead of the expected value given below graph:
describe Graph do
context 'can output all possible from vertex a to vertex b.' do
let(:subject) { Graph.new(%w[a b c d e]) }
before(:each) do
subject.add_edge(0, 1)
subject.add_edge(0, 2)
subject.add_edge(0, 4)
subject.add_edge(1, 2)
subject.add_edge(1, 4)
subject.add_edge(2, 3)
subject.add_edge(3, 1)
end
it 'example #1' do
expect(subject.paths_from('a', 'f')).to eql 0 # => should output 0 and it does.
end
it 'example #2' do
expect(subject.paths_from('f', 'a')).to eql 0 # => should ouput 0 and it does.
end
it 'example #3' do
expect(subject.paths_from('a', 'b')).to eql 2
end
end
end
Doubt #1: I've checked geeksforgeeks approach tips regarding the algorithm: it states I should backtrack. What is that and how may I do it? I guess they're referencing the visited variable... but I've got no clue as to how to do that.
I'll drop the class definition just in case.
class Graph
attr_accessor :net, :vertices
def initialize(vertices = [])
#net = Array.new(vertices.length) { [] }
#vertices = vertices
end
def add_edge(vertex_a, vertex_b)
return if #net[vertex_a].nil? || #vertices[vertex_b].nil?
#net[vertex_a] << #vertices[vertex_b]
end
end
Doubt #2: If I print the count variable right before the #net loop, it prints 0, then '1' four times, yet it returns 0. Why is that? I suppose it's because it's returing #paths_from's first call... if that's the case, how may I return #paths_from's last call's count variable?
I assume the graph is directed and contains no cycles.
Suppose the graph is described by the following hash.
graph = { :A=>[:C, :D], :B=> [:D], :C=>[:D, :E, :F], :D=>[:G], :E=>[:F, :H],
:F=>[:D, :G, :I, :H], :G=>[:H, :I], :H=>[], :I=>[] }
The nodes are keys and the arcs are given by the keys and each element of a keys value. There are, for example, arcs :A->:C, :A-->:D, :B->:D and so on.
We can display this graph as follows.
Given two of the nodes, designated as the origin and terminus, the problem is to determine the number of paths from origin to terminus.
Suppose
origin = :A
terminus = :H
It is seen that there are nine paths from A to H:
A-C-E-H
A-C-E-F-H
A-C-E-F-G-H
A-C-E-F-D-G-H
A-C-F-H
A-C-F-D-G-H
A-C-F-G-H
A-C-D-G-H
A-D-G-H
I will give two solutions. The first is a recursion that requires little code but enumerates all paths. The number of such paths, however, can grow exponentially with the number of nodes. The second is more complex but is much faster for larger graphs. It's computational complexity appears to be only O(n2), where n is the number of nodes.
Enumerate paths from origin to destination
def tot_paths(graph, terminus, node)
graph[node].reduce(0) do |tot, n|
tot + ((n == terminus) ? 1 : tot_paths(graph, terminus, n))
end
end
tot_paths(graph, :H, :A)
#=> 9
More complex, but a much more efficient solution
The second approach requires two steps. The first is to perform a topological sort of the nodes of the graph.
Any array sorted that is an array of topologically-sorted nodes has the property that, for any pair of nodes ni = sorted[i] and nj = sorted[j] there is no path from nj to ni if j > i. A directed graph with no cycles is guaranteed to have at least one topological sort of nodes.
I have used Kuhn's algorithm (described at the above link) to produce the topological sort given by the following array:
[:B, :A, :C, :E, :F, :D, :G, :I, :H]
As shown below, if these nodes are viewed as being on a line, all arcs are directed from left to right. (For now disregard the numbers shown above the nodes.)
My implementation of the Kuhn algorithm is as follows.
nodes = graph.keys
#=> [:A, :B, :C, :D, :E, :F, :G, :H, :I]
incoming = graph.each_with_object(nodes.map { |n| [n, []] }.to_h) do |(k,v),h|
v.each { |n| h[n] << k }
end
#=> {:A=>[], :B=>[], :C=>[:A], :D=>[:A, :B, :C, :F], :E=>[:C], :F=>[:C, :E],
# :G=>[:D, :F], :H=>[:E, :F, :G], :I=>[:F, :G]}
incoming[:H] #=> [:E, :F, :G], for example, shows that the arcs directed into node :H are :E->:H, :F->:H and :G->:H.
no_incoming_nodes = incoming.select { |k,v| v.empty? }.keys
#=> [:A, :B]
sorted_nodes = []
until no_incoming_nodes.empty?
n = no_incoming_nodes.pop
sorted_nodes << n
graph[n].each do |next_node|
incoming[next_node].delete(n)
no_incoming_nodes << next_node if incoming[next_node].empty?
end
end
sorted_nodes
#=> [:B, :A, :C, :E, :F, :D, :G, :I, :H]
The second step is to implement a dynamic-programming algorithm to count the number of paths from node :A to node :H. I will explain how it works by explaining the meaning of the numbers above each node in the diagram immediatally above.
The number of paths from node I (the element of sorted_nodes that is followed by the terminus) is 0 (the number above I) because I is not the terminus and has no outgoing nodes.
Going back one node in sorted_nodes, the number of paths from G to the terminus is 1 as it is followed by the terminus (1) and node I, which has 0 paths to the terminus.
The number of paths from node D to the terminus is 1 because D is followed by only one node, G, and G has 1 path to the terminus.
Node F has 3 paths to the terminus, 1 that goes directly to the terminus, 1 that passes through D, 0 that pass through I and 1 that passes through G.
Similarly, there are 4 paths from node E to the terminus, 8 paths from node C to the terminus and, our objective, 9 paths from A, the origin, to the terminus.
The computation can be implemented as follows (using sorted_nodes computed earlier).
origin = :A
terminus = :H
tot_to = graph.each_with_object({}) do |(k,v),h|
(h[k] = k == terminus ? 1 : 0) if v.empty?
end
#=> {:H=>1, :I=>0}
(sorted_nodes.index(terminus) - 1).downto(sorted_nodes.index(origin)).each do |i|
n = sorted_nodes[i]
tot_to[n] = graph[n].sum { |m| tot_to[m] }
end
tot_to[origin]
#=> 9
Lastly, I would like to mention that the dynamic programming algorithm could have been organised differently, with roughly equal computational efficiency. Rather than beginning at the terminus and working backward, we could have started at the origin and worked forward until the terminus is reached, at each node computing the number of paths from A to the given node.
You don't appear to be recording the output of the recursion.
can you try something like:
count += paths_from(vertex, vertex_b, count, visited) unless visited[#vertices.index(vertex)]
not passing in a count at all:
class Graph
attr_accessor :net, :vertices, :visited
def initialize(vertices = [])
#net = Array.new(vertices.length) { [] }
#visited = Array.new(vertices.length, false)
#vertices = vertices
end
def add_edge(vertex_a, vertex_b)
return if #net[vertex_a].nil? || #vertices[vertex_b].nil?
#net[vertex_a] << #vertices[vertex_b]
end
def paths_from(vertex_a, vertex_b)
return 0 if #vertices.none?(vertex_b) || #vertices.none?(vertex_a)
count = 0
count += 1 if vertex_a == vertex_b
#visited[#vertices.index(vertex_a)] = true
#net[#vertices.index(vertex_a)].each do |vertex|
count += paths_from(vertex, vertex_b) unless #visited[#vertices.index(vertex)]
end
count
end
end

In Ruby, can you store a point to a node in a hash for later replacement

I have to assemble some massive JSON payloads, and I want to avoid massive duplicately nested leaves. What I'd like to do is something like this:
tree = {}
tree[0] = {}
tree[0][1] = "stub"
# now save this pointer for later
stub = &tree[0][1]
...
# now go get the leaf
leaf = {0 => ["a","b","c"}
# now without having to search the entire tree, just use the old stub pointer
stub = leaf
Can this be done in Ruby?
Thanks for any help,
kevin
We don't have pointers (at least not at the Ruby level) but we do have references. So you could use a real hash instead of your "stub" string:
tree = {}
tree[0] = {}
tree[0][1] = {}
then stash that reference in leaf:
leaf = tree[0][1]
and modify the content of leaf without assigning anything new to leaf:
leaf[0] = %w[a b c]
That would leave you with tree[0][1] being {0 => ['a', 'b', 'c']} as desired. Of course, if you say leaf = {0 => %w[a b c]} then you'll have a new reference and you'll break the connection with tree[0][1].
Usually this goes in the other direction. When you need a new leaf, you create it:
leaf = {0 => %w[a b c]}
and then you put that leaf in the tree:
tree[0][1] = leaf
tree[0][6] = leaf # Possibly in multiple places
Then you could say leaf[11] = %w[x y z] and tree[0][1][11] and tree[0][6][11] would also be ['x', 'y', 'z'] because leaf, tree[0][1], and tree[0][6] would all refer to the same underlying hash.

How to reference a part of a 3D array?

So say I have a 3D array and I push values like so:
#h = [[[]]]
#h.push(["cat", "mammal", 100.0])
#h.push(["lizard", "reptile", 300.0])
#h.push(["dog", "mammal", 200.0])
How would I reference the max index by the 3rd element (the float) and then reference each individual elements of the index to output in this example just the value "reptile"?
I've tried:
#h.each do |(x,y,z)|
if ( [x,y,z] == #h.max_by{|(x,y,z)| z} )
puts y
end
end
But it doesn't give me only the "y" value.
Thanks!
#h = [] # not [[[]]]
#h.push(["cat", "mammal", 100.0])
#h.push(["lizard", "reptile", 300.0])
#h.push(["dog", "mammal", 200.0])
max_ar = #h.max_by{|ar| ar[2]}
p max_ar[1] #=> "reptile"
# Your way is less usual. It works fine, but you're working too hard:
max_ar = #h.max_by{|(x,y,z)| z}
p max_ar[1] #=> "reptile"

Ruby: How to get the indices of the first x array elements that match a condition?

I have two huge arrays of sentences, one in German and one in English. I will search through the German sentences for sentences that contain a certain word and if they do, I will check if there is an equivalent English sentence (using a hash with connection information). However, if the user is looking for a very common word, I don't want to return every single sentence that contains it but only the first x matches and stop searching then.
If I do german_sentences.index { |sentence| sentence.include?(word) } I get only one match at a time.
If I use german_sentences.keep_if { |sentence| sentence.include?(word) } I get all matches, but also lose the index information, which is really critical for this.
I am now using a custom loop with each_with_index and break once the maximum has been reached, but I really feel that I must be missing some existing solution, at least something that gives a limited number of matches (even if not their indices)...
german_sentences
.each_index
.lazy
.select{|i| german_sentences[i].include?(word)}
.first(n)
If your need is not a one-off, you could use Module#refine, rather than monkeypatching Array). refine was added to v2.0 experimentally, then changed considerably in v. 2.1. One of the restrictions in the use of refine is: "You may only activate refinements at top-level...", which evidently prevents testing in Pry and IRB.
module M
refine Array do
def select_indices_first(n)
i = 0
k = 0
a = []
return a if n == 0
each { |x| (a << i; k += 1) if yield(x); break if k == n; i += 1 }
a
end
def select_first(n) # if you wanted this also...
k = 0
a = []
return a if n == 0
each { |x| (a << x; k += 1) if yield(x); break if k == n }
a
end
end
end
using M
sentences = ["How now brown", "Cat", "How to guide", "How to shop"]
sentences.select_indices_first(0) {|s| s.include?("How")} # => []
sentences.select_indices_first(1) {|s| s.include?("How")} # => [0]
sentences.select_indices_first(2) {|s| s.include?("How")} # => [0, 2]
sentences.select_indices_first(3) {|s| s.include?("How")} # => [0, 2, 3]
sentences.select_indices_first(99) {|s| s.include?("How")} # => [0, 2, 3]
sentences.select_first(2) {|s| s.include?("How")}
# => ["How now brown", "How to guide"]

Measure the distance between two strings with Ruby?

Can I measure the distance between two strings with Ruby?
I.e.:
compare('Test', 'est') # Returns 1
compare('Test', 'Tes') # Returns 1
compare('Test', 'Tast') # Returns 1
compare('Test', 'Taste') # Returns 2
compare('Test', 'tazT') # Returns 5
Much easier and fast due to native C binding:
gem install levenshtein-ffi
gem install levenshtein
require 'levenshtein'
Levenshtein.normalized_distance string1, string2, threshold
http://rubygems.org/gems/levenshtein
http://rubydoc.info/gems/levenshtein/0.2.2/frames
I found this for you:
def levenshtein_distance(s, t)
m = s.length
n = t.length
return m if n == 0
return n if m == 0
d = Array.new(m+1) {Array.new(n+1)}
(0..m).each {|i| d[i][0] = i}
(0..n).each {|j| d[0][j] = j}
(1..n).each do |j|
(1..m).each do |i|
d[i][j] = if s[i-1] == t[j-1] # adjust index into string
d[i-1][j-1] # no operation required
else
[ d[i-1][j]+1, # deletion
d[i][j-1]+1, # insertion
d[i-1][j-1]+1, # substitution
].min
end
end
end
d[m][n]
end
[ ['fire','water'], ['amazing','horse'], ["bamerindos", "giromba"] ].each do |s,t|
puts "levenshtein_distance('#{s}', '#{t}') = #{levenshtein_distance(s, t)}"
end
That's awesome output: =)
levenshtein_distance('fire', 'water') = 4
levenshtein_distance('amazing', 'horse') = 7
levenshtein_distance('bamerindos', 'giromba') = 9
Source: http://rosettacode.org/wiki/Levenshtein_distance#Ruby
There is an utility method in Rubygems that actually should be public but it's not, anyway:
require "rubygems/text"
ld = Class.new.extend(Gem::Text).method(:levenshtein_distance)
p ld.call("asd", "sdf") => 2
Much simpler, I'm a Ruby show-off at times...
# Levenshtein distance, translated from wikipedia pseudocode by ross
def lev s, t
return t.size if s.empty?
return s.size if t.empty?
return [ (lev s.chop, t) + 1,
(lev s, t.chop) + 1,
(lev s.chop, t.chop) + (s[-1, 1] == t[-1, 1] ? 0 : 1)
].min
end
Ruby 2.3 and later ship with the did_you_mean gem which includes DidYouMean::Levenshtein.distance. Fit for most cases and available by default.
DidYouMean::Levenshtein.distance("Test", "est") # => 1
I made a damerau-levenshtein gem where algorithms are implemented in C
require "damerau-levenshtein"
dl = DamerauLevenshtein
dl.distance("Something", "Smoething") #returns 1
I like DigitalRoss' solution above. However, as pointed out by dawg, its runtime grows on the order O(3^n), which is no good for longer strings.
That solution can be sped up significantly using memoization, or 'dynamic programming':
def lev(string1, string2, memo={})
return memo[[string1, string2]] if memo[[string1, string2]]
return string2.size if string1.empty?
return string1.size if string2.empty?
min = [ lev(string1.chop, string2, memo) + 1,
lev(string1, string2.chop, memo) + 1,
lev(string1.chop, string2.chop, memo) + (string1[-1] == string2[-1] ? 0 : 1)
].min
memo[[string1, string2]] = min
min
end
We then have much better runtime, (I think it's almost linear? I'm not really sure).
[9] pry(main)> require 'benchmark'
=> true
[10] pry(main)> #memo = {}
=> {}
[11] pry(main)> Benchmark.realtime{puts lev("Hello darkness my old friend", "I've come to talk with you again")}
26
=> 0.007071999832987785

Resources