Which data structure to use to implement family tree in ruby? - ruby

I am trying to create a simple family tree in ruby where I could add children through the mother nodes. Also when I give a name and a relation as input I should be able to get the output as names of people related to the given person name.
For example, I should be able to do operations like
add_child('Tina', 'bob') // which will add bob as a child node to Tina
get_relation(bob, maternal_uncles) // which should output all the siblings of Tina in this case.
Which data structure is best to implement this and how to implement it in ruby? In my research I found graph is good approach and I was researching on its implementation since 2 days but could not find any solution.
I tried the following libraries
RubyTree https://github.com/evolve75/RubyTree - This helped me to get parents, siblings, grandparents relations but I could not think of how I can use this to get relations like father's brothers(paternal uncle), wife's sisters(sister in law) etc
weighted graph https://github.com/msayson/weighted_graph - I used 0 to represent spouse and 1 to represent children. I could not go anywhere from here. I got confused on how to even get parents and children of a given person.
I explored a little bit about ruby prefix trees and rgl gem but I could not apply them to my application.
Please help. Thanks in advance!

I could figure out a way to get minimum relations using RubyTree itself. RubyTree has inbuilt methods like parent, siblings, children etc and also we can pass content to the nodes.
So I used these to get what I want. For example for creating a spouse I created a child node to the root node and passed a hash like {relation: spouse} in the content. In this way I am able to apply logic by putting conditions on this hash and get the relations that I want
example:
tina = Tree::TreeNode.new('Tina', {gender: 'female', relation: 'root'})
mike = tina << Tree::TreeNode.new('Mike', {gender: 'male', relation: 'spouse'})
sofi = tina << Tree::TreeNode.new('sofi', {gender: 'female', relation: 'child'})
...... #add all children and their children like this
puts "--------siblings of sofi--------"
siblings_of_sofi.each do |sib|
if sib.content[:relation] == 'child'
puts sib.name
end
end
# assume tina has 4 sons and one of them is bob and alice is daughter of bob.
puts "--------alice paternal uncles--------"
puts alice.parent.name
puts alice.parent.content
if alice.parent.content[:relation] == 'spouse'
father = alice.parent.parent #as per the question child should be added through mother only, therefore alice is added as a child to bob's wife and bob's wife is added as a child to bob as {relation: spouse}
uncles = father.siblings
uncles.each do |uncle|
puts uncle.name if uncle.content[:gender] == 'male' && uncle.content[:relation] == 'child'
end
end

Related

What does this Ruby code do? unvisited_cities.min do |city|

I am working through a book on algorithms and it contains a description of Dijkstra's algorithm with a code sample in Ruby. The code solves a hypothetical problem of finding the cheapest flight path from city A to city B. You cannot necessarily fly directly from A to B, and so there may be multiple flights required to complete your trip.
A City object has a #name that is a String and #routes that is a hash with a City object as the key and ticket price as the value.
For example, the city of Atlanta is defined by
atlanta = City.new("Atlanta")
atlanta.add_route(boston, 100)
atlanta.add_route(denver, 160)
# where the add_route method states that
#routes[city] = price
Further, unvisited_cities is an array of City objects, and cheapest_prices_table is a hash with a city #name as the key and ticket price as the value.
My question is, what does the following code, which is part of the Dijkstra algorithm code sample, do? I am unfamiliar with the syntax unvisited_cities.min do |city|. I placed a puts statement just above the code in question (p unvisited_cities.min), and I received an error: ArgumentError (comparison of City with City failed). I could not find a description of this syntax anywhere to understand exactly what the lines below do. That is, how does the code below set current_city to be the least expensive city to fly to in unvisited_cities?
# We visit our next unvisited city. We choose the one that is cheapest
# to get to from the STARTING city:
current_city = unvisited_cities.min do |city|
cheapest_prices_table[city.name]
end
Thank you!

matching array items in rails

I have two arrays and I want to see the total number of matches, between the arrays individual items that their are.
For example arrays with:
1 -- House, Dog, Cat, Car
2 -- Cat, Book, Box, Car
Would return 2.
Any ideas? Thanks!
EDIT/
Basically I have two forms (for two different types of users) that uses nested attributes to store the number of skills they have. I can print out the skills via
current_user.skills.each do |skill| skill.name
other_user.skills.each do |skill| skill.name
When I print out the array, I get: #<Skill:0x1037e4948>#<Skill:0x1037e2800>#<Skill:0x1037e21e8>#<Skill:0x1037e1090>#<Skill:0x1037e0848>
So, yes, I want to compare the two users skills and return the number that match. Thanks for your help.
This works:
a = %w{house dog cat car}
b = %w{cat book box car}
(a & b).size
Documentation: http://www.ruby-doc.org/core/classes/Array.html#M000274
To convert classes to an array using the name, try something like:
class X
def name
"name"
end
end
a = [X.new]
b = [X.new]
(a.map{|x| x.name} & b.map{|x| x.name}).size
In your example, a is current_user.skills and b is other_users.skills. x is simply a reference to the current index of the array as the map action loops through the array. The action is documented in the link I provided.

What would you use for `n to n` relations in python?

after fiddling around with dictionaries, I came to the conclusion, that I would need a data structure that would allow me an n to n lookup. One example would be: A course can be visited by several students and each student can visit several courses.
What would be the most pythonic way to achieve this? It wont be more than 500 Students and 100 courses, to stay with the example. So I would like to avoid using a real database software.
Thanks!
Since your working set is small, I don't think it is a problem to just store the student IDs as lists in the Course class. Finding students in a class would be as simple as doing
course.studentIDs
To find courses a student is in, just iterate over the courses and find the ID:
studentIDToGet = "johnsmith001"
studentsCourses = list()
for course in courses:
if studentIDToGet in course.studentIDs:
studentsCourses.append(course.id)
There's other ways you could do it. You could have a dictionary of studentIDs mapped to courseIDs or two dictionaries that - one mapped studentIDs:courseIDs and another courseIDs:studentIDs - when updated, update each other.
The implementation I wrote out the code for would probably be the slowest, which is why I mentioned that your working set is small enough that it would not be a problem. The other implentations I mentioned but did not show the code for would require some more code to make them work that just aren't worth the effort.
It depends completely on what operations you want the structure to be able to carry out quickly.
If you want to be able to quickly look up properties related to both a course and a student, for example how many hours a student has spent on studies for a specific course, or what grade the student has in the course if he has finished it, and if he has finished it etc. a vector containing n*m elements is probably what you need, where n is the number of students and m is the number of courses.
If on the other hand the average number of courses a student has taken is much less than the total number of courses (which it probably is for a real case scenario), and you want to be able to quickly look up all the courses a student has taken, you probably want to use an array consisting of n lists, either linked lists, resizable vectors or similar – depending on if you want to be able to with the lists; maybe that is to quickly remove elements in the middle of the lists, or quickly access an element at a random location. If you both want to be able to quickly remove elements in the middle of the lists and have quick random access to list elements, then maybe some kind of tree structure would be the most suitable for you.
Most tree data structures carry out all basic operations in logarithmic time to the number of elements in the tree. Beware that some tree data structures have an amortized time on these operators that is linear to the number of elements in the tree, even though the average time for a randomly constructed tree would be logarithmic. A typical example of when this happens is if you use a binary search tree and build it up with increasingly large elements. Don't do that; scramble the elements before you use them to build up the tree in that case, or use a divide-and-conquer method and split the list in two parts and one pivot element and create the tree root with the pivot element, then recursively create trees from both the left part of the list and the right part of the list, these also using the divide-and-conquer method, and attach them to the root as the left child and the right child respectively.
I'm sorry, I don't know python so I don't know what data structures that are part of the language and which you have to create yourself.
I assume you want to index both the Students and Courses. Otherwise you can easily make a list of tuples to store all Student,Course combinations: [ (St1, Crs1), (St1, Crs2) .. (St2, Crs1) ... (Sti, Crsi) ... ] and then do a linear lookup everytime you need to. For upto 500 students this ain't bad either.
However if you'd like to have a quick lookup either way, there is no builtin data structure. You can simple use two dictionaries:
courses = { crs1: [ st1, st2, st3 ], crs2: [ st_i, st_j, st_k] ... }
students = { st1: [ crs1, crs2, crs3 ], st2: [ crs_i, crs_j, crs_k] ... }
For a given student s, looking up courses is now students[s]; and for a given course c, looking up students is courses[c].
For something simple like what you want to do, you could create a simple class with data members and methods to maintain them and keep them consistent with each other. For this problem two dictionaries would be needed. One keyed by student name (or id) that keeps track of the courses each is taking, and another that keeps track of which students are in each class.
defaultdicts from the 'collections' module could be used instead of plain dicts to make things more convenient. Here's what I mean:
from collections import defaultdict
class Enrollment(object):
def __init__(self):
self.students = defaultdict(set)
self.courses = defaultdict(set)
def clear(self):
self.students.clear()
self.courses.clear()
def enroll(self, student, course):
if student not in self.courses[course]:
self.students[student].add(course)
self.courses[course].add(student)
def drop(self, course, student):
if student in self.courses[course]:
self.students[student].remove(course)
self.courses[course].remove(student)
# remove student if they are not taking any other courses
if len(self.students[student]) == 0:
del self.students[student]
def display_course_enrollments(self):
print "Class Enrollments:"
for course in self.courses:
print ' course:', course,
print ' ', [student for student in self.courses[course]]
def display_student_enrollments(self):
print "Student Enrollments:"
for student in self.students:
print ' student', student,
print ' ', [course for course in self.students[student]]
if __name__=='__main__':
school = Enrollment()
school.enroll('john smith', 'biology 101')
school.enroll('mary brown', 'biology 101')
school.enroll('bob jones', 'calculus 202')
school.display_course_enrollments()
print
school.display_student_enrollments()
school.drop('biology 101', 'mary brown')
print
print 'After mary brown drops biology 101:'
print
school.display_course_enrollments()
print
school.display_student_enrollments()
Which when run produces the following output:
Class Enrollments:
course: calculus 202 ['bob jones']
course: biology 101 ['mary brown', 'john smith']
Student Enrollments:
student bob jones ['calculus 202']
student mary brown ['biology 101']
student john smith ['biology 101']
After mary brown drops biology 101:
Class Enrollments:
course: calculus 202 ['bob jones']
course: biology 101 ['john smith']
Student Enrollments:
student bob jones ['calculus 202']
student john smith ['biology 101']

Algorithm for converting hierarchical flat data (w/ ParentID) into sorted flat list w/ indentation levels

I have the following structure:
MyClass {
guid ID
guid ParentID
string Name
}
I'd like to create an array which contains the elements in the order they should be displayed in a hierarchy (e.g. according to their "left" values), as well as a hash which maps the guid to the indentation level.
For example:
ID Name ParentID
------------------------
1 Cats 2
2 Animal NULL
3 Tiger 1
4 Book NULL
5 Airplane NULL
This would essentially produce the following objects:
// Array is an array of all the elements sorted by the way you would see them in a fully expanded tree
Array[0] = "Airplane"
Array[1] = "Animal"
Array[2] = "Cats"
Array[3] = "Tiger"
Array[4] = "Book"
// IndentationLevel is a hash of GUIDs to IndentationLevels.
IndentationLevel["1"] = 1
IndentationLevel["2"] = 0
IndentationLevel["3"] = 2
IndentationLevel["4"] = 0
IndentationLevel["5"] = 0
For clarity, this is what the hierarchy looks like:
Airplane
Animal
Cats
Tiger
Book
I'd like to iterate through the items the least amount of times possible. I also don't want to create a hierarchical data structure. I'd prefer to use arrays, hashes, stacks, or queues.
The two objectives are:
Store a hash of the ID to the indentation level.
Sort the list that holds all the objects according to their left values.
When I get the list of elements, they are in no particular order. Siblings should be ordered by their Name property.
Update: This may seem like I haven't tried coming up with a solution myself and simply want others to do the work for me. However, I have tried coming up with three different solutions, and I've gotten stuck on each. One reason might be that I've tried to avoid recursion (maybe wrongly so). I'm not posting the partial solutions I have so far since they are incorrect and may badly influence the solutions of others.
I needed a similar algorithm to sort tasks with dependencies (each task could have a parent task that needed to be done first). I found topological sort. Here is an iterative implementation in Python with very detailed comments.
The indent level may be calculated while doing the topological sort. Simply set a node's indent level to its parent node's indent level + 1 as it is added to the topological ordering.
Note there can exist many valid topological orderings. To ensure the resulting topological order groups parent nodes with child nodes, select a topological sort algorithm based on depth-first traversal of the graph produced by the partial ordering information.
Wikipedia gives two more algorithms for topological sort. Note these algorithms aren't as good because the first one is breadth-first traversal, and the second one is recursive.
For hierarchical structures you almost certainly will need recursion (if you allow for arbitrary depth). I quickly hacked up some ruby code to illustrate how you might achieve this (though I haven't done the indentation):
# setup the data structure
class S < Struct.new(:id, :name, :parent_id);end
class HierarchySorter
def initialize(il)
#initial_list = il
first_level = #initial_list.select{|a| a.parent_id == nil}.sort_by{|a| a.name }
#final_array = subsort(first_level, 0)
end
#recursive function
def subsort(list, indent_level)
result = []
list.each do |item|
result << [item, indent_level]
result += subsort(#initial_list.select{|a| a.parent_id == item.id}.sort_by{|a| a.name }, indent_level + 1)
end
result
end
def sorted_array
#final_array.map &:first
end
def indent_hash
# magick to transform array of structs into hash
Hash[*#final_array.map{|a| [a.first.id, a.last]}.flatten]
end
end
hs = HierarchySorter.new [S.new(1, "Cats", 2), S.new(2, "Animal", nil), S.new(3, "Tiger", 1), S.new(4, "Book", nil),
S.new(5, "Airplane", nil)]
puts "Array:"
puts hs.sorted_array.inspect
puts "\nIndentation hash:"
puts hs.indent_hash.inspect
If you don't speak ruby I can re-craft it in something else.
Edit: I updated the code above to output both data-structures.
Outputs:
Array:
[#<struct S id=5, name="Airplane", parent_id=nil>, #<struct S id=2, name="Animal", parent_id=nil>, #<struct S id=1, name="Cats", parent_id=2>, #<struct S id=3, name="Tiger", parent_id=1>, #<struct S id=4, name="Book", parent_id=nil>]
Indentation hash:
{5=>0, 1=>1, 2=>0, 3=>2, 4=>0}
Wonsungi's post helped a lot, however that is for a generic graph rather than a tree. So I modified it quite a bit to create an algorithm designed specifically for a tree:
// Data strcutures:
nodeChildren: Dictionary['nodeID'] = List<Children>;
indentLevel: Dictionary['nodeID'] = Integer;
roots: Array of nodes;
sorted: Array of nodes;
nodes: all nodes
// Step #1: Prepare the data structures for building the tree
for each node in nodes
if node.parentID == NULL
roots.Append(node);
indentLevel[node] = 0;
else
nodeChildren[node.parentID].append(node);
// Step #2: Add elements to the sorted list
roots.SortByABC();
while roots.IsNotEmpty()
root = roots.Remove(0);
rootIndentLevel = indentLevel[root];
sorted.Append(root);
children = nodeChildren[root];
children.SortByABC();
for each child in children (loop backwards)
indentLevel[child] = rootIndentLevel + 1
roots.Prepend(child)

Secret santa algorithm

Every Christmas we draw names for gift exchanges in my family. This usually involves mulitple redraws until no one has pulled their spouse. So this year I coded up my own name drawing app that takes in a bunch of names, a bunch of disallowed pairings, and sends off an email to everyone with their chosen giftee.
Right now, the algorithm works like this (in pseudocode):
function DrawNames(list allPeople, map disallowedPairs) returns map
// Make a list of potential candidates
foreach person in allPeople
person.potentialGiftees = People
person.potentialGiftees.Remove(person)
foreach pair in disallowedPairs
if pair.first = person
person.Remove(pair.second)
// Loop through everyone and draw names
while allPeople.count > 0
currentPerson = allPeople.findPersonWithLeastPotentialGiftees
giftee = pickRandomPersonFrom(currentPerson.potentialGiftees)
matches[currentPerson] = giftee
allPeople.Remove(currentPerson)
foreach person in allPeople
person.RemoveIfExists(giftee)
return matches
Does anyone who knows more about graph theory know some kind of algorithm that would work better here? For my purposes, this works, but I'm curious.
EDIT: Since the emails went out a while ago, and I'm just hoping to learn something I'll rephrase this as a graph theory question. I'm not so interested in the special cases where the exclusions are all pairs (as in spouses not getting each other). I'm more interested in the cases where there are enough exclusions that finding any solution becomes the hard part. My algorithm above is just a simple greedy algorithm that I'm not sure would succeed in all cases.
Starting with a complete directed graph and a list of vertex pairs. For each vertex pair, remove the edge from the first vertex to the second.
The goal is to get a graph where each vertex has one edge coming in, and one edge leaving.
Just make a graph with edges connecting two people if they are allowed to share gifts and then use a perfect matching algorithm. (Look for "Paths, Trees, and Flowers" for the (clever) algorithm)
I was just doing this myself, in the end the algorithm I used doesn't exactly model drawing names out of a hat, but it's pretty damn close. Basically shuffle the list, and then pair each person with the next person in the list. The only difference with drawing names out of a hat is that you get one cycle instead of potentially getting mini subgroups of people who only exchange gifts with each other. If anything that might be a feature.
Implementation in Python:
import random
from collections import deque
def pairup(people):
""" Given a list of people, assign each one a secret santa partner
from the list and return the pairings as a dict. Implemented to always
create a perfect cycle"""
random.shuffle(people)
partners = deque(people)
partners.rotate()
return dict(zip(people,partners))
I wouldn't use disallowed pairings, since that greatly increases the complexity of the problem. Just enter everyone's name and address into a list. Create a copy of the list and keep shuffling it until the addresses in each position of the two lists don't match. This will ensure that no one gets themselves, or their spouse.
As a bonus, if you want to do this secret-ballot-style, print envelopes from the first list and names from the second list. Don't peek while stuffing the envelopes. (Or you could just automate emailing everyone thier pick.)
There are even more solutions to this problem on this thread.
Hmmm. I took a course in graph theory, but simpler is to just randomly permute your list, pair each consecutive group, then swap any element that is disallowed with another. Since there's no disallowed person in any given pair, the swap will always succeed if you don't allow swaps with the group selected. Your algorithm is too complex.
Create a graph where each edge is "giftability" Vertices that represent Spouses will NOT be adjacent. Select an edge at random (that is a gift assignment). Delete all edges coming from the gifter and all edges going to the receiver and repeat.
There is a concept in graph theory called a Hamiltonian Circuit that describes the "goal" you describe. One tip for anybody who finds this is to tell users which "seed" was used to generate the graph. This way if you have to re-generate the graph you can. The "seed" is also useful if you have to add or remove a person. In that case simply choose a new "seed" and generate a new graph, making sure to tell participants which "seed" is the current/latest one.
I just created a web app that will do exactly this - http://www.secretsantaswap.com/
My algorithm allows for subgroups. It's not pretty, but it works.
Operates as follows:
1. assign a unique identifier to all participants, remember which subgroup they're in
2. duplicate and shuffle that list (the targets)
3. create an array of the number of participants in each subgroup
4. duplicate array from [3] for targets
5. create a new array to hold the final matches
6. iterate through participants assigning the first target that doesn't match any of the following criteria:
A. participant == target
B. participant.Subgroup == target.Subgroup
C. choosing the target will cause a subgroup to fail in the future (e.g. subgroup 1 must always have at least as many non-subgroup 1 targets remaining as participants subgroup 1 participants remaining)
D. participant(n+1) == target (n +1)
If we assign the target we also decrement the arrays from 3 and 4
So, not pretty (at all) but it works. Hope it helps,
Dan Carlson
Here a simple implementation in java for the secret santa problem.
public static void main(String[] args) {
ArrayList<String> donor = new ArrayList<String>();
donor.add("Micha");
donor.add("Christoph");
donor.add("Benj");
donor.add("Andi");
donor.add("Test");
ArrayList<String> receiver = (ArrayList<String>) donor.clone();
Collections.shuffle(donor);
for (int i = 0; i < donor.size(); i++) {
Collections.shuffle(receiver);
int target = 0;
if(receiver.get(target).equals(donor.get(i))){
target++;
}
System.out.println(donor.get(i) + " => " + receiver.get(target));
receiver.remove(receiver.get(target));
}
}
Python solution here.
Given a sequence of (person, tags), where tags is itself a (possibly empty) sequence of strings, my algorithm suggests a chain of persons where each person gives a present to the next in the chain (the last person obviously is paired with the first one).
The tags exist so that the persons can be grouped and every time the next person is chosen from the group most dis-joined to the last person chosen. The initial person is chosen by an empty set of tags, so it will be picked from the longest group.
So, given an input sequence of:
example_sequence= [
("person1", ("male", "company1")),
("person2", ("female", "company2")),
("person3", ("male", "company1")),
("husband1", ("male", "company2", "marriage1")),
("wife1", ("female", "company1", "marriage1")),
("husband2", ("male", "company3", "marriage2")),
("wife2", ("female", "company2", "marriage2")),
]
a suggestion is:
['person1 [male,company1]',
'person2 [female,company2]',
'person3 [male,company1]',
'wife2 [female,marriage2,company2]',
'husband1 [male,marriage1,company2]',
'husband2 [male,marriage2,company3]',
'wife1 [female,marriage1,company1]']
Of course, if all persons have no tags (e.g. an empty tuple) then there is only one group to choose from.
There isn't always an optimal solution (think an input sequence of 10 women and 2 men, their genre being the only tag given), but it does a good work as much as it can.
Py2/3 compatible.
import random, collections
class Statistics(object):
def __init__(self):
self.tags = collections.defaultdict(int)
def account(self, tags):
for tag in tags:
self.tags[tag] += 1
def tags_value(self, tags):
return sum(1./self.tags[tag] for tag in tags)
def most_disjoined(self, tags, groups):
return max(
groups.items(),
key=lambda kv: (
-self.tags_value(kv[0] & tags),
len(kv[1]),
self.tags_value(tags - kv[0]) - self.tags_value(kv[0] - tags),
)
)
def secret_santa(people_and_their_tags):
"""Secret santa algorithm.
The lottery function expects a sequence of:
(name, tags)
For example:
[
("person1", ("male", "company1")),
("person2", ("female", "company2")),
("person3", ("male", "company1")),
("husband1", ("male", "company2", "marriage1")),
("wife1", ("female", "company1", "marriage1")),
("husband2", ("male", "company3", "marriage2")),
("wife2", ("female", "company2", "marriage2")),
]
husband1 is married to wife1 as seen by the common marriage1 tag
person1, person3 and wife1 work at the same company.
…
The algorithm will try to match people with the least common characteristics
between them, to maximize entrop— ehm, mingling!
Have fun."""
# let's split the persons into groups
groups = collections.defaultdict(list)
stats = Statistics()
for person, tags in people_and_their_tags:
tags = frozenset(tag.lower() for tag in tags)
stats.account(tags)
person= "%s [%s]" % (person, ",".join(tags))
groups[tags].append(person)
# shuffle all lists
for group in groups.values():
random.shuffle(group)
output_chain = []
prev_tags = frozenset()
while 1:
next_tags, next_group = stats.most_disjoined(prev_tags, groups)
output_chain.append(next_group.pop())
if not next_group: # it just got empty
del groups[next_tags]
if not groups: break
prev_tags = next_tags
return output_chain
if __name__ == "__main__":
example_sequence = [
("person1", ("male", "company1")),
("person2", ("female", "company2")),
("person3", ("male", "company1")),
("husband1", ("male", "company2", "marriage1")),
("wife1", ("female", "company1", "marriage1")),
("husband2", ("male", "company3", "marriage2")),
("wife2", ("female", "company2", "marriage2")),
]
print("suggested chain (each person gives present to next person)")
import pprint
pprint.pprint(secret_santa(example_sequence))

Resources