Using a tree in python to get values - algorithm

So I am trying to create a Tree using Python to be able to try and read a text file, which has repeating quantities within the file, and try to create a tree out of these values and return the sentences with the Top 3 values (Explained in more detail below).
First of all I searched on wikipedia on how a tree is created and have also seen previous examples on stackoverflow like: This one. and This one. However I have only been able to do this so far as code goes:
import fileinput
setPhrasesTree = 0
class Branch():
def __init__(self, value):
self.left = None
self.right = None
self.value = value
class Tree():
def __init__(self):
self.root = None
self.found = False
#lessThan function needed to compare strings
def lessThan(self, a, b):
if len(a) < len(b):
loopCount = len(a)
else:
loopCount = len(b)
for pos in range(0, loopCount):
if a[pos] > b[pos]:
return False
return True
def insert(self, value):
self.root = self.insertAtBranch(self.root, value)
def exists(self, value):
#set the class variable found to False to assume it is not there
self.found = False
self.findAtBranch(self.root, value)
return self.found
#Used to fine a value in a tree
def findAtBranch(self, branch, value):
if branch == None:
pass
else:
if branch.value == value:
self.found = True
else:
self.findAtBranch(branch.left, value)
self.findAtBranch(branch.right, value)
def insertAtBranch(self, branch, value):
if branch == None:
return Branch(value)
else:
if self.lessThan(branch.value, value):
branch.right = self.insertAtBranch(branch.right, value)
else:
branch.left = self.insertAtBranch(branch.left, value)
return branch
def loadTree(filename, treeType):
if treeType == setPhrasesTree:
for sentence in fileinput.input("setPhrases.txt"):
print(sentence)
setPhrases.insert(sentence[:-1])
def findSentenceType(sentence):
if sentence.exists(sentence):
return setPhrasesTree
Here is what text file looks like. Bare in mind that it is purposefully laid out like this and not with a quantity value next to it (file name = setPhrases.txt):
Hi my name is Dave.
Thank-You.
What is your name?
I have done all my homework.
What time is dinner?
What is your name?
Thank-You.
Hi my name is Dave.
What is your name?
I have done all my homework.
What is your name?
Can you bring me a drink Please?
Can you bring me a drink Please?
What is your name?
Hi my name is Dave.
What is your name?
Can you bring me a drink Please?
Here is what I am trying to get my code to do. I need it to recognize that the first sentence, in the file, is the starting node. And then it needs to tally up all the other sentences that are the same and add a value to that sentence and just use the tree to be able to do this. (I have originally done this in another way, however I need to use a tree to be able to tally up and do all the other stuff) Here is what I mean:
I then want to be able to return the top 3 Phrases with the highest frequencies. So in this case the system would return the sentences (in this order):
What is your name?
Hi my name is Dave.
Can you bring me a drink please?
Any help is much appreciated. Also thank-you for your time.

Here you go, an implementation using a dictionary. Is this what you want?
import collections
def count_lines():
d = collections.defaultdict(int)
for line in open( "phrases.txt" ):
d[ line.strip() ] += 1
# we use the negative count as sort key, so the biggest ends up first
a = sorted( d.items(), key=lambda x : -x[1] )
for n, u in enumerate( a[:3] ):
print( u[0], "# count=", u[1] )
count_lines()

Related

How to get my method to return it's resulting values to a 3rd method

I'm not sure if the topic title is specific enough but here goes. I have two methods - one that iterates through some arrays along with conditionals in the block to push the correct data out.
Here is that code
def iterate_lines
WIN_COMBINATIONS.each_with_index do |line,index|
lines = #board[line[0]] + #board[line[1]] + #board[line[2]]
if lines.include?("X") && !lines.include?("O")
scores = tally_scores(lines.count("X"),"X",index)
elsif lines.include?("O") && !lines.include?("X")
scores = tally_scores(lines.count("O"),"O",index)
elsif lines.include?("X") && lines.include?("O")
scores = tally_scores(0,"",index)
elsif !lines.include?("X") && !lines.include?("O")
scores = tally_scores(0,"",index)
end
p scores
end
end
The other method is the one calculating those scores according to my chosen heuristics.
def tally_scores(score,player,index)
score = 1 if score == 1 && player == "X"
score = -1 if score == 1 && player == "O"
score = 10 if score == 2 && player == "X"
score = -10 if score == 2 && player == "O"
score = 100 if score == 3 && player == "X"
score = -100 if score == 3 && player == "O"
score
end
Calling 'iterate_lines I can print the correct values out from either 'tally_scores', or as I have shown here, by setting the variables 'scores' to the calls in 'iterate_lines', which allows me to just print them out from 'iterate_lines'.
Naturally the return values from 'iterate_lines' is the array (WIN_COMBINATIONS). Hard-coding a return scores obviously would give me just the last value.
My problem is I have a 3rd method that needs to get what comes out of 'tally_scores' yet I can't pass it over as a regular argument, aka my_method(scores). The reason being is that the 3rd method has it's own list of arguments it gets passed for other reasons. Plus it would be nil until the method was called.
def get_scores
# other code
#: something like this:
score = iterate_lines
# or
score = tally_scores
# or
# ?
end
So I feel like maybe I backed myself into a corner and should trash what I have and restart. I will say I tried taking 'tally_scores' and putting the scores into a instance variable array. I found though when I passed it, all but the last value remained.
There are a couple issues here. First of all, as you've seen when you use each_with_index nothing that happens in that block has an effect outside, unless you use side effects. If you a set a variable in that block it will be reset each iteration.
You can change it to map.with_index so that the result is an array of the results produced by the iterations.
Also it seems like scores should be score here and in lines similar to it, since tally_scores returns a single score:
scores = tally_scores(lines.count("X"),"X",index)
If you're using map.with_index, then the return value of the block should be score, that way the result will be an array of scores. However you can't use return score from the block, which will return from the parent method and not the single iteration of the block. You can use next score instead or simply score as the last line.
After making these changes, you can say scores = iterate_lines.
It would look something like this:
def iterate_lines
WIN_COMBINATIONS.map.with_index do |line, index|
# set score according to your conditional logic
score # or alternatively, "next score"
end
end
It's better to extract the printing logic to elsewhere, for example:
scores = iterate_lines
scores.each { |score| p score }

Using Strings as Variable/Object Names in Ruby

I am dealing with fractals. You start with a rectangle, and that shape is decreased by a given decay rate. I have it set up to do the first 10 iterations of the given scenario, and each scenario looks like this:
y_1 = dec_y(y_1)
y_2 = dec_y(y_2)
a_y = [y_1, y_2]
rect_1 = TkcRectangle.new(canvas, [0,0], a_y)
where dec_y is defined as the following:
def dec_y(y)
to_ret = y / $rate
return to_ret
end
I want to turn the first snippet into a function/method (not exactly sure what the Ruby term is...), so that each iteration will just be a single line referencing a method, which makes the problem more extensible. But, I need each TkcRectangle to have a different name. The way I want to set it up, each TkcRectangle will have the same name. But, if I can set the name of the object to a string passed as an argument, then I should not have a problem.
How do I define the name of an object with a given string?
Edit : Code has not been tested, but will give you the idea.
Instead of naming each element, you can use an array and use the index instead
rectangles_array = Array.new
for each loop
rectangles_array << create_rectangle_object(y_1, y_2, canvas)
end for each loop
def dec_y(y)
to_ret = y / $rate
return to_ret
end
def create_rectangle_object(y_1, y_2, canvas)
return TkcRectangle.new(canvas, [0,0], [dec_y(y_1), dec_y(y_2)])
end
If you really want to name it read about structs.. Something like
MyRectangleStruct = Struct.new(:obj_name, :x1, :y1, :x2, :y2)
puts MyRectangleStruct.new(:obj_name => 'First_rec', .....)
define_method(method_name, &block)
with method_name being any string and &block being a block of ruby code; usually it looks something like this:
define_method(method_name) do
your code goes here
end

How do I make multiple combinations with a string in ruby?

Input should be a string:
"abcd#gmail.com"
Output should be an Array of strings:
["abcd#gmail.com",
"a.bcd#gmail.com",
"ab.cd#gmail.com",
"abc.d#gmail.com",
"a.b.cd#gmail.com",
"a.bc.d#gmail.com",
"a.b.c.d#gmail.com"]
The idea: "Make every possible combination in the first string part ("abcd") with a dot. Consecutive dots are not allowed. There are no dots allowed in the beginning and in the end of the first string part ("abcd")"
This is what I've came up with so far:
text,s = "abcd".split""
i=0
def first_dot(text)
text.insert 1,"."
end
def set_next_dot(text)
i = text.rindex(".")
text.delete_at i
text.insert(i+1,".")
end
My approach was
write a function, that sets the first dot
write a function that sets the next dot
...(magic)
I do not know how to put the pieces together. Any Idea? Or perhaps a better way?
thanx in advance
edit:
I think I found the solution :)
I will post it in about one hour (it's brilliant -> truth tables, binary numbers, transposition)
...and here the solution
s = "abc"
states = s.length
possibilites = 2**states
def set_space_or_dot(value)
value.gsub("0","").gsub("1",".")
end
def fill_with_leading_zeros(val, states)
if val.length < states
"0"*(states-val.length)+val
else
val
end
end
a = Array.new(possibilites,s)
a = a.map{|x| x.split ""}
b = [*0...possibilites].map{|x| x.to_s(2).to_s}
b = b.map{|x| fill_with_leading_zeros x,states}
b = b.map{|x| x.split ""}
c = []
for i in 0 ... a.size
c[i] = (set_space_or_dot (a[i].zip b[i]).join).strip
end
Changing pduersteler answer a little bit:
possibilities = []
string = "abcd#example.com"
(string.split('#')[0].size-1).times do |pos|
possibility = string.dup
possibilities << possibility.insert(pos+1, '.')
end
How about this (probably needs a bit more fine-tuning to suit your needs):
s = "abcd"
(0..s.size-1).map do |i|
start, rest = [s[0..i], s[(i+1)..-1]]
(0..rest.size-1).map { |j| rest.dup.insert(j, '.') }.map { |s| "#{start}#{s}"}
end.flatten.compact
#=> ["a.bcd", "ab.cd", "abc.d", "ab.cd", "abc.d", "abc.d"]
An option would be to iterate n times through your string moving the dot, where n is the amount of chars minus 1. This is what you're doing right now, but without defining two methods.
Something like this:
possibilities = []
string = "abcd#example.com"
(string.split('#')[0].size-1).times do |pos|
possibilities << string.dup.insert(pos+1, '.')
end
edit
Now tested. THanks to the comments, you need to call .dup on the string before the insert. Otherwise, the dot gets inserted into the string and will stay there for each iteration causing a mess. Calling .dup onthe string will copy the string and works on the copy instead, leaving the original string untouched.

Lychrel numbers

First of all, for those of you, who don't know (or forgot) about Lychrel numbers, here is an entry from Wikipedia: http://en.wikipedia.org/wiki/Lychrel_number.
I want to implement the Lychrel number detector in the range from 0 to 10_000. Here is my solution:
class Integer
# Return a reversed integer number, e.g.:
#
# 1632.reverse #=> 2361
#
def reverse
self.to_s.reverse.to_i
end
# Check, whether given number
# is the Lychrel number or not.
#
def lychrel?(depth=30)
if depth == 0
return true
elsif self == self.reverse and depth != 30 # [1]
return false
end
# In case both statements are false, try
# recursive "reverse and add" again.
(self + self.reverse).lychrel?(depth-1)
end
end
puts (0..10000).find_all(&:lychrel?)
The issue with this code is the depth value [1]. So, basically, depth is a value, that defines how many times we need to proceed through the iteration process, to be sure, that current number is really a Lychrel number. The default value is 30 iterations, but I want to add more latitude, so programmer can specify his own depth through method's parameter. The 30 iterations is perfect for such small range as I need, but if I want to cover all natural numbers, I have to be more agile.
Because of the recursion, that takes a place in Integer#lychrel?, I can't be agile. If I had provided an argument to the lychrel?, there wouldn't have been any changes because of the [1] statement.
So, my question sounds like this: "How do I refactor my method, so it will accept parameters correctly?".
What you currently have is known as tail recursion. This can usually be re-written as a loop to get rid of the recursive call and eliminate the risk of running out of stack space. Try something more like this:
def lychrel?(depth=30)
val = self
first_iteration = true
while depth > 0 do
# Return false if the number has become a palindrome,
# but allow a palindrome as input
if first_iteration
first_iteration = false
else
if val == val.reverse
return false
end
# Perform next iteration
val = (val + val.reverse)
depth = depth - 1
end
return true
end
I don't have Ruby installed on this machine so I can't verify whether that 's 100% correct, but you get the idea. Also, I'm assuming that the purpose of the and depth != 30 bit is to allow a palindrome to be provided as input without immediately returning false.
By looping, you can use a state variable like first_iteration to keep track of whether or not you need to do the val == val.reverse check. With the recursive solution, scoping limitations prevent you from tracking this easily (you'd have to add another function parameter and pass the state variable to each recursive call in turn).
A more clean and ruby-like solution:
class Integer
def reverse
self.to_s.reverse.to_i
end
def lychrel?(depth=50)
n = self
depth.times do |i|
r = n.reverse
return false if i > 0 and n == r
n += r
end
true
end
end
puts (0...10000).find_all(&:lychrel?) #=> 249 numbers
bta's solution with some corrections:
class Integer
def reverse
self.to_s.reverse.to_i
end
def lychrel?(depth=30)
this = self
first_iteration = true
begin
if first_iteration
first_iteration = false
elsif this == this.reverse
return false
end
this += this.reverse
depth -= 1
end while depth > 0
return true
end
end
puts (1..10000).find_all { |num| num.lychrel?(255) }
Not so fast, but it works:
code/practice/ruby% time ruby lychrel.rb > /dev/null
ruby lychrel.rb > /dev/null 1.14s user 0.00s system 99% cpu 1.150 total

Class Objects and comparing specific attributes

I have the following code.
class person(object):
def __init__(self, keys):
for item in keys:
setattr(self, item, None)
def __str__(self):
return str(self.__dict__)
def __eq__(self, other) :
return self.__dict__ == other.__dict__
Now I want to take this code and only do __eq__ on a specific set of attrs ("keys"). So I changed it to do this:
class person(object):
def __init__(self, keys):
self.valid_keys = keys
for item in keys:
setattr(self, item, None)
def __str__(self):
return dict([(i, getattr(self, i)) for i in self.valid_keys ])
def __eq__(self, other) :
assert isinstance(other, person)
self_vals = [ getattr(self, i) for i in self.valid_keys ]
other_vals = [ getattr(other, i) for i in self.valid_keys ]
return self_vals == other_vals
I have read the following two awesome posts (here and here) and my fundamental question is:
Is this the right approach or is there a better way to do this in python?
Obviously TMTOWTDI - but I'd like to keep and follow a standard pythonic approach. Thanks!!
Updates
I was asked why do I not fix the attrs in my class. This is a great question and here's why. The purpose of this is to take several dis-jointed employee records and build a complete picture of an employee. For example I get my data from ldap, lotus notes, unix passwd files, bugzilla data, etc. Each of those has uniq attrs and so I generalized them into a person. This gives me a quick consistent way to compare old records to new records. HTH. Thanks
** Updates Pt.2 **
Here is what I ended up with:
class personObj(object):
def __init__(self, keys):
self.__dict__ = dict.fromkeys(keys)
self.valid_keys = keys
def __str__(self):
return str([(i, getattr(self, i)) for i in self.valid_keys ])
def __eq__(self, other):
return isinstance(other, personObj) and all(getattr(self, i) == getattr(other, i) for i in self.valid_keys)
Thanks to both gents for reviewing!
There are minor enhancements (bug fixes) I'd definitely do.
In particular, getattr called with two arguments raises an ArgumentError if the attribute's not present, so you could get that exception if you were comparing two instances with different keys. You could just call it with three args instead (the third one is returned as the default value when the attribute is not present) -- just don't use None as the third arg in this case since it's what you normally have as the value (use a sentinel value as the third arg).
__str__ is not allowed to return a dict: it must return a string.
__eq__ between non-comparable objects should not raise -- it should return False.
Bugs apart, you can get the object's state very compactly with self.__dict__, or more elegantly with vars(self) (you can't reassign the whole dict with the latter syntax, though). This bit of knowledge lets you redo your class entirely, in a higher-level-of-abstraction way -- more compact and expeditious:
class person(object):
def __init__(self, keys):
self.__dict__ = dict.fromkeys(keys)
def __str__(self):
return str(vars(self))
def __eq__(self, other):
return isinstance(other, person) and vars(self) == vars(other)
You can simplify your comparison from:
self_vals = [ getattr(self, i) for i in self.valid_keys ]
other_vals = [ getattr(other, i) for i in self.valid_keys ]
return self_vals == other_vals
to:
return all(getattr(self, i) == getattr(other, i) for i in self.valid_keys)

Resources