How to compute "shortest distance" between two words? - algorithm

Recently I had an interview and I was asked to write a algorithm to find the minimum number of 1 letter changes to get from a particular of word to a given word , i.e. Cat->Cot->Cog->Dog
I dont want the solution of the problem just guide me through How I can use BFS in this algorithm ?

according to this scrabble list, the shortest path between cat and dog is:
['CAT', 'COT', 'COG', 'DOG']
from urllib import urlopen
def get_words():
try:
html = open('three_letter_words.txt').read()
except IOError:
html = urlopen('http://www.yak.net/kablooey/scrabble/3letterwords.html').read()
with open('three_letter_words.txt', 'w') as f:
f.write(html)
b = html.find('<PRE>') #ignore the html before the <pre>
while True:
a = html.find("<B>", b) + 3
b = html.find("</B>", a)
word = html[a: b]
if word == "ZZZ":
break
assert(len(word) == 3)
yield word
words = list(get_words())
def get_template(word):
c1, c2, c3 = word[0], word[1], word[2]
t1 = 1, c1, c2
t2 = 2, c1, c3
t3 = 3, c2, c3
return t1, t2, t3
d = {}
for word in words:
template = get_template(word)
for ti in template:
d[ti] = d.get(ti, []) + [word] #add the word to the set of words with that template
for ti in get_template('COG'):
print d[ti]
#['COB', 'COD', 'COG', 'COL', 'CON', 'COO', 'COO', 'COP', 'COR', 'COS', 'COT', 'COW', 'COX', 'COY', 'COZ']
#['CIG', 'COG']
# ['BOG', 'COG', 'DOG', 'FOG', 'HOG', 'JOG', 'LOG', 'MOG', 'NOG', 'TOG', 'WOG']
import networkx
G = networkx.Graph()
for word_list in d.values():
for word1 in word_list:
for word2 in word_list:
if word1 != word2:
G.add_edge(word1, word2)
print G['COG']
#{'COP': {}, 'COS': {}, 'COR': {}, 'CIG': {}, 'COT': {}, 'COW': {}, 'COY': {}, 'COX': {}, 'COZ': {}, 'DOG': {}, 'CON': {}, 'COB': {}, 'COD': {}, 'COL': {}, 'COO': {}, 'LOG': {}, 'TOG': {}, 'JOG': {}, 'BOG': {}, 'HOG': {}, 'FOG': {}, 'WOG': {}, 'NOG': {}, 'MOG': {}}
print networkx.shortest_path(G, 'CAT', 'DOG')
['CAT', 'OCA', 'DOC', 'DOG']
As a bonus we can get the farthest:
print max(networkx.all_pairs_shortest_path(G, 'CAT')['CAT'].values(), key=len)
#['CAT', 'CAP', 'YAP', 'YUP', 'YUK']

At first sight I thaught about Levenshtein distance but you need to use BFS. So I think that you should start from building tree. Given word should be root and then next nodes are words with changed first letter. Next next nodes have changed second letter. When you build the graph you use BFS and when you found new word store the path length. At the end of algorithm choose minimal distance.

Begin with just the starting word in your path set.
If the ending word of any path in your path set is the desired word, stop, that path is the desired path.
Replace each path in your path set with every possible path that starts with that path but is one word longer.
Go to step 2.

If we start to build a directed acyclic graph from the destination word to the source word, in a breadth-wise fashion, and we do a dictionary look-up to verify if we have seen the word earlier in the tree while adding the word, then the first occurrence of the source word,should give the shortest path in the reverse direction from the 'target word' to the 'source word'.
From this we can print the path from the 'source' to the 'target'

Related

traversing graph and creating dynamic variable

I have a simple graph with few nodes and these nodes have attributes such as "type" and "demand".
def mygraph():
G = nx.Graph()
G.add_nodes_from([("N1", {"type":"parent","demand": 10}),
("N2"{"type":"parent","demand": 12}),
("N3", {"type":"parent","demand": 25}),
("S1", {"type":"server","demand": 12}),
("S2,{"type":"server","demand": 20})])
I am passing this graph to another function in pyomo library. The dummy pyomo function is as follows:
def mymodel():
g=mygraph()
**VARIABLES**
model.const1 = Constraint(my constraint1)
model.const2 = Constraint(my constraint2)
model.obj1 = Objective(my objective)
status = SolverFactory('glpk')
results = status.solve(model)
assert_optimal_termination(results)
model.display()
mymodel()
I am trying to:
In graph function mygraph():, I need to find the total number of nodes in the graph G with attribute type==parent.
In pyomo function mymodel():, I need to create the number new of VARIABLES equal to the number of nodes with attribute type==parent. So in the case above, my program must create 3 new variables, since 3 nodes have attribute type==parent in my graph function. The values of these newly created variables will be accessed from the demand attribute of the same node thus, it should be something like this;
new_var1=demand of node1 (i.e., node1_demand=10 in this case)
new_var2=demand of node2 (i.e., node2_demand=12)
new_var3=demand of node3 (i.e., node2_demand=25)
For the first part you can loop over the nodes:
sum(1 for n,attr in G.nodes(data=True) if attr['type']=='parent')
# 3
# or to get all types
from collections import Counter
c = Counter(attr['type'] for n,attr in G.nodes(data=True))
# {'parent': 3, 'server': 2}
c['parent']
# 3
c['server']
# 2
For the second part (which also gives you the answer of the first part of you check the length):
{n: attr['demand'] for n,attr in G.nodes(data=True) if attr['type']=='parent'}
# or
[attr['demand'] for n,attr in G.nodes(data=True) if attr['type']=='parent']
Output:
{'N1': 10, 'N2': 12, 'N3': 25}
# or
[10, 12, 25]
instanciating attributes
def mymodel():
g = mygraph()
nodes = [attr['demand']
for n,attr in G.nodes(data=True)
if attr['type']=='parent']
# initialize model?
for i,n in enumerate(nodes, start=1):
setattr(model, f'const{1}', Constraint(something with n))
# ...

Breadth-first algorithm implementation

I am trying to implement a "Breadth-First" Algorithm as a variation of something I've seen in a book.
My issue is that the algorithm is not adding the elements of every node into the queue.
For instance, if I search for "black lab" under the name 'mariela' in the "search()" function, I will get the correct output: "simon is a black lab"
However, I ought to be able to look for "black lab" in "walter", which is connected to "mariela", which is connected to "simon", who is a "black lab'. This is not working.
Have I made a rookie mistake in my implementation of this algorithm, or have I set up my graph wrong?
As always, any/all help is much appreciated!
from collections import deque
# TEST GRAPH -------------
graph = {}
graph['walter'] = ['luci', 'kaiser', 'andrea', 'mariela']
graph['andrea'] = ['echo', 'dante', 'walter', 'mariela']
graph['mariela'] = ['ginger', 'simon', 'walter', 'andrea']
graph['kaiser'] = 'german shepherd'
graph['luci'] = 'black cat'
graph['echo'] = 'pitbull'
graph['dante'] = 'pitbull'
graph['ginger'] = 'orange cat'
graph['simon'] = 'black lab'
def condition_met(name):
if graph[name] == 'black lab':
return name
def search(name):
search_queue = deque()
search_queue += graph[name] # add all elements of "name" to queue
searchedAlready = [] # holding array for people already searched through
while search_queue: # while queue not empty...
person = search_queue.popleft() # pull 1st person from queue
if person not in searchedAlready: # if person hasn't been searched through yet...
if condition_met(person):
print person + ' is a black labrador'
return True
else:
search_queue += graph[person]
searchedAlready.append(person)
return False
search('walter')
#search('mariela')
You have lots of problems in your implementation - both Python and Algorithm wise.
Rewrite as:
# #param graph graph to search
# #param start the node to start at
# #param value the value to search for
def search(graph, start, value):
explored = []
queue = [start]
while len(queue) > 0:
# next node to explore
node = queue.pop()
# only explore if not already explored
if node not in explored:
# node found, search complete
if node == value:
return True
# add children of node to queue
else:
explored.append(node)
queue.extend(graph[node]) # extend is faster than concat (+=)
return False
graph = {}
graph['walter'] = ['luci', 'kaiser', 'andrea', 'mariela']
graph['andrea'] = ['echo', 'dante', 'walter', 'mariela']
graph['mariela'] = ['ginger', 'simon', 'walter', 'andrea']
# children should be a list
graph['kaiser'] = ['german shepherd']
graph['luci'] = ['black cat']
graph['echo'] = ['pitbull']
graph['dante'] = ['pitbull']
graph['ginger'] = ['orange cat']
graph['simon'] = ['black lab']
print search(graph, 'mariela', 'walter')
Here is a demo https://repl.it/IkRA/0

creating nested dictionary from flat list with python

i have a list of file in this form:
base/images/graphs/one.png
base/images/tikz/two.png
base/refs/images/three.png
base/one.txt
base/chapters/two.txt
i would like to convert them to a nested dictionary of this sort:
{ "name": "base" , "contents":
[{"name": "images" , "contents":
[{"name": "graphs", "contents":[{"name":"one.png"}] },
{"name":"tikz", "contents":[{"name":"two.png"}]}
]
},
{"name": "refs", "contents":
[{"name":"images", "contents": [{"name":"three.png"}]}]
},
{"name":"one.txt", },
{"name": "chapters", "contents":[{"name":"two.txt"}]
]
}
trouble is, my attempted solution, given some input like images/datasetone/grapha.png" ,"images/datasetone/graphb.png" each one of them will end up in a different dictionary named "datasetone" however i'd like both to be in the same parent dictionary as they are in the same directory, how do i create this nested structure without duplicating parent dictionaries when there's more than one file in a common path?
here is what i had come up with and failed:
def path_to_tree(params):
start = {}
for item in params:
parts = item.split('/')
depth = len(parts)
if depth > 1:
if "contents" in start.keys():
start["contents"].append(create_base_dir(parts[0],parts[1:]))
else:
start ["contents"] = [create_base_dir(parts[0],parts[1:]) ]
else:
if "contents" in start.keys():
start["contents"].append(create_leaf(parts[0]))
else:
start["contents"] =[ create_leaf(parts[0]) ]
return start
def create_base_dir(base, parts):
l={}
if len(parts) >=1:
l["name"] = base
l["contents"] = [ create_base_dir(parts[0],parts[1:]) ]
elif len(parts)==0:
l = create_leaf(base)
return l
def create_leaf(base):
l={}
l["name"] = base
return l
b=["base/images/graphs/one.png","base/images/graphs/oneb.png","base/images/tikz/two.png","base/refs/images/three.png","base/one.txt","base/chapters/two.txt"]
d =path_to_tree(b)
from pprint import pprint
pprint(d)
In this example you can see we end up with as many dictionaries named "base" as there are files in the list, but only one is necessary, the subdirectories should be listed in the "contents" array.
This does not assume that all paths start with the same thing, so we need a list for it:
from pprint import pprint
def addBits2Tree( bits, tree ):
if len(bits) == 1:
tree.append( {'name':bits[0]} )
else:
for t in tree:
if t['name']==bits[0]:
addBits2Tree( bits[1:], t['contents'] )
return
newTree = []
addBits2Tree( bits[1:], newTree )
t = {'name':bits[0], 'contents':newTree}
tree.append( t )
def addPath2Tree( path, tree ):
bits = path.split("/")
addBits2Tree( bits, tree )
tree = []
for p in b:
print p
addPath2Tree( p, tree )
pprint(tree)
Which produces the following for your example path list:
[{'contents': [{'contents': [{'contents': [{'name': 'one.png'},
{'name': 'oneb.png'}],
'name': 'graphs'},
{'contents': [{'name': 'two.png'}],
'name': 'tikz'}],
'name': 'images'},
{'contents': [{'contents': [{'name': 'three.png'}],
'name': 'images'}],
'name': 'refs'},
{'name': 'one.txt'},
{'contents': [{'name': 'two.txt'}], 'name': 'chapters'}],
'name': 'base'}]
Omitting the redundant name tags, you can go on with :
import json
result = {}
records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
"base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]
recordsSplit = map(lambda x: x.split("/"), records)
for record in recordsSplit:
here = result
for item in record[:-1]:
if not item in here:
here[item] = {}
here = here[item]
if "###content###" not in here:
here["###content###"] = []
here["###content###"].append(record[-1])
print json.dumps(result, indent=4)
The # characters are used for uniqueness (there could be a folder which name was content in the hierarchy). Just run it and see the result.
EDIT : Fixed a few typos, added the output.

Positional Argument Undefined

I am working on a larger project to write a code so the user can play Connect 4 against the computer. Right now, the user can choose whether or not to go first and the board is drawn. While truing to make sure that the user can only enter legal moves, I have run into a problem where my function legal_moves() takes 1 positional argument, and 0 are given, but I do not understand what I need to do to male everything agree.
#connect 4
#using my own formating
import random
#define global variables
X = "X"
O = "O"
EMPTY = "_"
TIE = "TIE"
NUM_ROWS = 6
NUM_COLS = 8
def display_instruct():
"""Display game instructions."""
print(
"""
Welcome to the second greatest intellectual challenge of all time: Connect4.
This will be a showdown between your human brain and my silicon processor.
You will make your move known by entering a column number, 1 - 7. Your move
(if that column isn't already filled) will move to the lowest available position.
Prepare yourself, human. May the Schwartz be with you! \n
"""
)
def ask_yes_no(question):
"""Ask a yes or no question."""
response = None
while response not in ("y", "n"):
response = input(question).lower()
return response
def ask_number(question,low,high):
"""Ask for a number within range."""
#using range in Python sense-i.e., to ask for
#a number between 1 and 7, call ask_number with low=1, high=8
low=1
high=NUM_COLS
response = None
while response not in range (low,high):
response=int(input(question))
return response
def pieces():
"""Determine if player or computer goes first."""
go_first = ask_yes_no("Do you require the first move? (y/n): ")
if go_first == "y":
print("\nThen take the first move. You will need it.")
human = X
computer = O
else:
print("\nYour bravery will be your undoing... I will go first.")
computer = X
human = O
return computer, human
def new_board():
board = []
for x in range (NUM_COLS):
board.append([" "]*NUM_ROWS)
return board
def display_board(board):
"""Display game board on screen."""
for r in range(NUM_ROWS):
print_row(board,r)
print("\n")
def print_row(board, num):
"""Print specified row from current board"""
this_row = board[num]
print("\n\t| ", this_row[num], "|", this_row[num], "|", this_row[num], "|", this_row[num], "|", this_row[num], "|", this_row[num], "|", this_row[num],"|")
print("\t", "|---|---|---|---|---|---|---|")
# everything works up to here!
def legal_moves(board):
"""Create list of column numbers where a player can drop piece"""
legals = []
if move < NUM_COLS: # make sure this is a legal column
for r in range(NUM_ROWS):
legals.append(board[move])
return legals #returns a list of legal columns
#in human_move function, move input must be in legal_moves list
print (legals)
def human_move(board,human):
"""Get human move"""
legals = legal_moves(board)
print("LEGALS:", legals)
move = None
while move not in legals:
move = ask_number("Which column will you move to? (1-7):", 1, NUM_COLS)
if move not in legals:
print("\nThat column is already full, nerdling. Choose another.\n")
print("Human moving to column", move)
return move #return the column number chosen by user
def get_move_row(turn,move):
move=ask_number("Which column would you like to drop a piece?")
for m in range (NUM_COLS):
place_piece(turn,move)
display_board()
def place_piece(turn,move):
if this_row[m[move]]==" ":
this_row.append[m[move]]=turn
display_instruct()
computer,human=pieces()
board=new_board()
display_board(board)
move= int(input("Move?"))
legal_moves()
print ("Human:", human, "\nComputer:", computer)
Right down the bottom of the script, you call:
move= int(input("Move?"))
legal_moves()
# ^ no arguments
This does not supply the necessary board argument, hence the error message.

Join array of strings into 1 or more strings each within a certain char limit (+ prepend and append texts)

Let's say I have an array of Twitter account names:
string = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
And a prepend and append variable:
prepend = 'Check out these cool people: '
append = ' #FollowFriday'
How can I turn this into an array of as few strings as possible each with a maximum length of 140 characters, starting with the prepend text, ending with the append text, and in between the Twitter account names all starting with an #-sign and separated with a space. Like this:
tweets = ['Check out these cool people: #example1 #example2 #example3 #example4 #example5 #example6 #example7 #example8 #example9 #FollowFriday', 'Check out these cool people: #example10 #example11 #example12 #example13 #example14 #example15 #example16 #example17 #FollowFriday', 'Check out these cool people: #example18 #example19 #example20 #FollowFriday']
(The order of the accounts isn't important so theoretically you could try and find the best order to make the most use of the available space, but that's not required.)
Any suggestions? I'm thinking I should use the scan method, but haven't figured out the right way yet.
It's pretty easy using a bunch of loops, but I'm guessing that won't be necessary when using the right Ruby methods. Here's what I came up with so far:
# Create one long string of #usernames separated by a space
tmp = twitter_accounts.map!{|a| a.insert(0, '#')}.join(' ')
# alternative: tmp = '#' + twitter_accounts.join(' #')
# Number of characters left for mentioning the Twitter accounts
length = 140 - (prepend + append).length
# This method would split a string into multiple strings
# each with a maximum length of 'length' and it will only split on empty spaces (' ')
# ideally strip that space as well (although .map(&:strip) could be use too)
tweets = tmp.some_method(' ', length)
# Prepend and append
tweets.map!{|t| prepend + t + append}
P.S.
If anyone has a suggestion for a better title let me know. I had a difficult time summarizing my question.
The String rindex method has an optional parameter where you can specify where to start searching backwards in a string:
arr = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
str = arr.map{|name|"##{name}"}.join(' ')
prepend = 'Check out these cool people: '
append = ' #FollowFriday'
max_chars = 140 - prepend.size - append.size
until str.size <= max_chars do
p str.slice!(0, str.rindex(" ", max_chars))
str.lstrip! #get rid of the leading space
end
p str unless str.empty?
I'd make use of reduce for this:
string = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
prepend = 'Check out these cool people:'
append = '#FollowFriday'
# Extra -1 is for the space before `append`
max_content_length = 140 - prepend.length - append.length - 1
content_strings = string.reduce([""]) { |result, target|
result.push("") if result[-1].length + target.length + 2 > max_content_length
result[-1] += " ##{target}"
result
}
tweets = content_strings.map { |s| "#{prepend}#{s} #{append}" }
Which would yield:
"Check out these cool people: #example1 #example2 #example3 #example4 #example5 #example6 #example7 #example8 #example9 #FollowFriday"
"Check out these cool people: #example10 #example11 #example12 #example13 #example14 #example15 #example16 #example17 #FollowFriday"
"Check out these cool people: #example18 #example19 #example20 #FollowFriday"

Resources