How to iterate only through unique combinations of multiple objects? - ruby

The title is a bit of a doozy.
I'm working on a project where users can make bids. The resulting items can be won exclusively or split between up to 3 users. One user can put in an exclusive bet of $20, and another 3 users can both agree to do a 3-way split and each only pay $10, resulting in $30, beating the first bidder.
I need to run through a list of possibly a dozen different bidders who agreed to the 3-way split to determine the winning trio:
Rza => $20 # loses
ODB + Gza => $25 # loses
InspectahDeck + Ghostface + ODB => $50 # wins
Alternatively
Rza => $100,000 # wins
ODB + Gza => $25 # loses
InspectahDeck + Ghostface + ODB => $50 # loses
All I have is an array of Bid objects, belonging to a variety of users. My goal is to see all possible combinations of up those who wish to split with others and see who comes out on top.
I tried to do something like:
bids.each do |bid1|
bids.each do |bid2|
bids.each do |bid3|
# Fill a hash here, but only if the permutation of the bids is unique
end
end
end
I'm having a hard time with this since it seems horribly inefficient and has tons of duplicates, sometimes same bids appearing twice. I'd like some help or at tips to point me in the right direction.
I'm really stumped.
Thanks in advance.
PS: Another tricky detail: Each bidder can have multiple bids set. So the same guy can have 1 exclusive, 1 2-way and 1 3-way.

Suppose you have something like this:
class Bid
attr_accessor :user # link to the user
attr_accessor :price # dollar amount
attr_accessor :way # 1 means 1-way, 2 means 2-way, 3 means 3-way
end
Get the highest bets of each kind:
best_1_way = bids.select{|bid| bid.way == 1}.max
best_2_ways = bids.select{|bid| bid.way == 2}.sort[-2,2]
best_3_ways = bids.select{|bid| bid.way == 3}.sort[-3,3]
Get the total prices:
total_1_way_price = best_1_way.price
total_2_ways_price = best_2_ways.map(&:price).inject(&:+)
total_3_ways_price = best_3_ways.map(&:price).inject(&:+)
Compare these three items, and you get your winner.
If you have a lot of bids and want to optimize:
all_1_ways, all_2_ways, all_3_ways =
bids.group_by{|bid| bid.way }.values_at(1,2,3)

Related

How to implement semi-random groups with Ruby?

suppose I've got groups:
{1=>[1,1,1,1,1,1], 2=>[2,2,2], 3=>[3,3,3,3,3,3], 4=>[4,4,4,4,4,4]}
the keys represent teams, and the values within the arrays represent employees. Imagine I wish to match employees in a semi-random way. I want to make groups of 3's-5's like this:
[1,1,2,3,5], [1,2,3,4], [1,2,3,3], [1,3,4,1,4]
I have the wish to create groups and have a bias for matching team members of opposite teams, but not an absolute bias. Also you must match every member of each team with a group.
How would you solve this?
This is how I've done it:
group_by_team = records.group_by {|x| x.team_id}.values
mixed_groups = group_by_team.each{|x| x.shuffle!}
# take 1 element from each team and mix
# the number of teams is defined as a constant so we don't have to hit the db with a count
for index in (1..(TEAMS-1))
zipped_groups ||= mixed_groups[0]
zipped_groups = zipped_groups.zip(mixed_groups[index])
end
# flatten the arrays to produce one large Array
# remove nil values from ziped steps with compact!
zipped_groups = zipped_groups.flatten!.compact!
lunch_groups = zipped_groups.each_slice(3)
# we can no longer reduce table size, so lets join two small lunch groups
if lunch_groups.any?{|x| x.size<3}
lunch_groups = self.merge_last_two(lunch_groups)
end
But the problems with my implementation are vast. Groups size is fixed at 3. And its no exactly elegant, or efficient.
How would you make semi-random groups happen?

Separate characters and numbers following specific rules

I am trying to distinguish flight numbers.
Example:
flightno = "FR556"
split_data = flightno.upcase.match(/([A-Za-z]+)(\d+)/)
first = split_data[1] # FR
second = split_data[1] # 556
I then go on to query the database to find an airline based on the FR in this example and apply some logic with the result which is Ryanair.
My problem is when the flight number might be:
flightno = "U21920"
split_data = flightno.upcase.match(/([A-Za-z]+)(\d+)/)
first = split_data[1] # U
second = split_data[1] # 21920
i basically want first to be U2 not just U. This is used to search the database of airlines by their IATA code in this case is U2
****EDIT**
In the interest of clarity i made some mistakes in terminology when asking my question. Due to the complexities of booking reference numbers, the input is taken from whatever the passenger provides. For an easyJet flight for example, the passenger may input EZY1920 or U21920 only the airline provides either so the passenger is ignorant really.
"EZY" = ICAO
"U2" = IATA
I take the input from the user and try to separate the ICAO or IATA from the flight number "1920" but there is no way of determining that without searching the database or separating the input which i feel is cumbersome from a user experience point of view.
Using a regex to separate characters from numbers works until the user inputs an IATA as part of their flight number (the passenger won't know the difference) and as you can see in the example above this confuses the regex.**
The trouble is i cant think of any other pattern with flight numbers. They always have at least two characters made up of just letters or a mixture of a letter and a number and can be 3 characters in length. The numbers part can be as short as 1 but can also be as long as 4 - always numbers.
****edit**
As has been mentioned in the comments, there is no fixed size however one thing that is always true (at least so far) is the first character will always be a letter regardless if it is ICAO or IATA.
After considering every bodies input so far i'm wondering if searching the database and returning airlines with an IATA or ICAO that matches the first two letters provided by the user (U2), (FR), (EZ) might be one way to go, however this is subject to obvious problems should an ICAO or IATA be released that matches another airline, for example "EZY" & "EZT". This is not future proof and i'm looking for better ruby or regex solutions.**
Appreciate your input.
EDIT
I have answered my own question below. While other answers provide a solution for handling some conditions they would fall down if the flight number began with a number so i worked out a crass but to date stable way to analyse the string for digits and then work out if it is an ICAO or IATA from that.
A solution I think of is that you match your given flight number against a complete list of ICAO/IATA codes: https://raw.githubusercontent.com/datasets/airport-codes/master/data/airport-codes.csv
Spending some time with google might give you a more appropriate list.
Then use the first three characters (if that is the maximum) of your flight number to find a match within the icao codes. If you find one, you will know where to seperate your string.
Here a minimal ugly example that should set you on a track. Feel free to update!
ICAOCODES = %w(FR DEU U21) # grab your data here
def retrieve_flight_information(flightnumber)
ICAOCODES.each do |icao|
co = flightnumber.match(icao).to_s
if co.length > 0
# airline
puts co
# flight number
puts flightnumber.gsub(co,'')
end
end
end
retrieve_flight_information("FR556")
#=> FR
#=> 556
retrieve_flight_information("U21214123")
#=> U21
#=> 214123
The biggest flaw lies in using .gsub() as it might mess up your flightnumber in case it looks like this: "FR21413FR2"
However you will find plenty of solutions to this problem on so.
As mentioned in the comments, a list of icao codes is not what you are looking for. But what is relevant here, is that you somehow need a list of strings that you can securely compare against.
I have a fairly crass solution that seems to be working in all scenarios i can throw at it to date. I wanted to make this available to anybody else that might find it useful?
The general rule of thumb for flight codes/numbers seems to be:
IATA: two characters made up of any combination letters and digits
ICAO: three characters made up of letters only (to date)
With that in mind we should be able to work out if we need to search the database by IATA or ICAO depending on the condition of the first three characters.
First we take the flight number and convert to uppercase
string = "U21920".upcase
Next we analyse the first three characters to check for any numbers.
first_three = string[0,3] # => U21
Is there a digit in first_three?
if first_three =~ /\d/ # => true
iata = first_three[0,2] # => If true lets get rid of the last character
# Now we go to the database searching IATA (U2)
search = Airline.where('iata LIKE ?', "#{iata}%") # => Starts with search, just in case
Otherwise if there isnt a digit found in the string
else
icao = string.match(/([A-Za-z]+)(\d+)/)
search = Airline.where('icao LIKE ?', "#{icao[1]}%")
This seems to work for the random flight numbers ive tested it with today from a few of the major airport live departure/arrival boards. Its an interesting problem because some airlines issue tickets with either an ICAO or IATA code as part of the flight number which means passengers won't know any different, not to mention, some airports provide flight information in their own format so assumign there isnt a change to the ICAO and IATA build then the above should work.
Here is an example script you can run
test.rb
puts "What is your flight number?"
string = gets.upcase
first_three = string[0,3]
puts "Taking first three from #{string} is #{first_three}"
if first_three =~ /\d/ # Calling String's =~ method.
puts "The String #{first_three} DOES have a number in it."
iata = first_three[0,2]
search = Airline.where('iata LIKE ?', "#{iata}%")
puts "Searching Airlines starting with IATA #{iata} = #{search.count}"
puts "Found #{search.first.name} from IATA #{iata}"
else
puts "The String #{first_three} does not have a number in it."
icao = string.match(/([A-Za-z]+)(\d+)/)
search = Airline.where('icao LIKE ?', "#{icao[1]}%")
puts "Searching Airlines starting with ICAO #{icao[1]} = #{search.count}"
puts "Found #{search.first.name} from IATA #{icao[1]}"
end
Airline
Airline(id: integer, name: string, iata: string, icao: string, created_at: datetime, updated_at: datetime )
stick this in your lib folder and run
rails runner lib/test.rb
Obviously you can remove all of the puts statements to get straight to the result. I'm using rails runner to include access to my Airline model when running the script.

How avoid interval with Mechanize

I'm trying to scrape Craiglist with Mechanize. I code this:
require 'mechanize'
a = Mechanize.new
page = a.get("http://paris.craigslist.fr/search/apa")
i = 0
list_per_page = 99
while i <= list_per_page do
title = page.search(".hdrlnk")[i].text
price = page.search(".price")[i].text
puts title
puts price
puts "-----------"
i+=1
end
It works but when a listing hasn't any price there is an interval. I think it's because I use search()[i] but I don't know what I have to do to avoid interval. Any idea?
Edit:
On Craiglist there is:
listing_title1 -> $100
listing_title2 -> $200
listing_title3 ->
listing_title4 -> $60
listing_title5 -> $150
My output CSV displays:
listing_title1 -> $100
listing_title2 -> $200
listing_title3 -> $60
listing_title4 -> $150
listing_title5 -> $300
$300 is listing_title6
If by 'interval' you mean the blank line that is printed when the listing doesn't have a price, you could fix this by making the puts conditional:
puts price unless price.empty?
Edit
If I understand right, your hdrlnk and price entries are getting out of sync with each other. This happens because your current loop is skipping entries with blank price fields and going straight to the next one.
The best way to get around this is to find a container that includes both price and hdrlnk and iterate over those instead of over the hdrlnk and price entries separately. On this page that would be the .row which contains all the info for each search result. So something like this would work:
page.search(".row").each do |row|
title = row.search(".hdrlnk").first
price = row.search(".price").first
puts title.text if title
puts price.text if price
puts "------------"
end
I know you've already accepted an answer and that's fine, but I wanted to introduce the concept of next which is a more powerful solution than putting if <thing> checks all over.
Your method could look like this:
while <condition> do
title = page.search(".hdrlnk")[i].text
price = page.search(".price")[i].text
# skip to the next iteration if any of the vars are nil
next unless [title, price].all?
# ... the rest of code
end
By the way, I think you're usage of the term 'interval' is a bit misleading. I think of an interval as a special kind of loop which runs on a specified time interval, i.e. every second or minute. It's probably clearer to use the terms loop or iteration in this case.

How to match between two arrays and update one based on criteria

I'm trying to match two supplier csv's and update one based on the results of the other; things like if price is different, update one file with the matching item of the other. If the product is in the first csv but not in the other, update it. Once the data set is adjusted, I'll write it back to the csv which I'm ok with. Each supplier file is about 9000 lines long. Sample data from the two Puts lines in the code are:
#<struct RecordBUY item_type=nil, buy_product_id="1000", product_name="Plastic Jeweled Crown", product_type=nil, product_code_SKU="105238", option_set=nil, duplicate={"1000"=>["105238"]}, brand_name="Rubies Costumes", prod_desc="This plastic crown has six large jewel stones accross the top. Adjustable headband. (Colors of the jewel stones may vary, our choice please.)", cost_price="$3.76", prod_weight="00.14", prod_width="5.75", prod_height="0.5", prod_depth="23.5", prod_category="Hats, Wigs & Masks", prod_upn="082686025935", prod_size="One Size", prod_color="Gold">
#<struct BCRecord item_type="Product", bc_product_id="620", product_name="Dollar Ring", product_type=nil, product_code_SKU="109624", option_set=nil, duplicate=nil, brand_name="Rubies Costumes", prod_desc="Ring has three large glittery Dollar Signs '$' that extend over your fingers.", cost_price="3.20", prod_weight="0.7200", prod_width="4.0000", prod_height="1.0000", prod_depth="7.0000", prod_category="Accessories & Makeup", prod_upn="82686006996", prod_size=nil, prod_color=nil, option_set=nil, price="5.60", allow_purchases=[21]>
I read the csv data into arrays against respective objects, but don't know how to do searching and updating efficiently. I did not come across concepts to avoid the bad ones (or whether doing a bad one on 9k lines is actually bad or just frowned upon). What I have is:
puts records[0]
puts recordsBC[1]
#start script
records.each do | buyline |
recordsBC.each do | bcline |
if bcline.product_code_SKU == buyline.product_code_SKU
##update pricing (brute force);
#bcline.price = buyline.cost_price * 1.75 #this fails with undefined method `price=' for #<Record:0x007fbb9088b960>
bcline.cost_price = buyline.cost_price
end
##if product is in BC currently, but not in buy - needs to be marked as inactive in BC
if bcline.product_code_SKU.include? buyline.product_code_SKU
#bcline.allow_purchases = "N" # this fails with undefined method `allow_purchases=' for #<Record:0x007fb2878822c8>
end
#if product is in Buy but not in BC then add it into BC
if buyline.product_code_SKU.include? bcline.product_code_SKU
recordsBC.push buyline
end
end
end
I can't figure out a better way, nor understand why I'm getting the undefined method errors on some but not all lines. I'm not after complete answers, just enough to figure out the rest of the solution.
I'd start by reducing the number of iterations. At the moment you are iterating through all of recordsBC for each buyline. So I'd start with:
records.each do | buyline |
record_subset = recordsBC.select{|r|!(r.product_code_SKU.split & buyling.product_code_SKU.split).empty?}
record_subset.each do |bcline|
.....
end
end
That should mean you only iterate through bcline items that have a matching product_code_SKU. You may have to modify the split as your example doesn't show how multiple SKUs are separated (e.g. '123 456', '123,456', or '123/456')

Speed dating algorithm

I work in a consulting organization and am most of the time at customer locations. Because of that I rarely meet my colleagues. To get to know each other better we are going to arrange a dinner party. There will be many small tables so people can have a chat. In order to talk to as many different people as possible during the party, everybody has to switch tables at some interval, say every hour.
How do I write a program that creates the table switching schedule? Just to give you some numbers; in this case there will be around 40 people and there can be at most 8 people at each table. But, the algorithm needs to be generic of course
heres an idea
first work from the perspective of the first person .. lets call him X
X has to meet all the other people in the room, so we should divide the remaining people into n groups ( where n = #_of_people/capacity_per_table ) and make him sit with one of these groups per iteration
Now that X has been taken care of, we will consider the next person Y
WLOG Y be a person X had to sit with in the first iteration itself.. so we already know Y's table group for that time-frame.. we should then divide the remaining people into groups such that each group sits with Y for every consecutive iteration.. and for each iteration X's group and Y's group have no person in common
.. I guess, if you keep doing something like this, you will get an optimal solution (if one exists)
Alternatively you could crowd source the problem by giving each person a card where they could write down the names of all the people they got dine with.. and at the end of event, present some kind of prize to the person with the most names in their card
This sounds like an application for genetic algorithm:
Select a random permutation of the 40 guests - this is one seating arrangement
Repeat the random permutation N time (n is how many times you are to switch seats in the night)
Combine the permutations together - this is the chromosome for one organism
Repeat for how ever many organisms you want to breed in one generation
The fitness score is the number of people each person got to see in one night (or alternatively - the inverse of the number of people they did not see)
Breed, mutate and introduce new organisms using the normal method and repeat until you get a satisfactory answer
You can add in any other factors you like into the fitness, such as male/female ratio and so on without greatly changing the underlying method.
Why not imitate real world?
class Person {
void doPeriodically() {
do {
newTable = random (numberOfTables);
} while (tableBusy(newTable))
switchTable (newTable)
}
}
Oh, and note that there is a similar algorithm for finding a mating partner and it's rumored to be effective for those 99% of people who don't spend all of their free time answering programming questions...
Perfect Table Plan
You might want to have a look at combinatorial design theory.
Intuitively I don't think you can do better than a perfect shuffle, but it's beyond my pre-coffee cognition to prove it.
This one was very funny! :D
I tried different method but the logic suggested by adi92 (card + prize) is the one that works better than any other I tried.
It works like this:
a guy arrives and examines all the tables
for each table with free seats he counts how many people he has to meet yet, then choose the one with more unknown people
if two tables have an equal number of unknown people then the guy will choose the one with more free seats, so that there is more probability to meet more new people
at each turn the order of the people taking seats is random (this avoid possible infinite loops), this is a "demo" of the working algorithm in python:
import random
class Person(object):
def __init__(self, name):
self.name = name
self.known_people = dict()
def meets(self, a_guy, propagation = True):
"self meets a_guy, and a_guy meets self"
if a_guy not in self.known_people:
self.known_people[a_guy] = 1
else:
self.known_people[a_guy] += 1
if propagation: a_guy.meets(self, False)
def points(self, table):
"Calculates how many new guys self will meet at table"
return len([p for p in table if p not in self.known_people])
def chooses(self, tables, n_seats):
"Calculate what is the best table to sit at, and return it"
points = 0
free_seats = 0
ret = random.choice([t for t in tables if len(t)<n_seats])
for table in tables:
tmp_p = self.points(table)
tmp_s = n_seats - len(table)
if tmp_s == 0: continue
if tmp_p > points or (tmp_p == points and tmp_s > free_seats):
ret = table
points = tmp_p
free_seats = tmp_s
return ret
def __str__(self):
return self.name
def __repr__(self):
return self.name
def Switcher(n_seats, people):
"""calculate how many tables and what switches you need
assuming each table has n_seats seats"""
n_people = len(people)
n_tables = n_people/n_seats
switches = []
while not all(len(g.known_people) == n_people-1 for g in people):
tables = [[] for t in xrange(n_tables)]
random.shuffle(people) # need to change "starter"
for the_guy in people:
table = the_guy.chooses(tables, n_seats)
tables.remove(table)
for guy in table:
the_guy.meets(guy)
table += [the_guy]
tables += [table]
switches += [tables]
return switches
lst_people = [Person('Hallis'),
Person('adi92'),
Person('ilya n.'),
Person('m_oLogin'),
Person('Andrea'),
Person('1800 INFORMATION'),
Person('starblue'),
Person('regularfry')]
s = Switcher(4, lst_people)
print "You need %d tables and %d turns" % (len(s[0]), len(s))
turn = 1
for tables in s:
print 'Turn #%d' % turn
turn += 1
tbl = 1
for table in tables:
print ' Table #%d - '%tbl, table
tbl += 1
print '\n'
This will output something like:
You need 2 tables and 3 turns
Turn #1
Table #1 - [1800 INFORMATION, Hallis, m_oLogin, Andrea]
Table #2 - [adi92, starblue, ilya n., regularfry]
Turn #2
Table #1 - [regularfry, starblue, Hallis, m_oLogin]
Table #2 - [adi92, 1800 INFORMATION, Andrea, ilya n.]
Turn #3
Table #1 - [m_oLogin, Hallis, adi92, ilya n.]
Table #2 - [Andrea, regularfry, starblue, 1800 INFORMATION]
Because of the random it won't always come with the minimum number of switch, especially with larger sets of people. You should then run it a couple of times and get the result with less turns (so you do not stress all the people at the party :P ), and it is an easy thing to code :P
PS:
Yes, you can save the prize money :P
You can also take look at stable matching problem. The solution to this problem involves using max-flow algorithm. http://en.wikipedia.org/wiki/Stable_marriage_problem
I wouldn't bother with genetic algorithms. Instead, I would do the following, which is a slight refinement on repeated perfect shuffles.
While (there are two people who haven't met):
Consider the graph where each node is a guest and edge (A, B) exists if A and B have NOT sat at the same table. Find all the connected components of this graph. If there are any connected components of size < tablesize, schedule those connected components at tables. Note that even this is actually an instance of a hard problem known as Bin packing, but first fit decreasing will probably be fine, which can be accomplished by sorting the connected components in order of biggest to smallest, and then putting them each of them in turn at the first table where they fit.
Perform a random permutation of the remaining elements. (In other words, seat the remaining people randomly, which at first will be everyone.)
Increment counter indicating number of rounds.
Repeat the above for a while until the number of rounds seems to converge.

Resources