How to Use hash table to solve questions? - data-structures

How to use hashtable to solve the given problem?
I was asked question like given a string of 0 and 1 like "001100" where 1 indicates system is updated and 0 indicate unupdated system. updated system(1) can update adjacent computer in one day. In one day only adjacent PC will be updated. So for above for day one string will become "011110" then day 2 it will be "111111" so total 2 days. How can i use hashtable for this problem?

Related

Maximise subset selection under constraints

I am working on a scheduling problem for a team of volunteers. I have boiled my problem down to the following algorithmic problem:
I have a matrix with ~60 rows representing volunteers and ~14 columns representing days. Each entry is an integer in the range 0 to 3 inclusive representing how free the volunteer is on that day. I want to choose exactly 4 entries from each column (4 volunteers a day) such that (in order of importance)
A 0-entry is never chosen.
The workload is as spread out as possible (first give everyone one shift, then start giving out second shifts, etc. We can expect that most volunteers will only have one shift per 2-week period, and some may even have none.)
The sum over selected entries is maximised (volunteers get days that they prefer).
I want to output a decision matrix that has a 1 whenever a volunteer is chosen for a day, and 0 otherwise. I believe this is an instance of the nurse-rostering problem, so I'm not expecting a fast solution, but I just want to make a brute force algorithm that will work in a reasonable time for my ~60 person team. I'm just really not sure how to start tackling this problem. Is it suited for backtracking, or is there some way to calculate the best placement of each volunteer based on the distribution of his/her day-scores?

How to design an algorithm to put elements into groups with constraints?

I was given a task of putting students into groups (to prepare a coding camp), but with several constraints. Though I've finished the task by hand, I'd like to know is there already exist some algorithms for tasks like this, or how can I design such an algorithm.
Background: 40 students in total, with these attributes:
gender: F/M
grade: Year 1/2
school: School 1/School 2/...
early assessment result: Rank from 1 to 40
Constraints: All of them needs to be satisfied.
Exactly 4 people per group
Each group needs to have at least a girl
Each group needs to have at least a Year 2 student
4 group members needs to come from 4 different schools
Each group needs to have at least a student who ranked top 10 in early assessment
What I'm expecting:
The Best: An existing algorithm/program for these kind of problems
Or, An algorithm for this specific problem
Or at least, Some ideas of creating an algorithm for this specific problem
My thoughts:
Since I've successed in making groups by hand, I know that such a solution indeed exists for my current dataset. But if I need an algorithm to find a solution for me, it should first try to check whether a solution even exists, by check if the number of girl / Year 2 students is greater than 10 (with pigeonhole principle), and some other conditions. And obviously, Constraint 5 is the easiest, and can provide a base solution for the rest. However, I still can not find a systematic way of doing it. Perhaps bruteforce and randomization can help? I'm not sure.
And sorry, since the data is confidential, I can not post it.
Update: After consulting a friend, here is a possible method:
First put the top 1 to 10 into 10 different groups.
Then iterate through groups. If the only person in the group is a boy/girl, try to add a girl/boy from a different school.
Then the problem size is reduced from 2^40 to 2^20, making bruthforce a viable solution.

Algorithm to find the best possible available times

Here is my scenario,
I run a Massage Place which offers various type of massages. Say 30 min Massage, 45 min massage, 1 hour massage, etc. I have 50 rooms, 100 employees and 30 pieces of equipment.When a customer books a massage appointment, the appointment requires 1 room, 1 employee and 1 piece of equipment to be available.
What is a good algorithm to find available resources for 10 guests for a given day
Resources:
Room – 50
Staff – 100
Equipment – 30
Business Hours : 9AM - 6PM
Staff Hours: 9AM- 6PM
No of guests: 10
Services
5 Guests- (1 hour massages)
3 Guests - (45mins massages)
2 Guests - (1 hour massage).
They are coming around the same time. Assume there are no other appointment on that day
What is the best way to get ::
Top 10 result - Fastest search which meets all conditions gets the top 10 result set. Top ten is defined by earliest available time. 9 – 11AM is best result set. 9 – 5pm is not that good.
Exhaustive search (Find all combinations) - all sets – Every possible combination
First available met (Only return the first match) – stop after one of the conditions have been met
I would appreciate your help.
Thanks
Nick
First, it seems the number of employees, rooms, and equipment are irrelevant. It seems like you only care about which of those is the lowest number. That is your inventory. So in your case, inventory = 30.
Next, it sounds like you can service all 10 people at the same time within the first hour of business. In fact, you can service 30 people at the same time.
So, no algorithm is necessary to figure that out, it's a static solution. If you take #Mario The Spoon's advice and weight the different duration massages with their corresponding profits, then you can start optimizing when you have more than 30 customers at a time.
Looks like you are trying to solve a problem for which there are quite specialized software applications. If your problem is small enough, you could try to do a brute force approach using some looping and backtracking, but as soon as the problem becomes too big, it will take too much time to iterate through all possibilities.
If the problem starts to get big, look for more specialized software. Things to look for are "constraint based optimization" and "constraint programming".
E.g. the ECLIPSe tool is an open-source constraint programming environment. You can find some examples on http://eclipseclp.org/examples/index.html. One nice example you can find there is the SEND+MORE=MONEY problem. In this problem you have the following equation:
S E N D
+ M O R E
-----------
= M O N E Y
Replace every letter by a digit so that the sum is correct.
This also illustrates that although you can solve this brute-force, there are more intelligent ways to solve this (see http://eclipseclp.org/examples/sendmore.pl.txt).
Just an idea to find a solution:
You might want to try to solve it with a constraint satisfaction problem (CSP) algorithm. That's what some people do if they have to solve timetable problems in general (e.g. room reservation at the University).
There are several tricks to improve CSP performance like forward checking, building a DAG and then do a topological sort and so on...
Just let me know, if you need more information about CSP :)

What class of algorithms can be used to solve this?

EDIT: Just to make sure someone is not breaking their head on the problem... I am not looking for the best optimal algorithm. Some heuristic that makes sense is fine.
I made a previous attempt at formulating this and realized I did not do a great job at it so I removed that question. I have taken another shot at formulating my problem. Please feel free to provide any constructive criticism that can help me improve this.
Input:
N people
k announcements that I can make
Distance that my voice can be heard (say 5 meters) i.e. I may decide to announce or not depending on the number of people within these 5 meters
Goal:
Maximize the total number of people who have heard my k announcements and (optionally) minimize the time in which I can finish announcing all k announcements
Constraints:
Once a person hears my announcement, he is be removed from the total i.e. if he had heard my first announcement, I do not count him even if he hears my second announcement
I can see the same person as well as the same set of people within my proximity
Example:
Let us consider 10 people numbered from 1 to 10 and the following pattern of arrival:
Time slot 1: 1 (payoff = 1)
Time slot 2: 2 3 4 5 (payoff = 4)
Time slot 3: 5 6 7 8 (payoff = 4 if no announcement was made previously in time slot 2, 3 if an announcement was made in time slot 2)
Time slot 4: 9 10 (payoff = 2)
and I am given 2 announcements to make. Now if I were an oracle, I would choose time slots 2 and time slots 3 because then 7 people would have heard (because 5 already heard my announcement in Time slot 2, I do not consider him anymore). I am looking for an online algorithm that will help me make these decisions on whether or not to make an announcement and if so based on what factors. Does anyone have any ideas on what algorithms can be used to solve this or a simpler version of this problem?
There should be an approach relying upon a max-flow algorithm. In essence, you're trying to push the maximum amount of messages from start->end. Though it would be multidimensional, you could have a super-sink, which connects to each value of t, then have each value of t connect to the people you can reach at this time and then have a super-sink. This way, you simply have to compute a max-flow (with the added constraint of no more than k shouts, which should be solvable with a bit of dynamic programming). It's a terrifically dirty way to solve it, but it should get the job done deterministically and without the use of heuristics.
I don't know that there is really a way to solve this or an algorithm to do it the way you have formulated it.
It seems like basically you are trying to reach the maximum number of people with exactly 2 announcements. But without knowing any information about the groups of people in advance, you can't really make any kind of intelligent decision about whether or not to use your first announcement. Your second one at least has the benefit of knowing when not to be used (i.e. if the group has no new members then you can know its not worth wasting the announcement). But it still has basically the same problem.
The only real way to solve this is to use knowledge about the type of data or the desired outcome to make guesses. If you know that groups average 100 people with a standard deviation of 10, then you could just refuse to announce if less than 90 people are present. Or, if you know you need to reach at least 100 people with two announcements, you could choose never to announce to less than 50 at once. Obviously those approaches risk never announcing at all if the actual data does not meet what you would expect. But that's always going to be a risk, since you could get 1 person in the first group and then 0 in all of the rest, no matter what you do.
Or, you could try more clearly defining the problem, I have a hard time figuring out how to relate this to computers.
Lets start my trying to solve the simplest possible variant of the problem: Lets assume N people and K timeslots, but only one possible announcement. Lets also assume that each person will only ever stay for one timeslot and that each person who hasn't yet shown up has an equally probable chance of showing up at any future timeslot.
Given these simplifications, at each timeslot you look at the payoff of announcing at the current timeslot and compare to the chance of a future timeslot having a higher payoff, eg, lets assume 4 people 3 timeslots:
Timeslot 1: Person 1 shows up, so you know you could get a payoff of 1 by announcing, but then you have 3 people to show up in 2 remaining timeslots, so at least one of those timeslots is guaranteed to have 2 people, so don't announce..
So at each timeslot, you can calculate the chance that a later timeslot will have a higher payoff than the current by treating the remaining (N) people and (K) timeslots as being N independent random numbers each from 1..k, and calculate the chance of at least one value k being hit more than or equal to the current-payoff times. (Similar to the Birthday problem, but for more than 1 collision) and then you need to decide hwo much to discount based on expected variances. (bird in the hand, etc)
Generalization of this solution to the original problem is left as an exercise for the reader.

Best way to generate order numbers for an online store?

Every order in my online store has a user-facing order number. I'm wondering the best way to generate them. Criteria include:
Short
Easy to say over the phone (e.g., "m" and "n" are ambiguous)
Unique
Checksum (overkill? Useful?)
Edit: Doesn't reveal how many total orders there have been (a customer might find it unnerving to make your 3rd order)
Right now I'm using the following method (no checksum):
def generate_number
possible_values = 'abfhijlqrstuxy'.upcase.split('') | '123456789'.split('')
record = true
while record
random = Array.new(5){possible_values[rand(possible_values.size)]}.join
record = Order.find(:first, :conditions => ["number = ?", random])
end
self.number = random
end
As a customer I would be happy with:
year-month-day/short_uid
for example:
2009-07-27/KT1E
It gives room for about 33^4 ~ 1mln orders a day.
Here is an implementation for a system I proposed in an earlier question:
MAGIC = [];
29.downto(0) {|i| MAGIC << 839712541[i]}
def convert(num)
order = 0
0.upto(MAGIC.length - 1) {|i| order = order << 1 | (num[i] ^ MAGIC[i]) }
order
end
It's just a cheap hash function, but it makes it difficult for an average user to determine how many orders have been processed, or a number for another order. It won't run out of space until you've done 230 orders, which you won't hit any time soon.
Here are the results of convert(x) for 1 through 10:
1: 302841629
2: 571277085
3: 34406173
4: 973930269
5: 437059357
6: 705494813
7: 168623901
8: 906821405
9: 369950493
10: 638385949
At my old place it was the following:
The customer ID (which started at 1001), the sequence of the order they made then the unique ID from the Orders table. That gave us a nice long number of at least 6 digits and it was unique because of the two primary keys.
I suppose if you put dashes or spaces in you could even get us a little insight into the customer's purchasing habits. It isn't mind boggling secure and I guess a order ID would be guessable but I am not sure if there is security risk in that or not.
Ok, how about this one?
Sequentially, starting at some number (2468) and add some other number to it, say the day of the month that the order was placed.
The number always increases (until you exceed the capacity of the integer type, but by then you probably don't care, as you will be incredibly successful and will be sipping margaritas in some far-off island paradise). It's simple enough to implement, and it mixes things up enough to throw off any guessing as to how many orders you have.
I'd rather submit the number 347 and get great customer service at a smaller personable website than: G-84e38wRD-45OM at the mega-site and be ignored for a week.
You would not want this as a system id or part of a link, but as a user-friendly number it works.
Douglas Crockford's Base32 Encoding works superbly well for this.
http://www.crockford.com/wrmg/base32.html
Store the ID itself in your database as an auto-incrementing integer, starting at something suitably large like 100000, and simply expose the encoded value to the customer/interface.
5 characters will see you through your first ~32 million orders, whilst performing very well and satisfying most of these requirements. It doesn't allow for the exclusion of similar sounding characters though.
Rather than generating and storing a number, you might try creating an encrypted version that would not reveal the number of orders in the system. Here's an article on exactly that.
Something like this:
Get sequential order number. Or, maybe, an UNIX timestamp plus two or three random digits (when two orders are placed at the same moment) is fine too.
Bitwise-XOR it with some semi-secret value to make number appear "pseudo-random". This is primitive and won't stop those who really want to investigate how many orders you have, but for true "randomness" you need to keep a (large) permutation table. Or you'll need to have large random numbers, so you won't be hit by the birthday paradox.
Add checkdigit using Verhoeff algorithm (I'm not sure it will have such a good properties for base33, but it shouldn't be bad).
Convert the number to - for example - base 33 ("0-9A-Z", except for "O", "Q" and "L" which can be mistaken with "0" and "1") or something like that. Ease of pronouncation means excluding more letters.
Group the result in some visually readable pattern, like XXX-XXX-XX, so users won't have to track the position with their fingers or mouse pointers.
Sequentially, starting at 1? What's wrong with that?
(Note: This answer was given before the OP edited the question.)
Just one Rube Goldberg-style idea:
You could generate a table with a random set of numbers that is tied to a random period of time:
Time Period Interleaver
next 2 weeks: 442
following 8 days: 142
following 3 weeks: 580
and so on... this gives you an unlimited number of Interleavers, and doesn't let anyone know your rate of orders because your time periods could be on the order of days and your interleaver is doing a lot of low-tech "mashing" for you.
You can generate this table once, and simply ensure that all Interleavers are unique.
You can ensure you don't run out of Interleavers by simply adding more characters into the set, or start by defining longer Interleavers.
So you generate an order ID by getting a sequential number, and using today's Interleaver value, interleave its digits (hence, the name) in between each sequential number's digits. Guaranteed unique - guaranteed confusing.
Example:
Today I have a sequential number 1, so I will generate the order ID: 4412
The next order will be 4422
The next order will be 4432
The 10th order will be 41402
In two weeks my interleaver will change to 142,
The 200th order will be 210402
The 201th order will be 210412
Eight days later, my interleaver changes to 580:
The 292th order will be 259820
This will be completely confusing but completely deterministic. You can just remove every other digit starting at the 1's place. (except when your order id is only one digit longer than your interleaver)
I didn't say this was the best way - just a Friday idea.
You could do it like a postal code:
2b2 b2b
That way it has some kind of checksum (not really, but at least you know it's wrong if there are 2 consecutive numbers or letters). It's easy to read out over the phone, and it doesn't give an indication of how many orders are in the system.
http://blog.logeek.fr/2009/7/2/creating-small-unique-tokens-in-ruby
>> rand(36**8).to_s(36)
=> "uur0cj2h"
How about getting the current time in miliseconds and using that as your order ID?

Resources