Student Election Results Tallying - algorithm

This is a coding interview question:
Your school is having an election and you are tasked with coding a program that tallies the results.
You are given a Set of Votes, each vote containing a candidate and a time stamp. Given a time stamp, return the top N candidates with the most votes at that timestamp. (each vote tallied must come before or at the given timestamp)

Create the Min Heap and HashMap Data structure to solve this problem.
1. Cast each vote in HashMap(Candidate, Votes).
2. At any time we want to find the N top trending Candidate, Add all the HashMap keys(Candidate votes) to the min heap with restriction of N size.
3. return all the item from the min heap, which will return the top N candidate with the votes. (as min heap filter the candidate with the restriction on size N).

This is probably far from the most efficient way, but I would:
Create a list candidateList containing each candidate and their respective number of votes (initially 0)
Go through the set of votes, and if a vote meets the time stamp requirement, add 1 to the candidate's votes in candidateList.
After you've gone through the set of votes, find the nth most popular candidate in candidateList (using a selection algorithm), and then iterate over the list to find the candidates more popular than them.

I would do it per array
For every new date you read, you create a new subarray: lets say you get a vote from the 9th of August 2016, for which you don't have votes registered yet l, for John Doe soons as you register a vote for lets say John Doe.
Your array should then be constructed like this:
array ->0->date: 09/08/2016
->John Doe: 1
Since I assume at an election all the names are known, we can simply save all the candidates names in another array, which we can use when we loop through this one.
Incase a new vote for John Doe on another date gets registered, your array would look like this
array ->0->date: 09/08/2016
->John Doe: 1
->1->date: 11/08/2016
->John Doe: 1
If someone votes for another persom on an already known date, it should look like this
array ->0->date: 09/08/2016
->John Doe: 1
->Jane Doe: 1
Hope this helps. If you want help looping through this array-structure-thingy, don't be afraid to ask :)

Related

Algorithm for heavily restricted Knapsack problem

I have a following problem to solve:
We have 53 weeks in a year, for each week we need to choose one model from the list: [A1, A2,....,F149, F150]. In total around 750 models in 6 classes: A,B,C,D,E,F.
Models can repeat and each has a specific value from around 3 to 10 and a weight. The goal is to achieve a target total value of 280+-5% with a minimal weight by the end of the year.
However there are a ton of restrictions. For example:
Models must be held for at least 4 weeks in a row. If we have chosen A1 for week 1, then we need to choose A1 for weeks 2,3,4;
If we have chosen model classes E,F then, after they end, we cannot choose E, F for another 4 weeks.
Throughout the year we can only choose 23 models of class D.
an so on
What I've tried so far:
Based on a target value create a corridor of allowed values throughout the year:
Corridor looks like this
Starting at week 1, choose a random model for the week from the list of allowed models -> Based on the choice modify the list of allowed models for next weeks
If our choice satisfies the criterion (also lies within a corridor), then week+=1. If not, delete this possibility.
If there is no more models for this week, go back one week, delete the possibility we have chosen before and choose random from what's left.
Pictorially the algorithm is like following the branches of a tree. If the branch is bad, return back to the fork and cut off the bad branch.
This algorithm can generate a random valid solution (in about 5 to 80 minutes with a mean time of 25 minutes). Then I need to generate more of those and choose one that has the least weight. Which is not a very good approach, I presume.
Question
The question is: what is the optimal way to solve the problem? The priority is to find the solution with a minimal weight and a target value and not the fastest algorithm. But it should at least end in a final amount of time =)
The problem statement above is a bit oversimplified and due to the complexity of calculations and the amount of combinations, there is no way to consider and compare all combinations.

Probability - expectation Puzzle: 1000 persons and a door

You stand in an office by a door, with a measuring tape. Every time a person walks in you measure him or her and only keep tally of the “record” tallest. If the new person is taller than the preceding one, you count a record. If later another person is taller, you have another record, etc.
A 1000 persons pass through the door. How many records do you expect to have?
(Assume independence of height/arrival. Also note that the answer does not depend on any assumption about the probability distribution other than independence.)
PS - I'm able to come up with answer (~7.5) with a brute force approach. ( Running this scenario over 1000000 times and taking average ). But here I'm looking for a theoretical approach.
consider x_1 to x_1000 as the record, and max(i) as max of the sequence until i. The question is reduced to finding expected number of times the max(i) changes.
for i=0 to 999:
if x_i+1>max(i), then max(i) changes
Also, P(x_i+1>max(i))=1/i+1
answer=> summation of 1/1+i (i varies from 0 to 999) which is approx. 7.49

Minimum sets to cover all sub arrays

I am explaining this question with little modification so that it becomea easy for me to explain.
There are n employees and I need to arrange an outing for them on such a day of a month on which all (or max) employees would be available for outing.
Each employee is asked to fill up an online survey stating his availability e.g. 1-31 or 15-17 etc. etc. And some might not be available for even a single day too.
There are no restrictions on the number of trips I have to arrange to cover all employees (not considering who arent available the whole month), but i want to find out minimum set of dates so as to cover all the employees. So in worst case scenario I will have to arrange trip 31 times.
Question: what is the best data structure I can use to run the best fitting algorithm on this data structure? What is the best possible way to solve this problem?
By best of course I mean time and space efficient way but I am also looking for other options to solve it.
The way I think is to maintain an array for 31 ints and initialize it to 0. Run over each employee and based on their available dates increment the array index. At the end sort the array of 31. The maximum value represents the date on qhich max employees are available. And apply the same logic on the left out employees. But the problem is to remove the left out employees. For which I will have to run over whole list of employees once to know which employees can be removed and form a new list of left out employees on which I can apply the previous logic. Running over the list twice this way to remove the employees isnt the best according to me. Any ideas?
As a first step, you should exclude employees with no available dates.
Then you problem becomes a variant of Set Cover Problem.
Your universe U is all employees, and collections of sets S are days. For each day i, you have employee j is in set S[i] iff that employee is available on day i.
That problem is NP-hard. So, unless you want an approximate solution, you must check every 31^2 combination of days, probably with some pruning.
Select an array from 1 to 31(each index is representing dates of a month).for each date you have to create a linked list(doubly) contains the emp_id who are available on that days(you can simultaneously create this list which will be sorted based on emp_id,and you can keep the information about the size of the list and the index of array which maximum employees).
The largest list must be in the solution(take it as first date).
Now compare each list with the largest list and remove those employees from the list which are already present in the selected largest list.
now do the same procedure and find the second date and so on...
this whole procedure will run in O(n^2)(because 31 is a constant value).
and space will be O(n).

If you know the future prices of a stock, what's the best time to buy and sell?

Interview Question by a financial software company for a Programmer position
Q1) Say you have an array for which the ith element is the price of a given stock on
day i.
If you were only permitted to buy one share of the stock and sell one share
of the stock, design an algorithm to find the best times to buy and sell.
My Solution :
My solution was to make an array of the differences in stock prices between day i and day i+1 for arraysize-1 days and then use Kadane Algorithm to return the sum of the largest continuous sub array.I would then buy at the start of the largest continuous array and sell at the end of the largest
continous array.
I am wondering if my solution is correct and are there any better solutions out there???
Upon answering i was asked a follow up question, which i answered exactly the same
Q2) Given that you know the future closing price of Company x for the next 10 days ,
Design a algorithm to to determine if you should BUY,SELL or HOLD for every
single day ( You are allowed to only make 1 decision every day ) with the aim of
of maximizing profit
Eg: Day 1 closing price :2.24
Day 2 closing price :2.11
...
Day 10 closing price : 3.00
My Solution: Same as above
I would like to know what if theres any better algorithm out there to maximise profit given
that i can make a decision every single day
Q1 If you were only permitted to buy one share of the stock and sell one share of the stock, design an algorithm to find the best times to buy and sell.
In a single pass through the array, determine the index i with the lowest price and the index j with the highest price. You buy at i and sell at j (selling before you buy, by borrowing stock, is in general allowed in finance, so it is okay if j < i). If all prices are the same you don't do anything.
Q2 Given that you know the future closing price of Company x for the next 10 days , Design a algorithm to to determine if you should BUY,SELL or HOLD for every single day ( You are allowed to only make 1 decision every day ) with the aim of of maximizing profit
There are only 10 days, and hence there are only 3^10 = 59049 different possibilities. Hence it is perfectly possible to use brute force. I.e., try every possibility and simply select the one which gives the greatest profit. (Even if a more efficient algorithm were found, this would remain a useful way to test the more efficient algorithm.)
Some of the solutions produced by the brute force approach may be invalid, e.g. it might not be possible to own (or owe) more than one share at once. Moreover, do you need to end up owning 0 stocks at the end of the 10 days, or are any positions automatically liquidated at the end of the 10 days? Also, I would want to clarify the assumption that I made in Q1, namely that it is possible to sell before buying to take advantage of falls in stock prices. Finally, there may be trading fees to be taken into consideration, including payments to be made if you borrow a stock in order to sell it before you buy it.
Once these assumptions are clarified it could well be possible to take design a more efficient algorithm. E.g., in the simplest case if you can only own one share and you have to buy before you sell, then you would have a "buy" at the first minimum in the series, a "sell" at the last maximum, and buys and sells at any minima and maxima inbetween.
The more I think about it, the more I think these interview questions are as much about seeing how and whether a candidate clarifies a problem as they are about the solution to the problem.
Here are some alternative answers:
Q1) Work from left to right in the array provided. Keep track of the lowest price seen so far. When you see an element of the array note down the difference between the price there and the lowest price so far, update the lowest price so far, and keep track of the highest difference seen. My answer is to make the amount of profit given at the highest difference by selling then, after having bought at the price of the lowest price seen at that time.
Q2) Treat this as a dynamic programming problem, where the state at any point in time is whether you own a share or not. Work from left to right again. At each point find the highest possible profit, given that own a share at the end of that point in time, and given that you do not own a share at the end of that point in time. You can work this out from the result of the calculations of the previous time step: In one case compare the options of buying a share and subtracting this from the profit given that you did not own at the end of the previous point or holding a share that you did own at the previous point. In the other case compare the options of selling a share to add to the profit given that you owned at the previous time, or staying pat with the profit given that you did not own at the previous time. As is standard with dynamic programming you keep the decisions made at each point in time and recover the correct list of decisions at the end by working backwards.
Your answer to question 1 was correct.
Your answer to question 2 was not correct. To solve this problem you work backwards from the end, choosing the best option at each step. For example, given the sequence { 1, 3, 5, 4, 6 }, since 4 < 6 your last move is to sell. Since 5 > 4, the previous move to that is buy. Since 3 < 5, the move on 5 is sell. Continuing in the same way, the move on 3 is to hold and the move on 1 is to buy.
Your solution for first problem is Correct. Kadane's Algorithm runtime complexity is O(n) is a optimal solution for maximum subarray problem. And benefit of using this algorithm is that it is easy to implement.
Your solution for second problem is wrong according to me. What you can do is to store the left and right index of maximum sum subarray you find. Once you find have maximum sum subarray and its left and right index. You can call this function again on the left part i.e 0 to left -1 and on right part i.e. right + 1 to Array.size - 1. So, this is a recursion process basically and you can further design the structure of this recursion with base case to solve this problem. And by following this process you can maximize profit.
Suppose the prices are the array P = [p_1, p_2, ..., p_n]
Construct a new array A = [p_1, p_2 - p_1, p_3 - p_2, ..., p_n - p_{n-1}]
i.e A[i] = p_{i+1} - p_i, taking p_0 = 0.
Now go find the maximum sum sub-array in this.
OR
Find a different algorithm, and solve the maximum sub-array problem!
The problems are equivalent.

How do I pick the most beneficial combination of items from a set of items?

I'm designing a piece of a game where the AI needs to determine which combination of armor will give the best overall stat bonus to the character. Each character will have about 10 stats, of which only 3-4 are important, and of those important ones, a few will be more important than the others.
Armor will also give a boost to 1 or all stats. For example, a shirt might give +4 to the character's int and +2 stamina while at the same time, a pair of pants may have +7 strength and nothing else.
So let's say that a character has a healthy choice of armor to use (5 pairs of pants, 5 pairs of gloves, etc.) We've designated that Int and Perception are the most important stats for this character. How could I write an algorithm that would determine which combination of armor and items would result in the highest of any given stat (say in this example Int and Perception)?
Targeting one statistic
This is pretty straightforward. First, a few assumptions:
You didn't mention this, but presumably one can only wear at most one kind of armor for a particular slot. That is, you can't wear two pairs of pants, or two shirts.
Presumably, also, the choice of one piece of gear does not affect or conflict with others (other than the constraint of not having more than one piece of clothing in the same slot). That is, if you wear pants, this in no way precludes you from wearing a shirt. But notice, more subtly, that we're assuming you don't get some sort of synergy effect from wearing two related items.
Suppose that you want to target statistic X. Then the algorithm is as follows:
Group all the items by slot.
Within each group, sort the potential items in that group by how much they boost X, in descending order.
Pick the first item in each group and wear it.
The set of items chosen is the optimal loadout.
Proof: The only way to get a higher X stat would be if there was an item A which provided more X than some other in its group. But we already sorted all the items in each group in descending order, so there can be no such A.
What happens if the assumptions are violated?
If assumption one isn't true -- that is, you can wear multiple items in each slot -- then instead of picking the first item from each group, pick the first Q(s) items from each group, where Q(s) is the number of items that can go in slot s.
If assumption two isn't true -- that is, items do affect each other -- then we don't have enough information to solve the problem. We'd need to know specifically how items can affect each other, or else be forced to try every possible combination of items through brute force and see which ones have the best overall results.
Targeting N statistics
If you want to target multiple stats at once, you need a way to tell "how good" something is. This is called a fitness function. You'll need to decide how important the N statistics are, relative to each other. For example, you might decide that every +1 to Perception is worth 10 points, while every +1 to Intelligence is only worth 6 points. You now have a way to evaluate the "goodness" of items relative to each other.
Once you have that, instead of optimizing for X, you instead optimize for F, the fitness function. The process is then the same as the above for one statistic.
If, there is no restriction on the number of items by category, the following will work for multiple statistics and multiple items.
Data preparation:
Give each statistic (Int, Perception) a weight, according to how important you determine it is
Store this as a 1-D array statImportance
Give each item-statistic combination a value, according to how much said item boosts said statistic for the player
Store this as a 2-D array itemStatBoost
Algorithm:
In pseudocode. Here assume that itemScore is a sortable Map with Item as the key and a numeric value as the value, and values are initialised to 0.
Assume that the sort method is able to sort this Map by values (not keys).
//Score each item and rank them
for each statistic as S
for each item as I
score = itemScore.get(I) + (statImportance[S] * itemStatBoost[I,S])
itemScore.put(I, score)
sort(itemScore)
//Decide which items to use
maxEquippableItems = 10 //use the appropriate value
selectedItems = new array[maxEquippableItems]
for 0 <= idx < maxEquippableItems
selectedItems[idx] = itemScore.getByIndex(idx)

Resources