Reducing time complexity from n^2 - algorithm

There are n students in a class. Soon, there's going to be an inter-school science quiz, where there'll be questions on 4 subjects, call them 1,2,3 and 4 for simplicity. There should 2 students in a team. Any student is equally likely to be good or bad at a particular subject.
We are given as input n rows, each with 4 entries in it. The student is good at i-th subject if the i-th column is equal to 1. I'm supposed to find out the total number of ways in which the school can send a team so that the two students together know all the 4 subjects.
For example,
S1: 1 1 1 1
S2: 1 1 0 1
S3: 1 0 1 1
S4: 0 0 1 1
S5: 0 0 0 0
Student 1 can go with any student since all the subjects are his strength. => 4
Student 2 can go with S3 and S4, since S2 is good in subject 1,2 and 4 and S3 and S4 are good in subject 3. => 2 (Note that (S1,S2) was already counted)
S3 will go with the one good at subject 2=> none
S4: Similarly, none.
Hence, ans=4+2=6
My solution:-
ans=0;
//arr is the array containing subject-wise "strength" of students
for(int i=0;i<n;i++){
ArrayList<Integer> a=new ArrayList<>();
for(int j=0;j<4;j++)
if(arr[i][j]==0)
a.add(j);
if(a.size()==0)
ans+=n-i-1;
else
for(int j=i+1;j<n;j++){
bool=false;
for(int k=0;k<a.size();k++){
if(arr[j][a.get(k)]==0)
break;
bool=true;
}
if(bool)
ans++;
}
}
System.out.println(ans);
Now I know my solution is correct, but its time complexity is O(n^2), and I'm looking for a better solution for the same. Thanks!

You can reduce the complexity on number of students by spending memory for the combinations of subjects.
Build a list of 2^s elements, where s is the quantity of subjects. Index the list by the combinations of subjects, interpreted as a binary number. For instance, S2 has a value of 13, and is missing a value of 2.
First tally the "needs" that each student can fill. For each combination of 1 bits in the student's known subjects, increment that index of the tally list.
for student in student_list:
score = # student's covered subjects as an int
tally = [0]*16
for idx in range(16):
if idx & score == idx:
tally[idx] += 1
You now have a list of how many students can cover each combination of needed subjects. This is O(n * 2^s)
For each student, find the needed score, the 1's complement of the student score. It is now a simple matter to add all of the tallies for the needed scores.
team_ct = 0
for student in student_list:
needed = # student's needed subjects as an int; this is 15-score (above)
team_ct += tally[needed]
Now, every pairing has been counted twice, so divide team_ct by 2.
Yes, this last bit of code can be dropped into a one-liner:
team_ct = sum([tally[15-score] for score in foo]) / 2
Where foo is a construct of all student scores. This part is O(n).

Related

Algorithm for grouping train trips

Imagine you have a full calendar year in front of you. On some days you take the train, potentially even a few times in a single day and each trip could be to a different location (I.E. The amount you pay for the ticket can be different for each trip).
So you would have data that looked like this:
Date: 2018-01-01, Amount: $5
Date: 2018-01-01, Amount: $6
Date: 2018-01-04, Amount: $2
Date: 2018-01-06, Amount: $4
...
Now you have to group this data into buckets. A bucket can span up to 31 consecutive days (no gaps) and cannot overlap another bucket.
If a bucket has less than 32 train trips it will be blue. If it has 32 or more train trips in it, it will be red. The buckets will also get a value based on the sum of the ticket cost.
After you group all the trips the blue buckets get thrown out. And the value of all the red buckets gets summed up, we will call this the prize.
The goal, is to get the highest value for the prize.
This is the problem I have. I cant think of a good algorithm to do this. If anyone knows a good way to approach this I would like to hear it. Or if you know of anywhere else that can help with designing algorithms like this.
This can be solved by dynamic programming.
First, sort the records by date, and consider them in that order.
Let day (1), day (2), ..., day (n) be the days where the tickets were bought.
Let cost (1), cost (2), ..., cost (n) be the respective ticket costs.
Let fun (k) be the best prize if we consider only the first k records.
Our dynamic programming solution will calculate fun (0), fun (1), fun (2), ..., fun (n-1), fun (n), using the previous values to calculate the next one.
Base:
fun (0) = 0.
Transition:
What is the optimal solution, fun (k), if we consider only the first k records?
There are two possibilities: either the k-th record is dropped, then the solution is the same as fun (k-1), or the k-th record is the last record of a bucket.
Let us then consider all possible buckets ending with the k-th record in a loop, as explained below.
Look at records k, k-1, k-2, ..., down to the very first record.
Let the current index be i.
If the records from i to k span more than 31 consecutive days, break from the loop.
Otherwise, if the number of records, k-i+1, is at least 32, we can solve the subproblem fun (i-1) and then add the records from i to k, getting a prize of cost (i) + cost (i+1) + ... + cost (k).
The value fun (k) is the maximum of these possibilities, along with the possibility to drop the k-th record.
Answer: it is just fun (n), the case where we considered all the records.
In pseudocode:
fun[0] = 0
for k = 1, 2, ..., n:
fun[k] = fun[k-1]
cost_i_to_k = 0
for i = k, k-1, ..., 1:
if day[k] - day[i] > 31:
break
cost_i_to_k += cost[i]
if k-i+1 >= 32:
fun[k] = max (fun[k], fun[i-1] + cost_i_to_k)
return fun[n]
It is not clear whether we are allowed to split records on a single day into different buckets.
If the answer is no, we will have to enforce it by not considering buckets starting or ending between records in a single day.
Technically, it can be done by a couple of if statements.
Another way is to consider days instead of records: instead of tickets which have day and cost, we will work with days.
Each day will have cost, the total cost of tickets on that day, and quantity, the number of tickets.
Edit: as per comment, we indeed can not split any single day.
Then, after some preprocessing to get days records instead of tickets records, we can go as follows, in pseudocode:
fun[0] = 0
for k = 1, 2, ..., n:
fun[k] = fun[k-1]
cost_i_to_k = 0
quantity_i_to_k = 0
for i = k, k-1, ..., 1:
if k-i+1 > 31:
break
cost_i_to_k += cost[i]
quantity_i_to_k += quantity[i]
if quantity_i_to_k >= 32:
fun[k] = max (fun[k], fun[i-1] + cost_i_to_k)
return fun[n]
Here, i and k are numbers of days.
Note that we consider all possible days in the range: if there are no tickets for a particular day, we just use zeroes as its cost and quantity values.
Edit2:
The above allows us to calculate the maximum total prize, but what about the actual configuration of buckets which gets us there?
The general method will be backtracking: at position k, we will want to know how we got fun (k), and transition to either k-1 if the optimal way was to skip k-th record, or from k to i-1 for such i that the equation fun[k] = fun[i-1] + cost_i_to_k holds.
We proceed until i goes down to zero.
One of the two usual implementation approaches is to store par (k), a "parent", along with fun (k), which encodes how exactly we got the maximum.
Say, if par (k) = -1, the optimal solution skips k-th record.
Otherwise, we store the optimal index i in par (k), so that the optimal solution takes a bucket of records i to k inclusive.
The other approach is to store nothing extra.
Rather, we run a slight modification code which calculates fun (k).
But instead of assigning things to fun (k), we compare the right part of the assignment to the final value fun (k) we already got.
As soon as they are equal, we found the right transition.
In pseudocode, using the second approach, and days instead of individual records:
k = n
while k > 0:
k = prev (k)
function prev (k):
if fun[k] == fun[k-1]:
return k-1
cost_i_to_k = 0
quantity_i_to_k = 0
for i = k, k-1, ..., 1:
if k-i+1 > 31:
break
cost_i_to_k += cost[i]
quantity_i_to_k += quantity[i]
if quantity_i_to_k >= 32:
if fun[k] == fun[i-1] + cost_i_to_k:
writeln ("bucket from $ to $: cost $, quantity $",
i, k, cost_i_to_k, quantity_i_to_k)
return i-1
assert (false, "can't happen")
Simplify the challenge, but not too much, to make an overlookable example, which can be solved by hand.
That helps a lot in finding the right questions.
For example take only 10 days, and buckets of maximum length of 3:
For building buckets and colorizing them, we need only the ticket count, here 0, 1, 2, 3.
On Average, we need more than one bucket per day, for example 2-0-2 is 4 tickets in 3 days. Or 1-1-3, 1-3, 1-3-1, 3-1-2, 1-2.
But We can only choose 2 red buckets: 2-0-2 and (1-1-3 or 1-3-3 or 3-1-2) since 1-2 in the end is only 3 tickets, but we need at least 4 (one more ticket than max day span per bucket).
But while 3-1-2 is obviously more tickets than 1-1-3 tickets, the value of less tickets might be higher.
The blue colored area is the less interesting one, because it doesn't feed itself, by ticket count.

Maximum Value taken by thief

Consider we have a sacks of gold and thief wants to get the maximum gold. Thief can take the gold to get maximum by,
1) Taking the Gold from contiguous sacks.
2) Thief should take the same amount of gold from all sacks.
N Sacks 1 <= N <= 1000
M quantity of Gold 0 <= M <= 100
Sample Input1:
3 0 5 4 4 4
Output:
16
Explanation:
4 is the minimum amount he can take from the sacks 3 to 6 to get the maximum value of 16.
Sample Input2:
2 4 3 2 1
Output:
8
Explanation:
2 is the minimum amount he can take from the sacks 1 to 4 to get the maximum value of 8.
I approached the problem using subtracting the values from array and taking the transition point from negative to positive, but this doesn't solves the problem.
EDIT: code provided by OP to find the index:
int temp[6];
for(i=1;i<6;i++){
for(j=i-1; j>=0;j--) {
temp[j] = a[j] - a[i];
}
}
for(i=0;i<6;i++){
if(temp[i]>=0) {
index =i;
break;
}
}
The best amount of gold (TBAG) taken from every sack is equal to weight of some sack. Let's put indexes of candidates in a stack in order.
When we meet heavier weight (than stack contains), it definitely continues "good sequence", so we just add its index to the stack.
When we meet lighter weight (than stack top), it breaks some "good sequences" and we can remove heavier candidates from the stack - they will not have chance to be TBAG later. Remove stack top until lighter weight is met, calculate potentially stolen sum during this process.
Note that stack always contains indexes of strictly increasing sequence of weights, so we don't need to consider items before index at the stack top (intermediate AG) in calculation of stolen sum (they will be considered later with another AG value).
for idx in Range(Sacks):
while (not Stack.Empty) and (Sacks[Stack.Peek] >= Sacks[idx]): //smaller sack is met
AG = Sacks[Stack.Pop]
if Stack.Empty then
firstidx = 0
else
firstidx = Stack.Peek + 1
//range_length * smallest_weight_in_range
BestSUM = MaxValue(BestSUM, AG * (idx - firstidx))
Stack.Push(idx)
now check the rest:
repeat while loop without >= condition
Every item is pushed and popped once, so linear time and space complexity.
P.S. I feel that I've ever seen this problem in another formulation...
I see two differents approaches for the moment :
Naive approach: For each pair of indices (i,j) in the array, compute the minimum value m(i,j) of the array in the interval (i,j) and then compute score(i,j) = |j-i+1|*m(i,j). Take then the maximum score over all the pairs (i,j).
-> Complexity of O(n^3).
Less naive approach:
Compute the set of values of the array
For each value, compute the maximum score it can get. For that, you just have to iterate once over all the values of the array. For example, when your sample input is [3 0 5 4 4 4] and the current value you are looking is 3, then it will give you a score of 12. (You'll first find a value of 3 thanks to the first index, and then a score of 12 due to indices from 2 to 5).
Take the maximum over all values found at step 2.
-> Complexity is here O(n*m), since you have to do at most m times the step 2, and the step 2 can be done in O(n).
Maybe there is a better complexity, but I don't have a clue yet.

Algorithm: Explanation of a graph based

In Futaba Kindergarten, where Shinchan studies, there are N students, s_0, s_1...s_(N-1), including Shinchan. Every student knows each other directly or indirectly. Two students knows each other directly if they are friends. Indirectly knowing each other means there is a third students who knows both of them. Knowing each other is a symmetric relation, i.e., if student s_a knows student s_b then student s_b also knows student s_a.
Ai-chan is a new admission in the class. She wants to be friend with all of them. But it will be very cumbersome to befriend each of the N students there. So she decided to befriend some of them such that every student in the class is either a friend of her or friend of friend of her.
Help her to select those students such that befriending them will complete her objective. The lesser number of students the better it is.
Input
First line of input will contain two space separated integer, N M, number of students at Futaba Kindergarten, excluding Ai-chan, and number of pairs of students who are friend to each other, i.e. they knows each other directly. Then follows M lines. In each line there are two space separated integer, s_u s_v, such that student s_u and s_v are friend to each other.
Output
In first line print the total number, P, of such students. Then in next line print P space separated index of students, such that befriending them will help Ai-chan to achieve her objective.
Constraints:
1 <= N <= 10^5
1 <= M <= min(10^5, N*(N-1)/2)
0 <= s_u, s_v <= N-1
s_u != s_v
Each pair of students (s_u, s_v) knows each other, directly or indirectly.
Score: ((N-P)/N)*200
**Sample Input**
6 7
0 1
0 2
1 2
1 3
2 4
3 4
3 5
**Sample Output**
4
0 2 3 5
Im My opinion be friending with only 1 and 3 will do the job. Am i missing something ?
I am not looking for the solution , just the explanation of sample input and output.
The solution is a simple greedy algorithm. Suppose that C is the set of students.
S = {}
R = {}
while (C != {}) {
- sort the students based on their number of friends
- pick the student s with the highest number of friends
- add R = R + {s}
- add s and friends of s to the set S and remove them from C
}
print(R)

Algorithm for random numbers

I have to implement an algorithm for a raffle. The problem is that i would like that some of the participant to have more chances, because they have more points. How can i do that?
I thounght to simply put them many times in the raffle, but doesn't seems legit.
Do you know any algorithms that can do that?
Thanks
Pseudo algorithm:
winnerTicket <- a random number between zero and sum ticket count - 1
currentTicket <- 0
For each participant in participants ordered by id
If winnerTicket - currentTicket > participant.ticketCount
currentTicket += participant.ticketCount
Else
return participant
Why wouldn't that be "legit". If you base your amount of chance on a number of points, you add the person for X times in the raffle based on his points. That person's chance increase.
I would solve it in this way.
You have a mapping: participant => number of chances. In many programming languages you can declare a mapping or dictionary like this:
{"player1": 2, "player2": 5, ... many more like these}
so you can iterate like this:
accumulatedMap = {} #an empty map
total = 0
for each pair of key:count in the mapping:
total = total + count
accumulatedMap[key] = total
#now, get random and calculate
element = random between 1 and total, inclusive.
for each pair of key:accumulated in the mapping:
if element <= accumulated:
return key
#at this point, in the worst case the last key was returned.
This code is just an example. Remember that mappings don't always keep an insertion order when iterating.

Flipping coins using Scala

I am trying to solve the Flipping coins problem from codechef in scala. The problem statement is as follows:
There are N coins kept on the table, numbered from 0 to N - 1.
Initally, each coin is kept tails up. You have to perform two types of
operations : 1) Flip all coins numbered between A and B. This is
represented by the command "0 A B" 2) Answer how many coins numbered
between A and B are heads up. This is represented by the command "1 A
B". Input : The first line contains two integers, N and Q. Each of the
next Q lines are either of the form "0 A B" or "1 A B" as mentioned
above.
Output : Output 1 line for each of the queries of the form "1 A B"
containing the required answer for the corresponding query.
Sample Input :
4 7
1 0 3
0 1 2
1 0 1
1 0 0
0 0 3
1 0 3
1 3 3
Sample Output :
0
1
0
2
1
Constraints : 1 <= N <= 100000 1 <= Q <= 100000 0 <= A <= B <= N - 1
In the most simplistic way, I was thinking of initializing an Array of Ints in scala as follows:
var coins = new Array[Int](1000)
If I encounter the command 0 A B, I will simply set the index of A until B+1 to 1 as follows:
for(i <- 5 until 8){
coins(i) = 1
}
If I encounter the command 1 A B, I will take a slice of the array from A until B+1 and count the number of 1's in that given slice and I will do it as follows:
val headCount = coins.slice(5,8).count(x => x == 1)
It seems like this operation take O(n) in the worst case and apparently this can be optimized to be solved in logarithmic time.
Can somebody point out what I might be doing wrong here and how can this problem be solved in the most optimal manner.
Thanks
i don't know much about scala these days, but i can suggest an answer for the more general question about O(log(n)). typically such algorithms uses trees, and i think you could do so here.
if you construct a balanced tree, with the coins as leaves, then you could store in each node the total number of coins and the number of heads in the leaves below that node. you could imagine code that flips coins working out which leaves to visit from the node information, and working in O(n) time (you still need to flip coins). but if the flipping code also updated the node data then the number of heads would be O(log(n)) because you can use the node info - you don't need to go to the leaves.
so that gives you O(n) for one command and O(log(n)) for the other.
but you can go better than that. you can make the flip operation O(log(n)) too, i think. to do this you would add to each node a "flipped" flag. if set then all the nodes below that point are flipped. there are some book-keeping details, but the general idea is there.
and if you take this to its logical conclusion, you don't actually need to store the leaves, at least at the start. you just add nodes with the level of detail required as you process the commands. at this point you basically have the interval tree mentioned in the comments.
One clean way to model this is as a BitSet, where the integer values in the set represent the indices of the heads on the board. Then you can flip the coins in a range like this:
def flip(coins: BitSet, a: Int, b: Int) = coins ^ BitSet(a to b: _*)
You can count the heads in a range similarly:
def heads(coins: BitSet, a: Int, b: Int) = (coins & BitSet(a to b: _*)).size
Or the (probably faster) mutable java.util.BitSet versions:
import java.util.BitSet
def flip(coins: BitSet, a: Int, b: Int) { coins.flip(a, b + 1) }
def heads(coins: BitSet, a: Int, b: Int) = coins.get(a, b + 1).cardinality
This isn't necessarily optimal, but it's a fairly efficient approach, since you're just flipping bits.

Resources