Probabilistic algorithm - algorithm

We are given a binary array that has either n zeros or floor(n/2) zeros and ceiling(n/2) ones.
We want to decide whether the array includes ones.
Q. Suggest a random algorithm that has time complexity O(1) and gives the correct answer with a probability of at least 3/4. The algorithm can give a wrong answer but not for more than 1/4 possible inputs.
I would like to get some direction on how to solve this question.

Check random item in the array:
If item == 0 return first possibility (n zeroes)
If item == 1 return second possibility (n/2 zeroes and n/2 ones)
Let's have a look what's going on: the only possibility to give incorrect answer is when we have second possibility,
but we get item == 0 and answer is first possibility. The conditional (second possibility) probability is
p = 1/2
If we check two random items
p = 1/4 (two items are zeroes)
If we check three random items
p = 1/8 (three items are zeroes)
Now, let's compute bayesian probability of incorrect answer, let
P0 - probability of the 1st (all zeroes) outcome
P1 - probability of the 2nd (half zeroes, half ones) outcome
Perror = P1 * p / (P0 + P1) <= 1/4
Or
P1 * p / (P0 + P1) <= 1/4
p <= (P0 + P1) / 4 / P1
p <= P0 / (4 * P1) + 1/4
From the worst case, P0 = 0 (P1 = 1) we get condition for p:
p <= 1/4
So far so good, we should check two random array's items and then
If both items are 0, we answer "All zeroes case"
If any item is 1, we answer "Half zeroes, half ones case"

Related

In which order the woman should bring the cats back in order to minimize the time?

A woman watches her cats leave one by one with different speeds in different directions. She took a motorcycle with one extra seat and follows the cats and picks up one cat at a time and brings them back home. Each cat moves with constant individual speed Vi and left home at time Ti. In which order the woman should bring the cats back in order to minimize the time?
I am trying to solve this problem but do not know how to begin.
Summary:
Sort the cats according to the metric v / x in descending order, where v is the cat's constant speed and x is the cat's initial displacement at time t = 0. It doesn't matter how you break ties. Once the order is initially established, it will remain the most efficient order in which to get cats as long as it is followed; so follow it.
Candidates debunked:
In both cases, allow the motorcycle speed to be w = 20.
It is proposed that you get cats in order from fastest to slowest. Counterexample: Cat #1 (x, v) = (1, 9) and Cat #2 (x, v) = (100, 10).
It is proposed that you get cats in order from closest to farthest. Counterexample: Cat #1 (x, v) = (1, 1) and Cat #2 (x, v) = (2, 100).
Detailed Derivation:
Let c(k) refer to the kth cat the lady picks up, v(k) refer to the speed of that cat and x(k) to the cat's initial displacement (at time t = 0, which we set a the time the lady starts her motorcycle initially in pursuit of the first cat).
The total time taken to get the first cat is:
t(1) = 2 * x(1) / (w - v(1))
where w is the constant speed of the motorcycle. Since this expression is going to be important we can motivate every part of it:
2 * comes from the fact that the lady must catch the cat, and then spend the same amount of time to return the cat home;
x(1) / (w - v(1)) is the time taken to reach the cat, that is, close the distance x(1) by traveling w - v(1) faster than the cat's v(1).
The time to get the first two cats is:
t(2) = t(1) + 2 * (x(2) + v(2)t(1)) / (w - v(2))
That is, it takes time equal to the time to get the first cat plus the time to get the second cat. The extra v(2)t(1) term accounts for the fact that the second cat moves while the lady is getting the first cat; otherwise, this part is the same.
Rearranging this expression, we get:
t(2) = t(1)(1 + 2 * v(2) / (w - v(2))) + 2 * x(2) / (w - v(2))
We define the following derivative terms:
T(k) = 2 * x(k) / (w - v(k))
s(k) = 2 * v(k) / (w - v(k)) + 1
Now we rewrite:
t(1) = T(1)
t(2) = s(2)T(1) + T(2)
and continue
t(1) = T(1)
t(2) = s(2)T(1) + T(2)
t(3) = s(3)s(2)t(1) + s(3)T(2) + T(3)
...
t(n) = s(n)...s(2)T(1) + s(n)...s(3)T(2) + ... + T(n)
This last expression gives us the total time to get all n cats:
s(n)...s(2)T(1) + s(n)...s(3)T(2) + ... + T(n)
Now we assume that we have an optimal solution in that the cats are picked up in the most efficient order possible. To derive useful properties about this hypothetical optimal solution, we can use the supposed optimality to infer that swapping cats produces a solution that is no better. Imagine swapping cats j and j+1:
... + s(n)...s(j+1)T(j) + s(n)...s(j+2)T(j+1) + ...
<= ... + s(n)...s(j)T(j+1) + s(n)...s(j+2)T(j) + ...
Terms involving T(k) for k < j have both s(j) and s(j+1) and by the commutativity of multiplication they are unaffected by the swap. Terms involving T(k) for k > j + 1 have neither s(j) nor s(j+1) and so cannot be affected by the swap. Only the terms with T(k) such that j <= k <= j + 1 are affected by the swap, so we can remove like terms:
s(n)...s(j+2)s(j+1)T(j) + s(n)...s(j+2)T(j+1)
<= s(n)...s(j+2)s(j)T(j+1) + s(n)...s(j+2)T(j)
The partial product s(n)...s(j+2) is common to all remaining terms and must be positive, so we can remove this like term by dividing both sides of the inequality:
s(j+1)T(j) + T(j+1) <= s(j)T(j+1) + T(j)
Rearrange this as follows:
(s(j+1) - 1)T(j) <= (s(j) - 1)T(j+1)
Finally:
(s(j+1) - 1) / T(j+1) <= (s(j) - 1) / T(j)
Recalling our definitions of s(k) and T(k), simplify to put this in terms of v and x:
v(j+1) / x(j+1) <= v(j) / x(j)
That is: if we have an optimal solution, it must be the case that the ratio of cats' speeds to initial displacements must be in descending order. This is a necessary, but perhaps not sufficient, condition.
Note that this result agrees with intuition:
Get still cats last (v = 0)
Get cats that haven't left yet first (x = 0)
Get cats approaching the motorcycle's speed first (or never)
Get cats that are really far away last (x -> +inf)
It also gives the correct result for the two-cat case; and in that case, if the ratios of speed to displacement are equal, then it can be easily shown that it doesn't matter which order you get the cats in (if they are unequal, you must get the cat with the higher ratio first).
Now - I have not addressed the case where cats may have the same ratio. It's not immediately obvious to me that the order in which you get cats with the same ratio doesn't matter.
However, suppose you have chosen optimally up until some point k < n. Now you need to decide which of two cats with the same ratio to go after. As we already mentioned, for the two-cat problem, it's a wash: so I think the answer is that it can't matter which one you choose, as either order among the two will take the same time and "look" the same afterwards. To see that two cats that start with the same ratio keep the same ratio:
v(i) / x(i) = c; X(i) = x(i) + v(i)t = x(i) + x(i)ct = x(i)(1 + ct)
v(j) / x(j) = c; X(j) = x(j) + v(j)t = x(j) + x(j)ct = x(j)(1 + ct)
So the ratio changes over time (if you take X as the new initial displacement) but two cats that start out with the same ratio will keep it. The new ratio will be:
v / x = c; v / X = v / x(1 + ct) = c / (1 + ct)
It is important to note that these ratios don't "cross over" each other either; if you start out with a higher or lower ratio, it will change over time, but it will not become higher or lower than other cats' ratios:
c(i) / (1 + c(i)t) > c(j) / (1 + c(j)t)
<=> c(i) + c(i)c(j)t > c(j) + c(i)c(j)t
<=> c(i) > c(j)
Based on all of these considerations, my best answer is:
Sort the cats according to the metric v / x in descending order. It doesn't matter how you break ties. Get the cats in that order.
Update
Thanks to #Patrick87 counterexample, I know my solution does not work in the general case, however, I'm going to leave it here because it provides a simpler solution under the extra assumption that all cats start their moves from the home at time 0. Please see #Patrick87 solution for a general solution.
Short Answer:
She must start with the fastest cat. i.e, order cats by velocity (in decreasing order).
Simplify the problem: Assume there are only two cats, one is running the other one walking very slowly. Which one you will go after first?
Detailed answer:
The total distance of all cats from home at time 0 is 0:
X(0) = 0
Therefore, if we assume that the woman catches the last cat at time Tn then the total distance the woman has traveled at Tn is:
X(Tn) = (V1 * T1) + ... + (Vn * Tn)
Where Ti is the time she catches the n's cat. Vi values are predetermined, so, we need to minimize this equation based on values of Ti.
We have n POSITIVE V values with n positive T coefficients to assign to them. Minimizing it under these conditions is easy:
Give the largest V, the smallest coefficient T and so on.
Which means start with the fastest cat (largest V) and bring it back first (multiply by smallest T) and continue.

Pyramids dynamic programming

I encountered this question in an interview and could not figure it out. I believe it has a dynamic programming solution but it eludes me.
Given a number of bricks, output the total number of 2d pyramids possible, where a pyramid is defined as any structure where a row of bricks has strictly less bricks than the row below it. You do not have to use all the bricks.
A brick is simply a square, the number of bricks in a row is the only important bit of information.
Really stuck with this one, I thought it would be easy to solve each problem 1...n iteratively and sum. But coming up with the number of pyramids possible with exactly i bricks is evading me.
example, n = 6
X
XX
X
XX XXX
X
XXX XXXX
XX X
XXX XXXX XXXXX
X
XX XX X
XXX XXXX XXXXX XXXXXX
So the answer is 13 possible pyramids from 6 bricks.
edit
I am positive this is a dynamic programming problem, because it makes sense to (once you've determined the first row) simply look to the index in your memorized array of your remainder of bricks to see how many pyramids fit atop.
It also makes sense to consider bottom rows of width at least n/2 because we can't have more bricks atop than on the bottom row EXCEPT and this is where I lose it and my mind falls apart, in certain (few cases) you can I.e. N = 10
X
XX
XXX
XXXX
Now the bottom row has 4 but there are 6 left to place on top
But with n = 11 we cannot have a bottom row with less than n/2 bricks. There is another wierd inconsistency like that with n = 4 where we cannot have a bottom row of n/2 = 2 bricks.
Let's choose a suitable definition:
f(n, m) = # pyramids out of n bricks with base of size < m
The answer you are looking for now is (given that N is your input number of bricks):
f(N, N+1) - 1
Let's break that down:
The first N is obvious: that's your number of bricks.
Your bottom row will contain at most N bricks (because that's all you have), so N+1 is a sufficient lower bound.
Finally, the - 1 is there because technically the empty pyramid is also a pyramid (and will thus be counted) but you exclude that from your solutions.
The base cases are simple:
f(n, 0) = 1 for any n >= 0
f(0, m) = 1 for any m >= 0
In both cases, it's the empty pyramid that we are counting here.
Now, all we need still is a recursive formula for the general case.
Let's assume we are given n and m and choose to have i bricks on the bottom layer. What can we place on top of this layer? A smaller pyramid, for which we have n - i bricks left and whose base has size < i. This is exactly f(n - i, i).
What is the range for i? We can choose an empty row so i >= 0. Obviously, i <= n because we have only n bricks. But also, i <= m - 1, by definition of m.
This leads to the recursive expression:
f(n, m) = sum f(n - i, i) for 0 <= i <= min(n, m - 1)
You can compute f recursively, but using dynamic programming it will be faster of course. Storing the results matrix is straightforward though, so I leave that up to you.
Coming back to the original claim that f(N, N+1)-1 is the answer you are looking for, it doesn't really matter which value to choose for m as long as it is > N. Based on the recursive formula it's easy to show that f(N, N + 1) = f(N, N + k) for every k >= 1:
f(N, N + k) = sum f(N - i, i) for 0 <= i <= min(N, N + k - 1)
= sum f(N - i, i) for 0 <= i <= N
= sum f(N - i, i) for 0 <= i <= min(N, N + 1 - 1)
In how many ways can you build a pyramid of width n? By putting any pyramid of width n-1 or less anywhere atop the layer of n bricks. So if p(n) is the number of pyramids of width n, then p(n) = sum [m=1 to n-1] (p(m) * c(n, m)), where c(n, m) is the number of ways you can place a layer of width m atop a layer of width n (I trust that you can work that one out yourself).
This, however, doesn't place a limitation on the number of bricks. Generally, in DP, any resource limitation must be modeled as a separate dimension. So your problem is now p(n, b): "How many pyramids can you build of width n with a total of b bricks"? In the recursive formula, for each possible way of building a smaller pyramid atop your current one, you need to refer to the correct amount of remaining bricks. I leave it as a challenge for you to work out the recursive formula; let me know if you need any hints.
You can think of your recursion as: given x bricks left where you used n bricks on last row, how many pyramids can you build. Now you can fill up rows from either top to bottom row or bottom to top row. I will explain the former case.
Here the recursion might look something like this (left is number of bricks left and last is number of bricks used on last row)
f(left,last)=sum (1+f(left-i,i)) for i in range [last+1,left] inclusive.
Since when you use i bricks on current row you will have left-i bricks left and i will be number of bricks used on this row.
Code:
int calc(int left, int last) {
int total=0;
if(left<=0) return 0; // terminal case, no pyramid with no brick
for(int i=last+1; i<=left; i++) {
total+=1+calc(left-i,i);
}
return total;
}
I will leave it to you to implement memoized or bottom-up dp version. Also you may want to start from bottom row and fill up upper rows in pyramid.
Since we are asked to count pyramids of any cardinality less than or equal to n, we may consider each cardinality in turn (pyramids of 1 element, 2 elements, 3...etc.) and sum them up. But in how many different ways can we compose a pyramid from k elements? The same number as the count of distinct partitions of k (for example, for k = 6, we can have (6), (1,5), (2,4), and (1,2,3)). A generating function/recurrence for the count of distinct partitions is described in Wikipedia and a sequence at OEIS.
Recurrence, based on the Pentagonal number Theorem:
q(k) = ak + q(k − 1) + q(k − 2) − q(k − 5) − q(k − 7) + q(k − 12) + q(k − 15) − q(k − 22)...
where ak is (−1)^(abs(m)) if k = 3*m^2 − m for some integer m and is 0 otherwise.
(The subtracted coefficients are generalized pentagonal numbers.)
Since the recurrence described in Wikipedia obliges the calculation of all preceding q(n)'s to arrive at a larger q(n), we can simply sum the results along the way to obtain our result.
JavaScript code:
function numPyramids(n){
var distinctPartitions = [1,1],
pentagonals = {},
m = _m = 1,
pentagonal_m = 2,
result = 1;
while (pentagonal_m / 2 <= n){
pentagonals[pentagonal_m] = Math.abs(_m);
m++;
_m = m % 2 == 0 ? -m / 2 : Math.ceil(m / 2);
pentagonal_m = _m * (3 * _m - 1);
}
for (var k=2; k<=n; k++){
distinctPartitions[k] = pentagonals[k] ? Math.pow(-1,pentagonals[k]) : 0;
var cs = [1,1,-1,-1],
c = 0;
for (var i in pentagonals){
if (i / 2 > k)
break;
distinctPartitions[k] += cs[c]*distinctPartitions[k - i / 2];
c = c == 3 ? 0 : c + 1;
}
result += distinctPartitions[k];
}
return result;
}
console.log(numPyramids(6)); // 13

Probabilty based on quicksort partition

I have come across this question:
Let 0<α<.5 be some constant (independent of the input array length n). Recall the Partition subroutine employed by the QuickSort algorithm, as explained in lecture. What is the probability that, with a randomly chosen pivot element, the Partition subroutine produces a split in which the size of the smaller of the two subarrays is ≥α times the size of the original array?
Its answer is 1-2*α.
Can anyone explain me how has this answer come?Please Help.
The choice of the pivot element is random, with uniform distribution.
There are N elements in the array, and we will assume that N is large (or we won't get the answer we want).
If 0≤α≤1, the probability that the number of elements smaller than the pivot is less than αN is α. The probability that the number of elements greater than the pivot is less than αN is the same. If α≤ 1/2, then these two possibilities are exclusive.
To say that the smaller subarray is of length ≥αN, is to say that neither of these conditions holds, therefore the probability is 1-2α.
The other answers didn't quite click with me so here's another take:
If at least one of the 2 subarrays must be you can deduce that the pivot must also be in position . This is obvious by contradiction. If the pivot is then there is a subarray smaller than . By the same reasoning the pivot must also be . Any larger value for the pivot will yield a smaller subarray than on the "right hand side".
This means that , as shown by the diagram below:
What we want to calculate then is the probability of that event (call it A) i.e .
The way we calculate the probability of an event is to sum of the probability of the constituent outcomes i.e. that the pivot lands at .
That sum is expressed as:
Which easily simplifies to:
With some cancellation we get:
Just one more approach for solving the problem (for those who have uneasy time understanding it, like I have).
First.
Since we are talking about "the smaller of the two subarrays", then its length is less than 1/2 * n (n - the number of elements in original array).
Second.
If 0 < a < 0.5 it means the a * n is less than 1/2 * n either.
And thus we are talking from now about two randomly chosen integers bounded by 0 at lowest and 1/2 * n at highest.
Third.
Lets imagine the dice with numbers from 1 to 6 on it's sides. Lets choose a number from 1 to 6, for example 4. Now roll the dice. Each number has a probability 1/6 to be the outcome of this roll. Thus for event "outcome is less or equal to 4" we have probability equal to the sum of probabilities of each of this outcomes. And we have numbers 1, 2, 3 and 4. Altogether p(x <= 4) = 4 * 1/6 = 4/6 = 2/3. So the probability of event "output is bigger than 4" is p(x > 4) = 1 - p(x <= 4) = 1 - 2/3 = 1/3.
Fourth.
Lets go back to our problem. The "chosen number" is now a * n. And we are going to roll the dice with the numbers from 0 to (1/2 * n) on it to get k - the number of elements in a smallest of subarrays. The probability that outcome is bounded by (a * n) at highest is equals to sum of the probabilities of all outcomes from 0 to (a * n). And the probability for any particular outcome k is p(k) = 1 / (1/2 * n).
Therefore p(k <= a * n) = (a * n) * (1 / (1/2 * n)) = 2 * a.
From this we can easily conclude that p(k > a * n) = 1 - p(k <= a * n) = 1 - 2 * a.
Array length is n.
For smaller array length >= αn pivot should be greater than αn number of elements. At the same time pivot should be smaller than αn number of elements( else smaller array size will be less than required)
So out of n element we have to select one among (n-2α)n elements.
required probability is n(1-2α)/n.
Hence 1-2α
The probability would be, the number of desired elements/Total number of elements.
In this case, ((1-αn)-(αn))/n
Since α lies between,0 and 0.5,(1-α) must be bigger than α.Hence the number of elements contained between them would be,
(1-α-α)n=(1-2α)n
and so,the probability would be,
(1-2α)n/n=1-2α
Another approach:
List the "more balanced" options:
αn + 1 to (1 - α)n - 1
αn + 2 to (1 - α)n - 2
...
αn + k to (1 - α)n - k
So k in total. We know that the most balanced is n / 2 to n / 2, so:
αn + k = n / 2 => k = n(1/2 - α)
Similarly, list the "less balanced" options:
αn - 1 to (1 - α)n + 1
αn - 2 to (1 - α)n + 2
...
αn - m to (1 - α)n + m
So m in total. We know that the least balanced is 0 to n so:
αn - m = 0 => m = αn
Since all these options happen with equal probability we can use the frequency definition of probability so:
Pr{More balanced} = (total # of more balanced) / (total # of options) =>
Pr{More balanced} = k / (k + m) = n(1/2 - α) / (n(1/2 - α) + αn) = 1 - 2α

Maximum of sums of unsorted array and each of a number of sorted arrays

Given an unsorted array
A = a_1 ... a_n
And a set of sorted Arrays
B_i = b_i_1 ... b_i_n # for i from 1 to $large_number
I would like to find the maximums from the (not yet calculated) sum arrays
C_i = (a_1 + b_i_1) ... (a_n + b_i_n)
for each i.
Is there a trick to do better than just calculating all the C_i and finding their maximums in O($large_number * n)?
Can we do better when we know that the B arrays are just shifts from an endless sequence,
e.g.
S = 0 1 4 9 16 ...
B_i = S[i:i+n]
(The above sequence has the maybe advantageous property that (S_i - S_i-1 > S_i-1 - S_i-2))
There are $large_number * n data in your first problem, so there can't be any such trick.
You can prove this with an adversary argument. Suppose you have an algorithm that solves your problem without looking at all n * $large_number entries of b. I'm going to pick a fixed a, namely (-10, -20, -30, ..., -10n). The first $large_number * n - 1 the algorithm looks at an entry b_(i,j), I'll answer that it's 10j, for a sum of zero. The last time it looks at an entry, I'll answer that it's 10j+1, for a sum of 1.
If $large_number is Omega(n), your second problem requires you to look at n * $large_number entries of S, so it also can't have any such trick.
However, if you specify S, there may be something. And if $large_number <= n/2 (or whatever it is), then, all of the entries of S must be sorted, so you only have to look at the last B.
If we don't know anything I don't it's possible to do better than O($large_number * n)
However - If it's just shifts of an endless sequence we can do it in O($large_number + n):
We calculate B_0 ןמ O($large_number).
Than B_1 = (B_0 - S[0]) + S[n+1]
And in general: B_i = (B_i-1 - S[i-1]) + S[i-1+n].
So we can calculate all the other entries and the max in O(n).
This is for a general sequence - if we have some info about it, it might be possible to do better.
we know that the B arrays are just shifts from an endless sequence,
e.g.
S = 0 1 4 9 16 ...
B_i = S[i:i+n]
You can easily calculate S[i:i+n] as (sum of squares from 1 to i+n) - (sum of squares from 1 to i-1)
See https://math.stackexchange.com/questions/183316/how-to-get-to-the-formula-for-the-sum-of-squares-of-first-n-numbers
With the provided example, S1 = 0, S2 = 1, S3 = 4...
Let f(n) = SUM of Si for i=1 to n = (n-1)(n)(2n-1)/6
B_i = f(i+n) - f(i-1)
You then add SUM(A) to each sum.
Another approach is to calculate the difference between B_i and B_(i-1):
That would be: S[i:i+n] - S[i-1:i+n-1] = S(i+n) - S(i-1)
That way, you can just calculate the difference of the sums of each array with the previous one. In my understanding, since Ci = SUM(Bi)+SUM(A), SUM(A) becomes a constant that is irrelevant in finding the maximum.

Algorithm to compute k fractions of form 1/r summing up to 1

Given k, we need to write 1 as a sum of k fractions of the form 1/r.
For example,
For k=2, 1 can uniquely be written as 1/2 + 1/2.
For k=3, 1 can be written as 1/3 + 1/3 + 1/3 or 1/2 + 1/4 + 1/4 or 1/6 + 1/3 + 1/2
Now, we need to consider all such set of k fractions that sum upto 1 and return the highest denominator among all such sets; for instance, the sample case 2, our algorithm should return 6.
I came across this problem in a coding competition and couldn't come up with an algorithm for the same. A bit of Google search later revealed that such fractions are called Egyption Fractions but probably they are set of distinct fractions summing upto a particular value (not like 1/2 + 1/2). Also, I couldn't find an algo to compute Egyption Fractions (if they are at all helpful for this problem) when their number is restricted by k.
If all you want to do is find the largest denominator, there's no reason to find all the possibilities. You can do this very simply:
public long largestDenominator(int k){
long denominator = 1;
for(int i=1;i<k;i++){
denominator *= denominator + 1;
}
return denominator;
}
For you recursive types:
public long largestDenominator(int k){
if(k == 1)
return 1;
long last = largestDenominator(k-1);
return last * (last + 1); // or (last * last) + last)
}
Why is it that simple?
To create the set, you need to insert the largest fraction that will keep it under 1 at each step(except the last). By "largest fraction", I mean by value, meaning the smallest denominator.
For the simple case k=3, that means you start with 1/2. You can't fit another half, so you go with 1/3. Then 1/6 is left over, giving you three terms.
For the next case k=4, you take that 1/6 off the end, since it won't fit under one, and we need room for another term. Replace it with 1/7, since that's the biggest value that fits. The remainder is 1/42.
Repeat as needed.
For example:
2 : [2,2]
3 : [2,3,6]
4 : [2,3,7,42]
5 : [2,3,7,43,1806]
6 : [2,3,7,43,1807,3263442]
As you can see, it rapidly becomes very large. Rapidly enough that you'll overflow a long if k>7. If you need to do so, you'll need to find an appropriate container (ie. BigInteger in Java/C#).
It maps perfectly to this sequence:
a(n) = a(n-1)^2 + a(n-1), a(0)=1.
You can also see the relationship to Sylvester's sequence:
a(n+1) = a(n)^2 - a(n) + 1, a(0) = 2
Wikipedia has a very nice article explaining the relationship between the two, as pointed out by Peter in the comments.
I never heard of Egyptian fractions before but here are some thoughts:
Idea
You can think of them geometrically:
Start with a unit square (1x1)
Draw either vertical or horizontal lines dividing the square into equal parts.
Repeat optionally the drawing of lines inside any of the sub-boxes evenly.
Stop any time you want.
The rectangles present will form a set of fractions of the form 1/n that add to 1.
You can count them and they might equal your 'k'.
Depending on how many equal sections you divided a rectangle into, it will tell whether you have 1/2 or 1/3 or whatever. 1/6 is 1/2 of 1/3 or 1/3 of 1/2. (i.e. You dived by 2 and then one of the sub-boxes by 3 OR the other way around.)
Idea 2
You start with 1 box. This is the fraction 1/1 with k=1.
When you sub-divide by n you add n to the count of boxes (k or of fractions summed) and subtract 1.
When you sub-divide any of those boxes, again, subtract 1 and add n, the number of divisions. Note that n-1 is the number of lines you drew to divide them.
More
You are going to start searching for the answer with k. Obviously k * 1/k = 1 so you have one solution right there.
How about k-1?
There's a solution there: (k-2) * 1/(k-1) + 2 * (1/((k-1)*2))
How did I get that? I made k-1 equal sections (with k-2 vertical lines) and then divided the last one in half horizontally.
Each solution is going to consist of:
taking a prior solution
using j less lines and some stage and dividing one of the boxes or sub-boxes into j+1 equal sections.
I don't know if all solutions can be formed by repeating this rule starting from k * 1/k
I do know you can get effective duplicates this way. For example: k * 1/k with j = 1 => (k-2) * 1/(k-1) + 2 * (1/((k-1)*2)) [from above] but k * 1/k with j = (k-2) => 2 * (1/((k-1)*2)) + (k-2) * 1/(k-1) [which just reverses the order of the parts]
Interesting
k = 7 can be represented by 1/2 + 1/4 + 1/8 + ... + 1/(2^6) + 1/(2^6) and the general case is 1/2 + ... + 1/(2^(k-1)) + 1/(2^(k-1)).
Similarly for any odd k it can be represented by 1/3 + ... + 3 * [1/(3^((k-1)/2)].
I suspect there are similar patterns for all integers up to k.

Resources