Related
A sequence A=[a1, a2,...,an] is a valley sequence, if there's an index i with 1 < i < n such that:
a1 > a2 > .... > ai
and
ai < ai+1 < .... < an.
It is given that a valley sequence must contain at least three elements in it.
What i'm really confused about is, how do we find an algorithm that finds the element ai, as described above, in O(log n) time?
Will it be similar to an O(log n) binary search?
And if we do have a binary search algorithm which find an element of an array in O(log n) time, can we improve the runtime to O(log log n) ?
To have a BIG-O(logn) algorithm, we will have to reduce the problem size by half in constant time.
In this problem specifically, we can select a mid-point, and check if its slope is increasing, decreasing or a bottom.
If the slope is increasing, the part after the mid-point could be ignored
else if the slope is decreasing, the part before the mid-point could be ignored
else the mid-point should be the bottom, hence we find our target.
Java code example :
Input: [99, 97, 89, 1, 2, 4, 6], output: 1
public int findBottomValley(int[] valleySequence) {
int start = 0, end = valleySequence.length - 1, mid;
while (start + 1 < end) {
mid = start + (end - start) / 2;
if (checkSlope(mid, valleySequence) < 0) {
// case decreasing
start = mid;
} else if (checkSlope(mid, valleySequence) > 0) {
// case increasing
end = mid;
} else {
// find our target
return valleySequence[mid];
}
}
// left over with two points
if (valleySequence[start] < valleySequence[end]) {
return valleySequence[start];
}
return valleySequence[end];
}
The helper function checkSlope(index, list) will check the slope at the index of the list, it will check three points including index - 1, index and index + 1. If the slope is decreasing, return negative numbers; if the slope is increasing, return positive numbers; if the numbers at index - 1 and index + 1 are both larger than the number at index, return 0;
Note: the algorithm makes assumptions that:
the list has at least three items
the slope at the adjacent elements cannot be flat, the reason behind this is that if there are adjacent numbers that are equal, then we are unable to decide which side the bottom is. It could appear at the left of such flat slope or on the right, hence we will have to do a linear search.
Since random access of an array is already constant O(1), having an O(logn) access time may not help the algorithm.
There is a solution that works a lot like binary search. Set a = 2 and b = n - 1. At each step, we will only need to consider candidates with index k such that a <= k <= b. Compute m = (a + b) / 2 (integer divide, so round down) and then consider array elements at indices m - 1, m and m + 1. If these elements are decreasing, then set a = m and keep searching. If these elements are increasing, then set b = m and keep searching. If these elements form a valley sequence, then return m as the answer. If b - a < 2, then there is no valley.
Since we halve the search space each time, the complexity is logarithmic. Yes, we access three elements and perform two comparisons at each stage, but calculation will show that just affects constant factors.
Note that this answer depends on these sequences being strictly decreasing and then increasing. If consecutive elements can repeat, the best solution is linear in the worst case.
Just saw the second part. In general, no, a way to find specific elements in logarithmic time - even constant time - is useless in general. The problem is that we really have no useful idea what to look for. If the spacing of all elements' values is greater than their spacing in the array - this isn't hard to arrange - then I can't see how you'd pick something to search for.
Consider a binary sequence b of length N. Initially, all the bits are set to 0. We define a flip operation with 2 arguments, flip(L,R), such that:
All bits with indices between L and R are "flipped", meaning a bit with value 1 becomes a bit with value 0 and vice-versa. More exactly, for all i in range [L,R]: b[i] = !b[i].
Nothing happens to bits outside the specified range.
You are asked to determine the number of possible different sequences that can be obtained using exactly K flip operations modulo an arbitrary given number, let's call it MOD.
More specifically, each test contains on the first line a number T, the number of queries to be given. Then there are T queries, each one being of the form N, K, MOD with the meaning from above.
1 ≤ N, K ≤ 300 000
T ≤ 250
2 ≤ MOD ≤ 1 000 000 007
Sum of all N-s in a test is ≤ 600 000
time limit: 2 seconds
memory limit: 65536 kbytes
Example :
Input :
1
2 1 1000
Output :
3
Explanation :
There is a single query. The initial sequence is 00. We can do the following operations :
flip(1,1) ⇒ 10
flip(2,2) ⇒ 01
flip(1,2) ⇒ 11
So there are 3 possible sequences that can be generated using exactly 1 flip.
Some quick observations that I've made, although I'm not sure they are totally correct :
If K is big enough, that is if we have a big enough number of flips at our disposal, we should be able to obtain 2n sequences.
If K=1, then the result we're looking for is N(N+1)/2. It's also C(n,1)+C(n,2), where C is the binomial coefficient.
Currently trying a brute force approach to see if I can spot a rule of some kind. I think this is a sum of some binomial coefficients, but I'm not sure.
I've also come across a somewhat simpler variant of this problem, where the flip operation only flips a single specified bit. In that case, the result is
C(n,k)+C(n,k-2)+C(n,k-4)+...+C(n,(1 or 0)). Of course, there's the special case where k > n, but it's not a huge difference. Anyway, it's pretty easy to understand why that happens.I guess it's worth noting.
Here are a few ideas:
We may assume that no flip operation occurs twice (otherwise, we can assume that it did not happen). It does affect the number of operations, but I'll talk about it later.
We may assume that no two segments intersect. Indeed, if L1 < L2 < R1 < R2, we can just do the (L1, L2 - 1) and (R1 + 1, R2) flips instead. The case when one segment is inside the other is handled similarly.
We may also assume that no two segments touch each other. Otherwise, we can glue them together and reduce the number of operations.
These observations give the following formula for the number of different sequences one can obtain by flipping exactly k segments without "redundant" flips: C(n + 1, 2 * k) (we choose 2 * k ends of segments. They are always different. The left end is exclusive).
If we had perform no more than K flips, the answer would be sum for k = 0...K of C(n + 1, 2 * k)
Intuitively, it seems that its possible to transform the sequence of no more than K flips into a sequence of exactly K flips (for instance, we can flip the same segment two more times and add 2 operations. We can also split a segment of more than two elements into two segments and add one operation).
By running the brute force search (I know that it's not a real proof, but looks correct combined with the observations mentioned above) that the answer this sum minus 1 if n or k is equal to 1 and exactly the sum otherwise.
That is, the result is C(n + 1, 0) + C(n + 1, 2) + ... + C(n + 1, 2 * K) - d, where d = 1 if n = 1 or k = 1 and 0 otherwise.
Here is code I used to look for patterns running a brute force search and to verify that the formula is correct for small n and k:
reachable = set()
was = set()
def other(c):
"""
returns '1' if c == '0' and '0' otherwise
"""
return '0' if c == '1' else '1'
def flipped(s, l, r):
"""
Flips the [l, r] segment of the string s and returns the result
"""
res = s[:l]
for i in range(l, r + 1):
res += other(s[i])
res += s[r + 1:]
return res
def go(xs, k):
"""
Exhaustive search. was is used to speed up the search to avoid checking the
same string with the same number of remaining operations twice.
"""
p = (xs, k)
if p in was:
return
was.add(p)
if k == 0:
reachable.add(xs)
return
for l in range(len(xs)):
for r in range(l, len(xs)):
go(flipped(xs, l, r), k - 1)
def calc_naive(n, k):
"""
Counts the number of reachable sequences by running an exhaustive search
"""
xs = '0' * n
global reachable
global was
was = set()
reachable = set()
go(xs, k)
return len(reachable)
def fact(n):
return 1 if n == 0 else n * fact(n - 1)
def cnk(n, k):
if k > n:
return 0
return fact(n) // fact(k) // fact(n - k)
def solve(n, k):
"""
Uses the formula shown above to compute the answer
"""
res = 0
for i in range(k + 1):
res += cnk(n + 1, 2 * i)
if k == 1 or n == 1:
res -= 1
return res
if __name__ == '__main__':
# Checks that the formula gives the right answer for small values of n and k
for n in range(1, 11):
for k in range(1, 11):
assert calc_naive(n, k) == solve(n, k)
This solution is much better than the exhaustive search. For instance, it can run in O(N * K) time per test case if we compute the coefficients using Pascal's triangle. Unfortunately, it is not fast enough. I know how to solve it more efficiently for prime MOD (using Lucas' theorem), but O do not have a solution in general case.
Multiplicative modular inverses can't solve this problem immediately as k! or (n - k)! may not have an inverse modulo MOD.
Note: I assumed that C(n, m) is defined for all non-negative n and m and is equal to 0 if n < m.
I think I know how to solve it for an arbitrary MOD now.
Let's factorize the MOD into prime factors p1^a1 * p2^a2 * ... * pn^an. Now can solve this problem for each prime factor independently and combine the result using the Chinese remainder theorem.
Let's fix a prime p. Let's assume that p^a|MOD (that is, we need to get the result modulo p^a). We can precompute all p-free parts of the factorial and the maximum power of p that divides the factorial for all 0 <= n <= N in linear time using something like this:
powers = [0] * (N + 1)
p_free = [i for i in range(N + 1)]
p_free[0] = 1
for cur_p in powers of p <= N:
i = cur_p
while i < N:
powers[i] += 1
p_free[i] /= p
i += cur_p
Now the p-free part of the factorial is the product of p_free[i] for all i <= n and the power of p that divides n! is the prefix sum of the powers.
Now we can divide two factorials: the p-free part is coprime with p^a so it always has an inverse. The powers of p are just subtracted.
We're almost there. One more observation: we can precompute the inverses of p-free parts in linear time. Let's compute the inverse for the p-free part of N! using Euclid's algorithm. Now we can iterate over all i from N to 0. The inverse of the p-free part of i! is the inverse for i + 1 times p_free[i] (it's easy to prove it if we rewrite the inverse of the p-free part as a product using the fact that elements coprime with p^a form an abelian group under multiplication).
This algorithm runs in O(N * number_of_prime_factors + the time to solve the system using the Chinese remainder theorem + sqrt(MOD)) time per test case. Now it looks good enough.
You're on a good path with binomial-coefficients already. There are several factors to consider:
Think of your number as a binary-string of length n. Now we can create another array counting the number of times a bit will be flipped:
[0, 1, 0, 0, 1] number
[a, b, c, d, e] number of flips.
But even numbers of flips all lead to the same result and so do all odd numbers of flips. So basically the relevant part of the distribution can be represented %2
Logical next question: How many different combinations of even and odd values are available. We'll take care of the ordering later on, for now just assume the flipping-array is ordered descending for simplicity. We start of with k as the only flipping-number in the array. Now we want to add a flip. Since the whole flipping-array is used %2, we need to remove two from the value of k to achieve this and insert them into the array separately. E.g.:
[5, 0, 0, 0] mod 2 [1, 0, 0, 0]
[3, 1, 1, 0] [1, 1, 1, 0]
[4, 1, 0, 0] [0, 1, 0, 0]
As the last example shows (remember we're operating modulo 2 in the final result), moving a single 1 doesn't change the number of flips in the final outcome. Thus we always have to flip an even number bits in the flipping-array. If k is even, so will the number of flipped bits be and same applies vice versa, no matter what the value of n is.
So now the question is of course how many different ways of filling the array are available? For simplicity we'll start with mod 2 right away.
Obviously we start with 1 flipped bit, if k is odd, otherwise with 1. And we always add 2 flipped bits. We can continue with this until we either have flipped all n bits (or at least as many as we can flip)
v = (k % 2 == n % 2) ? n : n - 1
or we can't spread k further over the array.
v = k
Putting this together:
noOfAvailableFlips:
if k < n:
return k
else:
return (k % 2 == n % 2) ? n : n - 1
So far so well, there are always v / 2 flipping-arrays (mod 2) that differ by the number of flipped bits. Now we come to the next part permuting these arrays. This is just a simple permutation-function (permutation with repetition to be precise):
flipArrayNo(flippedbits):
return factorial(n) / (factorial(flippedbits) * factorial(n - flippedbits)
Putting it all together:
solutionsByFlipping(n, k):
res = 0
for i in [k % 2, noOfAvailableFlips(), step=2]:
res += flipArrayNo(i)
return res
This also shows that for sufficiently large numbers we can't obtain 2^n sequences for the simply reason that we can not arrange operations as we please. The number of flips that actually affect the outcome will always be either even or odd depending upon k. There's no way around this. The best result one can get is 2^(n-1) sequences.
For completeness, here's a dynamic program. It can deal easily with arbitrary modulo since it is based on sums, but unfortunately I haven't found a way to speed it beyond O(n * k).
Let a[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 1. Let b[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 0.
Then:
# we can append 1 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in 1, or to any arrangement of (k-1) non-adjacent blocks of contiguous
# 1's that ends in 0:
a[n][k] = a[n - 1][k] + b[n - 1][k - 1]
# we can append 0 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in either 0 or 1:
b[n][k] = b[n - 1][k] + a[n - 1][k]
# complete answer would be sum (a[n][i] + b[n][i]) for i = 0 to k
I wonder if the following observations might be useful: (1) a[n][k] and b[n][k] are zero when n < 2*k - 1, and (2) on the flip side, for values of k greater than ⌊(n + 1) / 2⌋ the overall answer seems to be identical.
Python code (full matrices are defined for simplicity, but I think only one row of each would actually be needed, space-wise, for a bottom-up method):
a = [[0] * 11 for i in range(0,11)]
b = [([1] + [0] * 10) for i in range(0,11)]
def f(n,k):
return fa(n,k) + fb(n,k)
def fa(n,k):
global a
if a[n][k] or n == 0 or k == 0:
return a[n][k]
elif n == 2*k - 1:
a[n][k] = 1
return 1
else:
a[n][k] = fb(n-1,k-1) + fa(n-1,k)
return a[n][k]
def fb(n,k):
global b
if b[n][k] or n == 0 or n == 2*k - 1:
return b[n][k]
else:
b[n][k] = fb(n-1,k) + fa(n-1,k)
return b[n][k]
def g(n,k):
return sum([f(n,i) for i in range(0,k+1)])
# example
print(g(10,10))
for i in range(0,11):
print(a[i])
print()
for i in range(0,11):
print(b[i])
Suppose I have a collection of words with a predefined binary prefix code. Given a very large random binary chunk of data, I can parse this chunk into words using the prefix code.
I want to determine, at least approximately (for random chunks of very large lengths) the expectation values of number of hits for each word (how many times it is mentioned in the decoded text).
At first glance, the problem appears trivial - the probability of each word being scanned from the random pool of bits is completely determined by its length (since each bit can be either 0 or 1). But I suspect this to be an incorrect answer to the problem above since words have different lengths and thus this probability is not the same as the expected number of hits (divided by the length of the data chunk).
UPD: I was asked (in comments below) to state this problem mathematically, so here it goes.
Let w be a list of words written with only zeros and ones (our alphabet consists of only two letters). Furthermore, no word in w is a prefix of any other word. Thus w forms a legitimate binary prefix code. I want to know (at least approximately) the mean value of hits, for each word in w, averaged over all possible binary chunks of data with fixed size n. n can be taken very large, much much larger than any of the lengths of our words. However, words have different lengths and this can not be neglected.
I would appreciate any references to attempts to solve this.
My brief answer: the expected number of hits (or rather the expected proportion of hits) can be calculated for every given list of words.
I will not describe the full algorithm, but just do the following example in detail for illustration: let us fix the following very simple list of three words: 0, 10, 11.
For every n, there are 2^n different data chunks of length n (I mean n bits), each occur with the same probability 2^(-n).
The first observation is that, not all the data chunks can be decoded exactly - e.g. the data 0101, when you decode, there will remain a single 1 in the end.
Let us write U(n) for the number of length n data chunks that CAN be decoded exactly, and write V(n) for the others (i.e. those with an extra 1 in the end). The following recurrence relations are clear:
U(n) + V(n) = 2^n
V(n) = U(n - 1)
with the initial values U(0) = 1 and V(0) = 0.
A simple calculation then yields:
U(n) = (2^(n + 1) + (- 1)^n) / 3.
Now let A(n) (resp. B(n), C(n)) be the sum of the number of hits on the word 0 (resp. 10, 11) for all the U(n) exact data chunks, and let a(n) (resp. b(n), c(n)) be the same sum for all the V(n) inexact data chunks (the last 1 does not count in this case).
Then we have the following relations:
a(n) = A(n - 1), b(n) = B(n - 1), c(n) = C(n - 1)
A(n) = A(n - 1) + U(n - 1) + A(n - 2) + A(n - 2)
B(n) = B(n - 1) + B(n - 2) + U(n - 2) + B(n - 2)
C(n) = C(n - 1) + C(n - 2) + C(n - 2) + U(n - 2)
Explanation for the relations 2 3 4:
If D is an exact data chunk of length n, then there are three possibilities:
D ends with 0, and deleting this 0 yields an exact data chunk of length n - 1;
D ends with 10, and deleting this 10 yields an exact data chunk of length n - 2;
D ends with 11, and deleting this 11 yields an exact data chunk of length n - 2.
Thus, for example, when we sum up all the hit numbers for 0 in all exact data chunks of length n, the contributions of the three cases are respectively A(n - 1) + U(n - 1), A(n - 2), A(n - 2). Similarly for the other two equalities.
Now, solving these recurrence relations, we get:
A(n) = 2/9 * n * 2^n + (smaller terms)
B(n) = C(n) = 1/9 * n * 2^n + (smaller terms)
Since U(n) = 2/3 * 2^n + (smaller terms), our conclusion is that there are approximately n/3 hits on 0, n/6 hits on 10, n/6 hits on 11.
Note that the same proportions hold if we take also the V(n) inexact data chunks into account, because of the relations between A(n), B(n), C(n), U(n) and a(n), b(n), c(n), V(n).
This method generalizes to any list of words. It's the same idea as if you were to solve this problem using dynamic programing - create status, find recurrence relation, and establish transition matrix.
To go further
I think the following might also be true, which will simplify the answer further.
Let w_1, ..., w_k be the words in the list, and let l_1, ..., l_k be their lengths.
For every i = 1, ..., k, let a_i be the proportion of hits of w_i, i.e. for length n data chunks the expected number of hits for w_i is a_i * n + (smaller terms).
Then, my feeling (conjecture) is that a_i * 2^(l_i) is the same for all i, i.e. if one word is one bit longer than another, then its hit number is a half of that of the other.
This conjecture, if correct, is probably not very difficult to prove. But I'm too lazy to think now...
If this is true, then we can calculate those a_i very easily, because we have the identity:
sum (a_i * l_i) = 1.
Let me illustrate this with the above example.
We have w_1 = 0, w_2 = 10, w_3 = 11, hence l_1 = 1, l_2 = l_3 = 2.
According to the conjecture, we should have a_1 = 2 * a_2 = 2 * a_3. Thus a_2 = a_3 = x and a_1 = 2x. The above equality becomes:
2x * 1 + x * 2 + x * 2 = 1
Hence x = 1 / 6, and we have a_1 = 1 / 3, a_2 = a_3 = 1 / 6, as can be verified by the above calculation.
Let's make a simple machine that can recognize words: a DFA with an accepting state for each word. To construct this DFA, start with a binary tree with each left-child-edge labeled 0 and each right-child-edge labeled 1. Each leaf is either a word-accepter (if the path to that leaf down the tree is the word's spelling) or is garbage (a string of letters that isn't a prefix for any valid word). We wire up "restart" edges from the leaves back to the root of the tree*.
Let's find out what the frequency of matching each word would be, if we had a string of infinite length. To do this, treat the graph of the DFA as a Markov state transition diagram, initialize the starting state to be at the root with probability 1 and all other states 0, and find the steady state distribution (by finding the dominant eigenvector of the transition diagram's corresponding matrix).
Our string is not of infinite length. But since n is large, I expect "edge effects" to not matter so much. We can approximate the matching frequency by word by taking the matching rate by word and multiplying by n. If we want to be more precise, instead of taking the eigenvector we could just take the transition matrix to the nth power and multiply that with the starting distribution to get the resulting distribution after n letters.
*This isn't quite precise, because this Markov system would spend some nonzero amount of time at the root, when after recognizing a word or skipping garbage it should immediately go to the 0-child or 1-child depending. So we don't actually wire up our "restart" edges to a root: from a word-accepting node we wire up two restart edges (one to the 0-child and one to the 1-child of the root); we replace garbage nodes that are left-children with an edge to the 0-child; and we replace garbage nodes that are right-children with an edge to the 1-child. In fact, if we set our initial state to 0 with probability 0.5 and 1 with probability 0.5, we don't even need the root.
EDIT: To use #WhatsUp's example, we start with a DFA that looks like this:
We rewire it a little bit to restart after a word is accepted and get rid of the root node:
The corresponding Markov transition matrix is:
0.5 0 0.5 0.5
0.5 0 0.5 0.5
0 0.5 0 0
0 0.5 0 0
whose first eigenvector is:
0.333
0.333
0.167
0.167
Which is to say that it spends 1/3 of its time in the 0 node, 1/3 in 1, 1/6 in 10, and 1/6 in 11. This is in agreement with #WhatsUp's results for that example.
I am reading "Insertion Sort is O(nlogn) by Michael A. Bender , Martín Farach-Colton , Miguel Mosteiro" and I don't quite understand how the algorithm works and how to implement it even with the help of Wikipedia. The following is the description of the algorithm extracted from the original article.
1) Let A be an n-element array to be sorted. These elements are inserted one at
a time in random order into a sorting array S of size (1 + ε)n.
So the first step is creating array of size (1 + ε)n. Let ε = 1, then I need to create an array with twice the size of the original array.
2) The insertions proceed in log(n) rounds as follows.
Each round doubles the number of elements inserted into S and doubles the prefix of S where elements reside.
I understand that there will be outer loop that will loop log(n) time. Each round, I need to double the number of elements from A (original array) to S array. What I don't really understand is "double the prefix of S".
3) Specifically, round ith ends when element 2i is inserted and the elements are rebalanced. Before the rebalance, the 2i elements are in the first (1 + ε)2i positions.
A rebalance moves them into the first (2 + 2ε)2i positions, spreading
the elements as evenly as possible. We call 2 + 2ε the spreading factor.
From what I understand is that for every round, we will do "rebalance". "rebalance" will uniformly spread the original element in S array so that it leaves some gap between the element. The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
4) The insertion of 2i−1 intercalated elements within round ith is performed the
brute force way: search for the target position of the element to be inserted by
binary search (amongst the 2i−1 support positions in S), and move elements
of higher rank to make room for the new element. Not all elements of higher
rank need to be moved, only those in adjacent array positions until the nearest
gap is found.
This part shows how to insert each element into S array. First, use binary search to search for where the element should belongs. Then, shift the higher rank until it hit the gap.
This is the translation of the algorithm from what I understand (where A is array to sort and array start with index of 1):
def LibrarySort(A)
n ← length(A)
S ← array of size (1 + ε) * n
for i ← 1 to n
S[i] = null
for i ← 1 to floor(log(n) + 1)
for j ← 2i - 1 to 2i
index = binarysearch(S, A[j])
insert(S, A[j], index)
rebalance()
Then for insertion() function takes 3 parameters: array, item to insert, and location.
def insert(S, item, index)
if S[index] != null
tmp ← S[index]
i ← index + 1
while i <= length(S) and S[i] != null
swap(tmp, S[i])
i++
S[index] ← item
Questions
Is what I understand correct?
What is "double the prefix of S"?
Ad "double the prefix of S": The array (memory) is allocated once at the beginning to the size of (1 + ε) n, where n is total number of elements to be sorted. But the elements are added gradually and as they are added, they are not spread across the whole array, but only some prefix of it. When m elements are rebalanced, they are spread across first (1 + ε) m elements of the array. This is the prefix. When m doubles, so does (1 + ε) m.
Ad correctness: I see one slight mistake:
The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
The quoted description does not say what the formula is, but it can't be this one. Because this would map array of length m to length (1 + ε) m, but the description says you are mapping array of length (1 + ε) m to array of length 2 (1 + ε) m.
A simple expression would be k = 2 i where i is old index, but that would not spread the elements evenly. To spread the elements evenly, the formula is k = (2 + 2 ε) i, but i is index excluding any gaps.
I was studying for my final when I ran into this problem.
For 1a, I think its O(1) for amortized complexity, because it does x mod N which is sparse enough and linear probing incase it fails
However I'm not sure how to state or prove that exactly.
For 1b, it would hash into the same place, so it would linearly probe more each time it inserts, but I'm not sure how to derive a runtime from that either.
1a, there will be no collision at all except the last time (N will collide with every value,i.e N will first collide with 0, then you increase the value by one, it will collide with 1, so on and so forth), the total cost would be 1+1+...+1+n = (n-1 times)+n=2n-1, the amortized cost will be (2n-1)/n, it is O(1) with big-O notation.
1b, there will be (i-1) collisions for the i-th insert,plus the insert operation, the cost for the i-th operation would be i. So the total cost will be 1+2+...+n-2+n-1+n=(n+1)*n/2, you have inserted n time, the amortized cost will be (n+1)/2.
[edited, my original analysis was for open hashing not open addressing] For 1a) h(x) = x mod N, n < N, so the hash values will be 0, 1, ..., n - 2, 0. All insertions will be collision-free, apart from the last one. The last insertion will use a linear probe. First probe goes to bucket 0, but it is taken and the key is different. The next probe is at slot 1, with same result, until it reaches the first empty bucket at (n - 1). Hence you need (n - 1) extra operations for total of (2n - 1). The amortized cost is (2n - 1)/n per insertion.
For 1b) the hash table degenerates into linked list. Insertion is linear in the size, there are n insertions, hence (n + 1) * n / 2 operations total. That is (n + 1)/2 per insertion.