In-place transposition of a matrix - algorithm

Is it possible to transpose a (m,n) matrix in-place, giving that the matrix is represented as a single array of size m*n ?
The usual algorithm
transpose(Matrix mat,int rows, int cols ){
//construction step
Matrix tmat;
for(int i=0;i<rows;i++){
for(int j=0;j<cols;j++){
tmat[j][i] = mat[i][j];
}
}
}
doesn't apply to a single array unless the matrix is a square matrix.
If none, what is the minimum amount of additional memory needed??
EDIT:
I have already tried all flavors of
for(int i=0;i<n;++i) {
for(int j=0;j<i;++j) {
var swap = m[i][j];
m[i][j] = m[j][i];
m[j][i] = swap;
}
}
And it is not correct. In this specific example, m doesnt even exist. In a single line
matrix mat[i][j] = mat[i*m + j], where trans[j][i] = trans[i*n + j]

Inspired by the Wikipedia - Following the cycles algorithm description, I came up with following C++ implementation:
#include <iostream> // std::cout
#include <iterator> // std::ostream_iterator
#include <algorithm> // std::swap (until C++11)
#include <vector>
template<class RandomIterator>
void transpose(RandomIterator first, RandomIterator last, int m)
{
const int mn1 = (last - first - 1);
const int n = (last - first) / m;
std::vector<bool> visited(last - first);
RandomIterator cycle = first;
while (++cycle != last) {
if (visited[cycle - first])
continue;
int a = cycle - first;
do {
a = a == mn1 ? mn1 : (n * a) % mn1;
std::swap(*(first + a), *cycle);
visited[a] = true;
} while ((first + a) != cycle);
}
}
int main()
{
int a[] = { 0, 1, 2, 3, 4, 5, 6, 7 };
transpose(a, a + 8, 4);
std::copy(a, a + 8, std::ostream_iterator<int>(std::cout, " "));
}
The program makes the in-place matrix transposition of the 2 × 4 matrix
0 1 2 3
4 5 6 7
represented in row-major ordering {0, 1, 2, 3, 4, 5, 6, 7} into the 4 × 2 matrix
0 4
1 5
2 6
3 7
represented by the row-major ordering {0, 4, 1, 5, 2, 6, 3, 7}.
The argument m of transpose represents the rowsize, the columnsize n is determined by the rowsize and the sequence size. The algorithm needs m × n bits of auxiliary storage to store the information, which elements have been swapped. The indexes of the sequence are mapped with the following scheme:
0 → 0
1 → 2
2 → 4
3 → 6
4 → 1
5 → 3
6 → 5
7 → 7
The mapping function in general is:
idx → (idx × n) mod (m × n - 1) if idx < (m × n), idx → idx otherwise
We can identify four cycles within this sequence: { 0 }, { 1, 2, 4 }, {3, 5, 6} and { 7 }. Each cycle can be transposed independent of the other cycles. The variable cycle initially points to the second element (the first does not need to be moved because 0 → 0). The bit-array visited holds the already transposed elements and indicates, that index 1 (the second element) needs to be moved. Index 1 gets swapped with index 2 (mapping function). Now index 1 holds the element of index 2 and this element gets swapped with the element of index 4. Now index 1 holds the element of index 4. The element of index 4 should go to index 1, it is in the right place, transposing of the cycle has finished, all touched indexes have been marked visited. The variable cycle gets incremented till the first not visited index, which is 3. The procedure continues with this cycle till all cycles have been transposed.

The problem is, that the task is set uncorrectly. If you would meant by "the same place" use of the same matrix, it is a correct task. But when you are talking about writing down to the same area in memory, " the matrix is represented as a single array of size m*n", you have to add how is it represented there. Otherwards it is enough to change nothing except the function that reads that matrix - simply swap indexes in it.
You want to transpose the matrix representation in memory so, that the reading/setting function for this matrix by indexes remains the same. Don't you?
Also, we can't write down the algorithm not knowing, is the matrix written in memory by rows or by columns. OK, let's say it is written by rows. Isn't it?
If we set these two lacking conditions, the task becomes correct and is not hard to be solved.
Simply we should take every element in the matrix by linear index, find its row/column pair, transpose it, find another resulting linear index and put the value into the new place. The problem is that the transformation is autosymmetric only in the case of square matrices, so it really could not be done in site. Or it could, if we find the whole index transformation map and later use it on matrix.
Starting matrix A:
m- number of rows
n- number of columns
nm - number of elements
li - linear index
i - column number
j - row number
resulting matrix B:
lir - resulting linear index
Transforming array trans
//preparation
for (li=0;li<nm;li++){
j=li / n;
i=li-j*n;
lir=i*m+j;
trans[li]=lir;
}
// transposition
for (li=0;li<nm;li++){
cur=li;
lir=trans[cur];
temp2=a[lir];
cur=lir;
while (cur!=li){
lir=trans[cur];
temp1=a[cur];
a[cur]=temp2;
temp2=temp1;
check[cur]=1;
cur=lir;
}
}
Such auto transposing has sense only if there are heavy elements in cells.
It is possible to realize trans[] array as a function.

Doing this efficiently in the general case requires some effort. The non-square and in- versus out-of-place algorithms differ. Save yourself much effort and just use FFTW. I previously prepared a more complete write up, including sample code, on the matter.

Related

Number of ways to pick the elements of an array?

How to formulate this problem in code?
Problem Statement:
UPDATED:
Find the number of ways to pick the element from the array which are
not visited.
We starting from 1,2,.....,n with some (1<= x <= n) number of elements already picked/visited randomly which is given in the input.
Now, we need to find the number of ways we can pick rest of the (n - x) number of elements present in the array, and the way we pick an element is defined as:
On every turn, we can only pick the element which is adjacent(either left or right) to some visited element i.e
in an array of elements:
1,2,3,4,5,6 let's say we have visited 3 & 6 then we can now pick
2 or 4 or 5, as they are unvisited and adjacent to visited nodes, now say we pick 2, so now we can pick 1 or 4 or 5 and continues.
example:
input: N = 6(number of elements: 1, 2, 3, 4, 5, 6)
M = 2(number of visited elements)
visited elements are = 1, 5
Output: 16(number of ways we can pick the unvisited elements)
ways: 4, 6, 2, 3
4, 6, 3, 2
4, 2, 3, 6
4, 2, 6, 3
4, 3, 2, 6
4, 3, 6, 2
6, 4, 2, 3
6, 4, 2, 3
6, 2, 3, 4
6, 2, 4, 3
2, 6, 4, 3
2, 6, 3, 4
2, 4, 6, 3
2, 4, 3, 6
2, 3, 4, 6
2, 3, 6, 4.
Some analysis of the problem:
The actual values in the input array are assumed to be 1...n, but these values do not really play a role. These values just represent indexes that are referenced by the other input array, which lists the visited indexes (1-based)
The list of visited indexes actually cuts the main array into subarrays with smaller sizes. So for example, when n=6 and visited=[1,5], then the original array [1,2,3,4,5,6] is cut into [2,3,4] and [6]. So it cuts it into sizes 3 and 1. At this point the index numbering loses its purpose, so the problem really is fully described with those two sizes: 3 and 1. To illustrate, the solution for (n=6, visited=[1,5]) is necessarily the same as for (n=7, visited[1,2,6]): the sizes into which the original array is cut, are the same in both cases (in a different order, but that doesn't influence the result).
Algorithm, based on a list of sizes of subarrays (see above):
The number of ways that one such subarray can be visited, is not that difficult: if the subarray's size is 1, there is just one way. If it is greater, then at each pick, there are two possibilities: either you pick from the left side or from the right side. So you get like 2*2*..*2*1 possibilities to pick. This is 2size-1 possibilities.
The two outer subarrays are an exception to this, as you can only pick items from the inside-out, so for those the number of ways to visit such a subarray is just 1.
The number of ways that you can pick items from two subarrays can be determined as follows: count the number of ways to pick from just one of those subarrays, and the number of ways to pick from the other one. Then consider that you can alternate when to pick from one sub array or from the other. This comes down to interweaving the two sub arrays. Let's say the larger of the two sub arrays has j elements, and the smaller k, then consider there are j+1 positions where an element from the smaller sub array can be injected (merged) into the larger array. There are "k multichoose j+1" ways ways to inject all elements from the smaller sub array.
When you have counted the number of ways to merge two subarrays, you actually have an array with a size that is the sum of those two sizes. The above logic can then be applied with this array and the next subarray in the problem specification. The number of ways just multiplies as you merge more subarrays into this growing array. Of course, you don't really deal with the arrays, just with sizes.
Here is an implementation in JavaScript, which applies the above algorithm:
function getSubArraySizes(n, visited) {
// Translate the problem into a set of sizes (of subarrays)
let j = 0;
let sizes = [];
for (let i of visited) {
let size = i - j - 1;
if (size > 0) sizes.push(size);
j = i;
}
let size = n - j;
if (size > 0) sizes.push(size);
return sizes;
}
function Combi(n, k) {
// Count combinations: "from n, take k"
// See Wikipedia on "Combination"
let c = 1;
let end = Math.min(k, n - k);
for (let i = 0; i < end; i++) {
c = c * (n-i) / (end-i); // This is floating point
}
return c; // ... but result is integer
}
function getPickCount(sizes) {
// Main function, based on a list of sizes of subarrays
let count = 0;
let result = 1;
for (let i = 0; i < sizes.length; i++) {
let size = sizes[i];
// Number of ways to take items from this chunk:
// - when items can only be taken from one side: 1
// - otherwise: every time we have a choice between 2, except for the last remaining item
let pickCount = i == 0 || i == sizes.length-1 ? 1 : 2 ** (size-1);
// Number of ways to merge/weave two arrays, where relative order of elements is not changed
// = a "k multichoice from n". See
// https://en.wikipedia.org/wiki/Combination#Number_of_combinations_with_repetition
let weaveCount = count == 0 ? 1 // First time only
: Combi(size+count, Math.min(count, size));
// Number of possibilities:
result *= pickCount * weaveCount;
// Update the size to be the size of the merged/woven array
count += size;
}
return result;
}
// Demo with the example input (n = 6, visited = 1 and 5)
let result = getPickCount(getSubArraySizes(6, [1, 5]));
console.log(result);

Arranging the number 1 in a 2d matrix

Given the number of rows and columns of a 2d matrix
Initially all elements of matrix are 0
Given the number of 1's that should be present in each row
Given the number of 1's that should be present in each column
Determine if it is possible to form such matrix.
Example:
Input: r=3 c=2 (no. of rows and columns)
2 1 0 (number of 1's that should be present in each row respectively)
1 2 (number of 1's that should be present in each column respectively)
Output: Possible
Explanation:
1 1
0 1
0 0
I tried solving this problem for like 12 hours by checking if summation of Ri = summation of Ci
But I wondered if wouldn't be possible for cases like
3 3
1 3 0
0 2 2
r and c can be upto 10^5
Any ideas how should I move further?
Edit: Constraints added and output should only be "possible" or "impossible". The possible matrix need not be displayed.
Can anyone help me now?
Hint: one possible solution utilizes Maximum Flow Problem by creating a special graph and running the standard maximum flow algorithm on it.
If you're not familiar with the above problem, you may start reading about it e.g. here https://en.wikipedia.org/wiki/Maximum_flow_problem
If you're interested in the full solution please comment and I'll update the answer. But it requires understading the above algorithm.
Solution as requested:
Create a graph of r+c+2 nodes.
Node 0 is the source, node r+c+1 is the sink. Nodes 1..r represent the rows, while r+1..r+c the columns.
Create following edges:
from source to nodes i=1..r of capacity r_i
from nodes i=r+1..r+c to sink of capacity c_i
between all the nodes i=1..r and j=r+1..r+c of capacity 1
Run maximum flow algorithm, the saturated edges between row nodes and column nodes define where you should put 1.
Or if it's not possible then the maximum flow value is less than number of expected ones in the matrix.
I will illustrate the algorithm with an example.
Assume we have m rows and n columns. Let rows[i] be the number of 1s in row i, for 0 <= i < m,
and cols[j] be the number of 1s in column j, for 0 <= j < n.
For example, for m = 3, and n = 4, we could have: rows = {4 2 3}, cols = {1 3 2 3}, and
the solution array would be:
1 3 2 3
+--------
4 | 1 1 1 1
2 | 0 1 0 1
3 | 0 1 1 1
Because we only want to know whether a solution exists, the values in rows and cols may be permuted in any order. The solution of each permutation is just a permutation of the rows and columns of the above solution.
So, given rows and cols, sort cols in decreasing order, and rows in increasing order. For our example, we have cols = {3 3 2 1} and rows = {2 3 4}, and the equivalent problem.
3 3 2 1
+--------
2 | 1 1 0 0
3 | 1 1 1 0
4 | 1 1 1 1
We transform cols into a form that is better suited for the algorithm. What cols tells us is that we have two series of 1s of length 3, one series of 1s of length 2, and one series of 1s of length 1, that are to be distributed among the rows of the array. We rewrite cols to capture just that, that is COLS = {2/3 1/2 1/1}, 2 series of length 3, 1 series of length 2, and 1 series of length 1.
Because we have 2 series of length 3, a solution exists only if we can put two 1s in the first row. This is possible because rows[0] = 2. We do not actually put any 1 in the first row, but record the fact that 1s have been placed there by decrementing the length of the series of length 3. So COLS becomes:
COLS = {2/2 1/2 1/1}
and we combine our two counts for series of length 2, yielding:
COLS = {3/2 1/1}
We now have the reduced problem:
3 | 1 1 1 0
4 | 1 1 1 1
Again we need to place 1s from our series of length 2 to have a solution. Fortunately, rows[1] = 3 and we can do this. We decrement the length of 3/2 and get:
COLS = {3/1 1/1} = {4/1}
We have the reduced problem:
4 | 1 1 1 1
Which is solved by 4 series of length 1, just what we have left. If at any step, the series in COLS cannot be used to satisfy a row count, then no solution is possible.
The general processing for each row may be stated as follows. For each row r, starting from the first element in COLS, decrement the lengths of as many elements count[k]/length[k] of COLS as needed, so that the sum of the count[k]'s equals rows[r]. Eliminate series of length 0 in COLS and combine series of same length.
Note that because elements of COLS are in decreasing order of lengths, the length of the last element decremented is always less than or equal to the next element in COLS (if there is a next element).
EXAMPLE 2 : Solution exists.
rows = {1 3 3}, cols = {2 2 2 1} => COLS = {3/2 1/1}
1 series of length 2 is decremented to satisfy rows[0] = 1, and the 2 other series of length 2 remains at length 2.
rows[0] = 1
COLS = {2/2 1/1 1/1} = {2/2 2/1}
The 2 series of length 2 are decremented, and 1 of the series of length 1.
The series whose length has become 0 is deleted, and the series of length 1 are combined.
rows[1] = 3
COLS = {2/1 1/0 1/1} = {2/1 1/1} = {3/1}
A solution exists for rows[2] can be satisfied.
rows[2] = 3
COLS = {3/0} = {}
EXAMPLE 3: Solution does not exists.
rows = {0 2 3}, cols = {3 2 0 0} => COLS = {1/3 1/2}
rows[0] = 0
COLS = {1/3 1/2}
rows[1] = 2
COLS = {1/2 1/1}
rows[2] = 3 => impossible to satisfy; no solution.
SPACE COMPLEXITY
It is easy to see that it is O(m + n).
TIME COMPLEXITY
We iterate over each row only once. For each row i, we need to iterate over at most
rows[i] <= n elements of COLS. Time complexity is O(m x n).
After finding this algorithm, I found the following theorem:
The Havel-Hakimi theorem (Havel 1955, Hakimi 1962) states that there exists a matrix Xn,m of 0’s and 1’s with row totals a0=(a1, a2,… , an) and column totals b0=(b1, b2,… , bm) such that bi ≥ bi+1 for every 0 < i < m if and only if another matrix Xn−1,m of 0’s and 1’s with row totals a1=(a2, a3,… , an) and column totals b1=(b1−1, b2−1,… ,ba1−1, ba1+1,… , bm) also exists.
from the post Finding if binary matrix exists given the row and column sums.
This is basically what my algorithm does, while trying to optimize the decrementing part, i.e., all the -1's in the above theorem. Now that I see the above theorem, I know my algorithm is correct. Nevertheless, I checked the correctness of my algorithm by comparing it with a brute-force algorithm for arrays of up to 50 cells.
Here is the C# implementation.
public class Pair
{
public int Count;
public int Length;
}
public class PairsList
{
public LinkedList<Pair> Pairs;
public int TotalCount;
}
class Program
{
static void Main(string[] args)
{
int[] rows = new int[] { 0, 0, 1, 1, 2, 2 };
int[] cols = new int[] { 2, 2, 0 };
bool success = Solve(cols, rows);
}
static bool Solve(int[] cols, int[] rows)
{
PairsList pairs = new PairsList() { Pairs = new LinkedList<Pair>(), TotalCount = 0 };
FillAllPairs(pairs, cols);
for (int r = 0; r < rows.Length; r++)
{
if (rows[r] > 0)
{
if (pairs.TotalCount < rows[r])
return false;
if (pairs.Pairs.First != null && pairs.Pairs.First.Value.Length > rows.Length - r)
return false;
DecrementPairs(pairs, rows[r]);
}
}
return pairs.Pairs.Count == 0 || pairs.Pairs.Count == 1 && pairs.Pairs.First.Value.Length == 0;
}
static void DecrementPairs(PairsList pairs, int count)
{
LinkedListNode<Pair> pair = pairs.Pairs.First;
while (count > 0 && pair != null)
{
LinkedListNode<Pair> next = pair.Next;
if (pair.Value.Count == count)
{
pair.Value.Length--;
if (pair.Value.Length == 0)
{
pairs.Pairs.Remove(pair);
pairs.TotalCount -= count;
}
else if (pair.Next != null && pair.Next.Value.Length == pair.Value.Length)
{
pair.Value.Count += pair.Next.Value.Count;
pairs.Pairs.Remove(pair.Next);
next = pair;
}
count = 0;
}
else if (pair.Value.Count < count)
{
count -= pair.Value.Count;
pair.Value.Length--;
if (pair.Value.Length == 0)
{
pairs.Pairs.Remove(pair);
pairs.TotalCount -= pair.Value.Count;
}
else if(pair.Next != null && pair.Next.Value.Length == pair.Value.Length)
{
pair.Value.Count += pair.Next.Value.Count;
pairs.Pairs.Remove(pair.Next);
next = pair;
}
}
else // pair.Value.Count > count
{
Pair p = new Pair() { Count = count, Length = pair.Value.Length - 1 };
pair.Value.Count -= count;
if (p.Length > 0)
{
if (pair.Next != null && pair.Next.Value.Length == p.Length)
pair.Next.Value.Count += p.Count;
else
pairs.Pairs.AddAfter(pair, p);
}
else
pairs.TotalCount -= count;
count = 0;
}
pair = next;
}
}
static int FillAllPairs(PairsList pairs, int[] cols)
{
List<Pair> newPairs = new List<Pair>();
int c = 0;
while (c < cols.Length && cols[c] > 0)
{
int k = c++;
if (cols[k] > 0)
pairs.TotalCount++;
while (c < cols.Length && cols[c] == cols[k])
{
if (cols[k] > 0) pairs.TotalCount++;
c++;
}
newPairs.Add(new Pair() { Count = c - k, Length = cols[k] });
}
LinkedListNode<Pair> pair = pairs.Pairs.First;
foreach (Pair p in newPairs)
{
while (pair != null && p.Length < pair.Value.Length)
pair = pair.Next;
if (pair == null)
{
pairs.Pairs.AddLast(p);
}
else if (p.Length == pair.Value.Length)
{
pair.Value.Count += p.Count;
pair = pair.Next;
}
else // p.Length > pair.Value.Length
{
pairs.Pairs.AddBefore(pair, p);
}
}
return c;
}
}
(Note: to avoid confusion between when I'm talking about the actual numbers in the problem vs. when I'm talking about the zeros in the ones in the matrix, I'm going to instead fill the matrix with spaces and X's. This obviously doesn't change the problem.)
Some observations:
If you're filling in a row, and there's (for example) one column needing 10 more X's and another column needing 5 more X's, then you're sometimes better off putting the X in the "10" column and saving the "5" column for later (because you might later run into 5 rows that each need 2 X's), but you're never better off putting the X in the "5" column and saving the "10" column for later (because even if you later run into 10 rows that all need an X, they won't mind if they don't all go in the same column). So we can use a somewhat "greedy" algorithm: always put an X in the column still needing the most X's. (Of course, we'll need to make sure that we don't greedily put an X in the same column multiple times for the same row!)
Since you don't need to actually output a possible matrix, the rows are all interchangeable and the columns are all interchangeable; all that matter is how many rows still need 1 X, how many still need 2 X's, etc., and likewise for columns.
With that in mind, here's one fairly simple approach:
(Optimization.) Add up the counts for all the rows, add up the counts for all the columns, and return "impossible" if the sums don't match.
Create an array of length r+1 and populate it with how many columns need 1 X, how many need 2 X's, etc. (You can ignore any columns needing 0 X's.)
(Optimization.) To help access the array efficiently, build a stack/linked-list/etc. of the indices of nonzero array elements, in decreasing order (e.g., starting at index r if it's nonzero, then index r−1 if it's nonzero, etc.), so that you can easily find the elements representing columns to put X's in.
(Optimization.) To help determine when there'll be a row can't be satisfied, also make note of the total number of columns needing any X's, and make note of the largest number of X's needed by any row. If the former is less than the latter, return "impossible".
(Optimization.) Sort the rows by the number of X's they need.
Iterate over the rows, starting with the one needing the fewest X's and ending with the one needing the most X's, and for each one:
Update the array accordingly. For example, if a row needs 12 X's, and the array looks like [..., 3, 8, 5], then you'll update the array to look like [..., 3+7 = 10, 8+5−7 = 6, 5−5 = 0]. If it's not possible to update the array because you run out of columns to put X's in, return "impossible". (Note: this part should never actually return "impossible", because we're keeping count of the number of columns left and the max number of columns we'll need, so we should have already returned "impossible" if this was going to happen. I mention this check only for clarity.)
Update the stack/linked-list of indices of nonzero array elements.
Update the total number of columns needing any X's. If it's now less than the greatest number of X's needed by any row, return "impossible".
(Optimization.) If the first nonzero array element has an index greater than the number of rows left, return "impossible".
If we complete our iteration without having returned "impossible", return "possible".
(Note: the reason I say to start with the row needing the fewest X's, and work your way to the row with the most X's, is that a row needing more X's may involve examining updating more elements of the array and of the stack, so the rows needing fewer X's are cheaper. This isn't just a matter of postponing the work: the rows needing fewer X's can help "consolidate" the array, so that there will be fewer distinct column-counts, making the later rows cheaper than they would otherwise be. In a very-bad-case scenario, such as the case of a square matrix where every single row needs a distinct positive number of X's and every single column needs a distinct positive number of X's, the fewest-to-most order means you can handle each row in O(1) time, for linear time overall, whereas the most-to-fewest order would mean that each row would take time proportional to the number of X's it needs, for quadratic time overall.)
Overall, this takes no worse than O(r+c+n) time (where n is the number of X's); I think that the optimizations I've listed are enough to ensure that it's closer to O(r+c) time, but it's hard to be 100% sure. I recommend trying it to see if it's fast enough for your purposes.
You can use brute force (iterating through all 2^(r * c) possibilities) to solve it, but that will take a long time. If r * c is under 64, you can accelerate it to a certain extent using bit-wise operations on 64-bit integers; however, even then, iterating through all 64-bit possibilities would take, at 1 try per ms, over 500M years.
A wiser choice is to add bits one by one, and only continue placing bits if no constraints are broken. This will eliminate the vast majority of possibilities, greatly speeding up the process. Look up backtracking for the general idea. It is not unlike solving sudokus through guesswork: once it becomes obvious that your guess was wrong, you erase it and try guessing a different digit.
As with sudokus, there are certain strategies that can be written into code and will result in speedups when they apply. For example, if the sum of 1s in rows is different from the sum of 1s in columns, then there are no solutions.
If over 50% of the bits will be on, you can instead work on the complementary problem (transform all ones to zeroes and vice-versa, while updating row and column counts). Both problems are equivalent, because any answer for one is also valid for the complementary.
This problem can be solved in O(n log n) using Gale-Ryser Theorem. (where n is the maximum of lengths of the two degree sequences).
First, make both sequences of equal length by adding 0's to the smaller sequence, and let this length be n.
Let the sequences be A and B. Sort A in non-decreasing order, and sort B in non-increasing order. Create another prefix sum array P for B such that ith element of P is equal to sum of first i elements of B.
Now, iterate over k's from 1 to n, and check for
The second sum can be calculated in O(log n) using binary search for index of last number in B smaller than k, and then using precalculated P.
Inspiring from the solution given by RobertBaron I have tried to build a new algorithm.
rows = [int(x)for x in input().split()]
cols = [int (ss) for ss in input().split()]
rows.sort()
cols.sort(reverse=True)
for i in range(len(rows)):
for j in range(len(cols)):
if(rows[i]!= 0 and cols[j]!=0):
rows[i] = rows[i] - 1;
cols[j] =cols[j]-1;
print("rows: ",rows)
print("cols: ",cols)
#if there is any non zero value, print NO else print yes
flag = True
for i in range(len(rows)):
if(rows[i]!=0):
flag = False
break
for j in range(len(cols)):
if(cols[j]!=0):
flag = False
if(flag):
print("YES")
else:
print("NO")
here, i have sorted the rows in ascending order and cols in descending order. later decrementing particular row and column if 1 need to be placed!
it is working for all the test cases posted here! rest GOD knows

Minimum common remainder of division

I have n pairs of numbers: ( p[1], s[1] ), ( p[2], s[2] ), ... , ( p[n], s[n] )
Where p[i] is integer greater than 1; s[i] is integer : 0 <= s[i] < p[i]
Is there any way to determine minimum positive integer a , such that for each pair :
( s[i] + a ) mod p[i] != 0
Anything better than brute force ?
It is possible to do better than brute force. Brute force would be O(A·n), where A is the minimum valid value for a that we are looking for.
The approach described below uses a min-heap and achieves O(n·log(n) + A·log(n)) time complexity.
First, notice that replacing a with a value of the form (p[i] - s[i]) + k * p[i] leads to a reminder equal to zero in the ith pair, for any positive integer k. Thus, the numbers of that form are invalid a values (the solution that we are looking for is different from all of them).
The proposed algorithm is an efficient way to generate the numbers of that form (for all i and k), i.e. the invalid values for a, in increasing order. As soon as the current value differs from the previous one by more than 1, it means that there was a valid a in-between.
The pseudocode below details this approach.
1. construct a min-heap from all the following pairs (p[i] - s[i], p[i]),
where the heap comparator is based on the first element of the pairs.
2. a0 = -1; maxA = lcm(p[i])
3. Repeat
3a. Retrieve and remove the root of the heap, (a, p[i]).
3b. If a - a0 > 1 then the result is a0 + 1. Exit.
3c. if a is at least maxA, then no solution exists. Exit.
3d. Insert into the heap the value (a + p[i], p[i]).
3e. a0 = a
Remark: it is possible for such an a to not exist. If a valid a is not found below LCM(p[1], p[2], ... p[n]), then it is guaranteed that no valid a exists.
I'll show below an example of how this algorithm works.
Consider the following (p, s) pairs: { (2, 1), (5, 3) }.
The first pair indicates that a should avoid values like 1, 3, 5, 7, ..., whereas the second pair indicates that we should avoid values like 2, 7, 12, 17, ... .
The min-heap initially contains the first element of each sequence (step 1 of the pseudocode) -- shown in bold below:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We retrieve and remove the head of the heap, i.e., the minimum value among the two bold ones, and this is 1. We add into the heap the next element from that sequence, thus the heap now contains the elements 2 and 3:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We again retrieve the head of the heap, this time it contains the value 2, and add the next element of that sequence into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
The algorithm continues, we will next retrieve value 3, and add 5 into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
Finally, now we retrieve value 5. At this point we realize that the value 4 is not among the invalid values for a, thus that is the solution that we are looking for.
I can think of two different solutions. First:
p_max = lcm (p[0],p[1],...,p[n]) - 1;
for a = 0 to p_max:
zero_found = false;
for i = 0 to n:
if ( s[i] + a ) mod p[i] == 0:
zero_found = true;
break;
if !zero_found:
return a;
return -1;
I suppose this is the one you call "brute force". Notice that p_max represents Least Common Multiple of p[i]s - 1 (solution is either in the closed interval [0, p_max], or it does not exist). Complexity of this solution is O(n * p_max) in the worst case (plus the running time for calculating lcm!). There is a better solution regarding the time complexity, but it uses an additional binary array - classical time-space tradeoff. Its idea is similar to the Sieve of Eratosthenes, but for remainders instead of primes :)
p_max = lcm (p[0],p[1],...,p[n]) - 1;
int remainders[p_max + 1] = {0};
for i = 0 to n:
int rem = s[i] - p[i];
while rem >= -p_max:
remainders[-rem] = 1;
rem -= p[i];
for i = 0 to n:
if !remainders[i]:
return i;
return -1;
Explanation of the algorithm: first, we create an array remainders that will indicate whether certain negative remainder exists in the whole set. What is a negative remainder? It's simple, notice that 6 = 2 mod 4 is equivalent to 6 = -2 mod 4. If remainders[i] == 1, it means that if we add i to one of the s[j], we will get p[j] (which is 0, and that is what we want to avoid). Array is populated with all possible negative remainders, up to -p_max. Now all we have to do is search for the first i, such that remainder[i] == 0 and return it, if it exists - notice that the solution does not have to exists. In the problem text, you have indicated that you are searching for the minimum positive integer, I don't see why zero would not fit (if all s[i] are positive). However, if that is a strong requirement, just change the for loop to start from 1 instead of 0, and increment p_max.
The complexity of this algorithm is n + sum (p_max / p[i]) = n + p_max * sum (1 / p[i]), where i goes from to 0 to n. Since all p[i]s are at least 2, that is asymptotically better than the brute force solution.
An example for better understanding: suppose that the input is (5,4), (5,1), (2,0). p_max is lcm(5,5,2) - 1 = 10 - 1 = 9, so we create array with 10 elements, initially filled with zeros. Now let's proceed pair by pair:
from the first pair, we have remainders[1] = 1 and remainders[6] = 1
second pair gives remainders[4] = 1 and remainders[9] = 1
last pair gives remainders[0] = 1, remainders[2] = 1, remainders[4] = 1, remainders[6] = 1 and remainders[8] = 1.
Therefore, first index with zero value in the array is 3, which is a desired solution.

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 &leq; N &leq; 100), with 2 &leq; ai &leq; 500.
Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.
The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.
Example:
Let N = 3 and A = {2, 6, 7}, then:
LCM({2, 6}) = 6
LCM({2, 7}) = 14
LCM({6, 7}) = 42
LCM({2, 6, 7}) = 42
----------------------- +
answer 104
The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.
Solution sketch:
The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.
The solution reads (modulo some small fixed grammar issues):
The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:
2 8
3 5
5 3
7 3
11 2
13 2
17 2
19 2
Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.
I have the following concrete questions:
What is meant by these states?
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
How is dp[i] computed from dp[i-1]?
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
*The original problem description can be found from this source (problem F). This question is a simplified version of that description.
Discussion
After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.
The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.
Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.
What is meant by these states?
The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.
Does dp stand for dynamic programming?
I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.
If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?
All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).
dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..
Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).
To populate dp[i][n] from dp[i][n-1] for each component c:
For each i, copy dp[i][n-1] into dp[i][n].
Add 1 to dp[T(c)][n].
For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].
What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).
Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.
Why do the big primes not contribute to the number of states?
Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.
What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.
The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).
Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is
(1 + 2) (1 + 3) (1 + 5) (1 + 7),
because the LCM of a subset is exactly equal to the product here, so just multiply it out.
Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like
(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),
because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.
Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.
For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.
What is meant by these states?
I think here, states refer to if the number is in set B = {b0, b1, ..., bk-1} of LCMs of set A.
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
dp in the solution sketch stands for dynamic programming, I believe.
How is dp[i] computed from dp[i-1]?
It's feasible that we can figure out the state of next group of LCMs from previous states. So, we only need array of 2, and toggle back and forth.
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
We can use Prime Factorization and exponents only to present the number.
Here is one example.
6 = (2^1)(3^1)(5^0) -> state "1 1 0" to represent 6
18 = (2^1)(3^2)(5^0) -> state "1 2 0" to represent 18
Here is how we can get LMC of 6 and 18 using Prime Factorization
LCM (6,18) = (2^(max(1,1)) (3^ (max(1,2)) (5^max(0,0)) = (2^1)(3^2)(5^0) = 18
2^9 > 500, 3^6 > 500, 5^4 > 500, 7^4>500, 11^3 > 500, 13^3 > 500, 17^3 > 500, 19^3 > 500
we can use only count of exponents of prime number 2,3,5,7,11,13,17,19 to represent the LCMs in the set B = {b0, b1, ..., bk-1}
for the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.
9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 <= 70000, so we only need two of dp[9][6][4][4][3][3][3][3] to keep tracks of all LCMs' states. So, dp[70000][2] is enough.
I put together a small C++ program to illustrate how we can get sum of LCMs of the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500. In the solution sketch, we need to loop through 70000 max possible of LCMs.
int gcd(int a, int b) {
int remainder = 0;
do {
remainder = a % b;
a = b;
b = remainder;
} while (b != 0);
return a;
}
int lcm(int a, int b) {
if (a == 0 || b == 0) {
return 0;
}
return (a * b) / gcd(a, b);
}
int sum_of_lcm(int A[], int N) {
// get the max LCM from the array
int max = A[0];
for (int i = 1; i < N; i++) {
max = lcm(max, A[i]);
}
max++;
//
int dp[max][2];
memset(dp, 0, sizeof(dp));
int pri = 0;
int cur = 1;
// loop through n x 70000
for (int i = 0; i < N; i++) {
for (int v = 1; v < max; v++) {
int x = A[i];
if (dp[v][pri] > 0) {
x = lcm(A[i], v);
dp[v][cur] = (dp[v][cur] == 0) ? dp[v][pri] : dp[v][cur];
if ( x % A[i] != 0 ) {
dp[x][cur] += dp[v][pri] + dp[A[i]][pri];
} else {
dp[x][cur] += ( x==v ) ? ( dp[v][pri] + dp[v][pri] ) : ( dp[v][pri] ) ;
}
}
}
dp[A[i]][cur]++;
pri = cur;
cur = (pri + 1) % 2;
}
for (int i = 0; i < N; i++) {
dp[A[i]][pri] -= 1;
}
long total = 0;
for (int j = 0; j < max; j++) {
if (dp[j][pri] > 0) {
total += dp[j][pri] * j;
}
}
cout << "total:" << total << endl;
return total;
}
int test() {
int a[] = {2, 6, 7 };
int n = sizeof(a)/sizeof(a[0]);
int total = sum_of_lcm(a, n);
return 0;
}
Output
total:104
The states are one more than the powers of primes. You have numbers up to 2^8, so the power of 2 is in [0..8], which is 9 states. Similarly for the other states.
"dp" could well stand for dynamic programming, I'm not sure.
The recurrence relation is the heart of the problem, so you will learn more by solving it yourself. Start with some small, simple examples.
For the large primes, try solving a reduced problem without using them (or their equivalents) and then add them back in to see their effect on the final result.

How can you compare to what extent two lists are in the same order?

I have two arrays containing the same elements, but in different orders, and I want to know the extent to which their orders differ.
The method I tried, didn't work. it was as follows:
For each list I built a matrix which recorded for each pair of elements whether they were above or below each other in the list. I then calculated a pearson correlation coefficient of these two matrices. This worked extremely badly. Here's a trivial example:
list 1:
1
2
3
4
list 2:
1
3
2
4
The method I described above produced matrices like this (where 1 means the row number is higher than the column, and 0 vice-versa):
list 1:
1 2 3 4
1 1 1 1
2 1 1
3 1
4
list 2:
1 2 3 4
1 1 1 1
2 0 1
3 1
4
Since the only difference is the order of elements 2 and 3, these should be deemed to be very similar. The Pearson Correlation Coefficient for those two matrices is 0, suggesting they are not correlated at all. I guess the problem is that what I'm looking for is not really a correlation coefficient, but some other kind of similarity measure. Edit distance, perhaps?
Can anyone suggest anything better?
Mean square of differences of indices of each element.
List 1: A B C D E
List 2: A D C B E
Indices of each element of List 1 in List 2 (zero based)
A B C D E
0 3 2 1 4
Indices of each element of List 1 in List 1 (zero based)
A B C D E
0 1 2 3 4
Differences:
A B C D E
0 -2 0 2 0
Square of differences:
A B C D E
4 4
Average differentness = 8 / 5.
Just an idea, but is there any mileage in adapting a standard sort algorithm to count the number of swap operations needed to transform list1 into list2?
I think that defining the compare function may be difficult though (perhaps even just as difficult as the original problem!), and this may be inefficient.
edit: thinking about this a bit more, the compare function would essentially be defined by the target list itself. So for example if list 2 is:
1 4 6 5 3
...then the compare function should result in 1 < 4 < 6 < 5 < 3 (and return equality where entries are equal).
Then the swap function just needs to be extended to count the swap operations.
A bit late for the party here, but just for the record, I think Ben almost had it... if you'd looked further into correlation coefficients, I think you'd have found that Spearman's rank correlation coefficient might have been the way to go.
Interestingly, jamesh seems to have derived a similar measure, but not normalized.
See this recent SO answer.
You might consider how many changes it takes to transform one string into another (which I guess it was you were getting at when you mentioned edit distance).
See: http://en.wikipedia.org/wiki/Levenshtein_distance
Although I don't think l-distance takes into account rotation. If you allow rotation as an operation then:
1, 2, 3, 4
and
2, 3, 4, 1
Are pretty similar.
There is a branch-and-bound algorithm that should work for any set of operators you like. It may not be real fast. The pseudocode goes something like this:
bool bounded_recursive_compare_routine(int* a, int* b, int level, int bound){
if (level > bound) return false;
// if at end of a and b, return true
// apply rule 0, like no-change
if (*a == *b){
bounded_recursive_compare_routine(a+1, b+1, level+0, bound);
// if it returns true, return true;
}
// if can apply rule 1, like rotation, to b, try that and recur
bounded_recursive_compare_routine(a+1, b+1, level+cost_of_rotation, bound);
// if it returns true, return true;
...
return false;
}
int get_minimum_cost(int* a, int* b){
int bound;
for (bound=0; ; bound++){
if (bounded_recursive_compare_routine(a, b, 0, bound)) break;
}
return bound;
}
The time it takes is roughly exponential in the answer, because it is dominated by the last bound that works.
Added: This can be extended to find the nearest-matching string stored in a trie. I did that years ago in a spelling-correction algorithm.
I'm not sure exactly what formula it uses under the hood, but difflib.SequenceMatcher.ratio() does exactly this:
ratio(self) method of difflib.SequenceMatcher instance:
Return a measure of the sequences' similarity (float in [0,1]).
Code example:
from difflib import SequenceMatcher
sm = SequenceMatcher(None, '1234', '1324')
print sm.ratio()
>>> 0.75
Another approach that is based on a little bit of mathematics is to count the number of inversions to convert one of the arrays into the other one. An inversion is the exchange of two neighboring array elements. In ruby it is done like this:
# extend class array by new method
class Array
def dist(other)
raise 'can calculate distance only to array with same length' if length != other.length
# initialize count of inversions to 0
count = 0
# loop over all pairs of indices i, j with i<j
length.times do |i|
(i+1).upto(length) do |j|
# increase count if i-th and j-th element have different order
count += 1 if (self[i] <=> self[j]) != (other[i] <=> other[j])
end
end
return count
end
end
l1 = [1, 2, 3, 4]
l2 = [1, 3, 2, 4]
# try an example (prints 1)
puts l1.dist(l2)
The distance between two arrays of length n can be between 0 (they are the same) and n*(n+1)/2 (reversing the first array one gets the second). If you prefer to have distances always between 0 and 1 to be able to compare distances of pairs of arrays of different length, just divide by n*(n+1)/2.
A disadvantage of this algorithms is it running time of n^2. It also assumes that the arrays don't have double entries, but it could be adapted.
A remark about the code line "count += 1 if ...": the count is increased only if either the i-th element of the first list is smaller than its j-th element and the i-th element of the second list is bigger than its j-th element or vice versa (meaning that the i-th element of the first list is bigger than its j-th element and the i-th element of the second list is smaller than its j-th element). In short: (l1[i] < l1[j] and l2[i] > l2[j]) or (l1[i] > l1[j] and l2[i] < l2[j])
If one has two orders one should look at two important ranking correlation coefficients:
Spearman's rank correlation coefficient: https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient
This is almost the same as Jamesh answer but scaled in the range -1 to 1.
It is defined as:
1 - ( 6 * sum_of_squared_distances ) / ( n_samples * (n_samples**2 - 1 )
Kendalls tau: https://nl.wikipedia.org/wiki/Kendalls_tau
When using python one could use:
from scipy import stats
order1 = [ 1, 2, 3, 4]
order2 = [ 1, 3, 2, 4]
print stats.spearmanr(order1, order2)[0]
>> 0.8000
print stats.kendalltau(order1, order2)[0]
>> 0.6667
if anyone is using R language, I've implemented a function that computes the "spearman rank correlation coefficient" using the method described above by #bubake here:
get_spearman_coef <- function(objectA, objectB) {
#getting the spearman rho rank test
spearman_data <- data.frame(listA = objectA, listB = objectB)
spearman_data$rankA <- 1:nrow(spearman_data)
rankB <- c()
for (index_valueA in 1:nrow(spearman_data)) {
for (index_valueB in 1:nrow(spearman_data)) {
if (spearman_data$listA[index_valueA] == spearman_data$listB[index_valueB]) {
rankB <- append(rankB, index_valueB)
}
}
}
spearman_data$rankB <- rankB
spearman_data$distance <-(spearman_data$rankA - spearman_data$rankB)**2
spearman <- 1 - ( (6 * sum(spearman_data$distance)) / (nrow(spearman_data) * ( nrow(spearman_data)**2 -1) ) )
print(paste("spearman's rank correlation coefficient"))
return( spearman)
}
results :
get_spearman_coef(c("a","b","c","d","e"), c("a","b","c","d","e"))
spearman's rank correlation coefficient: 1
get_spearman_coef(c("a","b","c","d","e"), c("b","a","d","c","e"))
spearman's rank correlation coefficient: 0.9

Resources