Related
My problem is the following. I have the adjacency matrix Mat for a neural network. I want to randomize this network in the sense that I want to choose 4 notes randomly (say i,j,p,q) such that i and p are connected (which means Mat[p,i] = 1) and j and q are connected AND i and q are not connected (Mat[q,j] = 0)and j and p are not connected. I then connect i and q and j and p and disconnect the previous nodes. In one run, I want to do this 10^6 times.
So far I have two versions, one using a for loop and one recursively.
newmat = copy(Mat)
for trial in 1:Niter
count = 0
while count < 1
i,j,p,q = sample(Nodes,4,replace = false) #Choosing 4 nodes at random
if (newmat[p,i] == 1 && newmat[q,j] == 1) && (newmat[p,j] == 0 && newmat[q,i] == 0)
newmat[p,i] = 0
newmat[q,j] = 0
newmat[p,j] = 1
newmat[q,i] = 1
count += 1
end
end
end
Doing this recursively runs about just as fast until Niter = 10^4 after which I get a Stack Overflow error. How can I improve this?
I assume you are talking about a recursive variant of the for trial in 1:Niter.
To avoid stack overflows like this, a general rule of thumb (in languages without tail recursion elimination) is to not use recursion unless you know the recursion depth will not scale more than logarithmically.
The cases where this is applicable is mostly algorithms that are like tree traversals, with a "naturally occuring" recursive structure. Your case of a simple for loop can be viewed as the degenerate variant of that, with a "linked list" tree, but is not a all natural.
Just don't do it. There's nothing bad about a loop for some sequential processing like this. Julia is an imperative language, after all.
(If you want to do this with a recursive structure for fun or exercise: look up trampolines. They allow you to write code structured as tail recursive, but with the allocation happening by mutation and on the heap.)
Instead of sampling 4 random nodes and hoping they happen to be connected, you can sample the starting nodes p and q, and look for i and j within the nodes that these are connected to. Here's an implementation of that:
function randomizeconnections(adjmatin)
adjmat = copy(adjmatin)
nodes = axes(adjmat, 2)
niter = 10
for trial in 1:niter
p, q = sample(nodes, 2, replace = false)
#views plist, qlist = findall(adjmat[p, :]), findall(adjmat[q, :])
filter!(i -> !in(i, qlist) && i != q, plist)
filter!(j -> !in(j, plist) && j != p, qlist)
if isempty(plist) || isempty(qlist)
#debug "No swappable exclusive target nodes for source nodes $p and $q, skipping trial $trial..."
continue
end
i = rand(plist)
j = rand(qlist)
adjmat[p, i] = adjmat[q, j] = false
adjmat[p, j] = adjmat[q, i] = true
end
adjmat
end
Through the course of randomization, it may happen that two nodes don't have any swappable connections i.e. they may share all their end points or one's ending nodes are a subset of the other's. So there's a check for that in the above code, and the loop moves on to the next iteration in that case.
The line with the findalls in the above code effectively creates adjacency lists from the adjacency matrix on the fly. You can instead do that in one go at the beginning, and work with that adjacency list vector instead.
function randomizeconnections2(adjmatin)
adjlist = [findall(r) for r in eachrow(adjmatin)]
nodes = axes(adjlist, 1)
niter = 10
for trial in 1:niter
p, q = sample(nodes, 2, replace = false)
plist = filter(i -> !in(i, adjlist[q]) && i != q, adjlist[p])
qlist = filter(j -> !in(j, adjlist[p]) && j != p, adjlist[q])
if isempty(plist) || isempty(qlist)
#debug "No swappable exclusive target nodes for source nodes $p and $q, skipping trial $trial..."
continue
end
i = rand(plist)
j = rand(qlist)
replace!(adjlist[p], i => j)
replace!(adjlist[q], j => i)
end
create_adjmat(adjlist)
end
function create_adjmat(adjlist::Vector{Vector{Int}})
adjmat = falses(length(adjlist), length(adjlist))
for (i, l) in pairs(adjlist)
adjmat[i, l] .= true
end
adjmat
end
With the small matriced I tried locally, randomizeconnections2 seems about twice as fast as randomizeconnections, but you may want to confirm whether that's the case with your matrix sizes and values.
Both of these accept (and were tested with) BitMatrix type values as input, which should be more efficient than an ordinary matrix of booleans or integers.
This was a question asked in the coding round for NASDAQ internship.
Program description:
The program takes a binary string as input. We have to successively remove sub-sequences having all characters alternating, till the string is empty. The task was to find the minimum number of steps required to do so.
Example1:
let the string be : 0111001
Removed-0101, Remaining-110
Removed-10 , Remaining-1
Removed-1
No of steps = 3
Example2:
let the string be : 111000111
Removed-101, Remaining-110011
Removed-101, Remaining-101
Removed-101
No of steps = 3
Example3:
let the string be : 11011
Removed-101, Remaining-11
Removed-1 , Remaining-1
Removed-1
No of steps = 3
Example4:
let the string be : 10101
Removed-10101
No of steps = 1
The solution I tried, considered the first character of the binary string as first character for my sub-sequence. Then created a new string, where the next character would be appended if it wasn't part of the alternating sequence. The new string becomes our binary string. In this way, a loop continues till the new string is empty. (somewhat an O(n^2) algorithm). As expected, it gave me a timeout error. Adding a somewhat similar code in C++ to the one I had tried, which was in Java.
#include<bits/stdc++.h>
using namespace std;
int main() {
string str, newStr;
int len;
char c;
int count = 0;
getline(cin, str);
len = str.length();
//continue till string is empty
while(len > 0) {
len = 0;
c = str[0];
for(int i=1; str[i] != '\0';i++) {
//if alternative characters are found, set as c and avoid that character
if(c != str[i])
c = str[i];
//if next character is not alternate, add the character to newStr
else {
newStr.push_back(str[i]);
len++;
}
}
str = newStr;
newStr = "";
count++;
}
cout<<count<<endl;
return 0;
}
I also tried methods like finding the length of the largest sub sequence of same consecutive characters which obviously didn't satisfy every case, like that of example3.
Hope somebody could help me with the most optimized solution for this question. Preferably a code in C, C++ or python. Even the algorithm would do.
I found a more optimal O(NlogN) solution by maintaining a Min-Heap and Look-up hashMap.
We start with the initial array as alternating counts of 0, 1.
That is, for string= 0111001; lets assume our input-array S=[1,3,2,1]
Basic idea:
Heapify the count-array
Extract minimum count node => add to num_steps
Now extract both its neighbours (maintained in the Node-class) from the Heap using the lookup-map
Merge both these neighbours and insert into the Heap
Repeat steps 2-4 until no entries remain in the Heap
Code implementation in Python
class Node:
def __init__(self, node_type: int, count: int):
self.prev = None
self.next = None
self.node_type = node_type
self.node_count = count
#staticmethod
def compare(node1, node2) -> bool:
return node1.node_count < node2.node_count
def get_num_steps(S: list): ## Example: S = [2, 1, 2, 3]
heap = []
node_heap_position_map = {} ## Map[Node] -> Heap-index
prev = None
type = 0
for s in S:
node: Node = Node(type, s)
node.prev = prev
if prev is not None:
prev.next = node
prev = node
type = 1 - type
# Add element to the map and also maintain the updated positions of the elements for easy lookup
addElementToHeap(heap, node_heap_position_map, node)
num_steps = 0
last_val = 0
while len(heap) > 0:
# Extract top-element and also update the positions in the lookup-map
top_heap_val: Node = extractMinFromHeap(heap, node_heap_position_map)
num_steps += top_heap_val.node_count - last_val
last_val = top_heap_val.node_count
# If its the corner element, no merging is required
if top_heap_val.prev is None or top_heap_val.next is None:
continue
# Merge the nodes adjacent to the extracted-min-node:
prev_node = top_heap_val.prev
next_node = top_heap_val.next
removeNodeFromHeap(prev_node, node_heap_position_map)
removeNodeFromHeap(next_node, node_heap_position_map)
del node_heap_position_map[prev_node]
del node_heap_position_map[next_node]
# Created the merged-node for neighbours and add to the Heap; and update the lookup-map
merged_node = Node(prev_node.node_type, prev_node.node_count + next_node.node_count)
merged_node.prev = prev_node.prev
merged_node.next = next_node.next
addElementToHeap(heap, node_heap_position_map, merged_node)
return num_steps
PS: I havent implemented the Min-heap operations above, but the function-method-names are quite eponymous.
We can solve this in O(n) time and O(1) space.
This isn't about order at all. The actual task, when you think about it, is how to divide the string into the least number of subsequences that consist of alternating characters (where a single is allowed). Just maintain two queues or stacks; one for 1s, the other for 0s, where characters pop their immediate alternate predecessors. Keep a record of how long the queue is at any one time during the iteration (not including the replacement moves).
Examples:
(1)
0111001
queues
1 1 -
0 - 0
0 - 00
1 1 0
1 11 -
1 111 - <- max 3
0 11 0
For O(1) space, The queues can just be two numbers representimg the current counts.
(2)
111000111
queues (count of 1s and count of 0s)
1 1 0
1 2 0
1 3 0 <- max 3
0 2 1
0 1 2
0 0 3 <- max 3
1 1 2
1 2 1
1 3 0 <- max 3
(3)
11011
queues
1 1 0
1 2 0
0 1 1
1 2 0
1 3 0 <- max 3
(4)
10101
queues
1 1 0 <- max 1
0 0 1 <- max 1
1 1 0 <- max 1
0 0 1 <- max 1
1 1 0 <- max 1
I won't write the full code. But I have an idea of an approach that will probably be fast enough (certainly faster than building all of the intermediate strings).
Read the input and change it to a representation that consists of the lengths of sequences of the same character. So 11011 is represented with a structure that specifies it something like [{length: 2, value: 1}, {length: 1, value: 0}, {length: 2, value: 1}]. With some cleverness you can drop the values entirely and represent it as [2, 1, 2] - I'll leave that as an exercise for the reader.
With that representation you know that you can remove one value from each of the identified sequences of the same character in each "step". You can do this a number of times equal to the smallest length of any of those sequences.
So you identify the minimum sequence length, add that to a total number of operations that you're tracking, then subtract that from every sequence's length.
After doing that, you need to deal with sequences of 0 length. - Remove them, then if there are any adjacent sequences of the same value, merge those (add together the lengths, remove one). This merging step is the one that requires some care if you're going for the representation that forgets the values.
Keep repeating this until there's nothing left. It should run somewhat faster than dealing with string manipulations.
There's probably an even better approach that doesn't iterate through the steps at all after making this representation, just examining the lengths of sequences starting at the start in one pass through to the end. I haven't worked out what that approach is exactly, but I'm reasonably confident that it would exist. After trying what I've outlined above, working that out is a good idea. I have a feeling it's something like - start total at 0, keep track of minimum and maximum total reaches. Scan each value from the start of string, adding 1 to the total for each 1 encountered, subtracting 1 for each 0 encountered. The answer is the greater of the absolute values of the minimum and maximum reached by total. - I haven't verified that, it's just a hunch. Comments have lead to further speculation that doing this but adding together the maximum and absolute of minimum may be more realistic.
Time complexity - O(n)
void solve(string s) {
int n = s.size();
int zero = 0, One = 0, res = 0;
for (int i = 0; i < n; i++)
{
if (s[i] == '1')
{
if (zero > 0)
zero--;
else
res++;
One++;
}
else
{
if (One > 0)
One--;
else
res++;
zero++;
}
}
cout << res << endl;
}
We can calculate min cost suppose take this recurrence relation
min(mat[i-1][j],mat[i][j-1])+mat[i][j];
0 1 2 3
4 5 6 7
8 9 10 11
for calculating min cost using the above recurrence relation we will get for min-cost(1,2)=0+1+2+6=9
i am getting min cost sum, that's not problem..now i want to print the elements 0,1,2,6 bcz this elements are making min cost path.
Any help is really appreciated.
Suppose, your endpoint is [x, y] and start-point is [a, b]. After the recursion step, now start from the endpoint and crawl-back/backtrack to start point.
Here is the pseudocode:
# Assuming grid is the given input 2D grid
output = []
p = x, q = y
while(p != a && q != b):
output.add(grid[p][q])
min = infinity
newP = -1, newQ = -1
if(p - 1 >= 0 && mat[p - 1][q] < min):
min = matrix[p -1][q]
newP = p - 1
newQ = q
if(q - 1 >= 0 && mat[p][q - 1] < min):
min = mat[p][q - 1]
newP = p
newQ = q - 1
p = newP, q = newQ
end
output.add(grid[a][b])
# print output
Notice, here we used mat and grid - two 2D matrix where grid is the given input and mat is the matrix generated after the recursion step mat[i][j] = min(mat[i - 1][j], mat[i][j - 1]) + grid[i][j]
Hope it helps!
Besides computing the min cost matrix using the relation that you mentioned, you can also create a predecessor matrix.
For each cell (i, j), you should also store the information about who was the "min" in the relation that you mentioned (was it the left element, or is it the element above?). In this way, you will know for each cell, which is its preceding cell in an optimal path.
Afterwards, you can generate the path by starting from the final cell and moving backwards according to the "predecessor" matrix, until you reach the top-left cell.
Note that the going backwards idea can be applied also without explicitly constructing a predecessor matrix. At each point, you would need to look which of the candidate predecessors has a lower total cost.
Hell all, I have some problem when compute the rank of binary matrix that only 1 or 0. The rank of binary matrix will based on the row reduction using boolean operations XOR. Let see the XOR operation:
1 xor 1 =0
1 xor 0= 1
0 xor 0= 0
0 xor 1= 1
Given a binary matrix as
A =
1 1 0 0 0 0
1 0 0 0 0 1
0 1 0 0 0 1
We can see the third row equals first row xor with second row. Hence, the rank of matrix A only 2, instead of 3 by rank matlab function.
I have one way to compute the extractly rank of binary matrix using this code
B=gf(A)
rank(B)
It will return 2. However, when I compute with large size of matrix, for example 400 by 400. It does not return the rank (never stop). Could you suggest to me the good way to find rank of binary matrix for large size? Thank all
UPDATE: this is computation time using tic toc
N=50; Elapsed time is=0.646823 seconds
N=100;Elapsed time is 3.123573 seconds.
N=150;Elapsed time is 7.438541 seconds.
N=200;Elapsed time is 11.349964 seconds.
N=400;Elapsed time is 66.815286 seconds.
Note that check rank is only the condition in my algorithm. However, it take very long long time, then it will affect to my method
Base on the suggestion of R. I will use Gaussian Elimination to find the rank. This is my code. However, it call the rank function (spend some computation times). Could you modify help me without using rank function?
function rankA=GaussEliRank(A)
mat = A;
[m n] = size(A); % read the size of the original matrix A
for i = 1 : n
j = find(mat(i:m, i), 1); % finds the FIRST 1 in i-th column starting at i
if isempty(j)
mat = mat( sum(mat,2)>0 ,:);
rankA=rank(mat); %%Here
return;
else
j = j + i - 1; % we need to add i-1 since j starts at i
temp = mat(j, :); % swap rows
mat(j, :) = mat(i, :);
mat(i, :) = temp;
% add i-th row to all rows that contain 1 in i-th column
% starting at j+1 - remember up to j are zeros
for k = find(mat( (j+1):m, i ))'
mat(j + k, :) = bitxor(mat(j + k, :), mat(i, :));
end
end
end
%remove all-zero rows if there are some
mat = mat( sum(mat,2)>0 ,:);
if any(sum( mat(:,1:n) ,2)==0) % no solution because matrix A contains
error('No solution.'); % all-zero row, but with nonzero RHS
end
rankA=rank(mat); %%Here
end
Let check the matrix A at here. Correct ans is 393 for rank of A.
Once you get the matrix into row echelon form with Gaussian elimination, the rank is the number of nonzero rows. You should be able to replace the code after the loop with something like rankA=sum(sum(mat,2)>0);.
I am looking for constant time implementation of lowest common ancestor given two nodes in full binary tree( parent x than child 2*x and 2*x+1).
My problem is that there are large number of nodes in the tree and many queries. Is there a algorithm, which preprocesses so that queries can be answered in constant time.
I looked into LCA using RMQ, but I can't use that technique as I can't use array for this many nodes in the tree.
Can some one give me efficient implementation of the algorithm for answering many queries quickly, knowing it is full binary tree and the relation between nodes is as given above.
What I did was to start with two given nodes and successively find their parents ( node/2) keep hash list of visited nodes. when ever we reach a node that is already in hash list, than that node would be the lowest common ancestor.
But when there are many queries this algorithm is very time consuming, as in worst case I may have to traverse height of 30(max. height of tree) to reach root( worst case).
If you represent the two indices in binary, then the LCA can be found in two steps:
Shift right the larger number until the leading 1 bit is in the
same place as the other number.
Shift right both numbers until they are the same.
The first step can be done by getting log base 2 of the numbers and shifting the larger number right by the difference:
if a>b:
a = shift_right(a,log2(a)-log2(b))
else:
b = shift_right(b,log2(b)-log2(a))
The second step can be done by XORing the resulting two numbers and shifting right by the log base 2 of the result (plus 1):
if a==b:
return a
else:
return shift_right(a,log2(xor(a,b))+1)
Log base 2 can be found in O(log(word_size)) time, so as long as you are using integer indices with a fixed number of bits, this effectively constant.
See this question for information on fast ways to compute log base 2:
Fast computing of log2 for 64-bit integers
Edit :-
Faster way to get the common_ancestor in O(log(logn)) :-
int get_bits(unsigned int x) {
int high = 31;
int low = 0,mid;
while(high>=low) {
mid = (high+low)/2;
if(1<<mid==x)
return mid+1;
if(1<<mid<x) {
low = mid+1;
}
else {
high = mid-1;
}
}
if(1<<mid>x)
return mid;
return mid+1;
}
unsigned int Common_Ancestor(unsigned int x,unsigned int y) {
int xbits = get_bits(x);
int ybits = get_bits(y);
int diff,kbits;
unsigned int k;
if(xbits>ybits) {
diff = xbits-ybits;
x = x >> diff;
}
else if(xbits<ybits) {
diff = ybits-xbits;
y = y >> diff;
}
k = x^y;
kbits = get_bits(k);
return y>>kbits;
}
Explanation :-
get bits needed to represent x & y which using binary search is O(log(32))
the common prefix of binary notation of x & y is the common ancestor
whichever is represented by larger no of bits is brought to same bit by k >> diff
k = x^y erazes common prefix of x & y
find bits representing the remaining suffix
shift x or y by suffix bits to get common prefix which is the common ancestor.
Example :-
x = 12 = b1100
y = 8 = b1000
xbits = 4
ybits = 4
diff = 0
k = x^y = 4 = b0100
kbits = 3
res = x >> kbits = x >> 3 = 1
ans : 1