n nested for loops in fortran90 - nested-loops

i have read some topics on this, but i don't quite think it answers my question. if it does then please direct me to the correct topic and i will definitely look again.
here is my problem:
i want to write a for loop which will cycle through every possible combination of an array where the array is of length 'n'.
that is, if n = 2 then my for loop would be
do i1 = 1,2
do i2 = 1,2
! do stuff here
enddo
enddo
while if n = 3 then my array would look like
do i1 = 1,3
do i2 = 1,3
do i3 = 1,3
! do stuff here
enddo
enddo
enddo
and so on. how would i go about writing a routine which would do this automatically by simply an an input variable 'n'?

if you write out the indices, what you have is an n-digit number in base n (almost - there's an offset of 1 because you are using 1-based indices in fortran). and what you are asking for is every possible value that number can take.
in other words, if we use 0-based indices for a moment for simplicity, you have:
n=2, values=00,01,10,11 (binary counting from 0 to 3)
n=3, values=000,001,002,010,011,012,020,021,022, 100,101,102,110,111,112,120,121,122, 200,201,202,210,211,212,220,221,222 (ternary(?) counting from 0 to 26)
so what you are asking is how to do this in the general case.
and you can do that by using an array to hold the n digits, starting at [0,0....0]. then, within a "while" loop (which will replace your n nested for loops), try to increment the right-most entry (digit). if that is equal to n, then go back to zero and increment to the left. once you manage to increment a value without reaching n then you are "done" and can use the numbers as your indices.
it's pretty simple - you're just adding 1 each time.
then, for fortran's 1-based indexing, add 1 to each digit. in other words, change the above to start with 1s and move left at n+1.
for example, for n=4:
start with [1,1,1,1]
do your inner loop action
increment rightmost to [1,1,1,2]
do your inner loop action
increment rightmost to [1,1,1,3]
do your inner loop action
increment rightmost to [1,1,1,4]
do your inner loop action
increment rightmost to [1,1,1,5]
the digit you incremented is now at n+1, which means back to 1 and increment digit to left [1,1,2,1]
do your inner loop action
increment rightmost to [1,1,2,2]
do your inner loop action
etc..

I guess you could only do this by collapsing the loops into a single n**n loop, and compute the individual n indices out of the collapsed global index (or simply counting them up with different strides).
Edit: An attempt to put this into sample code:
do i=1,n**n
do j=1,n
ind(j) = mod((i-1)/max((j-1)*n,1),n) + 1
end do
! Some code using ind(1:n)
end do

Related

Counting Substrings: In a given text, find the number of substrings that start with an A and end with a B

For example, there are four such substrings in CABAAXBYA.
The original brute force algorithm that I used was, Using an outer for loop, whenever I encounter an A, I go inside another for loop to check if there's a B present or not. If a B is found, I increment the count. Finally, the value stored in the count variable yields the required result.
I came across a point while reading about String matching algorithms, when you traverse right to left rather than left to right, your algorithm is more efficient but here the substring isn't given as a parameter to the function that you would be using to compute the required value.
My question is if I traverse the string from right to left instead of left to right, will it make my algorithm more efficient in any case?
Here is one way in which iterating backwards through the string could result in O(n) computation instead of your original O(n^2) work:
A = "CABAAXBYA"
count = 0 # Number of B's seen
total = 0
for a in reversed(A):
if a=='B':
count += 1
elif a=='A':
total += count
print total
This works by keeping track in count of the number of B's to the right of the current point.
(Of course, you could also get the same result with forwards iteration by counting the number of A's to the left instead:
count = 0 # Number of A's seen
total = 0
for a in A:
if a=='A':
count += 1
elif a=='B':
total += count
print total
)

How to find the longest set of consecutive increasing (ascending) elements in array?

One straightforward way to find it is just check each element and its following element and include it in another array. But it's not so clear. Another way would be Divide & Conquer approach using Merge Sorting algorithm. In this case it's not supposed to sort numbers but divide them recursively in subarrays and merge only those numbers which satisfy condition n < n+1. But I'm not sure about code implementation of program that is merging and checking part.
It can be done using a single pass scan operation on the candidate array to see the length of the consecutive increasing sequence. So the pseudocode can be like the following:
a = [5 1 3 10 5 15 25 35 45 3 4 5];
longest_seq = 1;
temp_sec = 1;
for i =1:size(a)
if a[i-1]<a[i[
temp_sec = temp_sec +1;
else
if(temp_sec > longest_seq )
longest_seq = temp_sec ;
end
temp_sec = 1;
end
end
longest_seq is the number you are looking for (To my understanding).
Forgive me if I misunderstand your problem but why use separate arrays, merging, sorting, etc.? If as I understand it you're just looking for the longest sequence of increasing elements, recursion could be your friend. Pass a pointer to element N to func(n), if N+1 is greater than N, then N++, and call func(N) calls itself again (making sure you don't exceed the bounds of your array.
A simplistic explanation but think you see what I'm driving at.

Judgecode -- Sort with swap (2)

The problem I've seen is as bellow, anyone has some idea on it?
http://judgecode.com/problems/1011
Given a permutation of integers from 0 to n - 1, sorting them is easy. But what if you can only swap a pair of integers every time?
Please calculate the minimal number of swaps
One classic algorithm seems to be permutation cycles (https://en.wikipedia.org/wiki/Cycle_notation#Cycle_notation). The number of swaps needed equals the total number of elements subtracted by the number of cycles.
For example:
1 2 3 4 5
2 5 4 3 1
Start with 1 and follow the cycle:
1 down to 2, 2 down to 5, 5 down to 1.
1 -> 2 -> 5 -> 1
3 -> 4 -> 3
We would need to swap index 1 with 5, then index 5 with 2; as well as index 3 with index 4. Altogether 3 swaps or n - 2. We subtract n by the number of cycles since cycle elements together total n and each cycle represents a swap less than the number of elements in it.
Here is a simple implementation in C for the above problem. The algorithm is similar to User גלעד ברקן:
Store the position of every element of a[] in b[]. So, b[a[i]] = i
Iterate over the initial array a[] from left to right.
At position i, check if a[i] is equal to i. If yes, then keep iterating.
If no, then it's time to swap. Look at the logic in the code minutely to see how the swapping takes place. This is the most important step as both array a[] and b[] needs to be modified. Increase the count of swaps.
Here is the implementation:
long long sortWithSwap(int n, int *a) {
int *b = (int*)malloc(sizeof(int)*n); //create a temporary array keeping track of the position of every element
int i,tmp,t,valai,posi;
for(i=0;i<n;i++){
b[a[i]] = i;
}
long long ans = 0;
for(i=0;i<n;i++){
if(a[i]!=i){
valai = a[i];
posi = b[i];
a[b[i]] = a[i];
a[i] = i;
b[i] = i;
b[valai] = posi;
ans++;
}
}
return ans;
}
The essence of solving this problem lies in the following observation
1. The elements in the array do not repeat
2. The range of elements is from 0 to n-1, where n is the size of the array.
The way to approach
After you have understood the way to approach the problem ou can solve it in linear time.
Imagine How would the array look like after sorting all the entries ?
It will look like arr[i] == i, for all entries . Is that convincing ?
First create a bool array named FIX, where FIX[i] == true if ith location is fixed, initialize this array with false initially
Start checking the original array for the match arr[i] == i, till the time this condition holds true, eveything is okay. While going ahead with traversal of array also update the FIX[i] = true. The moment you find that arr[i] != i you need to do something, arr[i] must have some value x such that x > i, how do we guarantee that ? The guarantee comes from the fact that the elements in the array do not repeat, therefore if the array is sorted till index i then it means that the element at position i in the array cannot come from left but from right.
Now the value x is essentially saying about some index , why so because the array only has elements till n-1 starting from 0, and in the sorted arry every element i of the array must be at location i.
what does arr[i] == x means is that , not only element i is not at it's correct position but also the element x is missing from it's place.
Now to fix ith location you need to look at xth location, because maybe xth location holds i and then you will swap the elements at indices i and x, and get the job done. But wait, it's not necessary that the index x will hold i (and you finish fixing these locations in just 1 swap). Rather it may be possible that index x holds value y, which again will be greater than i, because array is only sorted till location i.
Now before you can fix position i , you need to fix x, why ? we will see later.
So now again you try to fix position x, and then similarly you will try fixing till the time you don't see element i at some location in the fashion told .
The fashion is to follow the link from arr[i], untill you hit element i at some index.
It is guaranteed that you will definitely hit i at some location while following in this way . Why ? try proving it, make some examples, and you will feel it
Now you will start fixing all the index you saw in the path following from index i till this index (say it j). Now what you see is that the path which you have followed is a circular one and for every index i, the arr[i] is tored at it's previous index (index from where you reached here), and Once you see that you can fix the indices, and mark all of them in FIX array to be true. Now go ahead with next index of array and do the same thing untill whole array is fixed..
This was the complete idea, but to only conunt no. of swaps, you se that once you have found a cycle of n elements you need n swaps, and after doing that you fix the array , and again continue. So that's how you will count the no. of swaps.
Please let me know if you have some doubts in the approach .
You may also ask for C/C++ code help. Happy to help :-)

Verilog: Minimal (hardware) algorithm for multiplying a binary input to its delayed form

I have a binary input in (1 bit serial input) which I want to delay by M clock pulses and then multiply (AND) the 2 signals. In other words, I want to evaluate the sum:
sum(in[n]*in[n+M])
where n is expressed in terms of number of clock pulses.
The most straightforward way is to store in a memory buffer in_dly the latest M samples of in. In Verilog, this would be something like:
always #(posedge clock ...)
...
in_dly[M-1:0] <= {in_dly[M-2:0], in};
if (in_dly[M-1] & in)
sum <= sum + 'd1;
...
While this works in theory, with large values of M (can be ~2000), the size of the buffer is not practical. However, I was thinking to take advantage of the fact that the input signal is 1 bit and it is expected to toggle only a few times (~1-10) during M samples.
This made me think of storing the toggle times from 2k*M to (2k+1)*M in an array a and from (2k+1)*M to (2k+2)*M in an array b (k is just an integer used to generalize the idea):
reg [10:0] a[0:9]; //2^11 > max(M)=2000 and "a" has max 10 elements
reg [10:0] b[0:9]; //same as "a"
Therefore, during M samples, in = 'b1 during intervals [a[1],a[2]], [a[3],a[4]], etc. Similarly, during the next M samples, the input is high during [b[1],b[2]], [b[3],b[4]], etc. Now, the sum is the "overlapping" of these intervals:
min(b[2],a[2])-max(b[1],a[1]), if b[2]>a[1] and b[1]<a[2]; 0 otherwise
Finally, the array b becomes the new array a and the next M samples are evaluated and stored into b. The process is repeated until the end of in.
Comparing this "optimized" method to the initial one, there is a significant gain in hardware: initially 2000 bits were stored, and now 220 bits are stored (for this example). However, the number is still large and not very practical..
I would greatly appreciate if somebody could suggest a more optimal (hardware-wise) way or a simpler way (algorithm-wise) of doing this operation. Thank you in advance!
Edit:
Thanks to Alexey's idea, I optimized the algorithm as follows:
Given a set of delays M[i] for i=1 to 10 with M[1]<M[2]<..<M[10], and an input binary array in, we need to compute the outputs:
y[i] = sum(in[n]*in[n+M[i]]) for n=1 to length(in).
We then define 2 empty arrays a[j] and b[j] with j=1,~5. Whenever in has a 0->1 transition, the smallest index empty element a[j] is "activated" and will increment at each clock cycle. Same goes for b[j] at 1->0 transitions. Basically, the pairs (a[j],b[j]) represent the portions of in equal to 1.
Whenever a[j] equals M[i], the sum y[i] will increment by 1 at each cycle while in = 1, until b[j] equals M[i]. Once a[j] equals M[10], a[j] is cleared. Same goes for b[j]. This is repeated until the end of in.
Based on the same numerical assumptions as the initial question, a total of 10 arrays (a and b) of 11 bits allow the computation of the 10 sums, corresponding to 10 different delays M[i]. This is almost 20 times better (in terms of resources used) than my initial approach. Any further optimization or idea is welcomed!
Try this:
make array A,
every time when in==1 get free A element and write M to it.
every clock decrement all non-zero A elements,
once any decremented element becomes zero, test in, if in==1 - sum++.
Edit: algorithm above intended for input like
- 00000000000010000000100010000000, while LLDinu realy needs
- 11111111111110000000011111000000, so here is modified algorithm:
make array (ring buffer) A,
every time when in toggles, get free A element and write M to it.
every clock decrement all non-zero A elements,
every clock test in, if in==1 and number of non-zero A elements is even - sum++.

Algorithm for calculate all difference values from a large list

​I have a question that now I have a list of 3 million records and I want get all difference values between every two records. A simple nested loop may take forever. Could anyone suggest me any algorithm which is capable of handling this problem?
If you want to calculate the mean of all absolute differences and your timestamps are sorted, you just need one loop:
t[i] <= t[i + 1] --> abs(t[i] - t[j]) = t[j] - t[i] for i < j
That is, there is a summand with positive sign and another summand with a negative sign for each of the N timestamp differences. Let's look at an example with 4 timestamps:
sum = (t[3] - t[2]) + (t[3] - t[1]) + (t[3] - t[0])
+ (t[2] - t[1]) + (t[2] - t[0])
+ (t[1] - t[0])
Here, t[3] is always added, t[2] is added twice and subtracted once, t[1] is added once and subtracted twice and finally the lowest value, t[0] is always subtracted.
Ore, more general: The first timestamp, i.e. the one with the lowest value, has always the negative sign, N - 1 times. The second has N - 2 times negative signs and a positive sign once, namely when comparing the the first timestamp. The third has N - 3 times a negative sign and a positive sign twice.
So your loop goes like this:
sum = 0;
for i = 0 to N:
sum = sum + (2*i - N + 1) * t[i]
where i is a zero-based index and N an exclusive upper bound, C-style. To get the average, divide by (N - 1) * N / 2.
If your array isn't sorted, you must sort it first, which usually has better performance than quadratic time, so you should be better off than with a nested loop.
One thing that might occur is that by summing up large values, you hit the limits of your data type. You could try to fix that by halving your loop and start summing from both ends in the hope that the differences about cancel themselves out. Alternatively you could already divide by the total number of differences inside the loop, possibly introducing some nasty floating-point round-off errors.
You could parallelise the problem by splitting the file into, say 8, chunks and processing them all at the same time and making the most of those expensive Intel iCores you paid for....
Use the split command to generate the lists.
#!/bin/bash
split -l 375000 yourfile sublist # split into lumps of 375,000 subfiles called sublist*
for f in sublist* # for all list* files
do
# Start a background process to work on one list
echo start processing file $f in background &
done
wait # till all are finished

Resources