USACO number triangle - Execution error - algorithm

The question is as follows
Consider the number triangle shown below. Write a program that calculates the highest sum of numbers that can be passed on a route that starts at the top and ends somewhere on the base. Each step can go either diagonally down to the left or diagonally down to the right.
7
3 8
8 1 0
2 7 4 4
4 5 2 6 5
In the sample above, the route from 7 to 3 to 8 to 7 to 5 produces the highest sum: 30.
I had the following error
Your program had this runtime error: Bad
syscall #32000175 (RT_SIGPROCMASK) [email kolstad if you think
this is wrong]. The program ran for 0.259 CPU seconds before the
error. It used 16328 KB of memory.
The code is as follows.
int arr[1500][1500];
map < int,map < int,int> >dp;
int main()
{
// ofstream fout ("numtri.out");
// ifstream fin ("numtri.in");
int n;
// fin>>n;
freopen ("numtri.in", "r", stdin);
freopen ("numtri.out", "w", stdout);
scanf ("%d", &n);
int ct = 1;
int gaga = -100;
for (int i=0; i<n; i++)
{
for (int j=0; j<ct; j++)
{
scanf ("%d", &arr[i][j]);
if(i>0)
dp[i][j] = maxi (dp[i-1][j-1] + arr[i][j], dp[i-1][j] + arr[i][j]);
else
dp[0][0]=arr[0][0];
if (i == n-1)
{
if (dp[i][j] > gaga)
gaga=dp[i][j];
}
}
ct++;
}
printf ("%d\n", gaga);
return 0;
}
It works fine on my laptop. On the website it works for 8 test cases and fails for 9th one with this error.
Thanks for the help!

if(i>0)
dp[i][j]=maxi(dp[i-1][j-1]+arr[i][j],dp[i-1][j]+arr[i][j]);
You check if i > 0, which will ensure you never access a negative index. You never do the same for j however, so you will access dp[i-1][-1] on the first run of the inner (j) loop. I'm pretty sure this is what causes the error.

Related

How do I wait for child kernels to finish in a parent kernel before executing the rest of the parent kernel in CUDA dynamic parallelism?

So I need the runParatron children to fully finish before the next iteration of the for loop happens. Based on the results I am getting, I'm pretty sure that's not happening. For example, I have a print statement in runParatron that executes AFTER the first "[" is printed outside the for loop.
I tried to run cudaDeviceSynchronize, but it wouldn't compile stating that host code can't be executed on device code, and that cudaDeviceSynchronize is undefined in device code. Is there any way to wait until the children kernels are done for this?
I see other posts, examples, and tutorials using cudaDeviceSynchronize within kernels, so perhaps I am missing something basic? Help would be thoroughly appreciated.
__global__ void runMLP(double* x, double* outputs, double* weights, activation_function* A_Fs, int* CIL, int layers, int bias, int* WLO, int* OLO) {
if (CIL[0] > 511) {
copyElements << <CIL[0] / 32, 32 >> > (outputs, x, CIL[0]);
//I WOULD ALSO LIKE TO WAIT HERE
}
else
for (int i = 0;i < CIL[0];i++) {
outputs[i] = x[i];
}
for (int i = 1;i < layers;i++) {
printf("----------------------Layer %d :: InputSize %d :: Layer weight offset %d :: Layer output offset %d----------------------\n", i, CIL[i-1], WLO[i-1], OLO[i]);
runParatron << < (CIL[i] / 32) + 1, 32 >> > (outputs + OLO[i - 1], outputs + OLO[i], weights + WLO[i - 1], A_Fs[i], CIL[i - 1], CIL[i], bias);
//cudaDeviceSynchronize(); //THIS IS WHERE I NEED TO WAIT UNTIL NEXT ITERATION
}
if (A_Fs[layers - 1] == SOFTMAX) {
double* temp = outputs + OLO[layers - 1];
printf("[");
for (int i = 0;i < CIL[layers-1];i++) {
printf("% d, ", temp[i]);
}
printf("]\n");
double denom = 0;
for (int i = 0;i < CIL[layers - 1];i++) {
denom += temp[i];
}
if (denom < DBL_MIN)
denom = DBL_MIN;
for (int i = 0;i < CIL[layers - 1];i++) {
temp[i] /= denom;
}
}
}
For example, here is the output where the "[" comes before the child kernel output:
//All Cell: starting lines are produced from child kernel
[Cell: 0 :: weightOffset 0 :: AF 2 //As you can see, there is the "[" here when it should be
Cell: 1 :: weightOffset 6 :: AF 2
Cell: 2 :: weightOffset 12 :: AF 2
Cell: 3 :: weightOffset 18 :: AF 2
-502657059, 2118981138, 1645236453, ] //Down here!
So I added an atomic counter and incremented it by one at the end of each child kernel. Then I put a while loop after the child kernel call checking to see if the counter had reached the amount of calls I wanted to finish yet. This fixed it. Let me know if anyone needs code for or clarification.

Algorithm: Print the nth consecutive prime number

I'm currently learning algorithms and have came across a code challenge from an interviewer about a function that prints out the nth prime number sequentially. So it would be something like:
getPrimeNth(10) will print 1 2 3 5 7 11 13 17 19 23
but most of the ones I found will print out just the nth number, so 23, or just ones that will detect if it is prime numbers. I am going to risk getting downvoted for this but I can't seem to find the right solution for this.
One is not a prime, for starters.
Second, your question needs more clarification....
Primes are not challenging - there is a lot of information available.
The simplest solution for you would be to simply test every number by modding up to the square root of that number. If it mods to zero, it is not prime. Store the primes in an array one after another. I'm not going to straight up give you the answer, but read more about The Sieve of Eratosthenes - which is highly inefficient IMO, but where you must start.
Therefore, the first prime would be in slot 0, second in slot 1, etc, etc.
The below code tries to find and saves all possible primes upto N (defined by the macro). It just calls the utility function is_prime() which checks whether a given number is prime or not.
#define TRUE 1
#define FALSE 0
#define N 10
typedef short int bool;
bool is_prime(int num)
{
int i = 2;
for (i = 2; i <= (num - 1); i++) {
if ((num % i) == 0) {
return FALSE;
}
}
return TRUE;
}
int main()
{
int primes[N];
int num_primes = 0;
int num = 2; /* start with 2 */
while (num_primes != N) {
if (is_prime (num) == TRUE) {
primes[num_primes] = num;
num_primes++;
}
num++;
}
int i = 0;
for (i = 0; i < N; i++) {
printf ("%d ", primes[i]);
}
printf ("\n");
}
Output: 2 3 5 7 11 13 17 19 23 29

Explanation of iterative algorithm to print power set

I found this iterative algorithm that prints the power set for a given set:
void PrintSubsets()
{
int source[3] = {1,2,3};
int currentSubset = 7;
int tmp;
while(currentSubset)
{
printf("(");
tmp = currentSubset;
for(int i = 0; i<3; i++)
{
if (tmp & 1)
printf("%d ", source[i]);
tmp >>= 1;
}
printf(")\n");
currentSubset--;
}
}
However, I am not sure why it works. Is it similar to a solution where you use a set of n bits, and on each step, add 1 with carry, using the reuslting pattern of zeros and ones to determine which elements belong?
List all integers in the binary base, and light should shine:
{abc}
7 xxx
6 xx-
5 x-x
4 x--
3 -xx
2 -x-
1 --x
0 --- (omitted)
The order to enumerate the integers does not matter provided you list them all. Incrementing or decrementing are the most natural ways.

Graph visit every node once and reach exit

I had a test right now and this was one of the questions:
Input
The places to visit in the labyrinth are numbered from 1 to n. The entry and
the exit correspond to number 1 and number n, respectively; the remaining
numbers correspond to crossings. Note that there are no dead ends and
there is no more than one connection linking a pair of crossings.
For each test case, the first line gives n and the number of connections
between crossings (m). Then, in each of the following m lines, you find a pair
of integers corresponding to the connection between two crossings.
Output
For each test case, your implementation should output one single line
containing "Found!", if it is possible to reach the exit by visiting every
crossing once or "Damn!", otherwise. Other test cases may follow.
Constraints
m < 32
n < 21
Example input:
8 13
1 2
1 3
2 3
2 4
3 4
3 5
4 5
4 6
5 6
5 7
6 7
6 8
7 8
8 8
1 2
1 3
2 4
3 5
4 6
5 7
6 8
7 8
Example output:
Found!
Damn!
I solved the problem using a sort of DFS algorithm but i have a few questions.
Using DFS algorithm, I implemented a recursive function that starts in the given node and tries to visit every node once and the last node must be the exit node. I don't have the full code right now but but it was something like this:
findPath(int current node, int numVisitedNodes, int *visited){
int *tmpVisited = copyArray(visited); //copies the visited array to tmpVisited
//DFS algo here
}
Every recursive call it copies the visited nodes array. I'm doing this because when it finds an invalid path and the recursion goes back to the origin, it can still go because no one overwrote the visited nodes list.
Is there any better way to do this?
How would you solve it? (you can provide code if you want)
Read the crossing
if start or end of the crossing belongs to a reachable set, add both to that set else create a new reachable set.
When input has finished, check if any of the reachable sets contains
both entrance and exit points
HashSet operations complexity is O(1). If every crossing are distinct, complexity is O(n^2),which is the worst case complexity of this algorithm. Space complexity is O(n), there is no recursion so there is no recursion overhead of memory.
Roughly speaking, every node is visited only once.
Java code using valid reachable sets is as follows.
public class ZeManel {
public static void main(String[] args) {
Integer[][] input = {{1,2},{2,3},{4,6}};
zeManel(input);
}
public static void zeManel(Integer[][] input){
List<Set<Integer>> paths = new ArrayList<Set<Integer>>();
int max = 0;
for(int i = 0;i < input.length;i++) {
max = input[i][0] > max ? input[i][0] : max;
max = input[i][1] > max ? input[i][1] : max;
boolean inPaths = false;
for (Set<Integer> set : paths) {
if(set.contains(input[i][0]) || set.contains(input[i][1])) {
set.add(input[i][0]);
set.add(input[i][1]);
inPaths = true;
break;
}
}
if(!inPaths) {
Set<Integer> path = new HashSet<Integer>();
path.add(input[i][0]);
path.add(input[i][1]);
paths.add(path);
}
}
for (Set<Integer> path : paths) {
if(path.contains(1) && path.contains(max)) {
System.out.println("Found!");
return;
}
}
System.out.println("Damn!");
}
}
This was my implementation during the test:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
# define N 21
# define M 32
int i;
int adj[N][N];
int count = 0;
int findPath(int numNodes, int currentNode, int depth, int *visited){
visited[currentNode] = 1;
if(currentNode == numNodes - 1 && depth == numNodes){
return 1;
}
if(depth > numNodes)
return -1;
int r = -1;
if(depth < numNodes){
count++;
int *tmp = (int*) malloc(numNodes*sizeof(int));
for(i = 0; i < numNodes; i++)
tmp[i] = visited[i];
for(i = 0; i < numNodes; i++){
if(adj[currentNode][i] == 1 && tmp[i] == 0 && r == -1){
if(findPath(numNodes, i, depth + 1, tmp) == 1)
r = 1;
}
}
free(tmp);
}
return r;
}
int main(){
int numLigacoes, a, b, numNodes;
int *visited;
while (scanf("%d %d", &numNodes, &numLigacoes) != EOF){
visited = (int*) malloc(numNodes*sizeof(int));
count = 0;
memset(adj, 0, N*N*sizeof(int));
memset(visited, 0, numNodes*sizeof(int));
for (i = 0; i < numLigacoes; i++){
scanf("%d %d", &a, &b);
adj[a - 1][b - 1] = 1;
adj[b - 1][a - 1] = 1;
}
if(findPath(numNodes, 0, 1, visited) == 1)
printf("Found! (%d)\n", count);
else
printf("Damn! (%d)\n", count);
free(visited);
}
return 0;
}
What do you think about that?

Big-O algorithmic analysis

I would say it's not a homework problem. It's just a tutorial resource online to learn the dynamic programming concepts from USACO website.
In the resource, a problem was given as follows.
Question:
A sequcen of as many as 10000 integers, ( 0 < integer < 100,000), what is the maximum decreasing subsequence?
The decent recursive approach was given
1 #include <stdio.h>
2 long n, sequence[10000];
3 main () {
4 FILE *in, *out;
5 int i;
6 in = fopen ("input.txt", "r");
7 out = fopen ("output.txt", "w");
8 fscanf(in, "%ld", &n);
9 for (i = 0; i < n; i++) fscanf(in, "%ld", &sequence[i]);
10 fprintf (out, "%d\n", check (0, 0, 99999));
11 exit (0);
12 }
13 check (start, nmatches, smallest) {
14 int better, i, best=nmatches;
15 for (i = start; i < n; i++) {
16 if (sequence[i] < smallest) {
17 better = check (i, nmatches+1, sequence[i]);
18 if (better > best) best = better;
19 }
20 }
21 return best;
22 }
Guys, I am not good at the algorithmic analysis. Would you please tell me what's the Big-O notation to this recursive enumeration solution in worst case as tight as possible. My personal thought would be O(N^N), but I have no confidence. Because the runtime is still acceptable under N <= 100. There must be something wrong. Please help me. Thank you.
In the USACO website, it gives the dynamic programming approach in O(n^2) as follows.
1 #include <stdio.h>
2 #define MAXN 10000
3 main () {
4 long num[MAXN], bestsofar[MAXN];
5 FILE *in, *out;
6 long n, i, j, longest = 0;
7 in = fopen ("input.txt", "r");
8 out = fopen ("output.txt", "w");
9 fscanf(in, "%ld", &n);
10 for (i = 0; i < n; i++) fscanf(in, "%ld", &num[i]);
11 bestsofar[n-1] = 1;
12 for (i = n-1-1; i >= 0; i--) {
13 bestsofar[i] = 1;
14 for (j = i+1; j < n; j++) {
15 if (num[j] < num[i] && bestsofar[j] >= bestsofar[i]) {
16 bestsofar[i] = bestsofar[j] + 1;
17 if (bestsofar[i] > longest) longest = bestsofar[i];
18 }
19 }
20 }
21 fprintf(out, "bestsofar is %d\n", longest);
22 exit(0);
23 }
Just look at with what kind of parameters you call the function. The first determines the third (which btw means you needed have the third parameter). The first ranges between 0 and n. The second one is smaller than the first. This means that you have at most n^2 different calls to the function.
Now comes the question how many times you call the function with the same parameters. And the answer is simple: you actually generate every single decreasing subsequece. This means that for the sequence N, N-1, N-2, ... you will generate 2^N sequences. Pretty poor, right (if you want experiment with the sequence I have given you)?
However if you use the memoization technique you should have already read about, you can improve the complexity to N^3 (at most n operations in every call to the function, the different calls are N^2 and memoization allows you to pay only once for a different call).

Resources