Is the second of these solutions more efficient? - prolog

Wondering if as a practical matter the post-compilation of the second solution below is more efficient .
first solution :
write_data([]).
write_data([_]) :-
write('LAST!'),nl.
write_data([X|Rest]) :-
Rest = [_|_],
write(X),nl,
write_data(Rest).
second solution :
write_data([]).
write_data([H|T]) :-
write_data(T, H).
write_data([], _) :-
write('LAST!'),nl.
write_data([H|T], X) :-
write(X),nl,
write_data(T, H).
some swipl magic :
?- [library(vm)] .
?- vm_list(write_data) , halt.
first solution swipl magic ::
========================================================================
write_data/1
========================================================================
0 s_virgin
1 i_exit
----------------------------------------
clause 1 (<clause>(0xb002c300)):
----------------------------------------
0 h_nil
1 i_exitfact
----------------------------------------
clause 2 (<clause>(0xb017da50)):
----------------------------------------
0 h_list
1 h_void
2 h_nil
3 h_pop
4 i_enter
5 b_atom('LAST!')
7 i_call(write/1)
9 i_depart(nl/0)
11 i_exit
----------------------------------------
clause 3 (<clause>(0xb002bc80)):
----------------------------------------
0 h_list_ff(1,2)
3 i_enter
4 b_unify_var(2)
6 h_list
7 h_pop
8 b_unify_exit
9 b_var1
10 i_call(write/1)
12 i_call(nl/0)
14 b_var2
15 i_depart(write_data/1)
17 i_exit
second solution swipl magic ::
========================================================================
write_data/1
========================================================================
0 s_virgin
1 i_exit
----------------------------------------
clause 1 (<clause>(0xb272c300)):
----------------------------------------
0 h_nil
1 i_exitfact
----------------------------------------
clause 2 (<clause>(0xb27288c0)):
----------------------------------------
0 h_list_ff(1,2)
3 i_enter
4 b_var2
5 b_var1
6 i_depart(write_data/2)
8 i_exit
========================================================================
write_data/2
========================================================================
0 s_virgin
1 i_exit
----------------------------------------
clause 1 (<clause>(0xb2728920)):
----------------------------------------
0 h_nil
1 i_enter
2 b_atom('LAST!')
4 i_call(write/1)
6 i_depart(nl/0)
8 i_exit
----------------------------------------
clause 2 (<clause>(0xb272bc80)):
----------------------------------------
0 h_list_ff(2,3)
3 i_enter
4 b_var1
5 i_call(write/1)
7 i_call(nl/0)
9 b_var(3)
11 b_var2
12 i_depart(write_data/2)
14 i_exit
Upon submission of the above stack overflow blocked with message "It looks like your post is mostly code; please add some more details." . HRM . Too much code in a question about computer programming ? (skratches virtual head) . Should I add some verbiage I heard about the new revolution "Bunch-Oriented" programming whilst around the water-cooler ?

Related

How to find the index of the first row of a matrix that satisfies two conditions in APL Language?

One more question to learn how to use APL Language.
Suppose you have an array, as an example:
c1
c2
c3
c4
c5
c6
3
123
0
4
5
6
3
134
0
2
3
4
3
231
180
1
2
5
4
121
0
3
2
4
4
124
120
4
6
3
4
222
222
5
3
5
So, how to find out which row has a value of 4 in the 1st column and a value grather than 0 in the 3rd column?
The expected answer is 5th line, in the just 5
When you want to make such "queries", think Boolean masks.
table ← 6 6⍴3 123 0 4 5 6 3 134 0 2 3 4 3 231 180 1 2 5 4 121 0 3 2 4 4 124 120 4 6 3 4 222 222 5
Let's extract the first column:
table[;1]
3 3 3 4 4 4
And indicate which elements have a value of 4:
table[;1] = 4
0 0 0 1 1 1
Similarly, we can indicate which elements of column 3 have value greater than 0:
table[;3] > 0
0 0 1 0 1 1
Their intersection (logical AND) indicates all rows that fulfil your criteria:
(table[;1] = 4) ∧ (table[;3] > 0)
0 0 0 0 1 1
The index of the first 1 is the row number for the first row that fulfils your criteria:
((table[;1] = 4) ∧ (table[;3] > 0)) ⍳ 1
5
Try it online!
Alternatively, we can use the final mask to filter the table and obtain all rows that fulfil your criteria:
((table[;1] = 4) ∧ (table[;3] > 0)) ⌿ table
4 124 120 4 6 3
4 222 222 5 3 5
Try it online!
Or we can generate all the row numbers:
⍳ 1 ↑ ⍴ table
1 2 3 4 5 6
Then use our Boolean mask to filter that, finding the row numbers of all the rows that fulfil your criteria:
((table[;1] = 4) ∧ (table[;3] > 0)) ⌿ ⍳ 1 ↑ ⍴ table
5 6
Try it online!

plotting multiple graphs and animation from a data file in gnuplot

Suppose I have the following sample data file.
0 1 2
0 3 4
0 1 9
0 9 2
0 19 0
0 6 1
0 11 0
1 3 2
1 3 4
1 1 6
1 9 2
1 15 0
1 6 6
1 11 1
2 3 2
2 4 4
2 1 6
2 9 6
2 15 0
2 6 6
2 11 1
first column gives value of time. Second gives values of x and 3rd column y. I wish to plot graphs of y as functions of x from this data file at different times,
i.e, for t=0, I shall plot using 2:3 with lines up to t=0 index. Then same thing I shall do for the variables at t=1.
At the end of the day, I want to get a gif, i.e, an animation of how the y vs x graph changes shape as time goes on. How can I do this in gnuplot?
What have you tried so far? (Check help ternary and help gif)
You need to filter your data with the ternary operator and then create the animation.
Code:
### plot filtered data and animate
reset session
$Data <<EOD
0 1 2
0 3 4
0 1 9
0 9 2
0 19 0
0 6 1
0 11 0
1 3 2
1 3 4
1 1 6
1 9 2
1 15 0
1 6 6
1 11 1
2 3 2
2 4 4
2 1 6
2 9 6
2 15 0
2 6
2 11 1
EOD
set terminal gif animate delay 50 optimize
set output "myAnimation.gif"
set xrange[0:20]
set yrange[0:10]
do for [i=0:2] {
plot $Data u 2:($1==i?$3:NaN) w lp pt 7 ti sprintf("Time: %g",i)
}
set output
### end of code
Result:
Addition:
The meaning of $1==i?$3:NaN in words:
If the value in the first column is equal to i then the result is the value in the third column else it will be NaN ("Not a Number").

How can you improve computation time when predicting KNN Imputation?

I feel like my run time is extremely slow for my data set, this is the code:
library(caret)
library(data.table)
knnImputeValues <- preProcess(mainData[trainingRows, imputeColumns], method = c("zv", "knnImpute"))
knnTransformed <- predict(knnImputeValues, mainData[ 1:1000, imputeColumns])
the PreProcess into knnImputeValues run's fairly quickly, however the predict function takes a tremendous amount of time. When I calculated it on a subset of the data this was the result:
testtime <- system.time(knnTransformed <- predict(knnImputeValues, mainData[ 1:15000, imputeColumns
testtime
user 969.78
system 38.70
elapsed 1010.72
Additionally, it should be noted that caret preprocess uses "RANN".
Now my full dataset is:
str(mainData[ , imputeColumns])
'data.frame': 1809032 obs. of 16 variables:
$ V1: int 3 5 5 4 4 4 3 4 3 3 ...
$ V2: Factor w/ 3 levels "1000000","1500000",..: 1 1 3 1 1 1 1 3 1 1 ...
$ V3: Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
$ V4: int 2 5 5 12 4 5 11 8 7 8 ...
$ V5: int 2 0 0 2 0 0 1 3 2 8 ...
$ V6: int 648 489 489 472 472 472 497 642 696 696 ...
$ V7: Factor w/ 4 levels "","N","U","Y": 4 1 1 1 1 1 1 1 1 1 ...
$ V8: int 0 0 0 0 0 0 0 1 1 1 ...
$ V9: num 0 0 0 0 0 ...
$ V10: Factor w/ 56 levels "1","2","3","4",..: 45 19 19 19 19 19 19 46 46 46 ...
$ V11: Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
$ V12: num 2 5 5 12 4 5 11 8 7 8 ...
$ V13: num 2 0 0 2 0 0 1 3 2 8 ...
$ V14: Factor w/ 4 levels "1","2","3","4": 2 2 2 2 2 2 2 2 3 3 ...
$ V15: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 2 2 ...
$ V16: num 657 756 756 756 756 ...
So is there something I'm doing wrong, or is this typical for how long it will take to run this? If you back of the envelop extrapolate (which I know isn't entire accurate) you'd get what 33 days?
Also it looks like system time is very low and user time is very high, is that normal?
My computer is a laptop, with a Intel(R) Core(TM) i5-6300U CPU # 2.40Ghz processor.
Additionally would this improve the runtime of the predict function?
cl <- makeCluster(4)
registerDoParallel()
I tried it, and it didn't seem to make a difference other than all the processors looked more active in my task manager.
FOCUSED QUESTION: I'm using Caret package to do KNN Imputation on 1.8 Million Rows, the way I'm currently doing it will take over a month to run, how do I write this in such a way that I could do it in a much faster amount of time(if possible)?
Thank you for any help provided. And the answer might very well be "that's how long it takes don't bother" I just want to rule out any possible mistakes.
You can speed this up via the imputation package and use of canopies which can be installed from Github:
Sys.setenv("PKG_CXXFLAGS"="-std=c++0x")
devtools::install_github("alexwhitworth/imputation")
Canopies use a cheap distance metric--in this case distance from the data mean vector--to get approximate neighbors. In general, we wish to keep the canopies each sized < 100k so for 1.8M rows, we'll use 20 canopies:
library("imputation")
to_impute <- mainData[trainingRows, imputeColumns] ## OP undefined
imputed <- kNN_impute(to_impute, k= 10, q= 2, verbose= TRUE,
parallel= TRUE, n_canopies= 20)
NOTE:
The imputation package requires numeric data inputs. You have several factor variables in your str output. They will cause this to fail.
You'll also get some mean vector imputation if you have fulling missing rows.
# note this example data is too small for canopies to be useful
# meant solely to illustrate
set.seed(2143L)
x1 <- matrix(rnorm(1000), 100, 10)
x1[sample(1:1000, size= 50, replace= FALSE)] <- NA
x_imp <- kNN_impute(x1, k=5, q=2, n_canopies= 10)
sum(is.na(x_imp[[1]])) # 0
# with fully missing rows
x2 <- x1; x2[5,] <- NA
x_imp <- kNN_impute(x2, k=5, q=2, n_canopies= 10)
[1] "Computing canopies kNN solution provided within canopies"
[1] "Canopies complete... calculating kNN."
row(s) 1 are entirely missing.
These row(s)' values will be imputed to column means.
Warning message:
In FUN(X[[i]], ...) :
Rows with entirely missing values imputed to column means.

I can't find a pattern of this 6*6 matrix

It's a kind of programming practice problem.
The question is, "Print this matrix".
0 1 2 3 4 5
1 2 3 4 5 0
0 1 2 3 0 1
5 0 5 4 1 2
4 5 4 3 2 3
3 2 1 0 5 4
=========================
Well, I can use 'printf' for 16 times, but I don't wanna do that.
There would be some pattern..
But really, I couldn't figure it out. I struggled with it for a week..!
It is a clockwise spiral starting at the top left.

Dynamic Programming - Two spies at the river

I think this is a very complicated dynamic programming problem.
Two spies each have a secret number in [1..m]. To exchange numbers they agree to meet at the river and "innocently" take turns throwing stones: from a pile of n=26 identical stones, each spy in turn throws at least one stone in the river.
The only information is in the number of stones each thrown in each turn. What is the largest m can be so they are sure they can complete the exchange?
Develop a recursive formula to count. Here is the start of the table; complete it to n=26. (You should not expect a closed form.)
n 1 2 3 4 5 6 7 8 9 10 11 12
m 1 1 1 2 2 3 4 6 8 12 16 23
Here are some hints from our professor: I suggest changing the problem to making the following table: Let R(n,m) be the range of numbers [1..R(n,m)] that A can indicate to B if they start with n stones, and both know that A has to also receive a number in [1..m] from B.
For example, if A needs no more information, R(n,1) can be computed by considering how many stones A could throw (one to n), then B thows 1 (if any remain) and A gets to decide again. The base cases R(0,1) = R(1,1) = 1, and you can write a recursive rule if you are careful at the boundaries. (You should find the Fibonacci numbers for R(n,1).)
If A needs information, then B has to send it by his or her choices, so things are a little more complicated. Here is the start of the table:
n\ m 1 2 3 4 5
0 1 0 0 0 0
1 1 0 0 0 0
2 2 0 0 0 0
3 3 1 0 0 0
4 5 2 1 0 0
5 8 4 2 1 1
6 13 7 4 3 2
7 21 12 8 6 4
8 34 20 15 11 8
9 55 33 27 19 16
From the R(n,m) table, how would you recover the entries of the earlier table (the table showing m as a function of n)?

Resources