Why does Frame.ofRecords garbles its results when fed a sequence generated by a parallel calculation? - parallel-processing

I am running some code that calculates a sequence of records and calls Frame.ofRecords with that sequence as its argument. The records are calculated using PSeq.map from the library FSharp.Collections.ParallelSeq.
If I convert the sequence into a list then the output is OK. Here is the code and the output:
let summaryReport path (writeOpenPolicy: WriteOpenPolicy) (outputs: Output seq) =
let foo (output: Output) =
let temp =
{ Name = output.Name
Strategy = string output.Strategy
SharpeRatio = (fst output.PandLStats).SharpeRatio
CalmarRatio = (fst output.PandLStats).CalmarRatio }
printfn "************************************* %A" temp
temp
outputs
|> Seq.map foo
|> List.ofSeq // this is the line that makes a difference
|> Frame.ofRecords
|> frameToCsv path writeOpenPolicy ["Name"] "Summary_Statistics"
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_AAPL MyStrategy 0.317372564 0.103940018
1 Singleton_MSFT MyStrategy 0.372516931 0.130150478
2 Singleton_IBM MyStrategy Infinity
The printfn command let me verify by inspection that in each case the variable temp was calculated correctly.
The last code line is just a wrapper around FrameExtensions.SaveCsv.
If I remove the |> List.ofSeq line then what comes out is garbled:
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_IBM MyStrategy 0.317372564 0.130150478
1 Singleton_MSFT MyStrategy 0.103940018
2 Singleton_AAPL MyStrategy 0.372516931 Infinity
Notice that the empty (corresponding to NaN) and Infinity items are now in different lines and other things are also mixed up.
Why is this happening?

The Frame.ofRecords function iterates over the sequence multiple times, so if your sequence returns different data when called repeatedly, you will get inconsistent data into the frame.
Here is a minimal example:
let mutable n = 0.
let nums = seq { for i in 0 .. 10 do n <- n + 1.; yield n, n }
Frame.ofRecords nums
This returns:
Item1 Item2
0 -> 1 12
1 -> 2 13
2 -> 3 14
3 -> 4 15
4 -> 5 16
5 -> 6 17
6 -> 7 18
7 -> 8 19
8 -> 9 20
9 -> 10 21
10 -> 11 22
As you can see, the first item is obtained during the first iteration of the sequence, while the second items is obtained during the second iteration.
This should probably be better documented, but it makes the performance better in typical scenarios - if you can send a PR to the docs, that would be useful.

Parallel Sequences are run in arbitrary order, because they get split across many processors therefore the result-set will be in random order. You can always sort them afterwards, or not run your data in parallel.

Related

How to extract vectors from a given condition matrix in Octave

I'm trying to extract a matrix with two columns. The first column is the data that I want to group into a vector, while the second column is information about the group.
A =
1 1
2 1
7 2
9 2
7 3
10 3
13 3
1 4
5 4
17 4
1 5
6 5
the result that i seek are
A1 =
1
2
A2 =
7
9
A3 =
7
10
13
A4=
1
5
17
A5 =
1
6
as an illustration, I used the eval function but it didn't give the results I wanted
Assuming that you don't actually need individually named separated variables, the following will put the values into separate cells of a cell array, each of which can be an arbitrary size and which can be then retrieved using cell index syntax. It makes used of logical indexing so that each iteration of the for loop assigns to that cell in B just the values from the first column of A that have the correct number in the second column of A.
num_cells = max (A(:,2));
B = cell (num_cells,1);
for idx = 1:max(A(:,2))
B(idx) = A((A(:,2)==idx),1);
end
B =
{
[1,1] =
1
2
[2,1] =
7
9
[3,1] =
7
10
13
[4,1] =
1
5
17
[5,1] =
1
6
}
Cell arrays are accessed a bit differently than normal numeric arrays. Array indexing (with ()) will return another cell, e.g.:
>> B(1)
ans =
{
[1,1] =
1
2
}
To get the contents of the cell so that you can work with them like any other variable, index them using {}.
>> B{1}
ans =
1
2
How it works:
Use max(A(:,2)) to find out how many array elements are going to be needed. A(:,2) uses subscript notation to indicate every value of A in column 2.
Create an empty cell array B with the right number of cells to contain the separated parts of A. This isn't strictly necessary, but with large amounts of data, things can slow down a lot if you keep adding on to the end of an array. Pre-allocating is usually better.
For each iteration of the for loop, it determines which elements in the 2nd column of A have the value matching the value of idx. This returns a logical array. For example, for the third time through the for loop, idx = 3, and:
>> A_index3 = A(:,2)==3
A_index3 =
0
0
0
0
1
1
1
0
0
0
0
0
That is a logical array of trues/falses indicating which elements equal 3. You are allowed to mix both logical and subscripts when indexing. So using this we can retrieve just those values from the first column:
A(A_index3, 1)
ans =
7
10
13
we get the same result if we do it in a single line without the A_index3 intermediate placeholder:
>> A(A(:,2)==3, 1)
ans =
7
10
13
Putting it in a for loop where 3 is replaced by the loop variable idx, and we assign the answer to the idx location in B, we get all of the values separated into different cells.

How to implement least cost path through matrix in Haskell

Hello I have a particular question I cant find any resources on for Haskell. I'm looking to create a function that takes a mmatrix in as a parameter and returns an array for haskell. something like:
returnPossiblePaths :: [[Int]] -> [Int]
The condition though, is that I return the the array with the 'least cost path' or the path that has the lowest sum. So if I have the matrix:
[6 9 3
2 5 7]
I want to iterate from the head to the tail, add the numbers up in that path and return the array with the smallest sum. e.g:
6 -> 9 -> 3 -> 7 = 25
6 -> 9 -> 5 -> 7 = 27
6 -> 2 -> 5 -> 7 = 20
6 -> 2 -> 5 -> 9 -> 3 -> 7 = 32
So here my result array would be: [6, 2, 5, 7]. I need help on how to go about doing this. I have no idea how I would go about iterating from head to tail in different 'paths' without going through all the elements. My general plan was to get all the paths into arrays, map sum to al of them then compare the results and return the array with the smallest sum. So I would first get all the arrays (paths) from the matrix then apply this function to them:
addm::[Int]->Int
addm (x:xs) = sum(x:xs)
store those values in a variable, compare them then return the lowest one. I know haskell has amazing functions that make this way easier and I was wondering if I could get help on how to go about doing this. Any advice is greatly appreciated, thanks!

Find the number of substrings in a string containing equal numbers of a, b, c

I'm trying to solve this problem. Now, I was able to get a recursive solution:
If DP[n] gives the number of beautiful substrings (defined in problem) ending at the nth character of the string, then to find DP[n+1], we scan the input string backward from the (n+1)th character until we find an ith character such that the substring beginning at the ith character and ending at the (n+1)th character is beautiful. If no such i can be found, DP[n+1] = 0.
If such a string is found then, DP[n+1] = 1 + DP[i-1].
The trouble is, this solution gives a timeout on one testcase. I suspect it is the scanning backward part that is problematic. The overall time complexity for my solution seems to be O(N^2). The size of the input data seems to indicate that the problem expects an O(NlogN) solution.
You don't really need dynamic programming for this; you can do it by iterating over the string once and, after each character, storing the state (the relative number of a's, b's and c's that were encountered so far) in a dictionary. This dictionary has maximum size N+1, so the overall time complexity is O(N).
If you find that at a certain point in the string there are e.g. 5 more a's than b's and 7 more c's than b's, and you find the same situation at another point in the string, then you know that the substring between those two points contains an equal number of a's, b's and c's.
Let's walk through an example with the input "dabdacbdcd":
a,b,c
-> 0,0,0
d -> 0,0,0
a -> 1,0,0
b -> 1,1,0
d -> 1,1,0
a -> 2,1,0
c -> 2,1,1 -> 1,0,0
b -> 1,1,0
d -> 1,1,0
c -> 1,1,1 -> 0,0,0
d -> 0,0,0
Because we're only interested in the difference between the number of a's, b'a and c's, not the actual number, we reduce a state like 2,1,1 to 1,0,0 by subtracting the lowest number from all three numbers.
We end up with a dictionary of these states, and the number of times they occur:
0,0,0 -> 4
1,0,0 -> 2
1,1,0 -> 4
2,1,0 -> 1
States which occur only once don't indicate an abc-equal substring, so we can discard them; we're then left with these repetitions of states:
4, 2, 4
If a state occurs twice, there is 1 abc-equal substring between those two locations. If a state occurs 4 times, there are 6 abc-equal substrings between them; e.g. the state 1,1,0 occurs at these points:
dab|d|acb|d|cd
Every substring between 2 of those 4 points is abc-equal:
d, dacb, dacbd, acb, acbd, d
In general, if a state occurs n times, it represents 1 + 2 + 3 + ... + n-1 abc-equal substrings (or easier to calculate: n-1 × n/2). If we calculate this for every count in the dictionary, the total is our solution:
4 -> 3 x 2 = 6
2 -> 1 x 1 = 1
4 -> 3 x 2 = 6
--
13
Let's check the result by finding what those 13 substrings are:
1 d---------
2 dabdacbdc-
3 dabdacbdcd
4 -abdacbdc-
5 -abdacbdcd
6 --bdac----
7 ---d------
8 ---dacb---
9 ---dacbd--
10 ----acb---
11 ----acbd--
12 -------d--
13 ---------d

Lua Variable, Table, For Loop Syntax

Just saw this in the Lua self examples...
-- Example 24 -- Printing tables.
-- Simple way to print tables.
a={1,2,3,4,"five","elephant", "mouse"}
for i,v in pairs(a) do print(i,v) end
-------- Output ------
1 1
2 2
3 3
4 4
5 five
6 elephant
7 mouse
Press 'Enter' key for next example
I haven't seen this syntax before, for i,v in pairs(a) do print(i,v) end
Where did the v come into existence ?
Does the word in cause it to exist ?
By the same token, where does the i come into existence ?
Is this a syntax designed for tables ?
Thanks for any explanation.
pairs returns an iterator over all fields and their values
more exactly it's a function of table and previous seen index which returns pair of index and its value.
> t = {4,5,6}
> p = pairs(t)
> =p(t)
1 4
> =p(t,1)
2 5
> =p(t,2)
3 6
there are 2 options: iterate over every keys or just those which are integers:
pairs and ipairs functions
this loop is very similar to python's
l = [4,5,6]
for i, v in enumerate(l):
...
or
d = {"a":1, "b":2}
for k, v in d.iteritems():
...
if you know python (it looks like everyone knows it)

CodeGolf: Brothers

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I just finished participating in the 2009 ACM ICPC Programming Conest in the Latinamerican Finals. These questions were for Brazil, Bolivia, Chile, etc.
My team and I could only finish two questions out of the eleven (not bad I think for the first try).
Here's one we could finish. I'm curious to seeing any variations to the code. The question in full: ps: These questions can also be found on the official ICPC website available to everyone.
In the land of ACM ruled a greeat king who became obsessed with order. The kingdom had a rectangular form, and the king divided the territory into a grid of small rectangular counties. Before dying the king distributed the counties among his sons.
The king was unaware of the rivalries between his sons: The first heir hated the second but not the rest, the second hated the third but not the rest, and so on...Finally, the last heir hated the first heir, but not the other heirs.
As soon as the king died, the strange rivaly among the King's sons sparked off a generalized war in the kingdom. Attacks only took place between pairs of adjacent counties (adjacent counties are those that share one vertical or horizontal border). A county X attacked an adjacent county Y whenever X hated Y. The attacked county was always conquered. All attacks where carried out simultanously and a set of simultanous attacks was called a battle. After a certain number of battles, the surviving sons made a truce and never battled again.
For example if the king had three sons, named 0, 1 and 2, the figure below shows what happens in the first battle for a given initial land distribution:
INPUT
The input contains several test cases. The first line of a test case contains four integers, N, R, C and K.
N - The number of heirs (2 <= N <= 100)
R and C - The dimensions of the land. (2 <= R,C <= 100)
K - Number of battles that are going to take place. (1 <= K <= 100)
Heirs are identified by sequential integers starting from zero. Each of the next R lines contains C integers HeirIdentificationNumber (saying what heir owns this land) separated by single spaces. This is to layout the initial land.
The last test case is a line separated by four zeroes separated by single spaces. (To exit the program so to speak)
Output
For each test case your program must print R lines with C integers each, separated by single spaces in the same format as the input, representing the land distribution after all battles.
Sample Input: Sample Output:
3 4 4 3 2 2 2 0
0 1 2 0 2 1 0 1
1 0 2 0 2 2 2 0
0 1 2 0 0 2 0 0
0 1 2 2
Another example:
Sample Input: Sample Output:
4 2 3 4 1 0 3
1 0 3 2 1 2
2 1 2
Perl, 233 char
{$_=<>;($~,$R,$C,$K)=split;if($~){#A=map{$_=<>;split}1..$R;$x=0,
#A=map{$r=0;for$d(-$C,$C,1,-1){$r|=($y=$x+$d)>=0&$y<#A&1==($_-$A[$y])%$~
if($p=(1+$x)%$C)>1||1-$d-2*$p}$x++;($_-$r)%$~}#A
while$K--;print"#a\n"while#a=splice#A,0,$C;redo}}
The map is held in a one-dimensional array. This is less elegant than the two-dimensional solution, but it is also shorter. Contains the idiom #A=map{...}#A where all the fighting goes on inside the braces.
Python (420 characters)
I haven't played with code golf puzzles in a while, so I'm sure I missed a few things:
import sys
H,R,C,B=map(int,raw_input().split())
M=(1,0), (0,1),(-1, 0),(0,-1)
l=[map(int,r.split())for r in sys.stdin]
n=[r[:]for r in l[:]]
def D(r,c):
x=l[r][c]
a=[l[r+mr][c+mc]for mr,mc in M if 0<=r+mr<R and 0<=c+mc<C]
if x==0and H-1in a:n[r][c]=H-1
elif x-1in a:n[r][c]=x-1
else:n[r][c]=x
G=range
for i in G(B):
for r in G(R):
for c in G(C):D(r,c)
l=[r[:] for r in n[:]]
for r in l:print' '.join(map(str,r))
Lua, 291 Characters
g=loadstring("return io.read('*n')")repeat n=g()r=g()c=g()k=g()l={}c=c+1 for
i=0,k do w={}for x=1,r*c do a=l[x]and(l[x]+n-1)%n w[x]=i==0 and x%c~=0 and
g()or(l[x-1]==a or l[x+1]==a or l[x+c]==a or l[x-c]==a)and a or
l[x]io.write(i~=k and""or x%c==0 and"\n"or w[x].." ")end l=w end until n==0
F#, 675 chars
let R()=System.Console.ReadLine().Split([|' '|])|>Array.map int
let B(a:int[][]) r c g=
let n=Array.init r (fun i->Array.copy a.[i])
for i in 1..r-2 do for j in 1..c-2 do
let e=a.[i].[j]-1
let e=if -1=e then g else e
if a.[i-1].[j]=e||a.[i+1].[j]=e||a.[i].[j-1]=e||a.[i].[j+1]=e then
n.[i].[j]<-e
n
let mutable n,r,c,k=0,0,0,0
while(n,r,c,k)<>(0,2,2,0)do
let i=R()
n<-i.[0]
r<-i.[1]+2
c<-i.[2]+2
k<-i.[3]
let mutable a=Array.init r (fun i->
if i=0||i=r-1 then Array.create c -2 else[|yield -2;yield!R();yield -2|])
for j in 1..k do a<-B a r c (n-1)
for i in 1..r-2 do
for j in 1..c-2 do
printf "%d" a.[i].[j]
printfn ""
Make the array big enough to put an extra border of "-2" around the outside - this way can look left/up/right/down without worrying about out-of-bounds exceptions.
B() is the battle function; it clones the array-of-arrays and computes the next layout. For each square, see if up/down/left/right is the guy who hates you (enemy 'e'), if so, he takes you over.
The main while loop just reads input, runs k iterations of battle, and prints output as per the spec.
Input:
3 4 4 3
0 1 2 0
1 0 2 0
0 1 2 0
0 1 2 2
4 2 3 4
1 0 3
2 1 2
0 0 0 0
Output:
2220
2101
2220
0200
103
212
Python 2.6, 383 376 Characters
This code is inspired by Steve Losh' answer:
import sys
A=range
l=lambda:map(int,raw_input().split())
def x(N,R,C,K):
if not N:return
m=[l()for _ in A(R)];n=[r[:]for r in m]
def u(r,c):z=m[r][c];n[r][c]=(z-((z-1)%N in[m[r+s][c+d]for s,d in(-1,0),(1,0),(0,-1),(0,1)if 0<=r+s<R and 0<=c+d<C]))%N
for i in A(K):[u(r,c)for r in A(R)for c in A(C)];m=[r[:]for r in n]
for r in m:print' '.join(map(str,r))
x(*l())
x(*l())
Haskell (GHC 6.8.2), 570 446 415 413 388 Characters
Minimized:
import Monad
import Array
import List
f=map
d=getLine>>=return.f read.words
h m k=k//(f(\(a#(i,j),e)->(a,maybe e id(find(==mod(e-1)m)$f(k!)$filter(inRange$bounds k)[(i-1,j),(i+1,j),(i,j-1),(i,j+1)])))$assocs k)
main=do[n,r,c,k]<-d;when(n>0)$do g<-mapM(const d)[1..r];mapM_(\i->putStrLn$unwords$take c$drop(i*c)$f show$elems$(iterate(h n)$listArray((1,1),(r,c))$concat g)!!k)[0..r-1];main
The code above is based on the (hopefully readable) version below. Perhaps the most significant difference with sth's answer is that this code uses Data.Array.IArray instead of nested lists.
import Control.Monad
import Data.Array.IArray
import Data.List
type Index = (Int, Int)
type Heir = Int
type Kingdom = Array Index Heir
-- Given the dimensions of a kingdom and a county, return its neighbors.
neighbors :: (Index, Index) -> Index -> [Index]
neighbors dim (i, j) =
filter (inRange dim) [(i - 1, j), (i + 1, j), (i, j - 1), (i, j + 1)]
-- Given the first non-Heir and a Kingdom, calculate the next iteration.
iter :: Heir -> Kingdom -> Kingdom
iter m k = k // (
map (\(i, e) -> (i, maybe e id (find (== mod (e - 1) m) $
map (k !) $ neighbors (bounds k) i))) $
assocs k)
-- Read a line integers from stdin.
readLine :: IO [Int]
readLine = getLine >>= return . map read . words
-- Print the given kingdom, assuming the specified number of rows and columns.
printKingdom :: Int -> Int -> Kingdom -> IO ()
printKingdom r c k =
mapM_ (\i -> putStrLn $ unwords $ take c $ drop (i * c) $ map show $ elems k)
[0..r-1]
main :: IO ()
main = do
[n, r, c, k] <- readLine -- read number of heirs, rows, columns and iters
when (n > 0) $ do -- observe that 0 heirs implies [0, 0, 0, 0]
g <- sequence $ replicate r readLine -- read initial state of the kingdom
printKingdom r c $ -- print kingdom after k iterations
(iterate (iter n) $ listArray ((1, 1), (r, c)) $ concat g) !! k
main -- handle next test case
AWK - 245
A bit late, but nonetheless... Data in a 1-D array. Using a 2-D array the solution is about 30 chars longer.
NR<2{N=$1;R=$2;C=$3;K=$4;M=0}NR>1{for(i=0;i++<NF;)X[M++]=$i}END{for(k=0;k++<K;){
for(i=0;i<M;){Y[i++]=X[i-(i%C>0)]-(b=(N-1+X[i])%N)&&X[i+((i+1)%C>0)]-b&&X[i-C]-b
&&[i+C]-b?X[i]:b}for(i in Y)X[i]=Y[i]}for(i=0;i<M;)printf"%s%d",i%C?" ":"\n",
X[i++]}

Resources