Scripting libreoffice calc formula into csv - bash

I have a bash script that writes some data extracted from raw log files into a file in csv format. Now I want to apply libreoffice calc formulas on this dataset. My idea is to write "raw" calc formula in the csv file directly from the bash script (using ';' [semicolon] instead of ',' [comma] to separate data to avoid breaking formulas). So I have a script like that (for example):
#!/bin/bash
for (( i=1; i<=5; i++ ))
do
echo "$i; $((i+1)); =SUM(A$i, B$i)" >> sum.csv
done
Executing this script gives this sum.csv file:
1; 2; =SUM(A1, B1)
2; 3; =SUM(A2, B2)
3; 4; =SUM(A3, B3)
4; 5; =SUM(A4, B4)
5; 6; =SUM(A5, B5)
When I open it with calc, it gives the expected result with each value separated in single cell. But the problem is that the formulas are not evaluated. Even manually editing the cell doesn't trigger an evaluation. The only thing working is to copy the formulas without '=' and manually writing '=', then pasting the formulas.
I tried using INDIRECT() but it didn't help.
Is there a way to force evaluation of formulas ? Or is there some other way to do what I want (without learning a new language...) ?

It should work after removing the leading space before the equals sign. The third field currently has the content =SUM(A1, B1) (notice the leading space). LO will recognize the formula if the content starts with = instead of =:
1;2;=SUM(A1, B1)
2;3;=SUM(A2, B2)
3;4;=SUM(A3, B3)
4;5;=SUM(A4, B4)
5;6;=SUM(A5, B5)

Related

How do I parallelize for-loop in octave using pararrayfun (or any other function will also do)?

Well, I'm new to octave and i wanted to know how to implement parallel execution of for loop in octave.
I'm looking for parallel implementation of the below code (its not the exact code that I'm trying to execute, but something similar to this)
`%read a csv file
master_sheet = csv2cell('master_sheet.csv');
delta = 0.001;
nprocs= nproc();
%extract some values from the csv file and store it in the variables
a = master_sheet{34,2} ;
b = master_sheet{38,2} ;
c = master_sheet{39,2} ;
for i=0:1000
%%create variants of a,b and c by adding a delta value
a_adj = a +(i)*delta ;
b_adj = b +(i)*delta ;
c_adj = c +(i)*delta ;
%club all the above variables and put it to an array variable
array_abc = [a_adj, b_adj, c_adj];
%send this array as an argument/parameter to a function
%processingData() function would essentially perform some series of calculation and would write the
%results onto a file
processingData(array_abc);
endfor
Currently, I'm using parallel pkg (pararrayfun) to implement this, but if there is any other way(package) that could achieve the parallelization of for loop in octave, then I'm open to exploring that as well.
Thank you!

Need help understanding this implementation of tetris

I am trying to understand this implementation of Tetris.
I have a few questions.
In update_score function,
if (( score > LEVEL_UP * level)) ; then # if level should be increased
((level++)) # increment level
pkill -SIGUSR1 -f "/bin/bash $0"
What is the use of having a separate process at all for adjusting the delay? Why use SIGUSR1 and SIGUSR2?
In the draw_piece function, why multiply by 8? I don't understand how the conversion is taking place or how the concept of "rotation" is implemented here.
for ((i = 0; i < 8; i += 2)) {
# relative coordinates are retrieved based on orientation and added to absolute coordinates
((x = $1 + ${piece[$3]:$((i + $4 * 8 + 1)):1} * 2))
((y = $2 + ${piece[$3]:$((i + $4 * 8)):1}))
xyprint $x $y "$5"
...
}
Nor do I understand the syntax involving : here.
In clear_next, why is draw_next ${filled_cell//?/ } necessary instead of just ${filled_cell}? What do the // signify?
I'm a beginner to shell scripting and programming in general and I have been trying to understand this implementation of Tetris [in shell]
Somehow, I suspect you could have found easier programs to start with.
What is the use of having a separate process at all for adjusting the delay? Why use [SIGUSR1] and [SIGUSR2]?
I don't think there's a separate process for adjusting the delay, but for implementing the timer. The timer must run even while the program is waiting for the player to give input, and if the shell functions don't give any way of having a timeout on read, that must be exported to another process. So then you get what there is in the end of script, a divide into the timer, the user input handler, and the actual game logic, with output from the first two going to the last one:
(ticker & reader) | (controller)
Bash's read does have the -t flag for timeout, so if it was implemented in Bash, you might not need the extra timer process. However, putting the timer in an external process also makes it independent of the user input, the read timeout would instead reset every time user hits a button. Working around that would require some way of accurately determining the elapsed time (or using a really short timeout on read and counting the ticks).
SIGUSR1 and SIGUSR2 are just "innocent" signals that don't have a meaning to the system at large, so they can be used here. Of course you could use others, but catching SIGINT or SIGHUP would annoy users if they wanted to stop the game.
In the draw_piece function, why multiply by 8?
((x = $1 + ${piece[$3]:$((i + $4 * 8 + 1)):1} * 2))
The piece array contains the different shapes and orientations of the pieces. A piece is 4 squares large, each square needs two coordinates, so we get 8 numbers per piece/orientation. For, example, the string for the S piece is 0001111201101120, so it has two orientations:
yx yx yx yx yx yx yx yx
00 01 11 12 01 10 11 20
And the piece looks something like this:
012 012
0 xx. 0 .x.
1 .xx 1 xx.
2 ... 2 x..
The ${variable:position:length} notation picks a substring from the given variable, so the program gets the single digits it needs from the bigger string. That's a somewhat weird way of implementing an array.
In clear_next, why is draw_next ${filled_cell//?/ } necessary ...? What do the // signify?
The ${parameter/foo/bar} construct is a pattern replacement (See e.g. Bash's manual on parameter expansion, look for "replace"). Whatever matches foo in the value of parameter, is replaced with bar, and the result is expanded. With a double slash, all matches are replaced, with a single slash, only the first. The question mark matches any character as with filename globs, so that effectively makes a string of spaces as long as the original string.
For example:
$ str="hallo hallo"
$ echo "${str/a/e}"
hello hallo
$ echo "${str//a/e}"
hello hello
$ str="abc"
$ echo "x${str//?/ }x"
x x

Reading delimited value from file into a array variable

I want to read data.txt which has a 2x2 matrix number inside delimited by tab like this:
0.5 0.1
0.3 0.2
Is there any way to read this file in bash then store it into an array then process it a little then export it to a file again? Like for example in matlab:
a=dlmread('data.txt') //read file to array variable a
for i=1:2
for j=1:2
b[i][j]=a[i][j]+100
end
end
dlmwrite(b,'data2.txt') //exporting array value b to data2.txt
If the extent of your processing is to something simple like add 100 to every entry, a simple awk command like this might work:
awk '{ for(i = 1; i <= NF - 1; i++) { printf("%.1f%s", $i + 100, OFS); } printf("%.1f%s", $NF+100, ORS); }' < matrix.txt
This just loops through each row and adds 100. It's possible to do more complex operations too, but if you really want toprocess matrices there are better tools (like python+numpy or octave).
It's also possible to use bash arrays, but to do any of the operations you'd have to use an external program anyway, since bash doesn't handle floating point arithmetic.

performance of unpack combined with join in Perl

I have a parser written in Perl which parses file of fixed length records. Part of the record consists of several strings (also fixed length), made of numbers only. Each character in the string is encoded as number, not as ASCII char. I.e., if I have string 12345, it's encoded as 01 02 03 04 05 (instead of 31 32 33 34 35).
I parse the record with unpack, and this particular part is unpacked as #array = unpack "C44", $s. Then I recover needed string with simple join, like $m = join("", #array).
I was wondering if thats an optimal way to decode. Files are quite big, millions of records, and obviously I tried to look if it's possible to optimize. Profiler shows that most of the time is spent in parsing the records (i.e., reading, writing, and other stuff is not a problem), and in parsing most of the time has been taken by these joins. I remember from other sources that join is quite efficient operation. Any ideas if it's possible to speed code more or is it optimal already? Perhaps it would be possible to avoid this intermediate array in some clever way, e.g., use pack/unpack combination instead?
Edited: code example
The code which I try to optimise looks like this:
while (read(READ, $buf, $rec_l) == $rec_l) {
my #s = unpack "A24 C44 H8", $buf;
my $msisdn = substr $s[0], 0, 11;
my $address = join("", #s[4..14]);
my $imsi = join("", #s[25..39]);
my $ts = localtime(hex($s[45]));
}
Untested (I'll come back and edit when I'm less busy) but this should work if I've done all of the math correctly, and be faster:
my ($msisdn, $address, $imsi, $ts) =
unpack "A11 x13 x3 a10 x10 a15 x5 N", $buf;
$address |= "0" x 10;
$imsi |= "0" x 15
$ts = localtime($ts);
As always in Perl, faster is less readable :-)
join("", unpack("C44", $s))
I don't believe this change would speed up your code. Everything depends on how often you call the join function to read one whole file. If you're working in chunks, try to increase the size of them. If you're doing some operation between unpack and join to this array, try to line them up with a map operation. If you post your source code it would be easier to identify the bottleneck.
I'm a pack/unpack noob, but how about skipping the join by altering your sample code like so:
my $m = unpack "H*", $s ;
quick test:
#!/usr/bin/perl
use strict ;
use Test::More tests => 1 ;
is( unpack("H*", "\x12\x34\x56"),"123456");

Why is printf in F# so slow?

I've just been really surprised by how slow printf from F# is. I have a number of C# programs that process large data files and write out a number of CSV files. I originally started by using fprintf writer "%s,%d,%f,%f,%f,%s" thinking that that would be simple and reasonably efficient.
However after a while I was getting a bit fed up with waiting for the files to process. (I've got 4gb XML files to go through and write out entries from them.).
When I ran my applications through a profiler, I was amazed to see printf as being one of the really slow methods.
I changed the code to not use printf and now performance is so much better. Printf performance was killing my overall application performance.
To give an example, my original code is:
fprintf sectorWriter "\"%s\",%f,%f,%d,%d,\"%s\",\"%s\",\"%s\",%d,%d,%d,%d,\"%s\",%d,%d,%d,%d,%s,%d"
sector.Label sector.Longitude sector.Latitude sector.RNCId sector.CellId
siteName sector.Switch sector.Technology (int sector.Azimuth) sector.PrimaryScramblingCode
(int sector.FrequencyBand) (int sector.Height) sector.PatternName (int sector.Beamwidth)
(int sector.ElectricalTilt) (int sector.MechanicalTilt) (int (sector.ElectricalTilt + sector.MechanicalTilt))
sector.SectorType (int sector.Radius)
And I've changed it to be the following
seq {
yield sector.Label; yield string sector.Longitude; yield string sector.Latitude; yield string sector.RNCId; yield string sector.CellId;
yield siteName; yield sector.Switch; yield sector.Technology; yield string (int sector.Azimuth); yield string sector.PrimaryScramblingCode;
yield string (int sector.FrequencyBand); yield string (int sector.Height); yield sector.PatternName; yield string (int sector.Beamwidth);
yield string (int sector.ElectricalTilt); yield string (int sector.MechanicalTilt);
yield string (int (sector.ElectricalTilt + sector.MechanicalTilt));
yield sector.SectorType; yield string (int sector.Radius)
}
|> writeCSV sectorWriter
Helper functions
let writeDelimited delimiter (writer:TextWriter) (values:seq<string>) =
values
|> Seq.fold (fun (s:string) v -> if s.Length = 0 then v else s + delimiter + v) ""
|> writer.WriteLine
let writeCSV (writer:TextWriter) (values:seq<string>) = writeDelimited "," writer values
I'm writing out files with about 30,000 rows. Nothing special.
I am not sure how much it matters, but...
Inspecting the code for printf:
https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/printf.fs
I see
// The general technique used this file is to interpret
// a format string and use reflection to construct a function value that matches
// the specification of the format string.
and I think the word 'reflection' probably answers the question.
printf is great for writing simple type-safe output, but if you want good perf in an inner loop, you might want to use a lower-level .NET API to write output. I haven't done my own benchmarking to see.
TextWriter already buffers its output. I recommend using Write to output each value, one at a time, instead of formatting an entire line and passing it to WriteLine. On my laptop, writing 100,000 lines takes nearly a minute using your function, while, using the following function, it runs in half a second.
let writeRow (writer:TextWriter) siteName (sector:Sector) =
let inline write (value:'a) (delim:char) =
writer.Write(value)
writer.Write(delim)
let inline quote s = "\"" + s + "\""
write (quote sector.Label) ','
write sector.Longitude ','
write sector.Latitude ','
write sector.RNCId ','
write sector.CellId ','
write (quote siteName) ','
write (quote sector.Switch) ','
write (quote sector.Technology) ','
write (int sector.Azimuth) ','
write sector.PrimaryScramblingCode ','
write (int sector.FrequencyBand) ','
write (int sector.Height) ','
write (quote sector.PatternName) ','
write (int sector.Beamwidth) ','
write (int sector.ElectricalTilt) ','
write (int sector.MechanicalTilt) ','
write (int (sector.ElectricalTilt + sector.MechanicalTilt)) ','
write sector.SectorType ','
write (int sector.Radius) '\n'
Now that F# 3.1 has been preview released, the performance of printf is claimed to have increased by 40x. You might want to have a look at this:
F# 3.1 Compiler/Library Additions
Printf performance
The F# 3.1 core library sees improved performance of the printf family
of functions for type-safe formatting. For example, printing using
the following format string now runs up to 40x faster (though your
exact mileage may vary):
sprintf "%d: %d, %x %X %d %d %s" No changes in your code are needed to
take advantage of this improved performance, though you do need to be
using the F# 3.1 FSharp.Core.dll runtime component.
EDIT: This answer is only valid for simple format strings, like "%s" or "%d". See comments below.
It is also interesting to note that if you can make a curried function and reuse that, the reflection will only be carried out once. Sample:
let w = new System.IO.StringWriter() :> System.IO.TextWriter
let printer = fprintf w "%d"
let printer2 d = fprintf w "%d" d
let print1() =
for i = 1 to 100000 do
printer 2
let print2() =
for i = 1 to 100000 do
printer2 2
let time f =
let sw = System.Diagnostics.Stopwatch()
sw.Start()
f()
printfn "%s" (sw.ElapsedMilliseconds.ToString())
time print1
time print2
print1 takes 48 ms on my machine while print2 takes 1158 ms.

Resources