gnuplot slow when plotting large data set as animation - performance

I'm trying to make an "animated" plot a lot of data (the position of 1000 particles) from a big text file with a script like:
set terminal wxt size 1000,600
k=999999
N = 999
do for [i=0:k]{
plot for [j=0:N-1] "pos.txt" using 2*j+1:2*j+2 every ::2*i+1::2*i+1 ls 1 pt 7 ps 2 notitle
In the file, every line has the coordinates X and Y at a certain time of the points I want to plot. I'm using every to plot all the data in each row once and then move on to the next row.
The output is something like this (1000 particles moving)
However the plotting is way too slow and I don't know what I can do to make it plot faster. It plots a row once every 5 or more seconds. The file weights some MBs. Should I change the terminal? Or the way I store the data? I think there might be a problem when gnuplot loads a big file.
Some particles dissappear in the simulation so I also get the error line 14: warning: Skipping data file with no valid points when the index j (well 2j+1) goes over the number of particles but I tried making it so that it reads the number of particles each time and it's even slower. Many thanks.

I suspect gnuplot is reading the whole file every time you plot, as opposite to read up to the line in question, then next line, then next, etc. One possible strategy is to separate your particles trajectory into different files, but specially it could help to remove the plot for by simply a plot plus a block selection with every, where instead of selecting the column for the particle you have your particles positions for the same time step in the same block.
Now your data looks something like this:
x1 y1 x2 y2 x3 y3 # Time step 1
x1 y1 x2 y2 x3 y3 # Time step 2
And gnuplot needs to read the file once for every time step and particle. If you structure the file as follows (note one blank line between blocks):
# Time step 1
x1 y1
x2 y2
x3 y3
# Time step 2
x1 y1
x2 y2
x3 y3
Then you don't need the plot for, instead just select the corresponding block with all the particles by inserting one extra semicolon in every:
set terminal wxt size 1000,600
k=999999
#N = 999 you don't need this anymore!
do for [i=0:k] {
plot "pos.txt" every :::i::i
}
The code above reads the file for every time step, rather than every time step and particle, and plots all the particles at once.

If performance is very critical, you may consider using a completely different data format. Although changing the format of the ASCII file gives a huge improvement, it scales badly, because gnuplot must always scan from the beginning of the data file in order to determine the position where to start at. I did some testing, and to plot the first 1000 frames it took me 60s, whereas the points 9000 to 10000 took 600s to plot.
You would need a data format which allows you to seek at any data set in constant time. In my thesis I saved all my experimental data (huge data sets) with hdf5, and then you can use the external utility h5totxt to extract the desired data set. Here, the position of the requested data set can be calculated without scanning the whole file, and the access time is independent of the frame number.
For testing I used the following python script to generate a test data file points.h5:
from numpy import random
import h5py
P = random.normal(size=(10000,1000,2))
f = h5py.File('points.h5', 'w')
f.create_dataset('points', data=P)
The gnuplot script for plotting is
set terminal wxt size 1000,600
k=9999
do for [i=0:9999]{
plot sprintf("< h5totxt -s ' ' -x %d points.h5", i) using 1:2 ls 1 pt 7 ps 2 title sprintf("%d", i)
}
Now, plotting of 1000 frames takes 40s, no matter which frames you take (0-1000 or 9000-10000).

Related

Efficiently plot large coastline file with gnuplot

I have a large data file, coastlines.csv, which represents the coastlines of the world, in this format
-4.94237 55.725449
-4.941922 55.725585
....
where each row represents a point, the first entry its the longitude, the second one is the latitude, both in degrees. The file has ~ 3 x 10^6 lines, and weighs ~ 700 MB.
Plotting this file with gnuplot
plot 'coastlines.csv'
takes some time, which is understandable. When I make the plot above and then click on it, draw a square with the mouse and zoom to replot a small region of the world only, the new plot takes the same amount of time as the full one.
I have the impression that gnuplot is checking all the points in the file again, because it does not know which ones will fall within the new plotting window.
Is there a way to speed up this replot?
Thanks!

Gnuplot: Making a gif of a map generated with matrix?

I generated a .dat file with 100 matrix 15x15, now I want to create a gif which shows the evolution from the first to the last matrix. They are all matrix with 1 or -1, so if I want to represent the inicial matrix I can copy and paste it in another file and I put this in gnuplot:
plot 'firstmatrix.dat' matrix with image
It represents the 1, -1 matrix with yellow and black.
To create the gif I'm trying to do this in gnuplot:
set terminal gif animate delay 20
set output 'evolution.gif'
set xrange [0:15]
set yrange [0:15]
N=15
nframes=5
do for [i=1:int(nframes)] {
plot 'evolution.dat' every ::(i-1)*N+1::i*N matrix with image
}
I intend to read from the first line of the file to the 15th line, then from the 16th to the 30th and so on.
I put only 5 frames to see better the result, and I obtain that the gif shows the first matrix in the first frame and nothing more, only white frames.
The error message is four times this one:
warning: Skipping data file with no valid points
So the data for the first frame, the first matrix, is well processed but not the rest. So here is my problem, I don't know why it process good the first one and no more.
Thanks in advance.
It shows only the first matrix in the first frame
You've been pretty close. But it took me also some iterations and testing...
Apparently, slicing a block of rows from a matrix requires every :::rowFirst::rowLast (mind the 3 colons at the beginning). And then gnuplot apparently takes the row index of the whole matrix as y-coordinate. Since you want to have it "on top of each other" you need the modulo operator % (check help operators binary). It might have been a bit easier if your matrices were separated by one or two empty lines.
Code:
### animated matrix data
reset session
### create some random data
set print $Data
do for [n=1:20] {
do for [y=1:15] {
Line = ''
do for [x=1:15] {
Line=Line.sprintf("% 3g",int(rand(0)*2)*2-1)
}
print Line
}
}
set print
set terminal gif animate delay 30
set output "tbMatrixAnimated.gif"
unset key
N=15
do for [i=1:20] {
plot $Data u 1:(int($2)%N):3 matrix every :::N*(i-1)::N*i-1 with image
}
set output
### end of code
Result: (only 20 matrices)

Syntax of animating in gnuplot using 'blocks' in a single file

I have a data file of two column, ten row ‘blocks’, with two lines of whitespace between each block. Each frame of the animation I want the ten points in the successive block to be plotted, until the end of the data file.
I've searched for how to do this for ages but can't appropriate any of the examples I've found to my case as I don't understand the syntax and can't find an explanation of it anywhere.
How would the example here or here be extended to blocks of x rows?
E.g., in the second example, pasted below for easy reference
n=10 # n present here the number of blocks in your file
plot "output.dat" using 1:2 every :::i::i
i=i+1
if i<n reread
What do the number of colons in every :::i::i mean? Is that three data lines, then two whitespace lines? (Appropriating assuming that doesn't work.)
(If this question seems too obvious, I assure you it is due to my lack of knowledge, not my lack of effort in researching. I would very gladly accept being pointed towards the place in the documentation where this is covered.)
This is not an answer regarding the syntax of every, but a way to achieve this animated plot that is scalable for future users in my position.
A datablock, or block, is x consecutive lines of data, separated by exactly two lines of whitespace.
The plot command option index can be used to access each of these blocks.
For example, plot "datafile.dat" using 1:2 index 1 would plot only the points in the first dataset (block of data).
A loop can be used to animate your data. The stats command can be used to find the number of datasets/blocks in your file, to use in the loop.
set terminal x11
stats 'bdata.txt' nooutput
set xrange [0:10]
set yrange [0:10]
do for [a = 1: int(STATS_blocks - 1)] {
plot "bdata.txt" using 1:2 index a
pause 0.1
}

Drawing a graph in linux

I have a file that contains the map between the two entities. I have to plot the x and and Y axis in a graph using linux. The X axis is Time like (12:35:07) and the Y axis will have some integer numbers range (1-14). I need to plot the x and the Y axis using some method that should be called from linux shell script. X axis will have totally around 500 to 800 samples. say (from 12:14:00 to 12:30:00). Can anyone please help me out. I tried using GNUplot, but the graph is not proper. Sample input is given below
12:34:58 5
12:35:06 9
12:35:07 14
12:35:07 13
12:35:08 4
12:35:08 5
12:35:17 9
12:35:17 13
12:35:18 14
12:35:19 4
12:35:19 5
This is what I have written
1 set terminal png
2 set output 'test.png'
3 set xdata time
4 set timefmt "%H:%M:%S"
5 set yrange [1:15]
6 plot "Graph1" using 1:2 title 'data A'
The problem I have is since the image has large number of samples (around 700 intervals, I am not able to see each value properly. Also the Y axis is not proper. I need to restrict the Y axis to onlyh 10 values (1-10). Also I need to draw 4 similar graphs and place each graph in each corner (subplot). I have not done the subplot yet in GNUplot.
It seems that you have three main issues right now. For some of them the issue is not totally clear, so it would help if you could update your question (or make a comment) to clarify, and I can update my response.
1) Not able to see each of ~700 values properly.
Here I need a bit of clarification. I'm assuming you don't want to see each of 700 data points individually, rather you want to be able to get rid of clutter on the x axis which happens when plotting time values. For this you can use the command
set xtics X
where X is the interval between tics you want, in this case a number of seconds. Is this what you want?
You can also use
set format x '%M:%S'
(or something) if you want to control how the x values are displayed on the axis.
2) y axis needs to be restricted to 10 values.
Here also it's not totally clear what you are looking for. Do you want to scale everything down to be between 1 and 10? The range of your data presented is 4 to 14, so by subtracting four you can get things between 0 and 10 like this:
plot "Graph1" using 1:($2-4) title 'data A'
If you want to constrict an arbitrary y data set to be between 1 and 10, that's a little more tricky:
stats "Graph1"
plot "Graph1" using 1:(($2-STATS_min_y)*9/(STATS_max_y-STATS_min_y)+1) title 'data A'
The stats command gets statistics about a file before you plot it. The convoluted plot command should scale everything to be between 1 and 10 on the y axis.
3) You want to have 4 subplots.
This one is pretty easy. Before your first plot command, use the command
set multiplot layout 2,2
This will create a 2x2 grid for your plots. Every plot command will plot on a new subplot. Type help set multiplot at the gnuplot command line for details on changing the sequence of subplots used.

Animate trajectory using Octave

I have a set of (x,y) coordinates that describe the trajectory of an object. I'd like to animate this trajectory using GNU Octave.
The data set is quite large so I won't be able to redraw the entire plot at every iteration if I want the animation to be smooth. What functions are there that would allow me to "update" a plot rather than redraw it?
Also, I have another set of (vx,vy) points, which describe the speed of the object. I'd like my animated trajectory to take speed into account. What function should I use to have the program sleep for a couple of milliseconds as to make the trajectory animate at the same speed as the object?
(I already know Octave has functions such as comet, but I need to write my own animator.)
Edit: Here's what I have up until now. I expected this to run too fast and require me to use pause, but it's still pretty slow (x and y have 10001 elements).
bounds = [min(x) max(x) min(y) max(y)];
axis(bounds);
hold on
for k = 2 : length(x)
plot(x(k-1:k), y(k-1:k));
drawnow("expose");
end
hold off
You can use the set command to change just the XData and YData data for a certain plot object h:
h = plot(my_xdata(0),my_ydata(0))
for i_=1:length(my_xdata)
set(h, 'YData', my_ydata(i_))
set(h, 'XData', my_xdata(i_))
pause(sqrt(vx(i_)^2+vy(i_)^2))
end
The pause(x) command pauses for x seconds, which can be less than 1.
I think you are looking for the "hold" command. holding the plot keeps all previous data on the plot and the new data is added on top.

Resources