Unix - Sorting based on field position [duplicate] - shell

This question already has answers here:
Unix - Sorting in shell script
(2 answers)
Closed 8 years ago.
I am trying to sort a file based on field position using the following command:
sort -d -k 2.42,44 -k 4.65,74
But when I check my output file whether it got sorted (sort -c) one of the lines is always out of order. The file to be sorted is given below:
000835NMSEFT 20140626095038403536
010835 577300053D0000000031860CZZBCCP01122000496 DA 2270000323 133052649999999999901121000358 DA 1235565067 201406301EG11388791 1953760980999999999 TN HCCLAIMPMT 40520140625PRREEP 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PETWIN CITIES COMMUNITY HOS XX1396778197 FILE 57446 LOS ANGELES CA90074 TJ 752918765-0506330000-000 PN 050633-003 EY DCCP13305264991953760980 050633-003NMSINS 201406255004000270000323 0003230001 2014I 050633-003 00004 01505000323050633-003 00004
000835NMSEFT 20140626095038403536
010835 577300054D0000000002162CZZHCCP01122000496 DA 2270000323 133052649999999999901122000030 DA 1452500273 201406301EG11388792 1953760980999999999 TN HCCLAIMPMT 40520140625PRREEP 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PESCRIPPS CLINIC XX1538297726 10666 N TORREY PINES RD S LA JOLLA CA92037 TJ 870737749-0000000000-000 BY DCCP13305264991953760980 870737749 NMSPRF 201406254004000270000323 0003230001 2014P 870737749-00300010 01405000323870737749-00300010
000835NMSEFT 20140626095038403536
010835 577300074D0000000002000CACHCCP01191179439 DA 3350209432 191179439099999999901121000358 DA 1459336554 201406301EG11388801 1953760980999999999 TN HCCLAIMPMT 40520140625PRALVORD UNIFIED SCHOOL DISTRICT 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PEUSC CARE MEDICAL GROUP IN XX1902846306 50938 FILE LOS ANGELES CA90074 TJ 954540991-0000000000-000 EY DCCP19117943901953760980 954540991 NMSPRF 201406254022603350209432 2094320001 2014P 954540991 00045 01403209432954540991 00045
000835NMSEFT 20140626095038403536
010835 577300075D0000000021553CABHCCP01122000496 DA 2640001855 194147704099999999901121141819 DA 0009605886 201406301EG11388802 1953760980999999999 TN HCCLAIMPMT 40520140625PRHOOPA VALLEY TRIBAL COUNCIL 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PEEUREKA FAMILY PRACTICE IN XX1114965761 2675 HARRIS ST EUREKA CA95503 TJ 941724607-0000000000-000 PN 941724 BY DCCP19414770401953760980 941724607 NMSPRF 201406254023202640001855 0018550001 2014P 941724607 00013 01405001855941724607 00013
000835NMSEFT 20140626095038403536
010835 577300076D0000000004819CABBCCP01122000496 DA 2640001855 194147704099999999901121141819 DA 990487159 201406301EG11388803 1953760980999999999 TN HCCLAIMPMT 40520140625PRHOOPA VALLEY TRIBAL COUNCIL 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PEHUMBOLDT RADIOLOGY MED GR XX1326154196 PO BOX 6428 EUREKA CA95502 TJ 942188987-0000000000-000 EY DCCP19414770401953760980 942188987 NMSPRF 201406254023202640001855 0018550001 2014P 942188987 00004 01405001855942188987 00004
000835NMSEFT 20140626095038403536
010835 577300079D0000000010320CDAHCCP01122000496 DA 4000155328 133017499699999999901323070380 DA 485008267910 201406301EG11388804 1953760980999999999 TN HCCLAIMPMT 40520140625PRVIASAT, INC 21555 OXNARD STREET WOODLAND HILLS CA91367 BLEDI SOLUTIONS TE8002273983 IC URwww.anthem.com/provider/router PEWEST PACIFIC MEDICAL LABO XX1124050380 PO BOX 2339 SANTA FE SPGS CA90670 TJ 954391299-0000000000-000 BY DCCP13301749961953760980 954391299 NMSPRF 201406254025104000155328 1553280001 2014P 954391299 00025 01405155328954391299 00025

Two problems. The number after the decimal is a character position within the field (not from the beginning of the line). And you need to repeat the field number after the commas. So:
sort -k 2.25,2.27 -k 4,4

Related

Opengl matrix without glm [duplicate]

This question already has an answer here:
How do I compose a rotation matrix with human readable angles from scratch?
(1 answer)
Closed 11 months ago.
Does anybody know what the order of a 4x4 GLfloat array matrix for transforming a 2D rectangle is? I don't want to use glm or cglm to make my life easy. I'm trying to use the least amount of libraries as possible.
Is the order something like this:
{ px, sx, rx, 0, py, sy, ry, 0, pz, sz, rz, 0, 0, 0, 0, 1 } ?
If not what is it?
Thanks!
4x4 matrix is for 3D.
Xx Yx Zx Tx
Xy Yy Zy Ty
Xz Yz Zz Tz
0 0 0 1
(Xx, Xy, Xz) - left (or right) vector
(Yx, Yy, Yz) - up vector
(Zx, Zy, Zz) - forward vector
(Tx, Ty, Tz) - translation (position) vector
indices:
m00 m01 m02 m03
m10 m11 m12 m13
m20 m21 m22 m23
m30 m31 m32 m33
Order: m00, m10, m20, m30, m01, m11, m21, m31, m02, m12, m22, m32, m03, m13, m23, m33
If you need only 2D transformations you can use a 3x3 matrix.
Xx Yx Tx
Xy Yy Ty
0 0 1
Order: m00, m10, m20, m01, m11, m21, m02, m12, m22
Or you want this?
Xx Yx 0 Tx
Xy Yy 0 Ty
0 0 1 0
0 0 0 1

From Front, Left & Up Vector to read Pitch, Yaw & Rol

I have a CSV file with:
FrontX
FrontY
FrontZ
LeftX
LegtY
LeftZ
UpX
UpY
UpZ
And I want to convert that to roll, yaw and pitch. But I do not know the formula for this

how to collect maximum and minimum values from a grep search pattern?

I have a huge log.txt file from which I need to calculate the maximum and minimum throughput values.
so, as the first step I started to collect lines as below into a separate file with the below grep command:
grep "x=0 y=0 Time=" log.txt > new_log.txt
So, now I have a new_log.txt file that contains only lines as below patterns.
From this file, I need to calculate(in bash) the maximum and minimum Throughput values:-
0-00:42:35.598 <tracing>:[DEFAULT]:[PRINT]:0-00:42:35.598 (7420:428:512:32 = 74.205024) <0xfa91>:Mbps: x=0 y=0 Time=200 ms Throughput=98.5463 Mbit/s
0-00:42:35.598 <tracing>:[DEFAULT]:[PRINT]:0-00:42:35.598 (7420:428:512:32 = 74.205024) <0xfa91>:Mbps: x=0 y=0 Time=200 ms Throughput=12.2456 Mbit/s
0-00:42:35.598 <tracing>:[DEFAULT]:[PRINT]:0-00:42:35.598 (7420:428:512:32 = 74.205024) <0xfa91>:Mbps: x=0 y=0 Time=200 ms Throughput=42.5675 Mbit/s
0-00:42:35.598 <tracing>:[DEFAULT]:[PRINT]:0-00:42:35.598 (7420:428:512:32 = 74.205024) <0xfa91>:Mbps: x=0 y=0 Time=200 ms Throughput=35.7289 Mbit/s
0-00:42:35.598 <tracing>:[DEFAULT]:[PRINT]:0-00:42:35.598 (7420:428:512:32 = 74.205024) <0xfa91>:Mbps: x=0 y=0 Time=200 ms Throughput=79.5687 Mbit/s
Desired output:-
maximum Throughput=98.5463 Mbit/s
minimum Throughput=12.2456 Mbit/s
Any support is appreciated.
You can use awk for that purpose.
E.g.:
cat new_log.txt | awk '
BEGIN {
mi=999999.9;
ma=0.0;
}
{
v=substr($12,12);
if (v<mi) mi=v;
if (v>ma) ma=v;
}
END {
printf("maximum Throughput=%f Mbit/s\n",ma);
printf("minimum Throughput=%f Mbit/s\n",mi);
}'
You even can use it on the output of the grep, you do not need to store the file.

Processing 2d scatter plot

How to load .csv file and use data in the table to create 2d scatter plot.
The data inside the table is
Name X Y Group Gender Year of Birth Grade
Victor Anderson 627 705 2 Female 2000 6
Jack Scott 808 643 1 Male 2002 4
Sean Robinson 624 627 1 Male 2002 4
William Rodriguez 423 396 1 Female 2004 2
Aaron Kelly 775 181 0 Female 2005 1
Raymond Taylor 433 731 1 Female 2000 1
Alan Foster 635 580 2 Male 2002 4
Charles Watson 884 262 0 Female 2003 3
Lillian Perez 334 190 4 Male 2005 1
Betty Moore 727 524 1 Female 2003 3
Bruce Adams 503 684 0 Male 2001 2
Kathryn Sanchez 284 246 0 Male 2001 2
Chris Hall 189 223 3 Male 2000 6
Eugene Harris 196 220 3 Female 2000 5
Gary Baker 707 559 0 Female 2001 5
Michael Ramirez 299 657 2 Male 2000 1
Walter Smith 400 755 0 Male 2005 5
Ann Murphy 0 895 1 Male 2003 3
Lois Thompson 783 631 1 Female 2003 2
Louis Jones 674 589 1 Female 2003 3
Tammy Bell 828 263 3 Male 2002 4
Carolyn Bailey 557 23 4 Male 2004 2
Larry Campbell 444 627 2 Female 2004 2
To draw a simple scatter plot from the data in the csv we need to:
Ensure X and Y columns can be read as numbers
Normalize or scale the X and Y values to the canvas size
Plot a point for each row at its scaled x and y position
Add labels
For the first item we could add code to clean up X and Y and turn them into numbers. Below I have assumed that X and Y will always be integers and removed white space so we can just call row.getInt()
The data is saved like this:
Name,X,Y,Group,Gender,Year of Birth,Grade
Victor Anderson,627,705,2,Female,2000,6
Jack Scott,808,643,1,Male,2002,4
Notice that extra white space has been removed so that we can call table and row methods without special cleanup code.
Table table;
int xMin;
int yMin;
int xMax;
int yMax;
void setup() {
size(900, 900);
table = loadTable("plot_data.csv", "header");
background(0);
fill(255);
// we get the minimum and maximum values for X and Y so we can scale to our canvas
xMin = getMin(table, "X");
yMin = getMin(table, "Y");
xMax = getMax(table, "X");
yMax = getMax(table, "Y");
}
void draw(){
background(0);
boolean labelDrawn = false;
for (TableRow row : table.rows()) {
int x = row.getInt("X");
// the map function takes our value and maps it to a value that fits on our canvas
float xPos = map(x, xMin, xMax, 0, width);
int y= row.getInt("Y");
float yPos = map(y, yMin, yMax, 0, height);
ellipse(xPos,yPos,10,10);
// Here we put a name label on the plotted data when the mouse is over it
if(dist(mouseX,mouseY,xPos,yPos)<10 && !labelDrawn){
fill(255);
text(row.getString("Name"),mouseX-10,mouseY-10);
labelDrawn = true; // only draw one label at a time
}
}
}
int getMin(Table table, String col){
int min = Integer.MAX_VALUE;
for (TableRow row : table.rows()){
if (row.getInt(col)<min){
min = row.getInt(col);
}
}
return min;
}
int getMax(Table table, String col){
int max = Integer.MIN_VALUE;
for (TableRow row : table.rows()){
if (row.getInt(col)>max){
max = row.getInt(col);
}
}
return max;
}

For loop crashes in gnuplot

I am experiencing an erratic behaviour on a heavy lined gnuplot script.
The script intends to plot borderlines. Filledcurve and line styles are used. Also a mono-parametric family of curves are displayed via
plot for [i=1:4] f(x,i) title sprintf("f(x,i) i=%d",i) w l dt 2
I tweeted an example here https://pbs.twimg.com/media/DHbeEgXXkAAV6aS.jpg:large
After this plot I continued introducing Argentina. After a new lines of codes the dashed lines in blue started to behave erratically. Two of them simply vanished.
I started debugging the code and ultimately make use of the set table utility outputing the whole plot into a single file.
I found the for-loop was behaving erraticly. The title line showed that the loop variable i was correctly running from start to end values. However the function f(x,i) was not properly fed with the correct value of i.
On the first iteration the value the loop worked fine and the script plotted f(x,1). On the second iteration the function was correctly fed and displayed f(x,2). But hereafter the function was always fed with i=2 despite title string correctly showed the change in i
The resulting plot was
I would really appreciate a hint helping me to find the bug.
I will now add a few lines of code. The plot command
plot for [i=1:words(europa_cet)] countries u ($1+($2<36?1:0)*15):(mercator($2)) index (word(europa_cet,i)+0) w filledcurve lc rgb colorFondoCET,\
for [i=1:words(paises_wet)] countries u ($1+15+dst*word(dst_wet,i)*15):(mercator($2)) index (word(paises_wet,i)+0) w filledcurve lc rgb colorFondoWET,\
for [i=1:words(southAm)] countries u ($1+(4+dst*word(dstSouthAm,i))*15):(mercator(abs($2))) index int(word(southAm,i)) w filledcurve lc rgb colorFondoGET,\
"magallanes.dat" u (new!=2?$1+dst*15:1/0):(mercator($2)) w filledcurve lc rgb colorFondoEET,\
"no_magallanes.dat" u (new!=2?$1:1/0):(mercator($2)) w filledcurve lc rgb colorFondoEET,\
for [k=1:5:1] mercator(orto(x,real(b1)+real(k)*real(15.0))) tit sprintf("orto a %02d signo %d",k,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+1*15)) tit sprintf("no loop orto a %02d signo %d",1,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+2*15)) tit sprintf("no loop orto a %02d signo %d",2,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+3*15)) tit sprintf("no loop orto a %02d signo %d",3,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+4*15)) tit sprintf("no loop orto a %02d signo %d",4,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+5*15)) tit sprintf("no loop orto a %02d signo %d",5,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+0*15)) tit sprintf("no loop orto a %02d signo %d",0,Signo) w l lc rgb 'blue' dt 2 lw 3,\
The for-loop (line #6) do not work. Line-by-line calls (lines 7-12) do work.
And the output obtained from set table, first lines of each iteration
# Curve 43 of 104, 100 points FIRST iteration
# Curve title: "orto a 01 signo 1"
# x y type
-22.5 0.874361° i
-21.9697 0.887919° i
-21.4394 0.901287° i
-20.9091 0.914467° i
# Curve 44 of 104, 100 points SECOND iter
# Curve title: "orto a 02 signo 1"
# x y type
-22.5 0.407722° i
-21.9697 0.426936° i
-21.4394 0.445968° i
-20.9091 0.464814° i
# Curve 45 of 104, 100 points THIRD iter (REPEATS 2nd)
# Curve title: "orto a 03 signo 1"
# x y type
-22.5 0.407722° i
-21.9697 0.426936° i
-21.4394 0.445968° i
-20.9091 0.464814° i
# Curve 46 of 104, 100 points FOURTH iter (repeats 2nd)
# Curve title: "orto a 04 signo 1"
# x y type
-22.5 0.407722° i
-21.9697 0.426936° i
-21.4394 0.445968° i
-20.9091 0.464814° i
# Curve 47 of 104, 100 points FIFTH iter (repeats 2nd)
# Curve title: "orto a 05 signo 1"
# x y type
-22.5 0.407722° i
-21.9697 0.426936° i
-21.4394 0.445968° i
-20.9091 0.464814° i
Now I have isolated the problem.
I am pasting again the plot (multi)line. The plot line is run two times one for the left (sunrise) subplot, the second for the right (sunset) subplot in a multiplot environment. They differ in the definition of the function orto and the value for b1.
plot "usa.dat" u (usaBool?($1+6*15):1/0):(mercator($2)) w filledcurve lc rgb colorFondoFET,\
for [i=1:words(europa_cet)] countries u ($1+($2<36?1:0)*15):(mercator($2)) index (word(europa_cet,i)+0) w filledcurve lc rgb colorFondoCET,\
for [i=1:words(paises_wet)] countries u ($1+15+dst*word(dst_wet,i)*15):(mercator($2)) index (word(paises_wet,i)+0) w filledcurve lc rgb colorFondoWET,\
for [i=1:words(southAm)] countries u ($1+(4+dst*word(dstSouthAm,i))*15):(mercator(abs($2))) index int(word(southAm,i)) w filledcurve lc rgb colorFondoGET,\
"magallanes.dat" u (new!=2?$1+dst*15:1/0):(mercator($2)) w filledcurve lc rgb colorFondoEET,\
"no_magallanes.dat" u (new!=2?$1:1/0):(mercator($2)) w filledcurve lc rgb colorFondoEET,\
mercator(orto(x,b1+1*15)) tit sprintf("no loop orto a %02d signo %d",1,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+2*15)) tit sprintf("no loop orto a %02d signo %d",2,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+3*15)) tit sprintf("no loop orto a %02d signo %d",3,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+4*15)) tit sprintf("no loop orto a %02d signo %d",4,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+5*15)) tit sprintf("no loop orto a %02d signo %d",5,Signo) w l lc rgb 'blue' dt 2 lw 3,\
mercator(orto(x,b1+0*15)) tit sprintf("no loop orto a %02d signo %d",0,Signo) w l lc rgb 'blue' dt 2 lw 3,\
for [i=1:words(europa_cet)] countries u ($1+($2<36?1:0)*15):(mercator($2)) index (word(europa_cet,i)+0) w l lt 1 lc rgb colorLinea,\
for [i=1:words(paises_wet)] countries u ($1+15*(1+dst*word(dst_wet,i))):(mercator($2)) index (word(paises_wet,i)+0) w l lt 1 lc rgb colorLinea,\
for [i=1:words(southAm)] countries u ($1+(4+dst*word(dstSouthAm,i))*15):(mercator(abs($2))) index int(word(southAm,i)) w l lt 1 lc rgb colorLinea,\
"magallanes.dat" u (new!=2?$1+dst*15:1/0):(mercator($2)) w l lt 1 lc rgb colorLinea,\
"no_magallanes.dat" u (new!=2?$1:1/0):(mercator($2)) w l lt 1 lc rgb colorLinea,\
"usa.dat" u (usaBool?($1+6*15):1/0):(mercator($2)) w l lt 1 lc rgb 'gray50',\
for [i=0:2] "ciudadesChile.dat" u ($2-($6-1+((new==0&i==2)?-1:((new==2&i==1)?1:0))-dst*word(dstSouthAm,i+1))*15):(mercator(abs($1))) index i w p pt 7 lc 1 ps 1.5,\
"ciudadesUsa.dat" u (usaBool?($2+6*15):1/0):(mercator(abs($1))) w p pt 7 lc rgb 'blue' ps 1.5,\
for [k=1:5:1] mercator(orto(x,real(b1)+real(k)*real(15.0))) tit sprintf("orto a %02d signo %d",k,Signo) w l lc rgb 'green' dt 2 lw 10,\
for [i=0:2] "ciudadesChile.dat" u ($2-($6-1+((new==0&i==2)?-1:((new==2&i==1)?1:0))-dst*real(word(dstSouthAm,i+1)+0))*15-$4*.5):(mercator(abs($1+$5*1.1))):(stringcolumn($3)) index i w labels font "Arial-Bold,10" tc rgb word(corporativa,1),\
"ciudadesUsa.dat" u (usaBool?($2+6*15-$4*.5):1/0):(mercator(abs($1+$5*1.1))):3 w labels font "Arial-Bold,10" tc rgb 'blue',\
for [i=1:words(Coastlines)] coastline u 1:(mercator($2)) index (word(Coastlines,i)+0) w l lt 1 lc rgb 'gray50',\
Problem is with the 3rd line from bottom, the one starting with for [i=0:2].
If this line is removed; the loop for [k=1:5:1] operates ok.
If for [k=1:5:1] is placed after 3rd line from bottom, the script operates ok.
If for [k=1:5:1] is placed before 3rd line from bottom, the loop does not operate fine.
The for [i=0:2] prints names of Chilean and Argentinean cities. Column $1 is latitude, $2 is longitude. They are displaced from center throu $4 and $5. The variable i through index controls the country: Argentina=0 Chile=1 RegionMagallanes (Chile)=2. The variable new controls time zone in Chile new=0 is prior to 2015; new=1 is current status, new=2 refers to 2015. Column $6refers to a reference time zone (-3 Argentina and Magallanes, -4 Chile).
The script also controls winter (dst=0) or summer (dst=1) condition.
I have tried and remove pieces (monomials) of the for [i=0:2] line one by one. No success. Even I edited the ciudadesChile.dat file and changed a tab separating $1 and $2 to a space. No success.

Resources