Given a matrix contained in a text file, print all the columns - shell

Let's suppose I have a textfile with a matrix:
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
I want to print all the columns in this way:
0 5 10 15 20
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
How can I do that in shell script or another programming language?

this sounds like a homework assignment.
If that is the case, you must look for the answer in order to learn.
I will give you some tips
https://mkyong.com/java/how-to-convert-file-into-an-array-of-bytes/
https://www.javatpoint.com/java-program-to-transpose-matrix
With this you can put together a puzzle.

Related

splitting file to smaller max n-chars files without cutting any line

Here is a sample input text file generated with the cal command:
$ cal 2743 > sample_text
In this example this file have 2180 characters
$ wc sample_text
36 462 2180 sample_text
I want to split it into smaller files each one having no more than 700 lines but preserving lines in complete state (no line can be cut)
I can view each such block with following awk code:
$ awk '{l=length+l;if(l<=700){print l,$0}else{l=length;print "\nnext block\n",l,$0}}' sample_text
32 2743
98 January February March
164 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
230 1 2 1 2 3 4 5 6 1 2 3 4 5 6
296 3 4 5 6 7 8 9 7 8 9 10 11 12 13 7 8 9 10 11 12 13
362 10 11 12 13 14 15 16 14 15 16 17 18 19 20 14 15 16 17 18 19 20
428 17 18 19 20 21 22 23 21 22 23 24 25 26 27 21 22 23 24 25 26 27
494 24 25 26 27 28 29 30 28 28 29 30 31
560 31
560
626 April May June
692 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
next block
66 1 2 3 1 1 2 3 4 5
132 4 5 6 7 8 9 10 2 3 4 5 6 7 8 6 7 8 9 10 11 12
198 11 12 13 14 15 16 17 9 10 11 12 13 14 15 13 14 15 16 17 18 19
264 18 19 20 21 22 23 24 16 17 18 19 20 21 22 20 21 22 23 24 25 26
330 25 26 27 28 29 30 23 24 25 26 27 28 29 27 28 29 30
396 30 31
396
462 July August September
528 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
594 1 2 3 1 2 3 4 5 6 7 1 2 3 4
660 4 5 6 7 8 9 10 8 9 10 11 12 13 14 5 6 7 8 9 10 11
next block
66 11 12 13 14 15 16 17 15 16 17 18 19 20 21 12 13 14 15 16 17 18
132 18 19 20 21 22 23 24 22 23 24 25 26 27 28 19 20 21 22 23 24 25
198 25 26 27 28 29 30 31 29 30 31 26 27 28 29 30
264
264
330 October November December
396 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
462 1 2 1 2 3 4 5 6 1 2 3 4
528 3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11
594 10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18
660 17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25
next block
66 24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31
132 31
I have the problem to save each max 700 chars block into separate file - with following command it only produces one file.0, and expected were split files file.0, file.1, file.2 and file.3 for this input example
$ awk 'c=0;{l=length+l;if(l<=700){print>"file."c}else{c=c++;l=length;print>"file."c}}' sample_text
$ cksum *
3868619974 2180 file.0
3868619974 2180 sample_text
This should do it:
BEGIN {
maxChars = 700
out = "file.0"
}
{
numChars = length($0)
totChars += numChars
if ( totChars > maxChars ) {
close(out)
out = "file." ++cnt
totChars = numChars
}
print > out
}

Shell Script to print a calendar after the user specifies the month and a day

*****Shell Script*******
Given a month and the day of the week that's the first of that month, print a calendar for the month. (Remember, number of days in months is different and use \n to go to a new line.)
Unix has a cal command especially for this purpose.
By default, cal shows the current month's calendar.
mayankp#mayank:~/$ cal
November 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
If you want a calendar for a specific month of a specific year, do this:
mayankp#mayank:~/$ cal 1 2018
January 2018
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
This displays the calendar for January 2018.
So, your shell script would be:(ex: calendar.sh)
#!/usr/bin/env bash
month=$1
year=$2
cal $1 $2
Run the script like this:
mayankp#mayank:~/$ sh calendar.sh 3 2018
March 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Let me know if this helps.

Moving average with successive elements using awk

I am trying to write a script in which each row element will give the average of next N rows (including itself). I know how to do it with preceding rows like the Nth row will give the average of the preceding N rows. Here is the script for that
awk '
BEGIN{
N = 5;
}
{
x = $2;
i = NR % N;
aveg += (x - X[i]) / N;
X[i] = x;
print $1, $2, aveg;
}' < file > aveg.txt
where file looks like this
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
20 20
21 21
22 22
23 23
24 24
25 25
26 26
27 27
28 28
29 29
30 30
31 31
32 32
33 33
34 34
35 35
36 36
37 37
38 38
39 39
40 40
I want that the first row has average of the next 5 elements i.e.
(1+2+3+4+5)/5=3
second row (2+3+4+5+6)/5=4
third row (3+4+5+6+7)/5=5
and so on. The rows should look like
1 1 3
2 2 4
3 3 5
4 4 6 ...
Can it be done as simply as the script shown above? I was thinking of assigning the row value as the value of nth row below and then proceeding with the above script. But, unfortunately I am unable to assign the row value to some value down the file. Can someone help me to write this script and find the moving average. I am open to other commands in shell as well.
$ cat test.awk
BEGIN {
N=5 # the window size
}
{
n[NR]=$1 # store the value in an array
}
NR>=N { # for records where NR >= N
x=0 # reset the sum variable
delete n[NR-N] # delete the one out the window of N
for(i in n) # all array elements
x+=n[i] # ... must be summed
print n[NR-(N-1)],x/N # print the row from the beginning of window
} # and the related window average
Try it:
$ for i in {1..36}; do echo $i $i >> test.in ; done
$ awk -f test.awk test.in
1 3
2 4
3 5
...
30 32
31 33
32 34
It could be done in running sum, add current and subtract n[NR-N], like this:
BEGIN {
N=5
}
{
n[NR]=$1
x+=$1-n[NR-N]
}
NR>=N {
delete n[NR-N]
print n[NR-(N-1)],x/N
}
Using a N-sized array
BEGIN { N=5 }
{
s+=array[i++]=$1
if (i>=N) i=0
}
NR>=N {
print array[i], s/N
s-=array[i]
}
$ cat tst.awk
BEGIN { OFS="\t"; range=5 }
{ recs[NR%range] = $0 }
NR >= range {
sum = 0
for (i in recs) {
split(recs[i],flds)
sum += flds[2]
}
print recs[(NR+1-range)%range], sum / range
}
.
$ awk -f tst.awk file
1 1 3
2 2 4
3 3 5
4 4 6
5 5 7
6 6 8
7 7 9
8 8 10
9 9 11
10 10 12
11 11 13
12 12 14
13 13 15
14 14 16
15 15 17
16 16 18
17 17 19
18 18 20
19 19 21
20 20 22
21 21 23
22 22 24
23 23 25
24 24 26
25 25 27
26 26 28
27 27 29
28 28 30
29 29 31
30 30 32
31 31 33
32 32 34
33 33 35
34 34 36
35 35 37
36 36 38

Bash - compare 2 text files and find missing lines

I've got 2 log files generated by a traffic generator.
The format of the logs is:
[packet ID][Hour Tx][Min Tx][Sec Tx][Hour Rx][Min Rx][Sec Rx][packet size][flow number]
The first file is the sender log:
1 13 15 17.799915 13 15 17.799915 512 1
2 13 15 17.800016 13 15 17.800016 512 1
3 13 15 17.800034 13 15 17.800034 512 1
4 13 15 17.800050 13 15 17.800050 512 1
5 13 15 17.800081 13 15 17.800081 512 1
6 13 15 17.800094 13 15 17.800094 512 1
7 13 15 17.800117 13 15 17.800117 512 1
8 13 15 17.800126 13 15 17.800126 512 1
9 13 15 17.800135 13 15 17.800135 512 1
10 13 15 17.800157 13 15 17.800157 512 1
11 13 15 17.800166 13 15 17.800166 512 1
12 13 15 17.800173 13 15 17.800173 512 1
13 13 15 17.800181 13 15 17.800181 512 1
14 13 15 17.800202 13 15 17.800202 512 1
15 13 15 17.800212 13 15 17.800212 512 1
16 13 15 17.800220 13 15 17.800220 512 1
17 13 15 17.800228 13 15 17.800228 512 1
18 13 15 17.800257 13 15 17.800257 512 1
19 13 15 17.800266 13 15 17.800266 512 1
20 13 15 17.800274 13 15 17.800274 512 1
21 13 15 17.800297 13 15 17.800297 512 1
22 13 15 17.800305 13 15 17.800305 512 1
23 13 15 17.800313 13 15 17.800313 512 1
24 13 15 17.800321 13 15 17.800321 512 1
25 13 15 17.800343 13 15 17.800343 512 1
26 13 15 17.800351 13 15 17.800351 512 1
27 13 15 17.800359 13 15 17.800359 512 1
28 13 15 17.800367 13 15 17.800367 512 1
29 13 15 17.800387 13 15 17.800387 512 1
30 13 15 17.800397 13 15 17.800397 512 1
31 13 15 17.800404 13 15 17.800404 512 1
32 13 15 17.800414 13 15 17.800414 512 1
33 13 15 17.800436 13 15 17.800436 512 1
34 13 15 17.800444 13 15 17.800444 512 1
35 13 15 17.800452 13 15 17.800452 512 1
36 13 15 17.800460 13 15 17.800460 512 1
37 13 15 17.800483 13 15 17.800483 512 1
38 13 15 17.800491 13 15 17.800491 512 1
39 13 15 17.800499 13 15 17.800499 512 1
40 13 15 17.800507 13 15 17.800507 512 1
and it continues for several thousands lines.
The second file is the receiver file:
1 13 15 17.799915 13 15 17.800965 512 1
3 13 15 17.800034 13 15 17.801605 512 1
5 13 15 17.800081 13 15 17.802808 512 1
7 13 15 17.800117 13 15 17.811653 512 1
8 13 15 17.800126 13 15 17.811686 512 1
9 13 15 17.800135 13 15 17.811992 512 1
11 13 15 17.800166 13 15 17.812425 512 1
13 13 15 17.800181 13 15 17.812966 512 1
15 13 15 17.800212 13 15 17.814371 512 1
17 13 15 17.800228 13 15 17.814813 512 1
19 13 15 17.800266 13 15 17.815244 512 1
21 13 15 17.800297 13 15 17.815804 512 1
23 13 15 17.800313 13 15 17.816314 512 1
25 13 15 17.800343 13 15 17.816805 512 1
27 13 15 17.800359 13 15 17.817385 512 1
29 13 15 17.800387 13 15 17.817930 512 1
31 13 15 17.800404 13 15 17.819176 512 1
33 13 15 17.800436 13 15 17.819654 512 1
35 13 15 17.800452 13 15 17.820115 512 1
37 13 15 17.800483 13 15 17.820649 512 1
39 13 15 17.800499 13 15 17.821185 512 1
41 13 15 17.800528 13 15 17.821781 512 1
43 13 15 17.800545 13 15 17.822329 512 1
45 13 15 17.800573 13 15 17.822976 512 1
47 13 15 17.800590 13 15 17.824001 512 1
49 13 15 17.800619 13 15 17.824448 512 1
51 13 15 17.800738 13 15 17.824963 512 1
53 13 15 17.800772 13 15 17.828931 512 1
55 13 15 17.800788 13 15 17.829416 512 1
57 13 15 17.801005 13 15 17.829820 512 1
59 13 15 17.801035 13 15 17.830404 512 1
61 13 15 17.801053 13 15 17.830873 512 1
63 13 15 17.801088 13 15 17.831448 512 1
65 13 15 17.801106 13 15 17.832285 512 1
67 13 15 17.801225 13 15 17.832860 512 1
69 13 15 17.801243 13 15 17.833318 512 1
71 13 15 17.801274 13 15 17.833921 512 1
73 13 15 17.801290 13 15 17.834448 512 1
75 13 15 17.801321 13 15 17.834983 512 1
77 13 15 17.801339 13 15 17.835492 512 1
and it continues for several thousands lines.
The first column of the second file is not necessarily ordered.
As you've probably seen, lines of the 2 files starting with the same ID are not equal (the timestamps are different).
I'd like to isolate those packets (lines) that are in the first file but that are missing in the second file. That is I'd like to know the timestamps of packets that have been sent but not received. The primary key of those files is the first column (ID of the packets sent).
The problem is that I tried with sort and join but I couldn't be able to get the results that I wanted.
Thank you
You can use this awk script for that:
awk 'FNR==NR{a[$1]=$0;next} !($1 in a) {print $1, $4}' file2 file1

unix cal command special character

When I try "cal | tail -6" in my unix machine, I get -
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
but when I try "cal | tail -6 | awk '{print $7}'", I get -
10
17
24
where is 3 going ? My requirement is basically all weekdays i.e column 2,3,4,5 & 6.
But I'm getting wrong output because of the strange behavior of "cal"
There are only 3 whitespace delimited columns in your first row. cal is working exactly as corrected, you are not understanding how awk works. As far as awk is concerned there is no 7th column in your first row as it yields attention to whitespace delimited columns, not fixed width columns.
A quick google search reveals you can use
BEGIN { FIELDWIDTHS = "3 3 3 3 3 3 3" }
In your awk script.
Since all of your columns in each row are three characters wide, you could use this to extract the days you wish for. For example, if you wanted only the 7th day in a column, you could do the following:
cal | sed 's/^\(.\{18\}\).*$/\1/'
This command would remove the first 18 characters in the line, which are the entries for the first 6 days of the week.
To extract a particular day, such as the fourth day, you could do this:
cal | sed 's/^.\{9\}\(.\{3\}\).*$/\1/'
To remove the first day of the week and the last day, you could do this:
cal | sed -e 's/^.\{3\}//' -e 's/^\(.\{15\}\).\{3\}$/\1/'
May be a row-wise extraction will do the trick. Try ncal. For example:
$ ncal
November 2012
Mo 5 12 19 26
Tu 6 13 20 27
We 7 14 21 28
Th 1 8 15 22 29
Fr 2 9 16 23 30
Sa 3 10 17 24
Su 4 11 18 25
or fill the absent dates with place holder (with '-' for example):
kent$ cal -s|tail -6|awk 'NR==1&&NF<7{gsub(/^ */,"");for(i=1;i<=(7-NF);i++) x=" - "x;$0=x" "$0;}1'
- - - - 1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
then you could get the column, replace '-' with " " if needed. e.g. for $7:
kent$ cal -s|tail -6|awk 'NR==1&&NF<7{gsub(/^ */,"");for(i=1;i<=(7-NF);i++) x=" - "x;$0=x" "$0;}{print $7}'
3
10
17
24
Note that todays date is highlighted unless you turn it off (-h). Use cut to extract the wanted columns:
cal -h | cut -c19-20
Output:
Sa
3
10
17
24

Resources