Here is a sample input text file generated with the cal command:
$ cal 2743 > sample_text
In this example this file have 2180 characters
$ wc sample_text
36 462 2180 sample_text
I want to split it into smaller files each one having no more than 700 lines but preserving lines in complete state (no line can be cut)
I can view each such block with following awk code:
$ awk '{l=length+l;if(l<=700){print l,$0}else{l=length;print "\nnext block\n",l,$0}}' sample_text
32 2743
98 January February March
164 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
230 1 2 1 2 3 4 5 6 1 2 3 4 5 6
296 3 4 5 6 7 8 9 7 8 9 10 11 12 13 7 8 9 10 11 12 13
362 10 11 12 13 14 15 16 14 15 16 17 18 19 20 14 15 16 17 18 19 20
428 17 18 19 20 21 22 23 21 22 23 24 25 26 27 21 22 23 24 25 26 27
494 24 25 26 27 28 29 30 28 28 29 30 31
560 31
560
626 April May June
692 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
next block
66 1 2 3 1 1 2 3 4 5
132 4 5 6 7 8 9 10 2 3 4 5 6 7 8 6 7 8 9 10 11 12
198 11 12 13 14 15 16 17 9 10 11 12 13 14 15 13 14 15 16 17 18 19
264 18 19 20 21 22 23 24 16 17 18 19 20 21 22 20 21 22 23 24 25 26
330 25 26 27 28 29 30 23 24 25 26 27 28 29 27 28 29 30
396 30 31
396
462 July August September
528 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
594 1 2 3 1 2 3 4 5 6 7 1 2 3 4
660 4 5 6 7 8 9 10 8 9 10 11 12 13 14 5 6 7 8 9 10 11
next block
66 11 12 13 14 15 16 17 15 16 17 18 19 20 21 12 13 14 15 16 17 18
132 18 19 20 21 22 23 24 22 23 24 25 26 27 28 19 20 21 22 23 24 25
198 25 26 27 28 29 30 31 29 30 31 26 27 28 29 30
264
264
330 October November December
396 Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
462 1 2 1 2 3 4 5 6 1 2 3 4
528 3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11
594 10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18
660 17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25
next block
66 24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31
132 31
I have the problem to save each max 700 chars block into separate file - with following command it only produces one file.0, and expected were split files file.0, file.1, file.2 and file.3 for this input example
$ awk 'c=0;{l=length+l;if(l<=700){print>"file."c}else{c=c++;l=length;print>"file."c}}' sample_text
$ cksum *
3868619974 2180 file.0
3868619974 2180 sample_text
This should do it:
BEGIN {
maxChars = 700
out = "file.0"
}
{
numChars = length($0)
totChars += numChars
if ( totChars > maxChars ) {
close(out)
out = "file." ++cnt
totChars = numChars
}
print > out
}
Related
*****Shell Script*******
Given a month and the day of the week that's the first of that month, print a calendar for the month. (Remember, number of days in months is different and use \n to go to a new line.)
Unix has a cal command especially for this purpose.
By default, cal shows the current month's calendar.
mayankp#mayank:~/$ cal
November 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
If you want a calendar for a specific month of a specific year, do this:
mayankp#mayank:~/$ cal 1 2018
January 2018
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
This displays the calendar for January 2018.
So, your shell script would be:(ex: calendar.sh)
#!/usr/bin/env bash
month=$1
year=$2
cal $1 $2
Run the script like this:
mayankp#mayank:~/$ sh calendar.sh 3 2018
March 2018
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Let me know if this helps.
// L is a list and n is its length //
// we assume that n= 4**k , for k≥1//
Alg1(L,n)
remove the smallest and largest element from L
if n-2 > (4**k)/2
call Alg1(L, n-2)
Not what it does but what is it intended to do? I don't understand what the question means by "intended" but I think the algorithm just removes the largest and smallest element of the list recursively until 4 or 3 elements remain.
Given a starting list of size 4^k, which appears to be implied by the definition given for n, alg1 reduces the size of the supplied list to ((4^k) / 2) + 2 for k >= 1. I agree with #Ctznkane525 that the algorithm is incompletely specified in that it doesn't tell us what the return value should be. But if we make the simple assumption that two elements should be removed from the end of the list each time n is decremented by 2 we can continue. Thus, consider the following implementation in Clojure:
(defn exp [x n]
(reduce * (repeat n x)))
(def k 1)
(defn alg1[l n]
(println "k=" k " n=" n " l=" l)
(if (> (- n 2) (/ (exp 4 k) 2))
(recur (take (- n 2) l) (- n 2))
l))
I've added code here to print the values of k, n, and l so we can watch what happens at each step.
Given the above we'll start a little testing. We'll invoke alg1 as (alg1 (take (exp 4 k) (iterate #(+ 1 %) 1)) (exp 4 k)), which simply creates a list of 4^k elements and passes it as the first argument to alg1, and passes 4^k for the second argument. So here goes:
user=> (def k 1)
#'user/k
user=> (alg1 (take (exp 4 k) (iterate #(+ 1 %) 1)) (exp 4 k))
k= 1 n= 4 l= (1 2 3 4)
(1 2 3 4)
So with k=1 and the list defined as (1 2 3 4) the function returns immediately, because n-2 = 2, and that's less than or equal to (4^k)/2, which is also 2.
Let's try with k=2:
user=> (def k 2)
#'user/k
user=> (alg1 (take (exp 4 k) (iterate #(+ 1 %) 1)) (exp 4 k))
k= 2 n= 16 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16)
k= 2 n= 14 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14)
k= 2 n= 12 l= (1 2 3 4 5 6 7 8 9 10 11 12)
k= 2 n= 10 l= (1 2 3 4 5 6 7 8 9 10)
(1 2 3 4 5 6 7 8 9 10)
Ah, that's a bit more interesting. We start with n=16, which is of course 4^k = 4^2 = 16, and the beginning list is (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16). When these values are considered by alg1 it finds that n-2 (14) is greater than (4^2)/2 (8), so it trims two elements from the end of the list and recursively invokes itself. On the second iteration it finds that n-2 (12) is greater than 8 so it trims another two elements and recursively invokes itself. This continues until n=10, when alg1 finds that n-2 (8) is no longer greater than (4^2)/2 (8), so it returns the list (1 2 3 4 5 6 7 8 9 10).
What happens with k=3?
user=> (def k 3)
#'user/k
user=> (alg1 (take (exp 4 k) (iterate #(+ 1 %) 1)) (exp 4 k))
k= 3 n= 64 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64)
k= 3 n= 62 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62)
k= 3 n= 60 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60)
k= 3 n= 58 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58)
k= 3 n= 56 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56)
k= 3 n= 54 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54)
k= 3 n= 52 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52)
k= 3 n= 50 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50)
k= 3 n= 48 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48)
k= 3 n= 46 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46)
k= 3 n= 44 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44)
k= 3 n= 42 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42)
k= 3 n= 40 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40)
k= 3 n= 38 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38)
k= 3 n= 36 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36)
k= 3 n= 34 l= (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34)
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34)
Similar results to the above. At each iteration two elements are trimmed from the list until the condition specified in the algorithm is reached, at which point the algorithm exits.
You can continue bumping up the value of k, building the arguments, and watching the algorithm work, but in the end the results are always similar: the list is reduced in size to ((4^k) / 2) + 2.
Best of luck.
I've got 2 log files generated by a traffic generator.
The format of the logs is:
[packet ID][Hour Tx][Min Tx][Sec Tx][Hour Rx][Min Rx][Sec Rx][packet size][flow number]
The first file is the sender log:
1 13 15 17.799915 13 15 17.799915 512 1
2 13 15 17.800016 13 15 17.800016 512 1
3 13 15 17.800034 13 15 17.800034 512 1
4 13 15 17.800050 13 15 17.800050 512 1
5 13 15 17.800081 13 15 17.800081 512 1
6 13 15 17.800094 13 15 17.800094 512 1
7 13 15 17.800117 13 15 17.800117 512 1
8 13 15 17.800126 13 15 17.800126 512 1
9 13 15 17.800135 13 15 17.800135 512 1
10 13 15 17.800157 13 15 17.800157 512 1
11 13 15 17.800166 13 15 17.800166 512 1
12 13 15 17.800173 13 15 17.800173 512 1
13 13 15 17.800181 13 15 17.800181 512 1
14 13 15 17.800202 13 15 17.800202 512 1
15 13 15 17.800212 13 15 17.800212 512 1
16 13 15 17.800220 13 15 17.800220 512 1
17 13 15 17.800228 13 15 17.800228 512 1
18 13 15 17.800257 13 15 17.800257 512 1
19 13 15 17.800266 13 15 17.800266 512 1
20 13 15 17.800274 13 15 17.800274 512 1
21 13 15 17.800297 13 15 17.800297 512 1
22 13 15 17.800305 13 15 17.800305 512 1
23 13 15 17.800313 13 15 17.800313 512 1
24 13 15 17.800321 13 15 17.800321 512 1
25 13 15 17.800343 13 15 17.800343 512 1
26 13 15 17.800351 13 15 17.800351 512 1
27 13 15 17.800359 13 15 17.800359 512 1
28 13 15 17.800367 13 15 17.800367 512 1
29 13 15 17.800387 13 15 17.800387 512 1
30 13 15 17.800397 13 15 17.800397 512 1
31 13 15 17.800404 13 15 17.800404 512 1
32 13 15 17.800414 13 15 17.800414 512 1
33 13 15 17.800436 13 15 17.800436 512 1
34 13 15 17.800444 13 15 17.800444 512 1
35 13 15 17.800452 13 15 17.800452 512 1
36 13 15 17.800460 13 15 17.800460 512 1
37 13 15 17.800483 13 15 17.800483 512 1
38 13 15 17.800491 13 15 17.800491 512 1
39 13 15 17.800499 13 15 17.800499 512 1
40 13 15 17.800507 13 15 17.800507 512 1
and it continues for several thousands lines.
The second file is the receiver file:
1 13 15 17.799915 13 15 17.800965 512 1
3 13 15 17.800034 13 15 17.801605 512 1
5 13 15 17.800081 13 15 17.802808 512 1
7 13 15 17.800117 13 15 17.811653 512 1
8 13 15 17.800126 13 15 17.811686 512 1
9 13 15 17.800135 13 15 17.811992 512 1
11 13 15 17.800166 13 15 17.812425 512 1
13 13 15 17.800181 13 15 17.812966 512 1
15 13 15 17.800212 13 15 17.814371 512 1
17 13 15 17.800228 13 15 17.814813 512 1
19 13 15 17.800266 13 15 17.815244 512 1
21 13 15 17.800297 13 15 17.815804 512 1
23 13 15 17.800313 13 15 17.816314 512 1
25 13 15 17.800343 13 15 17.816805 512 1
27 13 15 17.800359 13 15 17.817385 512 1
29 13 15 17.800387 13 15 17.817930 512 1
31 13 15 17.800404 13 15 17.819176 512 1
33 13 15 17.800436 13 15 17.819654 512 1
35 13 15 17.800452 13 15 17.820115 512 1
37 13 15 17.800483 13 15 17.820649 512 1
39 13 15 17.800499 13 15 17.821185 512 1
41 13 15 17.800528 13 15 17.821781 512 1
43 13 15 17.800545 13 15 17.822329 512 1
45 13 15 17.800573 13 15 17.822976 512 1
47 13 15 17.800590 13 15 17.824001 512 1
49 13 15 17.800619 13 15 17.824448 512 1
51 13 15 17.800738 13 15 17.824963 512 1
53 13 15 17.800772 13 15 17.828931 512 1
55 13 15 17.800788 13 15 17.829416 512 1
57 13 15 17.801005 13 15 17.829820 512 1
59 13 15 17.801035 13 15 17.830404 512 1
61 13 15 17.801053 13 15 17.830873 512 1
63 13 15 17.801088 13 15 17.831448 512 1
65 13 15 17.801106 13 15 17.832285 512 1
67 13 15 17.801225 13 15 17.832860 512 1
69 13 15 17.801243 13 15 17.833318 512 1
71 13 15 17.801274 13 15 17.833921 512 1
73 13 15 17.801290 13 15 17.834448 512 1
75 13 15 17.801321 13 15 17.834983 512 1
77 13 15 17.801339 13 15 17.835492 512 1
and it continues for several thousands lines.
The first column of the second file is not necessarily ordered.
As you've probably seen, lines of the 2 files starting with the same ID are not equal (the timestamps are different).
I'd like to isolate those packets (lines) that are in the first file but that are missing in the second file. That is I'd like to know the timestamps of packets that have been sent but not received. The primary key of those files is the first column (ID of the packets sent).
The problem is that I tried with sort and join but I couldn't be able to get the results that I wanted.
Thank you
You can use this awk script for that:
awk 'FNR==NR{a[$1]=$0;next} !($1 in a) {print $1, $4}' file2 file1
91
58
54
108
52
18
8
81
103
110
129
137
84
15
14
18
11
17
12
6
1
28
6
14
8
8
0
0
28
24
25
23
21
13
9
4
18
17
18
30
13
3
I want to split into chunks of six entries each.After that it will break the loop.Then it will continue the entries 7..12, then of 13..18 etc.
(for loop?continue?break?)
You can use paste:
paste -d' ' - - - - - - < inputfile
For your input, it'd return:
91 58 54 108 52 18
8 81 103 110 129 137
84 15 14 18 11 17
12 6 1 28 6 14
8 8 0 0 28 24
25 23 21 13 9 4
18 17 18 30 13 3
$ xargs -n 6 < file_name
91 58 54 108 52 18
8 81 103 110 129 137
84 15 14 18 11 17
12 6 1 28 6 14
8 8 0 0 28 24
25 23 21 13 9 4
18 17 18 30 13 3
I am trying to learn bash at a deeper level, and I decided to make a multiplication table. I have the functionality with the statement :
echo $[{1..10}*{1..10}]
but that gives me the following output:
1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 12 14 16 18 20 3 6 9 12 15 18 21 24 27 30 4 8 12 16 20 24 28 32 36 40 5 10 15 20 25 30 35 40 45 50 6 12 18 24 30 36 42 48 54 60 7 14 21 28 35 42 49 56 63 70 8 16 24 32 40 48 56 64 72 80 9 18 27 36 45 54 63 72 81 90 10 20 30 40 50 60 70 80 90 100
Is there any way to format this output like the following using only 1 statement (i can figure out how to do this with loops, but that's no fun :p )
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
6 12 18 24 30 36 42 48 54 60
7 14 21 28 35 42 49 56 63 70
8 16 24 32 40 48 56 64 72 80
9 18 27 36 45 54 63 72 81 90
10 20 30 40 50 60 70 80 90 100
Is it even possible to do in one statement, or would I have to loop?
Use this line for a nice output without using loops:
echo $[{1..10}*{1..10}] | xargs -n10 | column -t
Output:
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
6 12 18 24 30 36 42 48 54 60
7 14 21 28 35 42 49 56 63 70
8 16 24 32 40 48 56 64 72 80
9 18 27 36 45 54 63 72 81 90
10 20 30 40 50 60 70 80 90 100
Update
As a logical next step, I asked here if this multiplication table can have a variable range. With this help, my answer works with a variable ($boundary) range and stays quite readable:
boundary=4; eval echo $\[{1..$boundary}*{1..$boundary}\] | xargs -n$boundary | column -t
Output:
1 2 3 4
2 4 6 8
3 6 9 12
4 8 12 16
Also note that the $[..] arithmetic notation is deprecated and $((...)) should be used instead:
boundary=4; eval eval echo "$\(\({1..$boundary}*{1..$boundary}\)\)" | xargs -n$boundary | column -t
The printf built-in repeats its format as many times as necessary to print all arguments, so:
printf '%d %d %d %d %d %d %d %d %d %d\n' $[{1..10}*{1..10}]
If you want to avoid repeating the %d bit, it's trickier.
printf "$(echo %$[{1..10}*0]d)\\n" $[{1..10}*{1..10}]
In production code, use a loop.