bash insert space in last but one position of each line - bash

using the $ regex I can get last position of each line. but if I have the following:
12345
23456
34567
I need to add a space so it becomes
1234 5
2345 6
3456 7
Thanks!

$ sed 's/.$/ &/' file
1234 5
2345 6
3456 7

gawk -v FIELDWIDTHS='4 1' '{$1=$1}1' file
1234 5
2345 6
3456 7

Related

Is there a way to print lines from a file from n to m and than reverse their positions?

I'm trying to print text from line 10 to 20 and then reverse their positions.
I've tried this:
sed '10!G;h;$!d' file.txt
But it only prints from 10 to end of the file. Is there any way to stop it at line 20 by using only one sed command?
Almost there, you just need to replace $!d with the 'until' line-number
sed -n '10,20p' tst.txt
// Prints line 10 <--> 20
sed -n '10!G;h;20p' tst.txt
// Prints REVERSE line 10 <--> 20
output:
20
19
18
17
16
15
14
13
12
11
10
tst.txt:
1
2
3
4
...
19
20
Info
You can use this to print a range of lines:
sed -n -e 10,20p file.txt | tac
tac will reverse the order of the lines
And for those of you without tac (like those mac users out there):
sed -n -e 10,20p file.txt | tail -r

Dividing one file into separate based on line numbers

I have the following test file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
I want to separate it in a way that each file contains the last line of the previous file as the first line. The example would be:
file 1:
1
2
3
4
5
file2:
5
6
7
8
9
file3:
9
10
11
12
13
file4:
13
14
15
16
17
file5:
17
18
19
20
That would make 4 files with 5 lines and 1 file with 4 lines.
As a first step, I tried to test the following commands I wrote to get only the first file which contains the first 5 lines. I can't figure out why the awk command in the if statement, instead of printing the first 5 lines, it prints the whole 20?
d=$(wc test)
a=$(echo $d | cut -f1 -d " ")
lines=$(echo $a/5 | bc -l)
integer=$(echo $lines | cut -f1 -d ".")
for i in $(seq 1 $integer); do
start=$(echo $i*5 | bc -l)
var=$((var+=1))
echo start $start
echo $var
if [[ $var = 1 ]]; then
awk 'NR<=$start' test
fi
done
Thanks!
Why not just use the split util available from your POSIX toolkit. It has an option to split on number of lines which you can give it as 5
split -l 5 input-file
From the man split page,
-l, --lines=NUMBER
put NUMBER lines/records per output file
Note that, -l is POSIX compliant also.
$ ls
$
$ seq 20 | awk 'NR%4==1{ if (out) { print > out; close(out) } out="file"++c } {print > out}'
$
$ ls
file1 file2 file3 file4 file5
.
$ cat file1
1
2
3
4
5
$ cat file2
5
6
7
8
9
$ cat file3
9
10
11
12
13
$ cat file4
13
14
15
16
17
$ cat file5
17
18
19
20
If you're ever tempted to use a shell loop to manipulate text again, make sure to read https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice first to understand at least some of the reasons to use awk instead. To learn awk, get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
oh. and wrt why your awk command awk 'NR<=$start' test didn't work - awk is not shell, it has no more access to shell variables (or vice-versa) than a C program does. To init an awk variable named awkstart with the value of a shell variable named start and then use that awk variable in your script you'd do awk -v awkstart="$start" 'NR<=awkstart' test. The awk variable can also be named start or anything else sensible - it is completely unrelated to the name of the shell variable.
You could improve your code by removing the unneccesary echo cut and bc and do it like this
#!/bin/bash
for i in $(seq $(wc -l < test) ); do
(( i % 4 != 1 )) && continue
tail +$i test | head -5 > "file$(( 1+i/4 ))"
done
But still the awk solution is much better. Reading the file only once and taking actions based on readily available information (like the linenumber) is the way to go. In shell you have to count the lines, there is no way around it. awk will give you that (and a lot of other things) for free.
Use split:
$ seq 20 | split -l 5
$ for fn in x*; do echo "$fn"; cat "$fn"; done
xaa
1
2
3
4
5
xab
6
7
8
9
10
xac
11
12
13
14
15
xad
16
17
18
19
20
Or, if you have a file:
$ split -l test_file

How to combine column from multiple text files? [duplicate]

This question already has answers here:
How can I sum values in column based on the value in another column?
(5 answers)
Combine text from two files, output to another [duplicate]
(2 answers)
Closed 6 years ago.
I want to extract and combine a certain column from a bunch of text files into a single file as shown.
File1_example.txt
A 123 1
B 234 2
C 345 3
D 456 4
File2_example.txt
A 123 5
B 234 6
C 345 7
D 456 8
File3_example.txt
A 123 9
B 234 10
C 345 11
D 456 12
...
..
.
File100_example.txt
A 123 55
B 234 66
C 345 77
D 456 88
How can I loop through my files of interest and paste these columns together so that the final result is like below without having to type out 1000 unique file names?
1 5 9 ... 55
2 6 10 ... 66
3 7 11 ... 77
4 8 12 ... 88
Try this:
paste File[0-9]*_example.txt | awk '{i=3;while($i){printf("%s ",$i);i+=3}printf("\n")}'
Example:
File1_example.txt:
A 123 1
B 234 2
C 345 3
D 456 4
File2_example.txt:
A 123 5
B 234 6
C 345 7
D 456 8
Run command as:
$ paste File[0-9]*_example.txt | awk '{i=3;while($i){printf("%s ",$i);i+=3}printf("\n")}'
Output:
1 5
2 6
3 7
4 8
I tested below code with first 3 files
cat File*_example.txt | awk '{a[$1$2]= a[$1$2] $3 " "} END{for(x in a){print a[x]}}' | sort
1 5 9
2 6 10
3 7 11
4 8 12
1) use an awk array, a[$1$2]= a[$1$2] $3 " " index is column1 and column2, array value appends all column 3.
2) END{for(x in a){print a[x]}} travesrsed array a and prints all values.
3)use sort to sort the output.
when cating you need to ensure the file order is preserved, one way is to explicitly specify the files
cat File{1..100}_example.txt | awk '{print $NF}' | pr 4ts' '
extract last column by awk and align using pr

Updating n-th column in csv using awk [duplicate]

This question already has answers here:
awk doesn't print separator
(2 answers)
Closed 7 years ago.
Input file
1,2,3,4,5,6,7,8,9,10
11,22,33,44,55,66,77,88,99,100
111,222,333,444,555,666,777,888,999,1000
Expected Output
1,2,3,4,5,6,7,8MNINS,9,10
11,22,33,44,55,66,77,88MNINS,99,100
111,222,333,444,555,666,777,888MNINS,999,1000
I tried the following command
awk -F "," '{$8=$8"MNINS"}1' 1.csv > 2.csv
output:
1 2 3 4 5 6 7 8MNINS 9 10
11 22 33 44 55 66 77 88MNINS 99 100
111 222 333 444 555 666 777 888MNINS 999 1000
It is removed all the commas, so my csv file is changing into space seperated file.
Please help
You need to specify comma as Output field separator value.
awk -F "," -v OFS="," '{$8=$8"MNINS"}1' 1.csv > 2.csv

Counting the number of 10-digit numbers in a file

I need to count the total number of instances in which a 10-digit number appears within a file. All of the numbers have leading zeros, e.g.:
This is some text. 0000000001
Returns:
1
If the same number appears more than once, it is counted again, e.g.:
0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.
Returns:
3
Sometimes there are no spaces between the numbers, but each continuous string of 10-digits should be counted:
00000000010000000010000000000100000000010000000001
Returns:
5
How can I determine the total number of 10-digit numbers appearing in a file?
Try this:
grep -o '[0-9]\{10\}' inputfilename | wc -l
The last requirement - that you need to count multiple numbers per line - excludes grep, as far as I know it can count only per-line.
Edit: Obviously, I stand corrected by Nate :) grep's -o option is what I was looking for.
You can however do this easily with sed like this:
$ cat mkt.sh
sed -r -e 's/[^0-9]/./g' -e 's/[0-9]{10}/num /g' -e 's/[0-9.]//g' $1
$ for i in *.txt; do echo --- $i; cat $i; echo --- number count; ./mkt.sh $i|wc -w; done
--- 1.txt
This is some text. 0000000001
--- number count
1
--- 2.txt
0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.
--- number count
3
--- 3.txt
00000000010000000010000000000100000000010000000001
--- number count
5
--- 4.txt
1 2 3 4 5 6 6 7 9 0
11 22 33 44 55 66 77 88 99 00
123456789 0
--- number count
0
--- 5.txt
1.2.3.4.123
1234567890.123-AbceCMA-5553///q/\1231231230
--- number count
2
$
This might work for you:
cat <<! >test.txt
0000000001 This is some text.
0000000010 This is some more text.
0000000001 This is some other text.
00000000010000000010000000000100000000010000000001
1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 0 j
12345 67890 12 34 56 78 90
!
sed 'y/X/ /;s/[0-9]\{10\}/\nX\n/g' test.txt | sed '/X/!d' | sed '$=;d'
8
"I need to count the total number of instances in which a 10-digit number appears within a file. All of the numbers have leading zeros"
So I think this might be more accurate:
$ grep -o '0[0-9]\{9\}' filename | wc -l

Resources