Add numbers to the beginning of every line in a file - bash

How can I add numbers to the beginning of every line in a file?
E.g.:
This is
the text
from the file.
Becomes:
000000001 This is
000000002 the text
000000003 from the file.

Don't use cat or any other tool which is not designed to do that. Use the program:
nl - number lines of files
Example:
$ nl --number-format=rz --number-width=9 foobar
$ nl -n rz -w 9 foobar # short-hand
Because nl is made for it ;-)

AWK's printf, NR and $0 make it easy to have precise and flexible control over the formatting:
~ $ awk '{printf("%010d %s\n", NR, $0)}' example.txt
0000000001 This is
0000000002 the text
0000000003 from the file.

You're looking for the nl(1) command:
$ nl -nrz -w9 /etc/passwd
000000001 root:x:0:0:root:/root:/bin/bash
000000002 daemon:x:1:1:daemon:/usr/sbin:/bin/sh
000000003 bin:x:2:2:bin:/bin:/bin/sh
...
-w9 asks for numbers nine digits long; -nrz asks for the numbers to be formatted right-justified with zero padding.

cat -n thefile will do the job, albeit with the numbers in a slightly different format.

Easiest, simplest option is
awk '{print NR,$0}' file
See comment above on why nl isn't really the best option.

Here's a bash script that will do this also:
#!/bin/bash
counter=0
filename=$1
while read -r line
do
printf "%010d %s" $counter $line
let counter=$counter+1
done < "$filename"

perl -pe 'printf "%09u ", $.' -- example.txt

Related

How to remove consecutive repeating characters from every line?

I have the below lines in a file
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;;;;
Acanthocephala;;;;;;;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;;Polymorphus;;
and I want to remove the repeating semi-colon characters from all lines to look like below (note- there are repeating semi-colons in the middle of some of the above lines too)
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Profilicollis;Profilicollis_altmani;
Acanthocephala;Eoacanthocephala;Neoechinorhynchida;Neoechinorhynchidae;
Acanthocephala;
Acanthocephala;Palaeacanthocephala;Polymorphida;Polymorphidae;Polymorphus;
I would appreciate if someone could kindly share a bash one-liner to accomplish this.
You can use tr with "squeeze":
tr -s ';' < infile
perl -p -e 's/;+/;/g' myfile # writes output to stdout
or
perl -p -i -e 's/;+/;/g' myfile # does an in-place edit
If you want to edit the file itself:
printf "%s\n" 'g/;;/s/;\{2,\}/;/g' w | ed -s foo.txt
If you want to pipe a modified copy of the file to something else and leave the original unchanged:
sed 's/;\{2,\}/;/g' foo.txt | whatever
These replace runs of 2 or more semicolons with single ones.
could be solved easily by substitutions.
I add an awk solution by playing with the FS/OFS variable:
awk -F';+' -v OFS=';' '$1=$1' file
or
awk -F';+' -v OFS=';' '($1=$1)||1' file
Here's a sed version of alaniwi's answer:
sed 's/;\+/;/g' myfile # Write output to stdout
or
sed -i 's/;\+/;/g' myfile # Edit the file in-place

Adding line number to a list redirected to a .txt file in Bash [duplicate]

How can I add numbers to the beginning of every line in a file?
E.g.:
This is
the text
from the file.
Becomes:
000000001 This is
000000002 the text
000000003 from the file.
Don't use cat or any other tool which is not designed to do that. Use the program:
nl - number lines of files
Example:
$ nl --number-format=rz --number-width=9 foobar
$ nl -n rz -w 9 foobar # short-hand
Because nl is made for it ;-)
AWK's printf, NR and $0 make it easy to have precise and flexible control over the formatting:
~ $ awk '{printf("%010d %s\n", NR, $0)}' example.txt
0000000001 This is
0000000002 the text
0000000003 from the file.
You're looking for the nl(1) command:
$ nl -nrz -w9 /etc/passwd
000000001 root:x:0:0:root:/root:/bin/bash
000000002 daemon:x:1:1:daemon:/usr/sbin:/bin/sh
000000003 bin:x:2:2:bin:/bin:/bin/sh
...
-w9 asks for numbers nine digits long; -nrz asks for the numbers to be formatted right-justified with zero padding.
cat -n thefile will do the job, albeit with the numbers in a slightly different format.
Easiest, simplest option is
awk '{print NR,$0}' file
See comment above on why nl isn't really the best option.
Here's a bash script that will do this also:
#!/bin/bash
counter=0
filename=$1
while read -r line
do
printf "%010d %s" $counter $line
let counter=$counter+1
done < "$filename"
perl -pe 'printf "%09u ", $.' -- example.txt

How to add 100 spaces at end of each line of a file in Unix

I have a file which is supposed to contain 200 characters in each line. I received a source file with only 100 characters in each line. I need to add 100 extra white spaces to each line now. If it were few blank spaces, we could have used sed like:
sed 's/$/ /' filename > newfilename
Since it's 100 spaces, can anyone tell me is it possible to add in Unix?
If you want to have fixed n chars per line (don't trust the input file has exact m chars per line) follow this. For the input file with varying number of chars per line:
$ cat file
1
12
123
1234
12345
extend to 10 chars per line.
$ awk '{printf "%-10s\n", $0}' file | cat -e
1 $
12 $
123 $
1234 $
12345 $
Obviously change 10 to 200 in your script. Here $ shows end of line, it's not there as a character. You don't need cat -e, here just to show the line is extended.
With awk
awk '{printf "%s%100s\n", $0, ""}' file.dat
$0 refers to the entire line.
Updated after Glenn's suggestion
Somewhat how Glenn suggests in the comments, the substitution is unnecessary, you can just add the spaces - although, taking that logic further, you don't even need the addition, you can just say them after the original line.
perl -nlE 'say $_," "x100' file
Original Answer
With Perl:
perl -pe 's/$/" " x 100/e' file
That says... "Substitute (s) the end of each line ($) with the calculated expression (e) of 100 repetitions of a space".
If you wanted to pad all lines to, say, 200 characters even if the input file was ragged (all lines of differing length), you could use something like this:
perl -pe '$pad=200-length;s/$/" " x $pad/e'
which would make up lines of 83, 102 and 197 characters to 200 each.
If you use Bash, you can still use sed, but use some readline functionality to keep you from manually typing 100 spaces (see manual for "Readline arguments").
You start typing normally:
sed 's/$/
Now, you want to insert 100 spaces. You can do this by prepending hitting the space bar with a readline argument to indicate that you want it to happen 100 times, i.e., you manually enter what would look like this as a readline keybinding:
M-1 0 0 \040
Or, if your meta key is the alt key: Alt+1 00Space
This inserts 100 spaces, and you get
sed 's/$/ /' filename
after typing the rest of the command.
This is useful for working in an interactive shell, but not very pretty for scripts – use any of the other solutions for that.
Just in case you are looking for a bash solution,
while IFS= read -r line
do
printf "%s%100s\n" "$line"
done < file > newfile
Test
Say I have a file with 3 lines it it as
$ wc -c file
16 file
$ wc -c newfile
316 newfile
Original Answer
spaces=$(echo {1..101} | tr -d 0-9)
while read line
do
echo -e "${line}${spaces}\n" >> newfile
done < file
You can use printf in awk:
awk '{printf "%s%*.s\n", $0, 100, " "}' filename > newfile
This printf will append 100 spaces at the end of each newline.
Another way in GNU awk using string-manipulation function sprintf.
awk 'BEGIN{s=sprintf("%-100s", "");}{print $0 s}' input-file > file-with-spaces
A proof with an example:-
$ cat input-file
1234jjj hdhyvb 1234jjj
6789mmm mddyss skjhude
khora77 koemm sado666
nn1004 nn1004 457fffy
$ wc -c input-file
92 input-file
$ awk 'BEGIN{s=sprintf("%-100s", "");}{print $0 s}' input-file > file-with-spaces
$ wc -c file-with-spaces
492 file-with-spaces

Reverse strings in a text file using command-line tools like AWK or SED

I have a text file containing lines like
ABCD
1234
bear
Anders Ångström
And I need to transform it into
DBCA
4321
raeb
mörtsgnÅ srednA
I need to do this in command line, so I think AWK is the best tool for the job, but I might be wrong. Maybe it's possible to do it with Regex replace? Notepad2 knows Regex replace.
Later edit: I need do do it in Windows (I'm using GNUWin32 tools) and I need it to work with Unicode characters like Å, ö, ë, Ş, Ĕ, etc.
Use rev command like below.
$ rev < file
DCBA
4321
raeb
Try this with rev command:
rev File
Sample:
AMD$ cat File
ABCD
1234
bear
AMD$ rev File
DCBA
4321
raeb
man rev :
The rev utility copies the specified files to the standard output,
reversing the order of characters in every line. If no files are speci-
fied, the standard input is read.
Here is an awk
awk -vFS= '{do printf $(NF);while(--NF>0);print ""}' file
DCBA
4321
raeb
Or like this:
awk -vFS= '{for (i=NF;i>=1;i--) printf $(i);print ""}' file
DCBA
4321
raeb
For most correct use of printf, change it from printf $(i) to printf "%s",$(i)
If you are interested in Sed solution, here it is:
+$ sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//' <<EOF
+> ABCD
+> 1234
+> bear
+> EOF
DCBA
4321
raeb
You may find more write-only programs in Sed at http://sed.sourceforge.net/sed1line.txt.
You can use the rev command.
rev < filename > output filename
Output will be stored in the file. If you are not giving that it will be printed in stdout
This perl may do.
But not sure if you have perl for windows. If not, copy file to a linux server, convert it.
perl -ne 'chomp;print scalar reverse . "\n";' file
DCBA
4321
raeb
mörtsgnÅ srednA
This bash script do also work in linux
cat convert
#!/bin/bash
input="$1"
for (( i=${#input}-1; i>=0; i-- ));do
reverse="$reverse${input:$i:1}"
done
echo "$reverse"
./convert Ångström
mörtsgnÅ

"grep"ing first 12 of last 24 character from a line

I am trying to extract "first 12 of last 24 character" from a line, i.e.,
for a line:
species,subl,cmp= 1 4 1 s1,torque= 0.41207E-09-0.45586E-13
I need to extract "0.41207E-0".
(I have not written the code, so don't curse me for its formatting. )
I have managed to do this via:
var_s=`grep "species,subl,cmp= $3 $4 $5" $tfile |sed -n '$s/.*\(........................\)$/\1/p'|sed -n '$s/\(............\).*$/\1/p'`
but, is there any more readable way of doing this, rather then counting dots?
EDIT
Thanks to both of you;
so, I have sed,awk grep and bash.
I will run that in loop, for 100's of file.
so, can you also suggest me which one is most efficient, wrt time?
One way with GNU sed (without counting dots):
$ sed -r 's/.*(.{11}).{12}/\1/' file
0.41207E-09
Similarly with GNU grep:
$ grep -Po '.{11}(?=.{12}$)' file
0.41207E-09
Perhaps a python solution may also be helpful:
python -c 'import sys;print "\n".join([a[-24:-13] for a in sys.stdin])' < file
0.41207E-09
I'm not sure your example data and question match up so just change the values in the {n} quantifier accordingly.
Simplest is using pure bash:
echo "${str:(-24):12}"
OR awk can also do that:
awk '{print substr($0, length($0)-23, 12)}' <<< $str
OUTPUT:
0.41207E-09
EDIT: For using bash solution on a file:
while read l; do echo "${l:(-24):12}"; done < file
Another one, less efficient but has the advantage of making you discover new tools
`echo "$str" | rev | cut -b 1-24 | rev | cut -b 1-12
You can use awk to get first 12 characters of last 24 characters from a line:
awk '{substr($0,(length($0)-23))};{print substr($0,(length($0)-10))}' myfile.txt

Resources