Is there a shell command to pick the n-th line of a string ?
Example:
line1
line2
line3
pick line 2.
UPDATE: Thank you so far. With your help, I came up with this solution for a string:
Pick the 2nd line:
echo -e "1\n2\n3" | head -2 | tail -1
$ head -n filename | tail -1
where 'n' is your line number. But it's a little inefficient, launching 2 processes.
Alternatively sed can do this. To print the 4th line:
$ sed -n 4p filename
This forum answer details 3 different methods for sed
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
Using gawk:
gawk -v n=3 'n==NR { print; exit }' a.txt
head -4 a.txt | tail -1
To print the 4:th line in a. txt.
Related
I have following File (wishlist.txt):
Alligatoah Musik_ist_keine_lösung;https:///uhfhf
Alligatoah STRW;https:///uhfhf?i
Amewu Entwicklungshilfe;https:///uhfhf?i
and want to have the first word of line n.
so for n = 1:
Alligatoah
What i have so far is:
sed -e 's/\s.*//g' wishlist.txt
is there a elegant way to get rid of all lines except n?
Edit:
How to pass a bash variable "$i" to sed since
sed -n '$is/ .*//p' $wishlist
and
sed -n "\`${i}\`s/ .*//p" $wishlist
doesn't work
A couple of other techniques to get the first word of the 3rd line:
awk -v line=3 'NR == line {print $1; exit}' file
or
head -n 3 file | tail -n 1 | cut -d ' ' -f 1
Something like this. For the 1st word of the 3rd line.
sed -n '3s/\s.*//p' wishlist.txt
To use a variable: Note: Double quotes.
line=3; sed -n "${line}s/\s.*//p" wishlist.txt
sed supports "addresses", so you can tell it what lines to operate on. To print only the first line, you can use
sed -e '1!d; s/\s.*//'
where 1!d means: on lines other then 1, delete the line.
What's the easiest/quickest way to interleave the lines of two (or more) text files? Example:
File 1:
line1.1
line1.2
line1.3
File 2:
line2.1
line2.2
line2.3
Interleaved:
line1.1
line2.1
line1.2
line2.2
line1.3
line2.3
Sure it's easy to write a little Perl script that opens them both and does the task. But I was wondering if it's possible to get away with fewer code, maybe a one-liner using Unix tools?
paste -d '\n' file1 file2
Here's a solution using awk:
awk '{print; if(getline < "file2") print}' file1
produces this output:
line 1 from file1
line 1 from file2
line 2 from file1
line 2 from file2
...etc
Using awk can be useful if you want to add some extra formatting to the output, for example if you want to label each line based on which file it comes from:
awk '{print "1: "$0; if(getline < "file2") print "2: "$0}' file1
produces this output:
1: line 1 from file1
2: line 1 from file2
1: line 2 from file1
2: line 2 from file2
...etc
Note: this code assumes that file1 is of greater than or equal length to file2.
If file1 contains more lines than file2 and you want to output blank lines for file2 after it finishes, add an else clause to the getline test:
awk '{print; if(getline < "file2") print; else print ""}' file1
or
awk '{print "1: "$0; if(getline < "file2") print "2: "$0; else print"2: "}' file1
#Sujoy's answer points in a useful direction. You can add line numbers, sort, and strip the line numbers:
(cat -n file1 ; cat -n file2 ) | sort -n | cut -f2-
Note (of interest to me) this needs a little more work to get the ordering right if instead of static files you use the output of commands that may run slower or faster than one another. In that case you need to add/sort/remove another tag in addition to the line numbers:
(cat -n <(command1...) | sed 's/^/1\t/' ; cat -n <(command2...) | sed 's/^/2\t/' ; cat -n <(command3) | sed 's/^/3\t/' ) \
| sort -n | cut -f2- | sort -n | cut -f2-
With GNU sed:
sed 'R file2' file1
Output:
line1.1
line2.1
line1.2
line2.2
line1.3
line2.3
Here's a GUI way to do it: Paste them into two columns in a spreadsheet, copy all cells out, then use regular expressions to replace tabs with newlines.
cat file1 file2 |sort -t. -k 2.1
Here its specified that the separater is "." and that we are sorting on the first character of the second field.
Here are two files where I need to eliminate the data that they do not have in common:
a.txt:
hello world
tom tom
super hero
b.txt:
hello dolly 1
tom sawyer 2
miss sunshine 3
super man 4
I tried:
grep -f a.txt b.txt >> c.txt
And this:
awk '{print $1}' test1.txt
because I need to check only if the first word of the line exists in the two files (even if not at the same line number).
But then what is the best way to get the following output in the new file?
output in c.txt:
hello dolly 1
tom sawyer 2
super man 4
Use awk where you iterate over both files:
$ awk 'NR == FNR { a[$1] = 1; next } a[$1]' a.txt b.txt
hello dolly 1
tom sawyer 2
super man 4
NR == FNR is only true for the first file making { a[$1] = 1; next } only run on said file.
Use sed to generate a sed script from the input, then use another sed to execute it.
sed 's=^=/^=;s= .*= /p=' a.txt | sed -nf- b.txt
The first sed turns your a.txt into
/^hello /p
/^tom /p
/^super /p
which prints (p) whenever a line contains hello, tom, or super at the beginning of line (^) followed by a space.
This combines grep, cut and sed with process substitution:
$ grep -f <(cut -d ' ' -f 1 a.txt | sed 's/^/^/') b.txt
hello dolly 1
tom sawyer 2
super man 4
The output of the process substitution is this (piping to cat -A to show spaces):
$ cut -d ' ' -f 1 a.txt | sed 's/^/^/;s/$/ /' | cat -A
^hello $
^tom $
^super $
We then use this as input for grep -f, resulting in the above.
If your shell doesn't support process substitution, but your grep supports reading from stdin with the -f option (it should), you can use this instead:
$ cut -d ' ' -f 1 a.txt | sed 's/^/^/;s/$/ /' | grep -f - b.txt
hello dolly 1
tom sawyer 2
super man 4
I'm new with bash, and I want to combine two lines from different files when the same word is found in those lines.
E.g.:
File 1:
organism 1
1 NC_001350
4 NC_001403
organism 2
1 NC_001461
1 NC_001499
File 2:
NC_001499 » Abelson murine leukemia virus
NC_001461 » Bovine viral diarrhea virus 1
NC_001403 » Fujinami sarcoma virus
NC_001350 » Saimiriine herpesvirus 2 complete genome
NC_022266 » Simian adenovirus 18
NC_028107 » Simian adenovirus 19 strain AA153
i wanted an output like:
File 3:
organism 1
1 NC_001350 » Saimiriine herpesvirus 2 complete genome
4 NC_001403 » Fujinami sarcoma virus
organism 2
1 NC_001461 » Bovine viral diarrhea virus 1
1 NC_001499 » Abelson murine leukemia virus
Is there any way to get anything like that output?
You can get something pretty similar to your desired output like this:
awk 'NR == FNR { a[$1] = $0; next }
{ print $1, ($2 in a ? a[$2] : $2) }' file2 file1
This reads in each line of file2 into an array a, using the first field as the key. Then for each line in file1 it prints the first field followed by the matching line in a if one is found, else the second field.
If the spacing is important, then it's a little more effort but totally possible.
For a more Bash 4 ish solution:
declare -A descriptions
while read line; do
name=$(echo "$line" | cut -d '»' -f 1 | xargs echo)
description=$(echo "$line" | cut -d '»' -f 2)
eval "descriptions['$name']=' »$description'"
done < file2
while read line; do
name=$(echo "$line" | cut -d ' ' -f 2)
if [[ -n "$name" && -n "${descriptions[$name]}" ]]; then
echo "${line}${descriptions[$name]}"
else
echo "$line"
fi
done < file1
We could create a sed-script from the second file and apply it to the first file. It is straight forward, we use the sed s command to construct another sed s command from each line and store in a variable for later usage:
sc=$(sed -rn 's#^\s+(\w+)([^\w]+)(.*)$#s/\1/\1\2\3/g;#g; p;' file2 )
sed "$sc" file1
The first command looks so weird, because we use # in the outer sed s and we use the more common / in the inner sed s command as delimiters.
Do a echo $sc to study the inner one. It just takes the parts of each line of file2 into different capture groups and then combines the captured strings to a s/find/replace/g; with
find is \1
replace is \1\2\3
You want to rebuild file2 into a sed-command file.
sed 's# \(\w\+\) \(.*\)#s/\1/\1 \2/#' File2
You can use process substitution to use the result without storing it in a temp file.
sed -f <(sed 's# \(\w\+\) \(.*\)#s/\1/\1 \2/#' File2) File1
How I can remove the previous line of a match pattern?
Or
the opposite of:
sed -n '/pattern/{g;1!p;};h'
Use tac | sed | tac (Linux/Solaris) then it's next line after a match pattern :)
sed is an excellent tool for simple substitutions on a single line, for anything else just use awk:
$ cat file
here is a
a bad line before
a good line
in a file
$ awk 'NR==FNR{if (/good/) del[NR-1]; next} !(FNR in del)' file file
here is a
a good line
in a file
You can use the above idiom to delete any number of lines before and/or after a given pattern, e.g. to delete the 3 lines before and 2 lines after a given target:
$ cat file
-5
-4
-3
-2
-1
target
+1
+2
+3
+4
+5
$
$ awk 'NR==FNR{if (/target/) for (i=-3;i<=2;i++) del[NR+i]; next} !(FNR in del)' file file
-5
-4
+3
+4
+5
or to leave the target in place and just delete the lines around it:
$ awk 'NR==FNR{if (/target/) for (i=-3;i<=2;i++) if (i!=0) del[NR+i]; next} !(FNR in del)' file file
-5
-4
target
+3
+4
+5
All very clear, trivial, and scalable...
For "relatively complex" navigation around a search expression, ed might be a good solution (comments are not part of the command):
ed testfile << EOF
/r.*o/ # Search the pattern
-1d # delete one line above
w # write
EOF
Here is an example (using <<< and \n to write as a single line):
sh$ cat testfile
john
paul
george
ringo
sh$ ed testfile <<< $'/r.*o/\n-1d\nw'
23
ringo
16
sh$ cat testfile
john
paul
ringo
You can revert the file and then delete the line after the matche pattern(which is simple), and then revert the result, here is the code:
tail -r|sed '/pattern/{n;d;}'|tail -r
Here is another awk:
awk '/pattern/ {f=1} !f&&NR>1 {print p} {p=$0;f=0} END {print p}' file
A tac awk version:
tac file | awk '1; /pattern/ {getline}' | tac
PS getline should normally be avoided since it has many pitfalls, so then this:
tac file | awk '!p||NR!=p+1; /pattern/ {p=NR}' | tac
This might work for you (GNU sed):
sed '$!N;/\n.*pattern/!P;D' file
Keep a window of 2 lines and test the second of them for the pattern. If the pattern is present do not print the first line.
This would work but only if line count is more than 1 and pattern is not at the last line.
sed -n '/pattern/ { h; b }; 1 { h; b }; ${ H }; x; p' file
Better use awk instead:
awk '!/pattern/ && NR > 1 { print p } { p = $0 } END { if (NR) print p }' file