Command to get nth line of STDOUT - bash

Is there any bash command that will let you get the nth line of STDOUT?
That is to say, something that would take this
$ ls -l
-rw-r--r--# 1 root wheel my.txt
-rw-r--r--# 1 root wheel files.txt
-rw-r--r--# 1 root wheel here.txt
and do something like
$ ls -l | magic-command 2
-rw-r--r--# 1 root wheel files.txt
I realize this would be bad practice when writing scripts meant to be reused, BUT when working with the shell day to day it'd be useful to me to be able to filter my STDOUT in such a way.
I also realize this would be semi-trivial command to write (buffer STDOUT, return a specific line), but I want to know if there's some standard shell command to do this that would be available without me dropping a script into place.

Using sed, just for variety:
ls -l | sed -n 2p
Using this alternative, which looks more efficient since it stops reading the input when the required line is printed, may generate a SIGPIPE in the feeding process, which may in turn generate an unwanted error message:
ls -l | sed -n -e '2{p;q}'
I've seen that often enough that I usually use the first (which is easier to type, anyway), though ls is not a command that complains when it gets SIGPIPE.
For a range of lines:
ls -l | sed -n 2,4p
For several ranges of lines:
ls -l | sed -n -e 2,4p -e 20,30p
ls -l | sed -n -e '2,4p;20,30p'

ls -l | head -2 | tail -1

Alternative to the nice head / tail way:
ls -al | awk 'NR==2'
or
ls -al | sed -n '2p'

From sed1line:
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
From awk1line:
# print line number 52
awk 'NR==52'
awk 'NR==52 {print;exit}' # more efficient on large files

For the sake of completeness ;-)
shorter code
find / | awk NR==3
shorter life
find / | awk 'NR==3 {print $0; exit}'

Try this sed version:
ls -l | sed '2 ! d'
It says "delete all the lines that aren't the second one".

You can use awk:
ls -l | awk 'NR==2'
Update
The above code will not get what we want because of off-by-one error: the ls -l command's first line is the total line. For that, the following revised code will work:
ls -l | awk 'NR==3'

Another poster suggested
ls -l | head -2 | tail -1
but if you pipe head into tail, it looks like everything up to line N is processed twice.
Piping tail into head
ls -l | tail -n +2 | head -n1
would be more efficient?

Is Perl easily available to you?
$ perl -n -e 'if ($. == 7) { print; exit(0); }'
Obviously substitute whatever number you want for 7.

Yes, the most efficient way (as already pointed out by Jonathan Leffler) is to use sed with print & quit:
set -o pipefail # cf. help set
time -p ls -l | sed -n -e '2{p;q;}' # only print the second line & quit (on Mac OS X)
echo "$?: ${PIPESTATUS[*]}" # cf. man bash | less -p 'PIPESTATUS'

Hmm
sed did not work in my case.
I propose:
for "odd" lines 1,3,5,7... ls |awk '0 == (NR+1) % 2'
for "even" lines 2,4,6,8 ls |awk '0 == (NR) % 2'

For more completeness..
ls -l | (for ((x=0;x<2;x++)) ; do read ; done ; head -n1)
Throw away lines until you get to the second, then print out the first line after that. So, it prints the 3rd line.
If it's just the second line..
ls -l | (read; head -n1)
Put as many 'read's as necessary.

Related

How can I combine a set of text files, leaving off the first line of each?

As part of a normal workflow, I receive sets of text files, each containing a header row. It's more convenient for me to work with these as a single file, but if I cat them naively, the header rows in files after the first cause problems.
The files tend to be large enough (103–105 lines, 5–50 MB) and numerous enough that it's awkward and/or tedious to do this in an editor or step-by-step, e.g.:
$ wc -l *
20251 1.csv
124520 2.csv
31158 3.csv
175929 total
$ tail -n 20250 1.csv > 1.tmp
$ tail -n 124519 2.csv > 2.tmp
$ tail -n 31157 3.csv > 3.tmp
$ cat *.tmp > combined.csv
$ wc -l combined.csv
175926 combined.csv
It seems like this should be doable in one line. I've isolated the arguments that I need but I'm having trouble figuring out how to match them up with tail and subtract 1 from the line total (I'm not comfortable with awk):
$ wc -l * | grep -v "total" | xargs -n 2
20251 foo.csv
124520 bar.csv
31158 baz.csv
87457 zappa.csv
7310 bingo.csv
29968 niner.csv
2086 hella.csv
$ wc -l * | grep -v "total" | xargs -n 2 | tail -n
tail: option requires an argument -- n
Try 'tail --help' for more information.
xargs: echo: terminated by signal 13
You don't need to use wc -l to calculate the number of lines to output; tail can skip the first line (or the first K lines), just by adding a + symbol when using the -n (or --lines) option, as described in the man page:
-n, --lines=K output the last K lines, instead of the last 10;
or use -n +K to output starting with the Kth
This makes combining all files in a directory without the first line of each file as simple as:
$ tail -q -n +2 * > combined.csv
$ wc -l *
20251 foo.csv
124520 bar.csv
31158 baz.csv
87457 zappa.csv
7310 bingo.csv
29968 niner.csv
2086 hella.csv
302743 combined.csv
605493 total
The -q flag suppresses headers in the output when globbing for multiple files with tail.
Both tail and sed answers work fine.
For the sake of an alternative here is an awk command that does the same job:
awk 'FNR > 1' *.csv > combined.csv
FNR > 1 condition will skip first row for each file.
With GNU sed:
sed -ns '2,$p' 1.csv 2.csv 3.csv > combined.csv
or
sed -ns '2,$p' *.csv > combined.csv
Another sed alternative
sed -s 1d *.csv
deletes first line from each input file, without -s it will only delete from the first file.

How do I pipe the last command in my command history to clipboard?

I'm a total noob when it comes to grep/awk/sed/cut so I need help with this. I've got this: history | tail -n 1 | pbcopy which returns 1968* mv ~/iPhoto\ Library.zip ./ ; bell which is great because that's the last command I ran, but I need to remove the numbers at the beginning. I've tried various iterations of awk, grep, sed and cut, but like I said I'm a noob when it comes to those kinds of commands. How would I do that?
You could try this sed command,
history | tail -n 1 | pbcopy | sed 's/^[0-9]\+//g'
Through awk,
history | tail -n 1 | pbcopy | awk '{sub(/^[0-9]+/,"")}1'
Output:
mv ~/iPhoto\ Library.zip ./ ; bell
Just pipe your output to
awk '{for(i=2;i<NF;i++)printf "%s",$i OFS; if (NF) printf "%s",$NF; printf ORS}'
Output:
mv ~/iPhoto\ Library.zip ./ ; bell
In zsh you can just tell history (which is a synonym to fc -l) to not print the numbers with -n. Also, you can get it to print only the last entry with -1:
history -n -1 | pcopy
fc -l -n -1 | pcopy
In bash history has no options for this, but there is also a fc command, which even supports the needed options. Unfortunatelly, 'suppress command numbers' (from man 1 bash) seems to mean 'print a TAB instead', so the output starts with a TAB and a space, which can be removed with sed
fc -l -n -1 | sed 's/^\t //' | pcopy

Getting head to display all but the last line of a file: command substitution and standard I/O redirection

I have been trying to get the head utility to display all but the last line of standard input. The actual code that I needed is something along the lines of cat myfile.txt | head -n $(($(wc -l)-1)). But that didn't work. I'm doing this on Darwin/OS X which doesn't have the nice semantics of head -n -1 that would have gotten me similar output.
None of these variations work either.
cat myfile.txt | head -n $(wc -l | sed -E -e 's/\s//g')
echo "hello" | head -n $(wc -l | sed -E -e 's/\s//g')
I tested out more variations and in particular found this to work:
cat <<EOF | echo $(($(wc -l)-1))
>Hola
>Raul
>Como Esta
>Bueno?
>EOF
3
Here's something simpler that also works.
echo "hello world" | echo $(($(wc -w)+10))
This one understandably gives me an illegal line count error. But it at least tells me that the head program is not consuming the standard input before passing stuff on to the subshell/command substitution, a remote possibility, but one that I wanted to rule out anyway.
echo "hello" | head -n $(cat && echo 1)
What explains the behavior of head and wc and their interaction through subshells here? Thanks for your help.
head -n -1 will give you all except the last line of its input.
head is the wrong tool. If you want to see all but the last line, use:
sed \$d
The reason that
# Sample of incorrect code:
echo "hello" | head -n $(wc -l | sed -E -e 's/\s//g')
fails is that wc consumes all of the input and there is nothing left for head to see. wc inherits its stdin from the subshell in which it is running, which is reading from the output of the echo. Once it consumes the input, it returns and then head tries to read the data...but it is all gone. If you want to read the input twice, the data will have to be saved somewhere.
Using sed:
sed '$d' filename
will delete the last line of the file.
$ seq 1 10 | sed '$d'
1
2
3
4
5
6
7
8
9
For Mac OS X specifically, I found an answer from a comment to this Q&A.
Assuming you are using Homebrew, run brew install coreutils then use the ghead command:
cat myfile.txt | ghead -n -1
Or, equivalently:
ghead -n -1 myfile.txt
Lastly, see brew info coreutils if you'd like to use the commands without the g prefix (e.g., head instead of ghead).
cat myfile.txt | echo $(($(wc -l)-1))
This works. It's overly complicated: you could just write echo $(($(wc -l)-1)) <myfile.txt or echo $(($(wc -l <myfile.txt)-1)). The problem is the way you're using it.
cat myfile.txt | head -n $(wc -l | sed -E -e 's/\s//g')
wc consumes all the input as it's counting the lines. So there is no data left to read in the pipe by the time head is started.
If your input comes from a file, you can redirect both wc and head from that file.
head -n $(($(wc -l <myfile.txt) - 1)) <myfile.txt
If your data may come from a pipe, you need to duplicate it. The usual tool to duplicate a stream is tee, but that isn't enough here, because the two outputs from tee are produced at the same rate, whereas here wc needs to fully consume its output before head can start. So instead, you'll need to use a single tool that can detect the last line, which is a more efficient approach anyway.
Conveniently, sed offers a way of matching the last line. Either printing all lines but the last, or suppressing the last output line, will work:
sed -n '$! p'
sed '$ d'
Here is a one-liner that can get you the desired output, and it can be used more generally for getting all lines from a file except the last n lines.
grep -n "" myfile.txt \ # output the line number for each line
| sort -nr \ # reverse the file by using those line numbers
| sed '1,4d' \ # delete first 4 lines (last 4 of the original file)
| sort -n \ # reverse the reversed file (correct the line order)
| sed 's/^[0-9]*://' # remove the added line numbers
Here is the above command in an actual single line and runnable (can't execute the above due to the added comments):
grep -n "" myfile.txt | sort -nr | sed '1,4d' | sort -n | sed 's/^[0-9]*://'
It's a little cumbersome, and this problem can be solved with more comprehensive commands like ghead, but when you can't or don't want to download such tools, it's nice to be able to do this with the more basic options. I've been in situations where it's simply not an option to get better tools.
awk 'NR>1{print p}{p=$0}'
For this job, an awk one-liner is a bit longer than a sed one.

bash echo number of lines of file given in a bash variable without the file name

I have the following three constructs in a bash script:
NUMOFLINES=$(wc -l $JAVA_TAGS_FILE)
echo $NUMOFLINES" lines"
echo $(wc -l $JAVA_TAGS_FILE)" lines"
echo "$(wc -l $JAVA_TAGS_FILE) lines"
And they both produce identical output when the script is run:
121711 /home/slash/.java_base.tag lines
121711 /home/slash/.java_base.tag lines
121711 /home/slash/.java_base.tag lines
I.e. the name of the file is also echoed (which I don't want to). Why do these scriplets fail and how should I output a clean:
121711 lines
?
An Example Using Your Own Data
You can avoid having your filename embedded in the NUMOFLINES variable by using redirection from JAVA_TAGS_FILE, rather than passing the filename as an argument to wc. For example:
NUMOFLINES=$(wc -l < "$JAVA_TAGS_FILE")
Explanation: Use Pipes or Redirection to Avoid Filenames in Output
The wc utility will not print the name of the file in its output if input is taken from a pipe or redirection operator. Consider these various examples:
# wc shows filename when the file is an argument
$ wc -l /etc/passwd
41 /etc/passwd
# filename is ignored when piped in on standard input
$ cat /etc/passwd | wc -l
41
# unusual redirection, but wc still ignores the filename
$ < /etc/passwd wc -l
41
# typical redirection, taking standard input from a file
$ wc -l < /etc/passwd
41
As you can see, the only time wc will print the filename is when its passed as an argument, rather than as data on standard input. In some cases, you may want the filename to be printed, so it's useful to understand when it will be displayed.
wc can't get the filename if you don't give it one.
wc -l < "$JAVA_TAGS_FILE"
You can also use awk:
awk 'END {print NR,"lines"}' filename
Or
awk 'END {print NR}' filename
(apply on Mac, and probably other Unixes)
Actually there is a problem with the wc approach: it does not count the last line if it does not terminate with the end of line symbol.
Use this instead
nbLines=$(cat -n file.txt | tail -n 1 | cut -f1 | xargs)
or even better (thanks gniourf_gniourf):
nblines=$(grep -c '' file.txt)
Note: The awk approach by chilicuil also works.
It's a very simple:
NUMOFLINES=$(cat $JAVA_TAGS_FILE | wc -l )
or
NUMOFLINES=$(wc -l $JAVA_TAGS_FILE | awk '{print $1}')
I normally use the 'back tick' feature of bash
export NUM_LINES=`wc -l filename`
Note the 'tick' is the 'back tick' e.g. ` not the normal single quote

Linux commands to output part of input file's name and line count

What Linux commands would you use successively, for a bunch of files, to count the number of lines in a file and output to an output file with part of the corresponding input file as part of the output line. So for example we were looking at file LOG_Yellow and it had 28 lines, the the output file would have a line like this (Yellow and 28 are tab separated):
Yellow 28
wc -l [filenames] | grep -v " total$" | sed s/[prefix]//
The wc -l generates the output in almost the right format; grep -v removes the "total" line that wc generates for you; sed strips the junk you don't want from the filenames.
wc -l * | head --lines=-1 > output.txt
produces output like this:
linecount1 filename1
linecount2 filename2
I think you should be able to work from here to extend to your needs.
edit: since I haven't seen the rules for you name extraction, I still leave the full name. However, unlike other answers I'd prefer to use head rather then grep, which not only should be slightly faster, but also avoids the case of filtering out files named total*.
edit2 (having read the comments): the following does the whole lot:
wc -l * | head --lines=-1 | sed s/LOG_// | awk '{print $2 "\t" $1}' > output.txt
wc -l *| grep -v " total"
send
28 Yellow
You can reverse it if you want (awk, if you don't have space in file names)
wc -l *| egrep -v " total$" | sed s/[prefix]//
| awk '{print $2 " " $1}'
Short of writing the script for you:
'for' for looping through your files.
'echo -n' for printing the current file
'wc -l' for finding out the line count
And dont forget to redirect
('>' or '>>') your results to your
output file

Resources