Reading a specific line of a file - shell

What is the best way (better performance) to read a specific line of a file? Currently, I'm using the following command line:
head -line_number file_name | tail -1
ps.: preferentially, using shell tools.

You could use sed.
# print line number 10
$ sed -n '10p' file_name
$ sed '10!d' file_name
$ sed '10q;d' file_name

#print 10th line
awk NR==10 file_name

awk -v linenum=10 'NR == linenum {print; exit}' file

If you know the lines are the same length, then a program could directly index in to that line without reading all the preceeding ones: something like od might be able to do that, or you could code it up in half a dozen lines in most-any language. Look for a function called seek() or fseek().
Otherwise, perhaps...
tail +N | head -n 1
...as this asks tail to skip to the Nth line, and there are less lines put needlessly through the pipe than with your head to tail solution.

ruby -ne '$.==10 and (print; exit)' file

I've tried it couple of times to avoid the file cache and found the head + tail was quick but the ruby was the fastest:
$ wc -l myfile.txt
920391 myfile.txt
$ time awk NR==334227 myfile.txt
my_searched_line
real 0m14.963s
user 0m1.235s
sys 0m0.126s
$ time head -334227 myfile.txt |tail -1
my_searched_line
real 0m5.524s
user 0m0.569s
sys 0m0.725s
$ time sed '334227!d' myfile
my_searched_line
real 0m12.565s
user 0m0.814s
sys 0m0.398s
$ time ruby -ne '$.==334227 and (print; exit)' myfile
my_searched_line
real 0m0.750s
user 0m0.568s
sys 0m0.179s

Related

How to start from the last line with tail?

I have a huge log file. I need to find something and print last line. Like this:
tail -n +1 "$log" | awk '$9 ~ "'$something'" {print $0}' | tail -n1
But when I execute this command, tail starts from 1st line and reads all the lines. And running few mins.
How can I start to read from the last line and stop when I find something? So maybe I don't need to read all lines and running just few secs. Because I need just last line about $something.
Note you are saying tail -n +1 "$log", which is interpreted by tail as: start reading from line 1. So you are in fact doing cat "$log".
You probably want to say tail -n 1 "$log" (without the + before 1) to get the last n lines.
Also, if you want to get the last match of $something, you may want to use tac. This prints a file backwards: first the last line, then the penultimate... and finally the first one.
So if you do
tac "$log" | grep -m1 "$something"
this will print the last match of $something and then exit, because -mX prints the first X matches.
Or of course you can use awk as well:
tac "$log" | awk -v pattern="$something" '$9 ~ pattern {print; exit}'
Note the usage of -v to give to awk the variable. This way you avoid a confusing mixure of single and double quotes in your code.
tac $FILE | grep $SOMETHING -m 1
tac: the reverse of cat :-)
grep -m: search and stop on first occurrence
Instead of tail, use tac. It will reverse the file and you can exit when you first grep something:
tac "$log" | awk '$9 ~ "'$something'" {print $0;exit}'
tail -1000 takes only the last 1000 lines from your file.
You could grep that part, but you wouldn't know if the thing you grep for occurred in the earlier lines. There's no way to grep "backwards".

"grep"ing first 12 of last 24 character from a line

I am trying to extract "first 12 of last 24 character" from a line, i.e.,
for a line:
species,subl,cmp= 1 4 1 s1,torque= 0.41207E-09-0.45586E-13
I need to extract "0.41207E-0".
(I have not written the code, so don't curse me for its formatting. )
I have managed to do this via:
var_s=`grep "species,subl,cmp= $3 $4 $5" $tfile |sed -n '$s/.*\(........................\)$/\1/p'|sed -n '$s/\(............\).*$/\1/p'`
but, is there any more readable way of doing this, rather then counting dots?
EDIT
Thanks to both of you;
so, I have sed,awk grep and bash.
I will run that in loop, for 100's of file.
so, can you also suggest me which one is most efficient, wrt time?
One way with GNU sed (without counting dots):
$ sed -r 's/.*(.{11}).{12}/\1/' file
0.41207E-09
Similarly with GNU grep:
$ grep -Po '.{11}(?=.{12}$)' file
0.41207E-09
Perhaps a python solution may also be helpful:
python -c 'import sys;print "\n".join([a[-24:-13] for a in sys.stdin])' < file
0.41207E-09
I'm not sure your example data and question match up so just change the values in the {n} quantifier accordingly.
Simplest is using pure bash:
echo "${str:(-24):12}"
OR awk can also do that:
awk '{print substr($0, length($0)-23, 12)}' <<< $str
OUTPUT:
0.41207E-09
EDIT: For using bash solution on a file:
while read l; do echo "${l:(-24):12}"; done < file
Another one, less efficient but has the advantage of making you discover new tools
`echo "$str" | rev | cut -b 1-24 | rev | cut -b 1-12
You can use awk to get first 12 characters of last 24 characters from a line:
awk '{substr($0,(length($0)-23))};{print substr($0,(length($0)-10))}' myfile.txt

remove n lines from STDOUT on bash

Do you have any bash solution to remove N lines from stdout?
like a 'head' command, print all lines, only except last N
Simple solition on bash:
find ./test_dir/ | sed '$d' | sed '$d' | sed '$d' | ...
but i need to copy sed command N times
Any better solution?
except awk, python etc...
Use head with a negative number. In my example it will print all lines but last 3:
head -n -3 infile
if head -n -3 filename doesn't work on your system (like mine), you could also try the following approach (and maybe alias it or create a function in your .bashrc)
head -`echo "$(wc -l filename)" | awk '{ print $1 - 3; }'` filename
Where filename and 3 above are your file and number of lines respectively.
The tail command can skip from the end of a file on Mac OS / BSD. tail accepts +/- prefix, which facilitates expression below, which will show 3 lines from the start
tail -n +3 filename.ext
Or, to skip lines from the end of file, use - prefixed, instead.
tail -n -3 filenme.ext
Typically, the default for tail is the - prefix, thus counting from the end of the file. See a similar answer to a different question here: Print a file skipping first X lines in Bash

Can I grep only the first n lines of a file?

I have very long log files, is it possible to ask grep to only search the first 10 lines?
The magic of pipes;
head -10 log.txt | grep <whatever>
For folks who find this on Google, I needed to search the first n lines of multiple files, but to only print the matching filenames. I used
gawk 'FNR>10 {nextfile} /pattern/ { print FILENAME ; nextfile }' filenames
The FNR..nextfile stops processing a file once 10 lines have been seen. The //..{} prints the filename and moves on whenever the first match in a given file shows up. To quote the filenames for the benefit of other programs, use
gawk 'FNR>10 {nextfile} /pattern/ { print "\"" FILENAME "\"" ; nextfile }' filenames
Or use awk for a single process without |:
awk '/your_regexp/ && NR < 11' INPUTFILE
On each line, if your_regexp matches, and the number of records (lines) is less than 11, it executes the default action (which is printing the input line).
Or use sed:
sed -n '/your_regexp/p;10q' INPUTFILE
Checks your regexp and prints the line (-n means don't print the input, which is otherwise the default), and quits right after the 10th line.
You have a few options using programs along with grep. The simplest in my opinion is to use head:
head -n10 filename | grep ...
head will output the first 10 lines (using the -n option), and then you can pipe that output to grep.
grep "pattern" <(head -n 10 filename)
head -10 log.txt | grep -A 2 -B 2 pattern_to_search
-A 2: print two lines before the pattern.
-B 2: print two lines after the pattern.
head -10 log.txt # read the first 10 lines of the file.
You can use the following line:
head -n 10 /path/to/file | grep [...]
The output of head -10 file can be piped to grep in order to accomplish this:
head -10 file | grep …
Using Perl:
perl -ne 'last if $. > 10; print if /pattern/' file
An extension to Joachim Isaksson's answer: Quite often I need something from the middle of a long file, e.g. lines 5001 to 5020, in which case you can combine head with tail:
head -5020 file.txt | tail -20 | grep x
This gets the first 5020 lines, then shows only the last 20 of those, then pipes everything to grep.
(Edited: fencepost error in my example numbers, added pipe to grep)
grep -A 10 <Pattern>
This is to grab the pattern and the next 10 lines after the pattern. This would work well only for a known pattern, if you don't have a known pattern use the "head" suggestions.
grep -m6 "string" cov.txt
This searches only the first 6 lines for string

How do I read the first line of a file using cat?

How do I read the first line of a file using cat?
You don't need cat.
head -1 file
will work fine.
You don't, use head instead.
head -n 1 file.txt
There are many different ways:
sed -n 1p file
head -n 1 file
awk 'NR==1' file
You could use cat file.txt | head -1, but it would probably be better to use head directly, as in head -1 file.txt.
This may not be possible with cat. Is there a reason you have to use cat?
If you simply need to do it with a bash command, this should work for you:
head -n 1 file.txt
cat alone may not be possible, but if you don't want to use head this works:
cat <file> | awk 'NR == 1'
I'm surprised that this question has been around as long as it has, and nobody has provided the pre-mapfile built-in approach yet.
IFS= read -r first_line <file
...puts the first line of the file in the variable expanded by "$first_line", easy as that.
Moreover, because read is built into bash and this usage requires no subshell, it's significantly more efficient than approaches involving subprocesses such as head or awk.
You dont need any external command if you have bash v4+
< file.txt mapfile -n1 && echo ${MAPFILE[0]}
or if you really want cat
cat file.txt | mapfile -n1 && echo ${MAPFILE[0]}
:)
Use the below command to get the first row from a CSV file or any file formats.
head -1 FileName.csv
There is plenty of good answer to this question. Just gonna drop another one into the basket if you wish to do it with lolcat
lolcat FileName.csv | head -n 1
Adding one more obnoxious alternative to the list:
perl -pe'$.<=1||last' file
# or
perl -pe'$.<=1||last' < file
# or
cat file | perl -pe'$.<=1||last'

Resources