Extract line from a file in shell script - shell

I have a text file of 5000000 lines and I want to extract one line from each 1000 and write them into a new text file. The new text file should be of 5000 line.
Can you help me?

I would use a python script to do so. However, the same logic can be used with your shell as well. Here is the python code.
input_file = 'path/file.txt'
output_file = 'path/output.txt'
n = 0
with open(input_file, 'r') as f:
with ope(output_file, 'w') as o:
for line in f:
n += 1
if n == 1000:
o.write(line)
n = 0
Basically, you initialise a counter then you iterate over the file line by line, you increment the counter for each line, if the counter hits 1000 you write the line in the new file and reset the counter back.
Here is how to iterate over lines of a file using Bash shell.

Try:
awk 'NR%1000==1' infile > outfile
see this link for more options: remove odd or even lines from text file in terminal in linux

You can use either head or tail, depends which line you'd like to extract.
To extract first line from each file (for instance *.txt files):
head -n1 *.txt | grep -ve ^= -e ^$ > first.txt
To extract the last line from each file, simply use tail instead of head.
For extracting specific line, see: How do I use Head and Tail to print specific lines of a file.

Related

How to display n number of lines of a file which is passed as a argument in a shell script?

I have a log file and I want to print the first n lines from the log file by passing n as an argument in bash.
Example:
./hello.sh -n 10 filename
Output should be: First 10 lines of the file.
You could just use the head command on your terminal.
head -10 <filename>.log
This should do the job.

Copying first lines of multiple text files into single file

Using single bash command (pipes, stdio allowed)
copy first line of each file whose name begins with ABC to file named DEF.
Example:
Input:
ABC0:
qwe\n
rty\n
uio\n
ABC1:
asd\n
fgh\n
jkl\n
ABC2:
zxc\n
bvn\n
m,.\n
Result:
DEF:
qwe\n
asd\n
zxc\n
Already tried cat ABC* | head -n1 but it takes only first line from first file, others are omitted.
You would want head -n1 ABC* to let head take the first line from each file. Reading from standard input, head know nothing about where its input comes from.
head, though, adds its own header to identify which file each line comes from, so use awk instead:
awk 'FNR == 1 {print}' ./ABC* > DEF
FNR is the variable containing the line number of the current line of the input, reset to 0 each time a new file is opened. Using ./ABC* instead of ABC* guards against filenames containing an = (which awk handles specially if the part before = is a valid awk variable name. HT William Pursell.)
Assuming that the file names don't contain spaces or newlines, and that there are no directories with names starting with ABC:
ls ABC* | xargs -n 1 head -n 1
The -n 1 ensures that head receives only one name at a time.
If the aforementioned conditions are not met, use a loop like chepner suggested, but explicitly guard against directory entries which are not plain files, to avoid error messages issued by head.

How to append the output of bashrc to txtfile

In Linux terminal, What is the command string to append the output of bashrc to a text file (ex. mybash.txt) I know that with appending you use the double carrots '>>' but do not know how to append the output of bashrc to the text file.
You can use cat file >> outfile. If you only want to read the start of the file you can use:
head -N file >> outfile # where N is the numbers of lines you want to write
For the last part of a file you can use:
tail -N file >> outfile # where N is the numbers of lines you want to write

awk to write different columns from different lines into single line of output file?

I am using a while do loop to read in from a file that contains a list of hostnames, run a command against the host list, and input specific data from the results into a second file. I need the output to be from line 33 column 3 and line 224 column 7, output to a single line in the second file. I can do it for either one or the other but having trouble getting it to work for both. example:
while read i; do
/usr/openv/netbackup/bin/admincmd/bpgetconfig -M $i |\
awk -v j=33 -v k=3 'FNR == j {print $k}' > /tmp/clientversion.txt
done < /tmp/clientlist.txt
Any hints or help is greatly appreciated!
You could use something like this:
awk 'NR==33{a=$3}NR==224{print a,$7}'
This saves the value in the third column of line 33 to the variable a, then prints it out along with the seventh column of line 224.
However, you're currently overwriting the file /tmp/clientversion.txt every iteration of the while loop. Assuming you want the file to contain all of the output once the loop has run, you should move the redirection outside the loop:
while read -r i; do
/usr/openv/netbackup/bin/admincmd/bpgetconfig -M $i |\
awk 'NR==33{a=$3}NR==224{print a,$7}'
done < /tmp/clientlist.txt > /tmp/clientversion.txt
As a bonus, I have added the -r switch to read (see related discussion here). Depending on the contents of your input file, you might also want to use double quotes around "$i" as well.

Cut and paste a line with an exact match using sed

I have a text file (~8 GB). Lets call this file A. File A has about 100,000 lines with 19 words and integers separated by a space. I need to cut several lines from file A and paste them into a new file (file B). The lines should be deleted from file A. The lines to be cut from file A should have an exact matching string.
I then need to repeat this several times, removing lines from file A with a different matching string every time. Each time, file A is getting smaller.
I can do this using "sed" but using two commands, like this:
# Finding lines in file A with matching string and copying those lines to file B
sed -ne '/\<matchingString\>/ p' file A > file B
#Again finding the lines in file A with matching string and deleting those lines,
#writing a tmp file to hold the lines that were not deleted.
sed '/\<matchingString\>/d'file A > tmp
# Replacing file A with the tmp file.
mv tmp file A
Here is an example of files A and B. I want to extract all lines containing hg15
File A:
ID pos frac xp mf ...
23 43210 0.1 2 hg15...
...
...
File B:
23 43210 0.1 2 hg15...
I´m fairly new to writing shell scripts and using all the Unix tools, but I feel I should be able to do this more elegantly and faster. Can anyone please guide me along to improving this script. I don´t specifically need to use "sed". I have been searching the web and stackoverflow without finding a solution to this exact problem. I´m using RedHat and bash.
Thanks.
This might work for you (GNU sed):
sed 's|.*|/\\<&\\>/{w fileB\nd}|' matchingString_file | sed -i.bak -f - fileA
This makes a sed script from the matching strings that writes the matching lines to fileB and deletes them from fileA.
N.B. a backup of fileA is made too.
To make a different file for each exact word match use:
sed 's|.*|/\\<&\\>/{w "&.txt"\nd}|' matchingString_file | sed -i.bak -f - fileA
I'd use grep for this but besides this small improvement this is probably the fastest way to do it already, even if this means to apply the regexp to each line twice:
grep '<matchingString>' A > B
grep -v '<matchingString>' A > tmp
mv tmp A
The next approach would be to read the file line by line, check the line, and write it depending on the check either to B or to tmp. (And mv tmp A again in the end.) But there is no standard Unix tool which does this (AFAIK), and doing it in shell will probably reduce performance massively:
while IFS='' read line
do
if expr "$line" : '<matchingString>' >/dev/null
then
echo "$line" 1>&3
else
echo "$line"
fi > B 3> tmp
done < A
You could try to do this using Python (or similar scripting languages):
import re
with open('B', 'w') as b:
with open('tmp', 'w') as tmp:
with open('A') as a:
for line in a:
if re.match(r'<matchingString>', line):
b.write(line)
else:
tmp.write(line)
os.rename('tmp', 'A')
But this is a little out of scope here (not shell anymore).
Hope this will help you...
cat File A | while read line
do
#Finding lines in file A wit matching string and copying those lines to file B
sed -ne '/\<matchingString\>/ p' file A >> file B
#Again finding the lines in file A with matching string and deleting those lines
#writing a tmp file to hold the lines that were not deleted
sed '/\<matchingString\>/d'file A >> tmp
done
#once you are done with greping and copy pasting Replacing file A with the tmp file
`mv tmp file A`
PS: I'm appending to the file B since we are greping in a loop when the match pattern found.

Resources