How to run a command through a text file with different number of arguments on each line - shell

I have a text file similar to the one below (much longer). I'm trying to do a lookup for each of these IP addresses using the host command. Do you know how I could do this in the order of the text file (entire first line, then the second line, etc.)?
I tried using this, but it did not execute correctly:
while read in; do host "$in"; done < inputfile.txt > outputfile.txt
Input text file:
10.10.999.200 10.11.223.334 10.55.555.555
10.12.238.222 10.52.212.212
10.12.238.222 10.14.217.232
10.23.212.212 10.19.301.305 10.12.345.678

Set the spaces to newlines and pipe each IP to xargs to process.
tr ' ' '\n' < inputfile.txt | xargs -IX host X > outputfile.txt

I would do it this way:
cat file | while read line
do
echo "$line"
done
This way can see line by line. However, if your file is huge, it will take long time because every time you read file in shell, program open, read, close file. In such case you have to use AWK

Related

grep of 50000 strings in a big file performance improvement

I have a file, which is about 200 MB of size, with about 1.2 M lines in it. Let's say it as reading.txt. I have another file, input.txt,
in which there are about 50000 lines. I want to take a string in each line from input.txt file and grep in reading.txt. For a matched line,
in reading.txt get that complete line and write into other file, output.txt.
As of now, I am looping through every string of input.txt file, grep in reading.txt file. This approach is consuming more than 1 hour time.
Is there any option to increase performance so that time consumption reduces for this process.
while read line
do
LC_ALL=C grep ${line} reading.txt 2>/dev/null
done<input.txt >> output.txt
man grep yields (among others):
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. If this option is used
multiple times or is combined with the -e (--regexp) option,
search for all patterns given. The empty file contains zero
patterns, and therefore matches nothing.
grep -f input.txt reading.txt > output.txt
...will print all lines in 'reading.txt', with a sub string matching a line in 'input.txt', in the order of 'reading.txt', to 'output.txt'
You don't specify this, but it may be relevant (you said 1.2MB lines in 'reading.txt') - a separate output file for every matching line:
#!/bin/sh
nl='
'
IFS=$nl
c=0
for i in $(grep -f input.txt reading.txt); do
c=$((c+1))
echo "$i" > output$c.txt
done
There are tidier methods of setting IFS to a new line, for example in bash: IFS=$'\n' (also you can use > output$((++c)).txt in bash)

No new line produced by >>

I have the following piece of code that selects two line numbers in a file, extracts everything between these lines, replaces the new line characters with tabs and places them in an output file. I want all lines extracted within one loop to be on the same line, but lines extracted on different loops to go on a new line.
for ((i=1; i<=numTimePoints; i++)); do
# Get the starting point for line extraction. This is just an integer.
startScan=$(($(echo "${dataStart}" | sed -n ${i}p)+1))
# Get the end point for line extraction. This is just an integer.
endScan=$(($(echo "${dataEnd}" | sed -n ${i}p)-1))
# From file ${file}, take all lines between ${startScan} and ${endScan}. Replace new lines with tabs and output to file ${tmpOutputFile}
head -n ${endScan} ${file} | tail -n $((${endScan}-${startScan}+1)) | tr "\n" "\t" >> ${tmpOutputFile}
done
This script works mostly as intended, however all new lines are appended to the previous line, rather than placed on new lines (as I thought >> would do). In other words, if I were to now do cat ${tmpOutputFile} | wc then it returns 0 12290400 181970555. Can anyone point out what I'm doing wrong?
Any redirection, including >>, does not have anything to do with newline creation at all -- redirection operations don't generate output themselves, newlines or otherwise; they only control where file descriptors (stdout, stderr, etc) are connected to, and it's the programs performing those writes which are responsible for contents.
Consequently, your tr '\n' '\t' is entirely preventing newlines from being added to the output file -- there's nowhere one could come from that doesn't go through that pipeline.
Consider the following instead:
while read -r startScan <&3 && read -r endScan <&4; do
# generate your output
head -n "$endScan" "$file" | tail -n $(( endScan - startScan + 1 )) | tr '\n' '\t'
# append your newline
printf '\n'
done 3<<<"$dataStart" 4<<<"$dataEnd" >"$tmpOutputFile"
Note:
We aren't paying the cost of running sed to extract startScan and endScan, but rather are reading them a line at a time from herestrings created from the contents of dataStart and dataEnd
We're redirecting to our output file exactly once, and reusing that file handle for the entire loop (over multiple commands -- first the pipeline, and then the printf)
We're actually running a printf to generate that newline, rather than expecting it to be somehow implicitly created by magic.

How would I run a unix command in a loop with variable data?

I'd like to run a unix command in a loop, replacing a variable for each iteration and then store the output into a file.
I'll be grabbing the HTTP headers of a series of URL's using curl -I and then I want each instance outputted to a new line of a file.
I know
I could store the output with | cat or redirect it into a file with >, but how would I run the loop?
I have a file with a list of URL's one per line (or I could comma separate them, alternatively).
You can write:
while IFS= read -r url ; do
curl -I "$url"
done < urls-to-query.txt > retrieved-headers.txt
(using the built-in read command, which reads a line from standard input — in this case redirected from urls-to-query.txt — and saves it to a variable — in this case $url).
Given a list of URLs in a file:
http://url1.com
http://url2.com
You could run
cat inputfile | xargs curl -I >> outputfile
That would read each line of the input file and append the results for each row into the outputfile

Edit files in Bash

I have a few files that contain IP addresses. I'm creating a script and have to figure out how to create a new user file with an IP address that is based off the file created before it. If the last file contains an IP of A.B.C.D the new file needs to be A.B.C.(D+4).
I think I need to use the 'sed' and 'awk' commands, but haven't been able to get anything working. How would I go about writing this part of the script?
Here's something to get you started: suppose there is a file called input looks like this:
Input: contents of input
127.0.0.1
127.0.0.2
127.0.0.3
127.0.0.200
You can do on the cmdline:
awk 'BEGIN{FS=OFS="."} {$4=$4+4; print}' input > output
Explanation on what awk is doing here:
awk '...' - invoke awk, a tool used primarily for line-by-line manipulation of files, the stuff enclosed by single quotes are instructions to awk.
BEGIN{FS=OFS="."} - tell awk to use . as the delimiter for both input and output. FS stands for "Field Separator"
{$4=$4+4; print} - $4 means the 4th field. Since . is the delimiter, D corresponds to the 4th field and we add the integer value 4 to the 4th field. The print here is just short hand for printing the entire line.
input - name the input file as argument to awk; save a cat
> output - redirect the output to a file so you can inspect them for any issues before making the user files based on it.
Output: contents of output
127.0.0.5
127.0.0.6
127.0.0.7
127.0.0.204
And then you can read output one line at a time to create new user files as needed, maybe another script with something along the lines of:
while read line
do
echo "this is a user file" > "$line"
done < output
(and adjust it to your needs)
Finally, as long as you understand what's going on in the above, you can skip the output file altogether and just do this all in a one-liner:
awk 'BEGIN{FS=OFS="."} {$4=$4+4; print}' input | while read line; do echo "hello world" > "$line"; done

how to send text to a process in a shell script?

So I have a Linux program that runs in a while(true) loop, which waits for user input, process it and print result to stdout.
I want to write a shell script that open this program, feed it lines from a txt file, one line at a time and save the program output for each line to a file.
So I want to know if there is any command for:
- open a program
- send text to a process
- receive output from that program
Many thanks.
It sounds like you want something like this:
cat file | while read line; do
answer=$(echo "$line" | prog)
done
This will run a new instance of prog for each line. The line will be the standard input of prog and the output will be put in the variable answer for your script to further process.
Some people object to the "cat file |" as this creates a process where you don't really need one. You can also use file redirection by putting it after the done:
while read line; do
answer=$(echo "$line" | prog)
done < file
Have you looked at pipes and redirections ? You can use pipes to feed input from one program into another. You can use redirection to send contents of files to programs, and/or write output to files.
I assume you want a script written in bash.
To open a file you just need to type a name of it.
To send a text to a program you either pass it through | or with < (take input from file)
To receive output you use > to redirect output to some file or >> to redirect as well but append the results instead of truncating the file
To achieve what you want in bash, you could write:
#/bin/bash
cat input_file | xargs -l1 -i{} your_program {} >> output_file
This calls your_program for each line from input_file and appends results to output_file

Resources