How can I use each line of a file as an input switch in bash? - bash

I have a file that contains a list of filenames on each line.
myfile.txt:
somepath/Documents/a.txt
somepath/Documents/b.txt
somepath/Documents/c.txt
This file can contain any number of lines. What I want to do is then run a command that runs cat with each line being an input, such as:
cat <line 1> <line 2> <line 3> new_combination_file.txt
I was looking this up and I think I should be using xargs to do this, but when I looked up examples and the man page, it didn't make sense to me. Can someone help?

Use a while read loop to read each line into a variable
while read -r filename
do
cat "$filename"
done < myfile.txt > new_combination_file.txt
You can also use xargs:
xargs cat < myfile.txt > new_combination_file.txt
However, this won't work if any filenames contain spaces.

You can use xargs to do this like so:
xargs -t -a myfile.txt cat >new_combination_file.txt

If you want to use xargs command, just use
cat myfile.txt | xargs -I {} cat {}
-I specifies that the following command will be executed for each line of the input (that is, cat myfile.txt in this example) and
{} represents the value of each line
If you want to combine the output produced just use the regular redirect > command followed by the filename.
Hope this helps.

Related

launch several "while read" commands with xargs

I have a file that contains a list of commands like this
while read line;do tabix ftp://.../myfile.gz. >> output.vcf; done < input.txt
and I would like to pass this list of 45 commands to xargs.
I'm trying to call:
cat mycommands.txt | xargs -P45 -n10 bash
but I'm not sure whether bash understands > or >> as an argument and it is not working.
Does anyone see something I'm not seeing? A mistake...
Thank you very much in advance!
Did you try using the -I flag?
Like this
cat mycommands.txt | xargs -P45 -n10 -I {} bash -c {}
As it appears in the xargs man page:
Replace occurrences of replace-str in the initial-arguments with names
read from standard input.
kind regards

Add a prefix to logs with AWK

I am facing a problem with a script I need to use for log analysis; let me explain the question:
I have a gzipped file like:
5555_prova.log.gz
Inside the file there are mali lines of log like this one:
2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
I need a script read the gzipped log file which is capable to output on the stdout a modified log line like this one:
5555 2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
As you can see the line of log now start with the number read from the gzip file name.
I need this new line to feed a logstash data crunching chain.
I have tried with a script like this:
echo "./5555_prova.log.gz" | xargs -ISTR -t -r sh -c "gunzip -c STR | awk '{$0="5555 "$0}' "
this is not exactly what I need (the prefix is static and not captured with a regular expression from the file name) but even with this simplified version I receive an error:
sh -c gunzip -c ./5555_prova.log.gz | awk '{-bash=5555 -bash}'
-bash}' : -c: line 0: unexpected EOF while looking for matching `''
-bash}' : -c: line 1: syntax error: unexpected end of file
As you can see from the above output the $0 is no more the whole line passed via pipe to awk but is a strange -bash.
I need to use xargs because the list of gzipped file is fed the the command line from an another tool (i.e. an instantiated inotifywait listening to a directory where the files are written via ftp).
What I am missing? do you have some suggestions to point me in the right direction?
Regards,
S.
Trying to following the #Charles Duffy suggestion I have written this code:
#/bin/bash
#
# Usage: sendToLogstash.sh [pattern]
#
# Executes a command whenever files matching the pattern are closed in write
# mode or moved to. "{}" in the command is replaced with the matching filename (via xargs).
# Requires inotifywait from inotify-tools.
#
# For example,
#
# whenever.sh '/usr/local/myfiles/'
#
#
DIR="$1"
PATTERN="\.gz$"
script=$(cat <<'EOF'
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' < $(gunzip -dc "$DIR/$file")
EOF
)
inotifywait -q --format '%f' -m -r -e close_write -e moved_to "$DIR" \
| grep --line-buffered $PATTERN | xargs -I{} -r sh -c "file={}; $script"
But I got the error:
[root#ms-felogstash ~]# ./test.sh ./poppo
gzip: /1111_test.log.gz: No such file or directory
gzip: /1111_test.log.gz: No such file or directory
sh: $(gunzip -dc "$DIR/$file"): ambiguous redirect
Thanks for your help, I feel very lost writing bash scripts.
Regards,
S.
EDIT: Also in case you are dealing with multiple .gz files and want to print their content along with their file names(first column _ delimited) then following may help you.
for file in *.gz; do
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' <(gzip -dc "$file")
done
I haven't tested your code(couldn't completely understand also), so trying to give here a way like in case your code could pass file name to awk then it will be pretty simple to append the file's first digits like as follows(just an example).
awk 'FNR==1{split(FILENAME,array,"_")} {$0=array[1] OFS $0} 1' 5555_prova.log_file
So here I am taking FILENAME out of the box variable for awk(only in first line of file) and then by splitting it into array named array and then adding it in each line of the file.
Also wrap "gunzip -c STR this with ending " which seems to be missing before you pass its output to awk too.
NEVER, EVER use xargs -I with a string substituted into sh -c (or bash -c or any other context where that string is interpreted as code). This allows malicious filenames to run arbitrary commands -- think about what happens if someone runs touch $'$(rm -rf ~)\'$(rm -rf ~)\'.gz', and gets that file into your log.
Instead, let xargs append arguments after your script text, and write your script to iterate over / read those arguments as data, rather than having them substituted into code.
To show how to use xargs safely (well, safely if we assume that you've filtered out filenames with literal newlines):
# This way you don't need to escape the quotes in your script by hand
script=$(cat <<'EOF'
for arg; do gunzip -c <"$arg" | awk '{$0="5555 "$0}'; done
EOF
)
# if you **did** want to escape them by hand, it would look like this:
# script='for arg; do gunzip -c <"$arg" | awk '"'"'{$0="5555 "$0}'"'"'; done'
echo "./5555_prova.log.gz" | xargs -d $'\n' sh -c "$script" _
To be safer with all possible filenames, you'd instead use:
printf '%s\0' "./5555_prova.log.gz" | xargs -0 sh -c "$script" _
Note the use of NUL-delimited input (created with printf '%s\0') and xargs -0 to consume it.

Unix shell scripting, need assign the text files values to the sed command

i was trying to add the lines from the text file to the sed command
observered_list.txt
Uncaught SlingException
cannot render resource
IncludeTag Error
Recursive invocation
Reference component error
i need it to be coded like the following
sed '/Uncaught SlingException\|cannot render resource\|IncludeTag Error\|Recursive invocation\|Reference component error/ d'
help me to do this.
I would suggest you create a sed script and delete each pattern consecutively:
while read -r pattern; do
printf "/%s/ d;\n" "$pattern"
done < observered_list.txt >> remove_patterns.sed
# now invoke sed on the file you want to modify
sed -f remove_patterns.sed file_to_clean
Alternatively you could construct the sed command like this:
pattern=
while read -r line; do
pattern=$pattern'\|'$line
done < observered_list.txt
# strip of first and last \|
pattern=${pattern#\\\|}
pattern=${pattern%\\\|}
printf "sed '/%s/ d'\n" "$pattern"
# you still need to invoke the command, it's just printed
You can use grep for that:
grep -vFf /file/with/patterns.txt /file/to/process.txt
Explanation:
-v excludes lines of process.txt which match one of the patterns from output
-F treats patterns in patterns.txt as fixed strings instead of regexes (looks like this is desired here)
-f reads patterns from patterns.txt
Check man grep for further information.

Need to concatenate a string to each line of ls command output in unix

I am a beginer in Shell script. Below is my requirement in UNIX Korn Shell.
Example:
When we list files using ls command redirect to a file the file names will be stored as below.
$ ls FILE*>FLIST.TXT
$ cat FLIST.TXT
FILE1
FILE2
FILE3
But I need output as below with a prefixed constant string STR,:
$ cat FLIST.TXT
STR,FILE1
STR,FILE2
STR,FILE3
Please let me what should be the ls command to acheive this output.
You can't use ls alone to append data before each file. ls exists to list files.
You will need to use other tools along side ls.
You can append to the front of each line using the sed command:
cat FLIST.TXT | sed 's/^/STR,/'
This will send the changes to stdout.
If you'd like to change the actual file, run sed in place:
sed -i -e 's/^/STR,/' FLIST.TXT
To do the append before writing to the file, pipe ls into sed:
ls FILE* | sed 's/^/STR,/' > FLIST.TXT
The following should work:
ls FILE* | xargs -i echo "STR,{}" > FLIST.TXT
It takes every one of the file names filtered by ls and adds the "STR," prefix to it prior to the appending

Remove Lines in Multiple Text Files that Begin with a Certain Word

I have hundreds of text files in one directory. For all files, I want to delete all the lines that begin with HETATM. I would need a csh or bash code.
I would think you would use grep, but I'm not sure.
Use sed like this:
sed -i -e '/^HETATM/d' *.txt
to process all files in place.
-i means "in place".
-e means to execute the command that follows.
/^HETATM/ means "find lines starting with HETATM", and the following d means "delete".
Make a backup first!
If you really want to do it with grep, you could do this:
#!/bin/bash
for f in *.txt
do
grep -v "^HETATM" "%f" > $$.tmp && mv $$.tmp "$f"
done
It makes a temporary file of the output from grep (in file $$.tmp) and only overwrites your original file if the command executes successfully.
Using the -v option of grep to get all the lines that do not match:
grep -v '^HETATM' input.txt > output.txt

Resources