Run a command on items in file using Ansible - for-loop

I am looking for a way to run a command like smartctl on a file containing device names like /dev/sda; (one per line). The ansible playbook should be able to read each line and make it an arg to the command.

Are you looking for something like this?
<file_with_smartctl_args xargs -n1 smartctl
Replace file_with_smartctl_args with the file (complete path!) that contains the names of the files (arguments) you want to pass to smartctl. This will run "smartctl" one time for EACH of the lines (arguments) in the file.
Example:
If the file /usr/me/smartctl_args contains the following text:
file1
file2
file3
The command:
</usr/me/smartctl_args xargs -n1 smartctl
Will run smartctl 3 times (since the file has 3 lines in it), like this:
smartctl file1
smartctl file2
smartctl file3
The initial < tells the Unix shell that your "standard input" is going to come from the filename that follows (/usr/me/smartctl_args). Then, xargs will convert the "standard input" to command arguments, the -n1 option causes xargs to execute the command (smartctl) once for each argument.

Related

Need command or script to rename a list of files in linux using a pattern match

I have downloaded some 90 fasta files from NCBI for bacterial genomes. The downloaded files have default names given by NCBI. I need to change it to my desired file names. Thus I have created two .txt files:
file1.txt - having the default files names provided by NCBI. listed out the names provided by NCBI in file1.txt
file2.txt - having the names to replace the default. listed out the names to replace the NCBI names
Both the files are made in an order so that 1st entry of file1.txt is corresponding to 1st entry of file2.txt.
Now all the downloaded files are in a folder. the folder having the files
and I need a script which reads file1.txt, matches with the file name in the folder and replace it with the names in file2.txt.
I am not a bioinformatician, new to this genre. I look forward to your help. Can this process be made simpler?
This can done with a very small awk one-liner. For convenience, lets first combine your file1 and file2 to make processing easier. This can be done with paste file1.txt file2.txt >> names.txt.
names.txt will be a text file with the old names in the first column and the new names in the second. Awk lets us conveniently run through a file line-by-line (or record-by-record in its terminology) and access each column/field.
Assuming you are in the directory with all these files, as well as names.txt, you can simply run awk '{system("mv " $1 " " $2)}' names.txt to transform them all. This will run through all the lines in names.txt, take the filename given in the first column, and move it to the name given in the second column. The system command allows you to access more basic file system operations through the shell, like moving mv, copying cp, or removing rm files.
Use paste
and xargs like so:
paste file1.txt file2.txt | xargs --verbose -n2 mv
The command is using paste to write lines from 2 files side by side, separated by TABs, to STDOUT. The STDOUT is read by xargs using a pipe (|). Option --verbose prints the command, and option -n2 specifies the max number of arguments for xargs to be 2, so that the resulting commands that are executed are something like mv old_file new_file.
Alternatively, use the Perl one-liners below.
Print the commands to rename the files, without executing the commands ("dry run"):
paste file1.txt file2.txt | perl -lane '$cmd = "mv $F[0] $F[1]"; print $cmd;'
Print the commands to rename the files, then actually execute them:
paste file1.txt file2.txt | perl -lane '$cmd = "mv $F[0] $F[1]"; print $cmd; system $cmd;'
The command is using paste to write lines from 2 files side by side, separated by TABs, to STDOUT. The STDOUT is read by the Perl one-liner using a pipe (|) to pass it to Perl one-liner's STDIN.
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
$F[0], $F[1] : first and second elements of the array #F into which the line is split. They are old and new file names, respectively.
system executes the command $cmd, which actually moves the files.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches

xargs -a [file] mv -t [new-directory] gives me mv: cannot stat `filename*': No such file or directory error

I have been trying to run this command (that I have run before in a different directory), and everything I've read on the message boards has not solved my unknown issue.
Of note: 1) the files exist in this directory 2) I have proper permissions to move these files around 3) I have run this exact line of code before and it has worked. 4) I tried listing files with and without '' to capture all the files (see below). 5) I also tired to list each file as 'Sample1', but that did not work.
xargs -a [filename.txt] mv -t [new-directory]
I have file beginnings (I have ~5 file for each beginning), and I want to move all the files associated with that beginning.
Example: Sample1.bam Sample1.sorted.bam, etc
The lines in the file are listed as such:
Sample1*
Sample2*
Sample3* ...etc.
What am I doing incorrectly and how can I fix it?
TIA!
When you execute command using 'xargs' arguments are passed directly to the called program ('mv' in your case). Wildcard patterns in the input are not expanded - 'sample1*' is passed as is to "mv", which issue an error message about note having a file named 'sample1*'.
To get file name expansion, you want to use the shell. One way to handle this situation is
xargs -a FILENAME.TXT -I__ sh -c "mv -t NEW-FOLDER -- __"
Security Note: the code provides some protection against command line injection (e.g., file name starting with '-'). However, other possible attacks are possible. Safer version is
cat FILENAME.txt | grep '^[A-Za-z0-9][A-Z-z0-9._-]*$' | xargs I__ sh -c "mv -t NEW-FOLDER -- __"
which will limit the input to file with alphanumeric. The 'grep' patterns can be extend the pattern as needed.
With GNU Parallel you would do something like:
cat FILENAME.txt | parallel mv {} NEW-FOLDER
One of the benefits of GNU Parallel is that it deals correctly with file names like:
My brother's 12" records cost > $1000.txt

Sending file contents to another command bash

I have a plain text file with two columns. I need to take each line which contains two columns and send them to a command.
The source file looks like this:
potato potato2
the line needs to be sent to another command so it looks like this
command potato potato2
output I can just have to std out.
Been such a long time that I've tried a simple bash script...
I assume that your file contains two columns per line, separated by either spaces or tabs.
xargs -n 2 command < file.txt
See: man xargs
Looks like you just need to read a file line by line, so the following code should do:
while read -r line
do
echo "$line" | xargs your-other-command #Use xargs to convert input into arguments
done < source-file.txt

awk: Output to different processes

I have awk script which splits big file into several files by some condition. Than I'm running another script over each file in parallel.
awk -f script.awk -v DEST_FOLDER=tmp input.file
find tmp/ -name "*.part" | xargs -P $ALLOWED_CPUS --replace --verbose /bin/bash -c "./process.sh {}"
The question is: are there any way to run ./process.sh:
before first script is done, because process.sh processes file line by line (one line too long to be passed to xargs directly);
each new file has a header (added in script.awk) that should be run before the rest of file;
limit amount of parallel processes;
GNU parallel,inotifywait is not an option;
assume dest folder is empty, files name are unknown.
The purpose of optimization to get rid of waiting until the awk is done while some files are ready to be processed.
Once you have created a file, you can pass the filename to a process' or script's input:
awk '{print name_of_created_file | "./process.sh &"}'
& sends process.sh to the background, so that they can run in parallel. However, this is a gawk extension and not POSIX. Check the manual
You basically give the answer yourself: GNU Parallel + inotifywait will work.
Since you are not allowed to use inotifywait, you can make your substitute for inotifywait. If you are allowed to write your own script, you are also allowed to run GNU Parallel (as that is just a script).
So something like this:
awk -f script.awk -v DEST_FOLDER=tmp input.file &
sleep 1
record file sizes of files in tmp
while tmp is not empty do
for files in tmp:
if file size is unchanged: print file
record new file size
sleep 1
done | parallel 'process {}; rm {}'
It is assumed that awk will produce some output with one second. If that takes longer, adjust the sleeps accordingly.

Pass Every Line of Input as stdin for Invocation of Utility

I have a file containing valid xmls (one per line) and I want to execute a utility (xpath) on each line one by one.
I tried xargs but that seems doesn't seem to have an option to pass the line as stdin :-
% cat <xmls-file> | xargs -p -t -L1 xpath -p "//Path/to/node"
Cannot open file '//Path/to/node' at /System/Library/Perl/Extras/5.12/XML/XPath.pm line 53.
I also tried parallel --spreadstdin but that doesn't seem to work either :-
% cat <xmls-file> | parallel --spreadstdin xpath -p "//Path/to/node"
junk after document element at line 2, column 0, byte 1607
If you want every line of a file to be split off and made stdin for a utility
you could use a for loop in bash shell:
cat xmls-file | while read line
do ( echo $f > /tmp/input$$;
xpath -p "//Path/to/node" </tmp/input$$
rm -f /tmp/input$$
);
done
The $$ appends the process id number, creating a unique name
I assume xmls-file contains, on each line, what you want iterated into $f and that you want this as stdin for a command line, not as a parameter to the command.
On the other hand, your specification may be incorrect and maybe instead you need each line
to be part of a command. In that case, delete the echo and rm lines, and change the xpath command to include $f wherever the line from the file is needed.
I've not done much XML so the do command may need to be edited.
You are very close with the GNU Parallel version; only -n1 missing:
cat <xmls-file> | parallel -n1 --spreadstdin xpath -p "//Path/to/node"

Resources