Loop over files in folder and lines in file simultaneously. - bash

I have the following Bash file:
#!/bin/sh
# $1 is path of C executable
# $2 is path of folder with instances
# $3 is path of folder to save results
seed=333
# Write header to output file
echo "instance, cost, time" >> "$3/ACO/run.txt"
# Run ACO
for instance in "$2"/*
do
y=${instance%.txt}
output=$(eval "$1 --seed $seed --instance $instance --runtime 10 --aco --rep")
filename=${y##*/}
length=${#filename}
first=$(echo ${filename:3:$length-4} | awk '{print toupper($0)}')
second=${filename:$length-1:1}
echo "$first.$second, $output" >> "$3/ACO/run.txt"
done
This Bash file runs an algorithm in C on each instance (a .txt file) located in the folder specified by the second argument ($2). This algorithm requires different command line arguments, as can be seen by the call to eval(e.g. --seed and --instance).
Now, my problem is this. I have another .txt file which specifies for each instance the value for the command line argument --runtime. The values in this file are in the same order as the instances in the instances folder.
So my question is, how can I loop simultaneously over the instances in the instance folder and over the runtimes.txt file? Also, I would like this runtimes.txt file to be the fourth argument to this Bash file.

It's not particularly elegant but one option would be to use something like this:
line=1
for instance in "$2"/*
do
runtime=$(sed -n "${line}p" "$4")
line=$(( line + 1 ))
# etc.
done
The counter is incremented in the loop and sed is used to print a single line using the p command.
By the way, I think you can (and should!) drop the use of eval in your script - I can't see it doing anything useful.

You can use this to read several files :
while true
do
read -r val1 <&3 || break
read -r val2 <&4 || break
# do whatever you want with the values
done 3<file1 4<file2

Related

Bash File names will not append to file from script

Hello I am trying to get all files with Jane's name to a separate file called oldFiles.txt. In a directory called "data" I am reading from a list of file names from a file called list.txt, from which I put all the file names containing the name Jane into the files variable. Then I'm trying to test the files variable with the files in list.txt to ensure they are in the file system, then append the all the files containing jane to the oldFiles.txt file(which will be in the scripts directory), after it tests to make sure the item within the files variable passes.
#!/bin/bash
> oldFiles.txt
files= grep " jane " ../data/list.txt | cut -d' ' -f 3
if test -e ~data/$files; then
for file in $files; do
if test -e ~/scripts/$file; then
echo $file>> oldFiles.txt
else
echo "no files"
fi
done
fi
The above code gets the desired files and displays them correctly, as well as creates the oldFiles.txt file, but when I open the file after running the script I find that nothing was appended to the file. I tried changing the file assignment to a pointer instead files= grep " jane " ../data/list.txt | cut -d' ' -f 3 ---> files=$(grep " jane " ../data/list.txt) to see if that would help by just capturing raw data to write to file, but then the error comes up "too many arguments on line 5" which is the 1st if test statement. The only way I get the script to work semi-properly is when I do ./findJane.sh > oldFiles.txt on the shell command line, which is me essentially manually creating the file. How would I go about this so that I create oldFiles.txt and append to the oldFiles.txt all within the script?
The biggest problem you have is matching names like "jane" or "Jane's", etc. while not matching "Janes". grep provides the options -i (case insensitive match) and -w (whole-word match) which can tailor your search to what you appear to want without having to use the kludge (" jane ") of appending spaces before an after your search term. (to properly do that you would use [[:space:]]jane[[:space:]])
You also have the problem of what is your "script dir" if you call your script from a directory other than the one containing your script, such as calling your script from your $HOME directory with bash script/findJane.sh. In that case your script will attempt to append to $HOME/oldFiles.txt. The positional parameter $0 always contains the full pathname to the current script being run, so you can capture the script directory no matter where you call the script from with:
dirname "$0"
You are using bash, so store all the filenames resulting from your grep command in an array, not some general variable (especially since your use of " jane " suggests that your filenames contain whitespace)
You can make your script much more flexible if you take the information of your input file (e.g list.txt), the term to search for (e.g. "jane"), the location where to check for existence of the files (e.g. $HOME/data) and the output filename to append the names to (e.g. "oldFile.txt") as command line [positonal] parameters. You can give each default values so it behaves as you currently desire without providing any arguments.
Even with the additional scripting flexibility of taking the command line arguments, the script actually has fewer lines simply filling an array using mapfile (synonymous with readarray) and then looping over the contents of the array. You also avoid the additional subshell for dirname with a simple parameter expansion and test whether the path component is empty -- to replace with '.', up to you.
If I've understood your goal correctly, you can put all the pieces together with:
#!/bin/bash
# positional parameters
src="${1:-../data/list.txt}" # 1st param - input (default: ../data/list.txt)
term="${2:-jane}" # 2nd param - search term (default: jane)
data="${3:-$HOME/data}" # 3rd param - file location (defaut: ../data)
outfn="${4:-oldFiles.txt}" # 4th param - output (default: oldFiles.txt)
# save the path to the current script in script
script="$(dirname "$0")"
# if outfn not given, prepend path to script to outfn to output
# in script directory (if script called from elsewhere)
[ -z "$4" ] && outfn="$script/$outfn"
# split names w/term into array
# using the -iw option for case-insensitive whole-word match
mapfile -t files < <(grep -iw "$term" "$src" | cut -d' ' -f 3)
# loop over files array
for ((i=0; i<${#files[#]}; i++)); do
# test existence of file in data directory, redirect name to outfn
[ -e "$data/${files[i]}" ] && printf "%s\n" "${files[i]}" >> "$outfn"
done
(note: test expression and [ expression ] are synonymous, use what you like, though you may find [ expression ] a bit more readable)
(further note: "Janes" being plural is not considered the same as the singular -- adjust the grep expression as desired)
Example Use/Output
As was pointed out in the comment, without a sample of your input file, we cannot provide an exact test to confirm your desired behavior.
Let me know if you have questions.
As far as I can tell, this is what you're going for. This is totally a community effort based on the comments, catching your bugs. Obviously credit to Mark and Jetchisel for finding most of the issues. Notable changes:
Fixed $files to use command substitution
Fixed path to data/$file, assuming you have a directory at ~/data full of files
Fixed the test to not test for a string of files, but just the single file (also using -f to make sure it's a regular file)
Using double brackets — you could also use double quotes instead, but you explicitly have a Bash shebang so there's no harm in using Bash syntax
Adding a second message about not matching files, because there are two possible cases there; you may need to adapt depending on the output you're looking for
Removed the initial empty redirection — if you need to ensure that the file is clear before the rest of the script, then it should be added back, but if not, it's not doing any useful work
Changed the shebang to make sure you're using the user's preferred Bash, and added set -e because you should always add set -e
#!/usr/bin/env bash
set -e
files=$(grep " jane " ../data/list.txt | cut -d' ' -f 3)
for file in $files; do
if [[ -f $HOME/data/$file ]]; then
if [[ -f $HOME/scripts/$file ]]; then
echo "$file" >> oldFiles.txt
else
echo "no matching file"
fi
else
echo "no files"
fi
done

How to take multiple files in terminal in BASH shell scripting

I am new in sh and I am trying to scan output files and take some rows starts with "enthalpy new" to a new file. In advance I created a file named step.txt in the directory. There is not a certain number of files so I tried to do it like that:
for (( i=1; i<=$# ; i++))
do
grep "enthalpy new" $i >> step.txt
done
And I wrote this command to the terminal:
$bash hw1.sh sample2.out sample1.out
Then, I took these errors:
grep: 1: No such a file or directory
grep: 2: No such a file or directory
I am expecting to have step.txt file having 28 lines which 12 of them coming from sample1.out and 16 of them coming from sample2.out. Inside of step.txt will look like:
enthalpy new = 80 Ry
enthalpy new = 76 Ry
....
....
enthalpy new = 90 Ry
Is there anyone to tell my error and to help me fixing the code?
At present $I is referencing the iterations of the loop and so 1,2,3 .... These files cannot be found by grep and hence the error.
There are two approaches to ocercome this. $# contains the parameters passed to the script and so you could try:
grep "enthalpy new" "$#" >> step.txt
Alternatively, if you want to loop through each parameter/file try:
for fil in "$#"
do
grep "enthalpy new" "$fil" >> step.txt
done
To input filenames or any other variable you have to use $1 $2 $3 e.g. as the input for the script. It would work more flexible if you would drop them in a specific directory (let's say ./output) and call the script without variables in the parent directory - then it would be more flexible in terms of how many files you drop in there, without incriminating the variables and capturing input for code injection - the code should look like this:
for i in $(find ./output -name '*.out')
do
grep "enthalpy new" $i >> step.txt
done

How do count the amount of levels in a filepath including the root? (Shell scripting)

I'm trying to create a script that will take an input argument (filepath) and count the amount of levels are in the filepath. I tried to use the following code but it only counts the home and directories, but not the root and file if specified.
IFS="/" read -ra PARTS <<< "$(pwd)"
for i in "${PARTS[#]}"
do
echo "$i"
((NUM_FLEVEL=NUM_FLEVEL+1))
done
Say that I type in "script ~/hello1/file1" and the ~ happens to be home/house. I expect the actual total count to be 5.
/
home
house
hello1
file1
You are operating on the ouput of pwd, not on the file name you apparently tried to pass in as a command-line argument.
You can print the length of an array simply with ${#array[#]}.
IFS="/" read -ra PARTS <<< "$1"
echo "${#PARTS[#]}"
Demo (with some debugging details): https://ideone.com/1oWawt
A proper solution would probably loop over "$#" so you can pass in multiple arguments.

Redirecting the result files to different variable file names

I have a folder with, say, ten data files I01.txt, ..., I10.txt.. Each file, when executed using the command /a.out, gives me five output files, namely f1.txt, f2.txt, ... f5.txt.
I have written a simple bash program to execute all the files and save the output printed on the screen to a variable file using the command
./ cosima_peaks_444_temp_muuttuva -args > $counter-res.txt.
Using this, I am able to save the on screen output to the file. But the five files f1 to f5 are altered to store results of the last file run, in this case I10, and the results of the first nine files are lost.
So I want to save the output of each I*.txt file (f1 ... f5) to a a different file such that, when the program executes I01.txt, using ./a.out it stores the output of the files
f1>var1-f1.txt , f2>var1-f2.txt... f5 > var1-f5.txt
and then repeats the same for I02 (f1>var2-f1.txt ...).
#!/bin/bash
# echo "for looping over all the .txt files"
echo -e "Enter the name of the file or q to quit "
read dir
if [[ $dir = q ]]
then
exit
fi
filename="$dir*.txt"
counter=0
if [[ $dir == I ]]
then
for f in $filename ; do
echo "output of $filename"
((counter ++))
./cosima_peaks_444_temp_muuttuva $f -m202.75 -c1 -ng0.5 -a0.0 -b1.0 -e1.0 -lg > $counter-res.txt
echo "counter $counter"
done
fi
If I understand you want to pass files l01.txt, l02.txt, ... to a.out and save the output for each execution of a.out to a separate file like f01.txt, f02.txt, ..., then you could use a short script that reads each file named l*.txt in the directory and passes the value to a.out redirecting the output to a file fN.txt (were N is the same number in the lN.txt filename.) This presumes you are passing each filename to a.out and that a.out is not reading the entire directory automatically.
for i in l*.txt; do
num=$(sed 's/^l\(.*\)[.]txt/\1/' <<<"$i")
./a.out "$i" > "f${num}.txt"
done
(note: that is 's/(lowercase L) ... /\one..')
note: if you do not want the same N from the filename (with its leading '0'), then you can trim the leading '0' from the N value for the output filename.
(you can use a counter as you have shown in your edited post, but you have no guarantee in sort order of the filenames used by the loop unless you explicitly sort them)
note:, this presumes NO spaces or embedded newline or other odd characters in the filename. If your lN.txt names can have odd characters or spaces, then feeding a while loop with find can avoid the odd character issues.
With f1 - f5 Created Each Run
You know the format for the output file name, so you can test for the existence of an existing file name and set a prefix or suffix to provide unique names. For example, if your first pass creates filenames 'pass1-f01.txt', 'pass1-f02.txt', then you can check for that pattern (in several ways) and increment your 'passN' prefix as required:
for f in "$filename"; do
num=$(sed 's/l*\(.*\)[.]txt/\1/' <<<"$f")
count=$(sed 's/^0*//' <<<"$num")
while [ -f "pass${count}-f${num}.txt" ]; do
((count++))
done
./a.out "$f" > "pass${count}-f${num}.txt"
done
Give that a try and let me know if that isn't closer to what you need.
(note: the use of the herestring (<<<) is bash-only, if you need a portable solution, pipe the output of echo "$var" to sed, e.g. count=$(echo "$num" | sed 's/^0*//') )
I replaced your cosima_peaks_444_temp_muuttuva with a function myprog.
The OP asked for more explanation, so I put in a lot of comment:
# This function makes 5 output files for testing the construction
function myprog {
# Fill the test output file f1.txt with the input filename and a datestamp
echo "Output run $1 on $(date)" > f1.txt
# The original prog makes 5 output files, so I copy the new testfile 4 times
cp f1.txt f2.txt
cp f1.txt f3.txt
cp f1.txt f4.txt
cp f1.txt f5.txt
}
# Use the number in the inputfile for making a unique filename and move the output
function move_output {
# The parameter ${1} is filled with something like I03.txt
# You can get the number with a sed action, but it is more efficient to use
# bash functions, even in 2 steps.
# First step: Cut off from the end as much as possiple (%%) starting with a dot.
Inumber=${1%%.*}
# Step 2: Remove the I from the Inumber (that is filled with something like "I03").
number=${Inumber#I}
# Move all outputfiles from last run
for outputfile in f*txt; do
# Put the number in front of the original name
mv "${outputfile}" "${number}_${outputfile}"
done
}
# Start the main processing. You will perform the same logic for all input files,
# so make a loop for all files. I guess all input files start with an "I",
# followed by 2 characters (a number), and .txt. No need to use ls for listing those.
for input in I??.txt; do
# Call the dummy prog above with the name of the first input file as a parameter
myprog "${input}"
# Now finally the show starts.
# Call the function for moving the 5 outputfiles to another name.
move_output "${input}"
done
I guess you have the source code of this a.out binary. If so, I would modify it so that it outputs to several fds instead of several files. Then you can solve this very cleanly using redirects:
./a.out 3> fileX.1 4> fileX.2 5> fileX.3
and so on for every file you want to output. Writing to a file or to a (redirected) fd is equivalent in most programs (notable exception: memory mapped I/O, but that is not commonly used for such scripts - look for mmap calls)
Note that this is not very esoteric, but a very well known technique that is regularly used to separate output (stdout, fd=1) from errors (stderr, fd=2).

Get output filename in Bash Script

I would like to get just the filename (with extension) of the output file I pass to my bash script:
a=$1
b=$(basename -- "$a")
echo $b #for debug
if [ "$b" == "test" ]; then
echo $b
fi
If i type in:
./test.sh /home/oscarchase/test.sh > /home/oscarchase/test.txt
I would like to get:
test.txt
in my output file but I get:
test.sh
How can I procede to parse this first argument to get the right name ?
Try this:
#!/bin/bash
output=$(readlink /proc/$$/fd/1)
echo "output is performed to \"$output\""
but please remember that this solution is system-dependent (particularly for Linux). I'm not sure that /proc filesystem has the same structure in e.g. FreeBSD and certainly this script won't work in bash for Windows.
Ahha, FreeBSD obsoleted procfs a while ago and now has a different facility called procstat. You may get an idea on how to extract the information you need from the following screenshot. I guess some awk-ing is required :)
Finding out the name of the file that is opened on file descriptor 1 (standard output) is not something you can do directly in bash; it depends on what operating system you are using. You can use lsof and awk to do this; it doesn't rely on the proc file system, and although the exact call may vary, this command worked for both Linux and Mac OS X, so it is at least somewhat portable.
output=$( lsof -p $$ -a -d 1 -F n | awk '/^n/ {print substr($1, 2)}' )
Some explanation:
-p $$ selects open files for the current process
-d 1 selects only file descriptor 1
-a is use to require both -p and -d apply (the default is to show all files that match either condition
-F n modifies the output so that you get one line per field, prefixed with an identifier character. With this, you'll get two lines: one beginning with p and indicating the process ID, and one beginning with `n indicating the file name of the file.
The awk command simply selects the line starting with n and outputs the first field minus the initial n.

Resources