Command with two in files, one out file, looped - bash

so here is my dilemma. I have a command in the form:
grdpaste infile.grd infile.grd -Goutfile.grd
I have a series of folders in the same directory that each contain a file named infile.grd. I want to iterate through all the folder so that the first run combines infile.grd from the first and second folder, and then the second combines outfile.grd from the first run and infile.grd from the third folder, and so on. I do not know how many folders exist, and the final product should contain the combination of all the infiles.
I think I can use a counter to control the combination parts (I did it earlier in my script), but I do not know how to make a for loop that takes one file from one folder and the other file from the next folder, without knowing the names of the folders. I hope this makes sense, thanks much.
AM

If grdpaste will accept an empty input file in a sane way then the following should work:
lastfile=dummy.grd
touch "$lastfile"
for infile in */infile.grd; do
_outfile=outfile$((i++)).grd
grdpaste "$lastfile" "$infile" -G"$_outfile"
lastfile=$_outfile
done
If it can't then the above loop needs to be modified to store the first name it sees in $lastfile and do nothing else that first loop through... something like this:
lastfile=
for infile in */infile.grd; do
[ -z "$lastfile" ] && { lastfile=$infile; continue; }
_outfile=outfile$((i++)).grd
grdpaste "$lastfile" "$infile" -G"$_outfile"
lastfile=$_outfile
done

solution posted below. For complete code, see moravi project here.
for folder in */
do
ls "$folder" | sed 's/e/e/' >"${folder%/}.tmp"
done
for file in *.tmp
do
lat=$(echo $file | awk -F "." '{print $1}')
count=0
while read line
do
count=$(( $count + 1 ))
if [ "$count" = "1" ]
then
declare "tmp_${count}=$line"
elif [ "$count" = "2" ]
then
declare "tmp_${count}=$line"
prod="P"$(( ${count} - 1 ))".grd"
grdpaste ./${lat}/${tmp_1} ./${lat}/${tmp_2} -G./${lat}/${prod} -V
elif [ "$count" > "2" ]
then
r="tmp_"${count}
declare "r=$line"
pprod="P"$(( ${count} - 2 ))".grd"
prod="P"$(( ${count} - 1 ))".grd"
grdpaste ./${lat}/${r} ./${lat}/${pprod} -G./${lat}/${prod} -V
to_paste=${prod}
fi
done <$file
done
rm *.tmp

Related

Move just those files named as specific rows in a sample sheet

Imagine I have these files in my working directory in bash:
123.tsv 456.tsv 789.tsv 101112.tsv 131415.tsv
and that I have this sample sheet (tab separated):
sampleID tissue
123 lung
124 bone
456 lung
457 bone
Now, I want to move those files corresponding to lung samples to a new directory, so I would like to have the following files in the new directory:
123.tsv
456.tsv
I was trying to use:
awk -F"\t" '$2 == "lung"'
But I am not sure about how to include this in a for loop to select filenames included in the first column of the output file from the awk command.
How can I solve this?
If row number is larger 1 and second column contains lung then print content of first column with some text around it:
mkdir new_dir
awk 'NR>1 && $2=="lung" {print "mv", $1 ".tsv new_dir"}' sample.sheet
If output looks fine, append | sh to awk line to execute commands.
#!/bin/sh
#
#
me=$( basename "${0}" )
# Adjust these as needed. If you want to use your current
# working directory change (or remove) `/tmp/` to `./`.
old_dir="/tmp/foo"
new_dir="/tmp/bar"
list="/tmp/sample_sheet"
# Make sure all the pieces are available. Exit if not.
if [ ! -d "${old_dir}" ]
then
echo "ERROR: ${me}: Source '${old_dir}' does not exist." 1>&2
exit 1
elif [ ! -d "${new_dir}" ]
then
echo "ERROR: ${me}: Target '${new_dir}' does not exist." 1>&2
exit 2
elif [ ! -r "${list}" ]
then
echo "ERROR: ${me}: Sample sheet input '${list}' does not exist." 1>&2
exit 3
fi
# Iterate over the first column in `${list}`.
for file in $( awk 'NR>1 && $2=="lung" {print $1".tsv"}' "${list}" )
do
# If the file exists move it, if not do nothing.
if [ -f "${old_dir}/${file}" ]
then
echo "INFO: ${me}: mv ${old_dir}/${file} ${new_dir}/${file}"
mv "${old_dir}/${file}" "${new_dir}/${file}"
fi
done
Here's a script that you can run like, for example, this:
./move_files.sh lung
This works for both cases (lung and bone), and is general. Put this into a file called move_files.sh:
#!/usr/bin/env bash
files=$(sed -e "s/\([0-9]\{3\}\)\( *$1\)/\1/g" <(grep $1 eg.sheet))
if [ ! -d $1 ]; then
mkdir $1
fi
for t in ${files[#]}; do
mv "./$t.tsv" $1
done
With the following directory content:
101112.tsv 123.tsv 124.tsv 131415.tsv 456.tsv 457.tsv 789.tsv eg.sheet move_files.sh
and eg.sheet containing:
sampleID tissue
123 lung
124 bone
456 lung
457 bone
... running the script with
./move_files.sh lung
... results in 123.tsv and 456.tsv being moved into a newly created lung directory (or simply moved there if the directory already exists).
You can then simply run
./move_files.sh bone
to move 124.tsv and 457.tsv to a newly created bone directory. Of course this is then generalisable to whatever is in eg.sheet.
Side note: you must run chomd +x move_files.sh in order to use it in the way I've suggested. Otherwise, you can invoke it with bash move_files.sh lung instead.
EDIT:
To address the point raised by keithpjolley in the comments, this can still work with "tissues" such as "eye lash" just by quoting the $1 variable throughout and by calling it with a quoted string (e.g., ./move_files.sh "eye lash"):
#!/usr/bin/env bash
files=$(sed -e "s/\([0-9]\{3\}\)\( *$1\)/\1/g" <(grep "$1" eg.sheet))
if [ ! -d "$1" ]; then
mkdir "$1"
fi
for t in ${files[#]}; do
mv "./$t.tsv" "$1"
done

How to list files with words exceeding n characters in all subdirectories

I have to write a shell script that creates a file containing the name of each text files from a folder (given as parameter) and it's subfolders that contain words longer than n characters (read n from keyboard).
I wrote the following code so far :
#!/bin/bash
Verifies if the first given parameter is a folder:
if [ ! -d $1 ]
then echo $1 is not a directory\!
exit 1
fi
Reading n
echo -n "Give the number n: "
read n
echo "You entered: $n"
Destination where to write the name of the files:
destinatie="destinatie"
the actual part that i think it makes me problems:
nr=0;
#while read line;
#do
for fisier in `find $1 -type f`
do
counter=0
for word in $(<$fisier);
do
file=`basename "$fisier"`
length=`expr length $word`
echo "$length"
if [ $length -gt $n ];
then counter=$(($counter+1))
fi
done
if [ $counter -gt $nr ];
then echo "$file" >> $destinatie
fi
done
break
done
exit
The script works but it does a few more steps that i don't need.It seems like it reads some files more than 1 time. If anyone can help me please?
Does this help?
egrep -lr "\w{$n,}" $1/* >$destinatie
Some explanation:
\w means: a character that words consist of
{$n,} means: number of consecutive characters is at least $n
Option -l lists files and does not print the grepped text and -r performs a recursive scan on your directory in $1
Edit:
a bit more complete version around the egrep command:
#!/bin/bash
die() { echo "$#" 1>&2 ; exit 1; }
[ -z "$1" ] && die "which directory to scan?"
dir="$1"
[ -d "$dir" ] || die "$dir isn't a directory"
echo -n "Give the number n: "
read n
echo "You entered: $n"
[ $n -le 0 ] && die "the number should be > 0"
destinatie="destinatie"
egrep -lr "\w{$n,}" "$dir"/* | while read f; do basename "$f"; done >$destinatie
This code has syntax errors, probably leftovers from your commented-out while loop: It would be best to remove the last 3 lines: done causes the error, break and exit are unnecessary as there is nothing to break out from and the program always terminates at its end.
The program appears to output files multiple times because you just append to $destinatie. You could simply delete that file when you start:
rm "$destinatie"
You echo the numbers to stdout (echo "$length") and the file names to $destinatie (echo "$file" >> $destinatie). I do not know if that is intentional.
I found the problem.The problem was the directory in which i was searching.Because i worked on the files from the direcotry and modified them , it seems that there remained some files which were not displayed in file explorer but the script would find them.i created another directory and i gived it as parameter and it works. Thank you for your answers
.

Shell issue for loop in while loop

I am using while loop to read xyz.txt file and file which contains contents like below:
2 - info1
4 - info2
6 - info3
9 - info4
Further I am using if condition to match the count -gt then y value so it will send an email. The problem I am facing every time it matches the if condition it is sending an email which I want once, it should read the file till end and if condition matches store the next line output to a file and then send that file with all information. At present I am receiving number of email.
Hope my question is clear I think I am looking for return function once condition matches it continue reading file till the end and store the info.
count=`echo $line | awk '{print $3}'`
cnt=o
while read line
do
if [ "$count" -gt "$x" ]; then ---> This logic is working fine
cnt=$(( $cnt + 1)) --- > This logic is working fine
echo $line > info.txt -----> In info.txt I want to store info in 1 go which ever matches condition.
export info.txt=$info.txt
${PERL_BIN}/perl $send_mail
fi
done < file.txt
If you only want to send email once, don't put the invocation of Perl which sends mail inside the loop; put it outside the loop (after the end of the loop). Use append (>>) to build the file up piecemeal.
count=`echo $line | awk '{print $3}'`
cnt=0 # 0 not o!
while read line
do
if [ "$count" -gt "$x" ]; then
cnt=$(($cnt + 1))
echo $line >> info.txt
fi
done < file.txt
if [ $cnt -gt 0 ]
then
export info_txt=$info.txt
${PERL_BIN}/perl $send_mail
fi
Okay. I've tried to grasp what you want, I think it is this:
First, before the loop, remove any old info.txt file.
rm info.txt
Then, each time through the loop, append new lines to it like so:
echo $line >> info.txt
Notice the double arrows >>. This means append, instead of overwrite.
Finally, do the email sending after the loop.

Renaming Multiples Files with Different Names in a Directory using shell

I've found most of the questions of this kind where the change in name has been same for the entire set of files in that directory.
But i'm here presented with a situation to give a different name to every file in that directory or just add a different prefix.
For Example, I have about 200 files in a directory, all of them with numbers in their filename. what i want to do is add a prefix of 1 to 200 for every file. Like 1_xxxxxxxx.png,2_xxxxxxxx.png...........200_xxxxxxxx.png
I'm trying this, but it doesnt increment my $i everytime, rather it gives a prefix of 1_ to every file.
echo "renaming files"
i=1 #initializing
j=ls -1 | wc -l #Count number of files in that dir
while [ "$i" -lt "$j" ] #looping
do
for FILE in * ; do NEWFILE=`echo $i_$FILE`; #swapping the file with variable $i
mv $FILE $NEWFILE #doing the actual rename
i=`expr $i+1` #increment $i
done
Thanks for any suggestion/help.
To increment with expr, you definitely need spaces( expr $i + 1 ), but you would probably be better off just doing:
echo "renaming files"
i=1
for FILE in * ; do
mv $FILE $((i++))_$FILE
done
i=1
for f in *; do
mv -- "$f" "${i}_$f"
i=$(($i + 1))
done

Avoid going into subdirectories when "find" has a hit

I am trying to look for a certain file in multiple folders. When I hit the file, I want to stop going into subdirectories.
For example:
/foo/.target
/bar/buz/.target
/foo/bar/.target
I want only the first two:
/foo/.target
/bar/buz/.target
Your requirements are not completely clear. I understand them as: look for “wanted” files inside a directory tree; if a directory directly contains at least one match, then just print them, otherwise recurse into that directory.
I can't think of a pure find solution. You could write an awk or perl script to parse the output of find.
Here's a shell script that I think does what you're looking for. Warning: I've only minimally tested it.
#!/bin/sh
## Return 0 if $1 is a matching file, 1 otherwise.
## Note that $1 is the full path to the file.
wanted () {
case ${1##*/} in
.target) true;;
esac
}
## Recurse into the directory $1. Print all wanted files in this directory.
## If there is no wanted file, recurse into each subdirectory in turn.
traverse () {
found=0
for x in "$1"/.* "$1"/*; do
if [ "$x" = "$1/." ] || [ "$x" = "$1/.." ]; then
continue # skip '.' and '..' entries
fi
if ! [ -e "$x" ]; then
continue # skip spurious '.*', '*' from non-matching patterns
fi
if wanted "$x"; then
printf '%s\n' "$x"
found=$(($found+1))
fi
done
if [ $found -eq 0 ]; then # no match here, so recurse
for x in "$1"/.*/ "$1"/*/; do
x=${x%/}
if [ "$x" = "$1/." ] || [ "$x" = "$1/.." ]; then
continue
fi
if [ -d "$x" ]; then # only actual subdirs, not symlinks or '.*' or '*'
found_stack=$found:$found_stack # no lexical scoping in sh
traverse "${x%/}"
found=${found_stack%%:*}
found_stack=${found_stack#*:}
fi
done
fi
}
found_stack=:
for x; do
if wanted "$x"; then
printf '%s\n' "$x"
else
traverse "$x"
fi
done
Use sed or perhaps awk or anything that could break the pipe once it reads a line from input. It may stop find's execution quickly or at least soon enough.
find ... | sed 1q
find ... | awk '1; { exit }"
It would just show a single line.
For the first two:
find ... | sed 2q
find ... | awk '1; NR == 2 { exit }"

Resources