I am setting up a directory structure with many different R & bash scripts in it. They all will be referencing files and folders. Instead of hardcoding the paths I would like to have a text file where each script can search for a descriptor in the file (see below) and read the relevant path from that.
Getting the search-append to work in R is easy enough for me; I am having trouble getting it to work in Bash, since I don't know the language very well.
My guess is it has something to do with the way awk works / stores the variable, or maybe the way the / works on the awk output. But I'm not familiar enough with it and would really appreciate any help
Text File "Master_File.txt":
NOT_DIRECTORY "/file/paths/Fake"
JOB_TEST_DIRECTORY "/file/paths/Real"
ALSO_NOT_DIRECTORY "/file/paths/Fake"
Bash Script:
#! /bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
The last line, $SRCPATH/$R_SCRIPT, seems to be replacing part of SRCPath with the name of $R_SCRIPT which outputs something like /RScript.Rs/Real instead of what I would like, which is /file/paths/Real/RScript.R.
Note: if I hard code the path path="/file/paths/Real" then the code $path/$R_SCRIPT outputs what I want.
The R Script:
system(command = "echo \"SUCCESSFUL_RUN\"", intern = FALSE, wait = TRUE)
q("no")
Please let me know if there's any other info that would be helpful, I added everything I could think of. And thank you.
Edit Upon Answer:
I found two solutions.
Solution 1 - By Mheni:
[ see his answer below ]
Solution 2 - My Adaptation of Mheni's Answer:
After seeing a Mehni's note on ignoring the " quotation marks, I looked up some more stuff, and found out it's possible to change the character that awk used to determine where to separate the text. By adding a -F\" to the awk call, it successfully separates based on the " character.
The following works
#!/bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk -F\" -v r_script=$R_SCRIPT '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
Thank you so much everyone that took the time to help me out. I really appreciate it.
the problem is because of the quotes around the path, this change to the awk command ignores them when printing the path.
there was also a space in the shebang line that shouldn't be there as #david mentioned
#!/bin/bash
master_file_name="/tmp/data"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { if(NR==2) { gsub("\"",""); print $2 } }' "$master_file_name")
echo "$SRCPATH/$R_SCRIPT"
OUTPUT
[1] "Hello World!"
in my example the paths are in /tmp/data
NOT_DIRECTORY "/tmp/file/paths/Fake"
JOB_TEST_DIRECTORY "/tmp/file/paths/Real"
ALSO_NOT_DIRECTORY "/tmp/file/paths/Fake"
and in the path that corresponds to JOB_TEST_DIRECTORY i have a simple hello_world R script
[user#host tmp]$ cat /tmp/file/paths/Real/RScript.R
print("Hello World!")
I would use
Master_File.txt :
NOT_DIRECTORY="/file/paths/Fake"
JOB_TEST_DIRECTORY="/file/paths/Real"
ALSO_NOT_DIRECTORY="/file/paths/Fake"
Bash Script:
#!/bin/bash
R_SCRIPT="RScript.R"
if [[ -r /path/to/Master_File.txt ]]; then
. /path/to/Master_File.txt
else
echo "ERROR -- Can't read Master_File"
exit
fi
Rscript --vanilla $JOB_TEST_DIRECTORY/$R_SCRIPT
Basically, you create a configuration file Key=value, source it then use the the keys as variable for whatever you need throughout the script.
Related
I am writing a shell script, in which I get the location of java via which java. As response I get (for example)
/usr/pi/java7_32/jre/bin/java.
I need the path to be cut so it ends with /jre/, more specificly
/usr/pi/java7_32/jre/
as the programm this information is provided to can not handle the longe path to work.
I have used cut with the / as delimiter and as I thought that the directory of the Java installation is always the same, therfore a
cut -d'/' -f1-5
worked just fine to get this result:
/usr/pi/java7_32/jre/
But as the java could be installed somewhere else aswell, for example at
/usr/java8_64/jre/
the statement would not work correctly.
I need tried sed, awk, cut and different combinations of them but found no answer I liked.
As the title says I would count the number of appereance of the car / until the substing jre/ is found under the premisse that the shell counts from the left to the right.
The incremented number would be the the field I want to see by cutting with the delimiter.
path=$(which java) # example: /usr/pi/java7_32/jre/bin/java
i=0
#while loop with a statment which would go through path
while substring != jre/ {
if (char = '/')
i++
}
#cut the path
path=$path | cut -d'/' -f 1-i
#/usr/pi/java7_32/jre result
Problem is the eventual difference in the path before and after
/java7_64/jre/, like */java*/jre/
I am open for any ideas and solutions, thanks a lot!
Greets
Jan
You can use the shell's built-in parameter operations to get what you need. (This will save the need to create other processes to extract the information you need).
jpath="$(which java)"
# jpath now /usr/pi/java7_32/jre/bin/java
echo ${jpath%jre*}jre
produces
/usr/pi/java7_32/jre
The same works for
jpath=/usr/java8_64/jre/
The % indicates remove from the right side of the string the matching shell reg-ex pattern. Then we just put back jre to have your required path.
You can overwrite the value from which java
jpath=${jpath%jre*}jre
IHTH
You can get the results with grep:
path=$(echo $path | grep -o ".*/jre/")
Say I have 20 different files. First 10 files end with .counts.tsv and the rest of the files end with .libsize.tsv. For each .counts.tsv there are matching .libsize.tsv files. I would like to use a for loop for selecting both of these files and run an R script for on those two files types.
Here is what I tried,
#!/bin/bash
arti='/home/path/tofiles'
for counts in ${arti}/*__counts.tsv ; do
for libsize in "$arti"/*__libsize.tsv ; do
Rscript score.R ${counts} ${libsize}
done;
done;
The above shell script iterates over the files more than 200 times whereas I have only 20 files. I need the Rscript to be executed 10 times for both files. Any suggestions would be appreciated.
I started typing up an answer before seeing your comment that you're only interested in a bash solution, posting anyway in case someone finds this question in the future and is open to an R based solution.
If I were approaching this from scratch, I'd probably just use an R function defined in the file that takes the two file names instead of messing around with the system() calls, but this would provide the behavior you desire.
## Get a vector of files matching each extension
counts_names <- list.files(path = ".", pattern ="*.counts.tsv")
libsize_names <- list.files(path = ".", pattern ="*.libsize.tsv")
## Get the root names of the files before the extensions
counts_roots <- gsub(".counts.tsv$", "",counts_names)
libsize_roots <- gsub(".libsize.tsv$", "",libsize_names)
## Get only root names that have both file types
shared_roots <- intersect(libsize_roots,counts_roots)
## Loop through the shared root names and execute an Rscript call based on the two files
for(i in seq_along(shared_roots)){
counts_filename <- paste0(shared_roots[[i]],".counts.tsv")
libsize_filename <- paste0(shared_roots[[i]],".libsize.tsv")
Command <- paste("Rscript score.R",counts_filename,libsize_filename)
system(Command)
}
Construct the second filename with ${counts%counts.tsv} (remove last part).
#!/bin/bash
arti='/home/path/tofiles'
for counts in ${arti}/*__counts.tsv ; do
libsize="${counts%counts.tsv}libsize.tsv"
Rscript score.R "${counts}" "${libsize}"
done
EDIT:
Less safe is trying to make it an oneliner. When the filenames are without spaces and newlines, you can risk an accident with
echo ${arti}/*counts.tsv ${arti}/*.libsize.tsv | xargs -n2 Rscript score.R
and when you feel really lucky (with no other files than those tsv files in $arti) make a bungee jump with
echo ${arti}/* | xargs -n2 Rscript score.R
Have you tried list.files in base? This will allow you to use all files in the folder.
arti='/home/path/tofiles'
for i in list.files(arti) {
script
}
See whether the below helps.
my_list = list.files("./Data")
counts = grep("counts.tsv", my_list, value=T)
libsize = grep("libsize.tsv", my_list, value=T)
for (i in seq(length(counts))){
system(paste("Rscript score.R",counts[i],libsize[i]))
}
Finally,
I tried the following and it helped me,
for sam in "$arti"/*__counts.tsv ; do
filebase=$(basename $sam)
samples=$(ls -1 ${filebase}|awk -F'[-1]' '{print $1}')
Rscript score.R ${samples}__counts.tsv ${samples}__libsize.tsv
done;
For someone looking for something similar :)
I'm trying to add a service under the name of $1.service, however, the service file being created is not getting the $1, instead, only .service
also there is a part where $1 needs to be pasted inside the $1.service file but it's not passing that information through as well.
this is basically how my bash script look like;
#! /bin/bash
function addService {
cat << EOF > /etc/systemd/system/$1.service
(all that service stuff here)
PIDFile=${_var}/$1.pid
EOF
}
cfg_file=~/config/"$cfg.conf"
if [ -f "$cfg_file" ]; then
. "$cfg_file"
addService $1
fi
so you run the script as ./script.sh test and it should create a service called test.service in this example, but it doesn't seem to be working properly. however, the variables like ${_conf} are passing through without any problems.
and also, do I have to use EOF specifically for this task or echo would do the job alone?
EDIT:
The config file exists and it is $1+.conf and this is the content of test.conf file;
_var=var1
and the .service file that is created passing this information without any problems. which means if $1 wasn't working, it wouldn't fetch the config file as well. but apparently, it is working.
First, you are checking for the existence of a file in ~/conf that ends in .conf? What is the value of $cfg? Does ~/conf/${cfg}.conf exist? If not, are you even going into the if clause? Using "set -x" will help debug these things.
Second, you have EOF indented. For HERE documents, the delimiter must start in the first column. You should have gotten an error when running this script about that. Something like, here-document at line X delimited by end-of-file (wanted EOF). The delimiter string can be anything (e.g. EOSD for end of service definition). It needs to start in column 1 though.
Here is what I quickly did to make sure things work.
#! /bin/bash
set -x
function addService {
cat << EOF > ./$1.service
(all that service stuff here)
PIDFile=${_conf}/$1.pid
EOF
}
cfg_file=./conf.in
if [ -f "$cfg_file" ]; then
. "$cfg_file"
addService $1
fi
Hope this helps.
The problem has been solved by changing
cat << EOF > /etc/systemd/system/$1.service
(service content here)
EOF
to
echo "
(service content here)
" > /etc/systemd/system/$1.service
I have recently just made this script:
if test -s $HOME/koolaid.txt ; then
Billz=$(grep / $HOME/koolaid.txt)
echo $Billz
else
Billz=$HOME/notkoolaid
echo $Billz
fi
if test -d $Billz ; then
echo "Ok"
else touch $Billz
fi
So basically, if the file $HOME/koolaid.txt file does NOT exist, then Billz will be set as $HOME/koolaid.txt. It then sucesfully creates the file.
However, if I do make the koolaid.txt then I get this
mkdir: cannot create directory : No such file or directory
Any help would be appreciated
Here is a difference between content of a variable and evaluated content...
if your variable contains a string $HOME/some - you need expand it to get /home/login/same
One dangerous method is eval.
bin=$(grep / ~/.rm.cfg)
eval rbin=${bin:-$HOME/deleted}
echo "==$rbin=="
Don't eval unless you're absolutely sure what you evaling...
Here are a couple things to fix:
Start your script with a "shebang," such as:
#!/bin/sh
This way the shell will know that you want to run this as a Bourne shell script.
Also, your conditional at the top of the script doesn't handle the case well in which .rm.cfg exists but doesn't contain a slash character anywhere in it. In that case the rbin variable never gets set.
Finally, try adding the line
ls ~
at the top so you can see how the shell is interpreting the tilde character; that might be the problem.
(Sorry for the confusion. Previous $ sign occurred when I tried to simplify the actual problem. Thanks for correcting the question)
I wanted to split a directory name on underscores (ex: dir_to_split="my_test_dir") like this:
my_dir=($dir_to_split)
var=$(echo $my_dir | awk -F"_" '{print $1,$2,$3}')
set -- $var
splited_1=$1
splited_2=$2
splited_3=$3
now using these splited_x is causing me errors. ex.
myprograme $splited_1 $splited_2 $splited_3
Can anyone please help me with this ? Thank you....
(Rewritten after updated question.)
What kind of errors do you get? I find it useful to add set -x to the top of my shell scripts when debugging, this lets the shell print all commands it executes so you can pinpoint the line where problems begin.
Are you sure that $dir_to_split is actually set? Does it contain spaces or tabs? Does it contain two underscores? I don't see any other problems right now.
There are in-shell methods of splitting a variable such as:
dir="my_test_dir"
OIFS="$IFS"
IFS="-"
set --
IFS="$OIFS"
See also this SO question.