Bash : Adding input file names in output results - bash

I'm using cURL API to submit files on a API service and It returns back with something called taks_id for the files submitted.
#submitter.sh
creation_date=$(date +"%m_%d_%Y")
task_id_file="/home/results/$creation_date"_task_ids".csv"
for i in $(find $1 -type f);do
task_id="$(curl -s -F file=#$i http://X.X.X.X:XXXX/api/tiscale/v1/upload)"
final_task_id=$(echo $task_id | grep -o 'http\?://[^"]\+')
echo "$final_task_id" >> $task_id_file
done
#Result ( 10_13_2016_task_ids.csv )
http://192.168.122.24:8080/api/tiscale/v1/task/17
http://192.168.122.24:8080/api/tiscale/v1/task/18
http://192.168.122.24:8080/api/tiscale/v1/task/19
Run Method :
$./submitter.sh /home/files/pdf/
Now, Using[find $1 -type f] logic will get the full path with file name mentioned.
#find /home/files/pdf -type f
/home/files/pdf/country.pdf
/home/files/pdf/summary.pdf
/home/files/pdf/age.pdf
How can i add the file names along with cURL API response result. For example , When submitting "/home/files/country.pdf", The API might give the task_id with http://192.168.122.24:8080/api/tiscale/v1/task/17'`.
Expecting Result :
country.pdf,http://192.168.122.24:8080/api/tiscale/v1/task/17
summary.pdf,http://192.168.122.24:8080/api/tiscale/v1/task/18
age.pdf,http://192.168.122.24:8080/api/tiscale/v1/task/19
I'm beginner in Bash, Any suggestions on how to achieve this ?

Related

Qsub script - Unable to run job: Script length does not match declared length

I have a script that submits processing jobs to a queue. Before I submit the jobs, I assign the string variables to each respective data point so I can use them as the arguments before I submit the jobs through qsub.
I had to fix up the module I'm loading first by putting in a -v variable to set up my working environment. I got the error message that is in the title however, and looking around there is very limited resources to debugging it. One resource I found seems to have led me in the direction of the potential likelihood of an extraneous space in the qsub command itself. Has anyone run into this?
I also did echo on my qsub command to make sure it was being inputted correctly, as it was.
Here's my script:
#!/bin/bash
# This script is for submitting the initial registration subjects for Greedy registration.
# It can serve as a template for later studies when multiple submissions could be handy
# GO_HOME = Origin diqrectory for all niftis of interest
GO_NIFTI="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Nifti/"
GO_B0="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Protocols/ants_SyNBaseline/W_Registration_antsSyN_Baseline7/"
GO_FM="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Protocols/brainmage_batch_t1/"
FINAL_DESTINATION="/gpfs/fs001/cbica/comp_space/wingerti/Test-Retest/Protocols/Registration_greedy_Rigid/"
cd $GO_NIFTI
nii_directories=($(find . -type d -name "*t1*" -o -name "*t0*" -o -name "*t2*" -maxdepth 1 ))
module load greedy
# Will look at these subjects individually, taking them out list to not run DTI_Preprocess
unset nii_directories[27] # 1000009_t0_test
unset nii_directories[17] # 1000001_t0
unset nii_directories[4] # 1000009_t2
# With directories, navigate into each, and find where the suitable niis are (31dir and 33dir)
for g in "${nii_directories[#]}";
do
# Subject ID argument
subjid=${g:2:9}
echo "$subjid is the subject ID..."
# -i argument (T1 NIFTI File and DTI)
cd $GO_NIFTI
nii_is=$(find $subjid -type f -name ${subjid}_T1.nii.gz)
nii_i=${GO_NIFTI}${nii_is}
cd $GO_B0
cd $subjid
GO_B0_2=$PWD
b0_is=$(find . -type f -name b0.nii.gz)
b0_i=${GO_B0_2}${b0_is}
echo "-i arguments for $subjid is $nii_i and $b0_i"
# -m argument (Mask File)
#cd $GO_DTI
#mask_ms=$(find $subjid -type f -name ${subjid}_tensor_mask.nii.gz)
#mask_m=${GO_DTI}${mask_ms}
#echo "-m argument for $subjid is $mask_m"
# -fm argument (T1 mask)
cd $GO_FM
mask_fms=$(find $subjid -type f -name ${subjid}_t1_brain_mask.nii.gz)
mask_fm=${GO_FM}${mask_fms}
echo "-fm argument for $subjid is $mask_fm"
# -o argument (Working Directory for possible debugging and tmp dir organization among experiments)
cd $FINAL_DESTINATION
g=${FINAL_DESTINATION:73:-1}
experiment_name="${subjid}_${g}"
mkdir $experiment_name
output_o=${FINAL_DESTINATION}${experiment_name}/${experiment_name}_rigid.txt
echo "-o argument for $g is $output_o"
#
printf "\nSubmitting the following command: \n
qsub -m beas -M myname#medschool.edu -N Registration_${experiment_name} "$(which greedy)" -d3 -i $nii_i $b0_i -o $output_o -a -m MI -n 100x100 -fm $mask_fm dof 6 -ia-identity\n
as JobID: Registration_${experiment_name}\n\n"
qsub -v /medorg/software/external/greedy/centos7/c6dca2e -m beas -M myname#medschool.edu -N Registration_${experiment_name} "$(which greedy)" -d3 -i $nii_i $b0_i -o $output_o -a -m MI -n 100x100 -fm $mask_fm dof 6 -ia-identity
# --- Above line submits Greedy Rigid jobs (dof 6) with
# --- "-m" for emailing updates on jobs, inbox sorts job submission emails
# --- "-N" names the job for book-keeping
cd $GO_NIFTI
done

batch processing : File name comparison error

I have written a program (Cifti_subject_fmri) which compares whether file name matches in two folders and essentially executes a set of instructions
#!/bin/bash -- fix_mni_paths
source activate ciftify_v1.0.0
export SUBJECTS_DIR=/scratch/m/mchakrav/dev/functional_data
export HCP_DATA=/scratch/m/mchakrav/dev/tCDS_ciftify
## make the $SUBJECTS_DIR if it does not already exist
mkdir -p ${HCP_DATA}
SUBJECTS=`cd $SUBJECTS_DIR; ls -1d *` ## list of my subjects
HCP=`cd $HCP_DATA; ls -1d *` ## List of HCP Subjects
cd $HCP_DATA
## submit the files to the queue
for i in $SUBJECTS;do
for j in $HCP ; do
if [[ $i == $j ]];then
parallel "echo ciftify_subject_fmri $i/filtered_func_data.nii.gz $j fMRI " ::: $SUBJECTS |qbatch --walltime '05:00:00' --ppj 8 -c 4 -j 4 -N ciftify_subject_fmri -
fi
done
done
When i run this code in the cluster i am getting an error which says
./Cifti_subject_fmri: [[AS1: command not found
The query ciftify_subject_fmri is part of toolbox ciftify, for it to execute it requires following instructions
ciftify_subject_fmri <func.nii.gz> <Subject> <NameOffMRI>
I have 33 subjects [AS1 -AS33] each with its own func.nii.gz files located SUBJECTS directory,the results need to be populated in HCP directory, fMRI is name of file format .
Could some one kindly let me know why i am getting an error in loop

How define a fish shell function that searches for a pattern within certain files?

I want to create a function that has one required argument and two optional arguments:
(searchterm, filenamepattern='.*', grepopt='-in')
and searches the current directory recursively and prints a list of
files:linenumbers:linecontents
What is the best way to do this?
Both for dealing with function arguments in Fish, and for the find/grep command pipeline.
This is the best I came up with so far:
function findin --argument searchterm
set -q argv[1]; and echo "Searching for $searchterm"; or begin;
echo "searchterm is required."; return 1; end;
set -q argv[2]; and set -l filenamepattern $argv[2]; or set filenamepattern ".*"
set -q argv[3]; and set -l grepopt $argv[3]; or set grepopt '-in'
find . -type f -print0 | grep -iz "$filenamepattern" | xargs -0 grep "$searchterm" $grepopt
echo "just ran:\n find . -type f -print0 | grep -iz \"$filenamepattern\" | xargs -0 grep \"$searchterm\" $grepopt"
end
Invoked like this:
><>findin
searchterm is required.
><>findin 'love'
anyfile.html:343:/* research -> Healing of Love */
><>findin 'love' 'only.one'
only.one:343:/* research -> Healing of Love */
><>findin 'love' 'index.html' ''
index.html:/* case sensitive without line#s-> healing of love */

Ruby and shell : get the same output with ls

I got a folder with the following zip files :
13162.zip 14864.zip 19573.zip 20198.zip
In console, when i run :
cd my_folder; echo `ls *{.zip,.ZIP}`
I got the following output (which is perfect) :
ls: cannot access *.ZIP: No such file or directory
13162.zip 14864.zip 19573.zip 20198.zip
Now when in ruby i try the same :
cmd= "cd my_folder; echo `ls {*.zip,*.ZIP}`";
puts `#{cmd}`
It only display :
ls: cannot access {*.zip,*.ZIP}: No such file or directory
=> nil
I try this solution :
Getting output of system() calls in Ruby
But it seem not work in my case.
How can i get the same output in ruby and in shell ?
Ruby Only
You can use Dir.glob with File::FNM_CASEFOLD for case-insensitive search :
Dir.chdir 'my_folder' do
Dir.glob('*.zip', File::FNM_CASEFOLD).each do |zip_file|
puts zip_file
end
end
#=>
# 19573.zip
# 13162.zip
# 14864.zip
# 20198.zip
# 12345.zIp
Ruby + bash
You can use find for case-insensitive search :
paths = `find my_folder -maxdepth 1 -iname '*.zip'`.split
#=> ["my_folder/19573.zip", "my_folder/13162.zip", "my_folder/14864.zip", "my_folder/20198.zip", "my_folder/12345.zIp"]
-printf '%P' can also be used to only display the filenames :
files = `find my_folder -maxdepth 1 -iname '*.zip' -printf '%P\n'`.split
#=> ["19573.zip", "13162.zip", "14864.zip", "20198.zip", "12345.zIp"]
I think this should work directly on the terminal:
echo 'system("ls *ZIP,*zip")' | ruby
or create a ruby file with the following contents
system("cd my_folder; ls {*.zip,*.ZIP}")
and then execute it. Once you write ls, you don't need echo!

Regex in sed to match a subpath in a path with capturing groups

I have a list of dictionaries, made by two files named index with extension {aff,dic} like
dictionaries/dictionaries/bg_BG/index.dic
dictionaries/dictionaries/ca_ES/index.dic
dictionaries/dictionaries/cs_CZ/index.dic
dictionaries/dictionaries/da_DK/index.dic
...
dictionaries/dictionaries/bg_BG/index.aff
dictionaries/dictionaries/ca_ES/index.aff
dictionaries/dictionaries/cs_CZ/index.aff
dictionaries/dictionaries/da_DK/index.aff
and I want to copy them in a different folder, but naming each of the by the subpath like it_IT in order to have
myDicts/it_IT.dic
myDicts/it_IT.acc
I came out with this inline
for file in dictionaries/dictionaries/**/*.{dic,aff}; do echo ${file}; done
that lists the files in these folders, having in $file the for...loop variable dictionaries/dictionaries/da_DK/index.aff.
So using sed I was able to selected (in exclusion) those patterns like
sed 's:[a-z][a-z][_-][A-Z][A-Z]::';
so having
for file in dictionaries/dictionaries/**/*.{dic,aff}; do echo ${file} | sed 's:[a-z][a-z][_-][A-Z][A-Z]::'; done
that this time will print out
dictionaries/dictionaries//index.dic
dictionaries/dictionaries//index.dic
dictionaries/dictionaries//index.dic
...
dictionaries/dictionaries//index.aff
dictionaries/dictionaries//index.aff
dictionaries/dictionaries//index.aff
For my understanding I know that sed to print out the capture group needs to specify the captured group and the non capturing part - see here
But I was not able to figure out how to achieve this in order to have in $file at the end
bg_BG.acc
ca_ES.acc
da_DK.acc
...
bg_BG.dic
ca_ES.dic
da_DK.dic
where the extension {acc,dic} should be added as well.
I need to execute this command inline for scripting reasons.
[UPDATE]
Thanks to the answer below I came out with this solution
for file in dictionaries/dictionaries/**/*.{dic,aff}; do echo $file | sed 's:.*\([a-z][a-z][_-][A-Z][A-Z]\)/index\(.*\):cp & myDicts/\1\2:' | sh; done
that does its job:
$ ls myDicts/
bg_BG.aff cs_CZ.aff de_AT.aff de_DE.aff en_AU.aff en_GB.aff en_ZA.aff eu_ES.aff gl_ES.aff it_IT.aff mn_MN.aff nl_NL.aff pl_PL.aff pt_PT.aff ru_RU.aff sl_SI.aff sv_SE.aff uk_UA.aff
bg_BG.dic cs_CZ.dic de_AT.dic de_DE.dic en_AU.dic en_GB.dic en_ZA.dic eu_ES.dic gl_ES.dic it_IT.dic mn_MN.dic nl_NL.dic pl_PL.dic pt_PT.dic ru_RU.dic sl_SI.dic sv_SE.dic uk_UA.dic
ca_ES.aff da_DK.aff de_CH.aff el_GR.aff en_CA.aff en_US.aff es_ES.aff fr_FR.aff hr_HR.aff lb_LU.aff nb_NO.aff nn_NO.aff pt_BR.aff ro_RO.aff sk_SK.aff sr_RS.aff tr-TR.aff vi_VN.aff
ca_ES.dic da_DK.dic de_CH.dic el_GR.dic en_CA.dic en_US.dic es_ES.dic fr_FR.dic hr_HR.dic lb_LU.dic nb_NO.dic nn_NO.dic pt_BR.dic ro_RO.dic sk_SK.dic sr_RS.dic tr-TR.dic vi_VN.dic
There is only one pitfall that is it does not capture these path patterns
dictionaries/dictionaries/ca_ES-valencia/
dictionaries/dictionaries/sr_RS-Latn
dictionaries/dictionaries/ca_ES-valencia/
dictionaries/dictionaries/sr_RS-Latn/
here's a way:
echo dictionaries/dictionaries/da_DK/index.aff |
sed 's:.*\([^/]\+\)/index\(\..*\):\1\2:'
output:
da_DK.aff
however, there's a faster way than a for loop:
find dictionaries/dictionaries -name "index.dic" -or -name "index.aff" |
sed 's:dictionaries/dictionaries/\([^/]\+\)/index\(\..*\):mv & myDicts/\1\2:'
if that produces the commands you want, pipe it to sh:
mkdir myDicts
find dictionaries/dictionaries -name "index.dic" -or -name "index.aff" |
sed 's:dictionaries/dictionaries/\([^/]\+\)/index\(\..*\):mv & myDicts/\1\2:' |
sh

Resources