How to segregate display output of a binary in 2 different files? - shell

I Have a Shell script which executes couple of binary files infinitely . My Task is to redirect the output displayed by these binary files into a log file with timestamp (YY-MM-DD). This task is pretty easy , However problem arises when the day changes. This is my problem -- If a particular binary file is in the process of execution (not completed yet) and day changes , the output should displayed should be logged in 2 different files with different timestamps. For Eg: -
while true; do
if (current_date = new_date); then
execute binary >> log.out_$current_date
// If binary is still in the process of execution how to redirect in 2 files ???
else
execute binary >> log.out_$new_date
fi
done
Required Output is :: output of file on current_date to be logged in current log file and output of file on new date to be logged in new file
..... Please Help

Just create a further script that read from the stdout of your binary and check for the date after every single line read
(source of outputdatescript.sh)
#!/bin/bash
while read line; do
# example made with unix data, replace it with your date string generator
date=$(date "+%s")
echo $line >> file.$date
done;
then the command in your main script must be subst from
execute binary >> log.out_$current_date
to
execute binary | outputdatescript.sh

Related

different PDB files add hydrogen same time using reduce program in shell script with modified file name (1axk.pdb <---->1ackH.pdb)

I tried to run this code but its repeated the same file and ignored the other files and I want to stored all the directory files into a variable so that, read the file one by one and after addition of hydrogen by reduce program the file contain same file name with H addition and I want to save this output in separate directory how I can do this please help me I am new in this coding field.
I tried to run this code but its repeated the same file and ignored the other files
#!/bin/bash
# A script for reduce program
l = cd ~/hetero
for l in hetero/*.pdb
do
ls | xargs -L 1 -d '\n' reduce *.pdb > ....*H.pdb
done
echo "all done"
Your for-loop defines l as a variable whose value you obtain within the loop with the syntax $l. If you don't use $l inside your loop, you cannot expect loop iterations to depend on it.

Process multiple files one by one dynamically in workflow using indirect file method

My workflow uses 3 indirect files.
The indirect files can have one or more file names.
Let's say all 3 indirect files have 2 file names each.
Indirect_file1 has (file1,file2)
Indirect_file2 has (filea,fileb)
Indirect_file3 has (filex,filey)
My workflow should run in sequence.
First sequence (file1,filea,filex)
Second sequence (file2,fileb,filey)
we are on Linux environment, so i guess it can be done using shell script
Any pointers will be appreciated.
Thanks in Advance.
This should work -
in informatica session, modify input type to 'Command'
in informatica session, change command type to 'Command generating file List'
for first worfklow set the command like this 'cut -d ',' file -f1' if your delimiter is comma.
for second worfklow set the command like this 'cut -d ',' file -f2' if your delimiter is comma.
You might want to make small work packages first before processing. When the workflow takes a long time it is easier to (re-)start new processes.
You can start with something like this:
# Step 1, move the current set to temporary folder
combine_dir=/tmp/combine
mkdir "${combine_dir}"
mv Indirect_file1 "${combine_dir}"
mv Indirect_file2 "${combine_dir}"
mv Indirect_file3 "${combine_dir}"
# Step 2, construct work packages in other tmp dir
workload_dir=/tmp/workload
mkdir "${workload_dir}"
for file in Indirect_file1 Indirect_file2 Indirect_file3; do
loadnr=1
for work in $(grep -Eo '[^(,)]*' "${file}"); do
echo "${work}" >> ${workload_dir}/sequence${loadnr}
((loadnr++))
done
done
# The sequenceXXX files have been generated with one file on each line.
# When you must have it like (file1,filea,filex), change above loop.
# Now files are ready to be processed. Move them to some dir where files will be handled.
# Please cleanup temporary files

kickstart five concurrent processes in bash

I have a folder named datafolder which contains five csv files aa.csv ab.csv ac.csv ad.csv ae.csv Each csv file contains data from an excel sheet in the format: date, product type, name, address etc. and I am only interested in the second column which is named product. Basically what I want to happen is for the jobmaster script to count the number of files in datafolder and then to start a map process for each individual file. I have the following scripts:
The jobmaster script runs without problems, however once the map script starts, only the first echo mapping $1 is displaying and the process is stuck in an infinite loop (my guess). When I run the ps command I expect to see 5 map.sh running, however there are none.
I suspect you missed an input redirection in map.sh:
file=$1
echo "mapping $file"
while IFS="," read -r value1 product remainder; do
# ...
done < "$file"
# ^^^^^ provide the standard input to from this file to `read`

Retrieving File name for bash/shell programing

I need to access two files in my shell script. The only issue is , I am not sure what the file name is going to be as it is system generated.A part of the file name is always constant , but the rest of it is system generated , hence may vary. I am not sure how to access these files?
Sample File Names
Type 1
MyFile1.yyyy-mm-dd_xx:yy:zz.log
In this case , I know MyFile1 portion is a constant for all the files, the other portion varies based on date and time. I can use date +%Y-%m-%d to get till MyFile1.yyyy-mm-dd_ but I am not sure how to select the correct file. Please note each day will have just one file of the kind. In unix the below command gives me the correct file .
unix> ls MyFile1.yyyy-mm-dd*
Type 2
MyFile2.yyyymmddxxyyxx.RandomText.SomeNumber.txt
In this file , as you can see Myfile2 portion is common,I can user Date +%Y%m%d to get till (current date) MyFile2.yyyymmdd, again not very clear how to go on from there .In unix the below command gives me the correct file .Also I need to have previous date in the dd column for File 2.
unix> ls MyFile2.yyyymmdd*
basically looking for the following line in my shell script
#!/bin/ksh
timeA=$(date +%Y-%m-%d)
timeB=$(date +%Y%m)
sysD=$(date +%d)
sysD=$((sysD-1))
filename1=($Home/folder/MyFile1.$timeA*)
filename2=($Home/folder/MyFile2.$timeB$sysD*)
Just not sure how to get the RHS for these two files.
The result when running the above scripts is as below
Script.ksh[8]: syntax error at line 8 : `(' unexpected
Perhaps this
$ file=(MyFile1.yyyy-mm-dd*)
$ echo $file
MyFile1.yyyy-mm-dd_xx:yy:zz.log
It should be noted that you must declare variables in this manner
foo=123
NOT
foo = 123
Notice carefully, bad
filename1=$($HOME/folder/MyFile1.$timeA*)
good
filename1=($HOME/folder/MyFile1.$timeA*)

batch job submission upon completion of job

I would like to write a script to execute the steps outlined below. If someone can provide simple examples on how to modify files and search through folders using a script (not necessarily solving my problem below), I will greatly appreciate it.
submit job MyJob in currentDirectory using myJobShellFile.sh to a queue
upon completion of MyJob, goto to currentDirectory/myJobDataFolder.
In myJobDataFolder, there are folders
myJobData.0000 myJobData.0001 myJobData.0002 myJobData.0003
I want to find the maximum number maxIteration of all the listed folders. Here it would be maxIteration=0003.\
In file myJobShellFile.sh, at the last line says
mpiexec ./main input myJobDataFolder
I want to append this line to
'mpiexec ./main input myJobDataFolder 0003'
I want to submit MyJob to the que while maxIteration < 10
Upon completion of MyJob, find the new maxIteration and change this number in myJobShellFile.sh and goto step 4.
I think people write python scripts typically to do this stuff, but am having a hard time finding out how. I probably don't know the correct terminology for this procedure. I am also aware that the script will vary slightly depending on the queing system, but any help will be greatly appreciated.
Quite a few aspects of your question are unclear, such as the meaning of “submit job MyJob in currentDirectory using myJobShellFile.sh to a que”, “append this line to
'mpiexec ./main input myJobDataFolder 0003'”, how you detect when a job is done, relevant parts of myJobShellFile.sh, and some other details. If you can list the specific shell commands you use in each iteration of job submission, then you can post a better question, with a bash tag instead of python.
In the following script, I put a ### at the end of any line where I am guessing what you are talking about. Lines ending with ### may be irrelevant to whatever you actually do, or may be pseudocode. Anyway, the general idea is that the script is supposed to do the things you listed in your items 1 to 5. This script assumes that you have modified myJobShellFile.sh to say
mpiexec ./main input $1 $2
instead of
mpiexec ./main input
because it is simpler to use parameters to modify what you tell mpiexec than it is to keep modifying a shell script. Also, it seems to me you would want to increment maxIter before submitting next job, instead of after. If so, remove the # from the t=$((1$maxIter+1)); maxIter=${t#1} line. Note, see the “Parameter Expansion” section of man bash re expansion of the ${var#txt} form, and the “Arithmetic Expansion” section re $((expression)) form. The 1$maxIter and similar forms are used to change text like 0018 (which is not a valid bash number because 8 is not an octal digit) to 10018.
#!/bin/sh
./myJobShellFile.sh MyJob ###
maxIter=0
while true; do
waitforjobcompletion ###
cd ./myJobDataFolder
maxFile= $(ls myJobData* | tail -1)
maxIter= ${maxFile#myJobData.} #Get max extension
# If you want to increment maxIter, uncomment next line
# t=$((1$maxIter+1)); maxIter=${t#1}
cd ..
if [[ 1$maxIter -lt 11000 ]] ; then
./myJobShellFile.sh MyJobDataFolder $maxIter
else
break
fi
done
Notes: (1) To test with smaller runs than 1000 submissions, replace 11000 by 10000+n; for example, to do 123 runs, replace it with 10123. (2) In writing the above script, I assumed that not-previously-known numbers of output files appear in the output directory from time to time. If instead exactly one output file appears per run, and you just want to do one run per value for the values 0000, 0001, 0002, 0999, 1000, then use a script like the following. (For testing with a smaller number than 1000, replace 1000 with (eg) 0020. The leading zeroes in these numbers tell bash to fill the generated numbers with leading zeroes.)
#!/bin/sh
for iter in {0000..1000}; do
./myJobShellFile.sh MyJobDataFolder $iter
waitforjobcompletion ###
done
(3) If the system has a command that sleeps while it waits for a job to complete on the supercomputing resource, it is reasonable to use that command in place of waitforjobcompletion in the above scripts. Otherwise, if the system has a command jobisrunning that returns true if a job is still running, replace waitforjobcompletion with something like the following:
while jobisrunning ; do sleep 15; done
This will run the jobisrunning command; if it returns true, the shell will sleep for 15 seconds and then retest. Here is an example that illustrates waiting for a file to appear and then for it to go away:
while [ ! -f abc ]; do sleep 3; echo no abc; done
while ls abc >/dev/null 2>&1; do sleep 3; echo an abc; done
The second line's test could be [ -f abc ] instead; I showed a longer example to illustrate how to suppress output and error messages by routing them to /dev/null. (4) To reverse the sense of a while statement's test, replace the word while with until. For example, while [ ! -f abc ]; ... is equivalent to until [ -f abc ]; ....

Resources