I have around 50 input files to a terminal program. The program takes one file as input at the time, prints some data and terminates.
When it has terminated, I run the program again with the next file and so on.
Is there a way to make this automatic—since this will take several hours and some file take a few minutes and some can take 1 hour—and save each data print in a file output_inputfile.txt?
I was thinking to have a file like
myprogram file-1
myprogram file-2
myprogram file-3
and execute it in some way.
You can accomplish that via the shell scripting capability, e.g. have a look at this: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html. You could just put them all in one directory and use this simple script:
#!/bin/bash
cd /path/to/your/files # go to the directory
for i in $( ls ); do # for every file that 'ls' returns
/path/to/your/program $i # call your program
done
Related
Recently I tried to list all of the images located in a directory I had (several hundred) and put them into a file. I used a very simple command
ls > "image_names.txt"
I was bored and decided to look inside the file and realized that image_names.txt was located in the file. Then I realized, the order of operations performed was not what I thought. I read the command as left to right, in two separate steps:
ls (First list all the file names)
> "image_names.txt" (Then create this file and pipe it here)
Why is it creating the file first then listing all of the files in the directory, despite the ls command coming first?
When you use output redirection, the shell needs a place to put your output( suppose it was very long, then it could all be lost on terminate, or exhaust all working memory), so the first step is to open the output file for streaming output from the executed command's stdout.
This is especially important to know in this kind of command
cat a.txt | grep "foo" > a.txt
since a is opened first and not in append mode it will be truncated, meaning there is no input for cat. So the behaviour you expect that the lines will be filtered from a.txt and replace a.txt will not actually happen. Instead you will just lose the contents of a.txt.
Because redirection > "image_names.txt" was performed before ls command.
I have 1000 inputs for a program which I have no control on the output
I can run the program over each file like below. So this program goes take the input file which is like input1, input2 and input3, then run my program and save several outputs there but each time overwrite the outputs to the previous
for i in {1..3}; do
myprogram input"$i"
done
I thought I generate 3 folders and put the input files there then I run the program so maybe the program write the output there, but still not successful.
for i in {1..3}; do
myprogram "$i"/input"$i"
done
Basically I want to exe the program that save the output in each file and then go to another folder .
Is there anyway to cope with this?
Thanks
If it is overwriting the input file as indicated in your comment, you can do save the original input file by copying and renaming/moving then calling the program. Then if you really want them in a subdirectory, make a directory, and move the input and/or output file(s).
for i in {1..3}
do
cp infile$i outfile$i
./myprogram outfile$i
mkdir programRun-$i
mv infile$i outfile$i programRun-$i
done
If it is leaving the input file alone, and just outputs to a consistent file name, then something like
for i in {1..3}
do
./myprogram infile$i
mkdir programRun-$i
mv outfile programRun-$i/outfile-$i
done
Note that in either case, I'd consider using a different variable than $i to identify which run of the program - perhaps a time/date in YYYMMDDHHMMSS form, or just a unix timestamp. Just for organization purposes, and that way all output files from a given run are together... but whatever fits your needs.
If the myprogram is always creating the same file names then you could move them off before executing the next loop iteration. In this example if the output is files called out*.txt .
for i in {1..3}; do ./myprogram input"$i"; mkdir output"$i"; mv out*.txt output"$i"/; done
If the file names created differ you could create new directories and cd into those prior to executing the application.
for i in {1..3}; do mkdir output"$i"; cd output"$i"; ../myprogram ../input"$i"; cd ..; done
I have a while loop in a bash script:
Example:
while read LINE
do
echo $LINE >> $log_file
done < ./sample_file
My question is why when I delete the sample_file while the script is running the loop doesn't end and I see that the log_file is updating? How the loop is continuing while there is no input?
In unix, a file isn't truly deleted until the last directory entry for it is removed (e.g. with rm) and the last open file handle for it is closed. See this question (especially MarkR's answer) for more info. In the case of your script, the file is opened as stdin for the while read loop, and until that loop exits (or closes its stdin), rming the file will not actually delete it off disk.
You can see this effect pretty easily if you want. Open three terminal windows. In the first, run the command cat >/tmp/deleteme. In the second, run tail -f /tmp/deleteme. In the third, after running the other two commands, run rm /tmp/deleteme. At this point, the file has been unlinked, but both the cat and tail processes have open file handles for it, so it hasn't actually been deleted. You can prove this by typing into the first terminal window (running cat), and every time your hit return, tail will see the new line added to the file and display it in the second window.
The file will not actually be deleted until you end those two commands (Control-D will end cat, but you need Control-C to kill tail).
See "Why file is accessible after deleting in unix?" for an excellent explanation of what you are observing here.
In short...
Underlying rm and any other command that may appear to delete a file
there is the system call unlink. And it's called unlink, not remove or
deletefile or anything similar, because it doesn't remove a file. It
removes a link (a.k.a. directory entry) which is an association
between a file and a name in a directory.
You can use the function truncate to destroy the actual contents (or shred if you need to be more secure), which would immediately halt the execution of your example loop.
The moment shell executes the while loop, the sample_file contents have been read, and it does not matter whether the file exists or not after that point.
Test script:
$ cat test.sh
#!/bin/bash
while read line
do
echo $line
sleep 1
done < data_file
Test file:
$ seq 1 10 > data_file
Now, in one terminal you run the script, in another terminal, you go and delete the file data_file, you would still see the 1 to 10 numbers printed by the script.
i spent the better part of the day looking for a solution to this problem and i think i am nearing the brink ... What i need to do in bash is: write 1 script that will periodicly read your inputs and write them into a file and second script that will periodicly print out the complete file BUT only when something new gets written in, meaning it will never write 2 same outputs 1 after another. 2 scripts need to comunicate by the means of a lock, meaning script 1 will lock a file so that script 2 cant print anything out of it, then script 1 will write something new into that file and unlock it ( and then script 2 can print updated file ).
The only hints we got was the usage of flock and lockfile - didnt get any hints on how to use them, exept that problem MUST be solved by flock or lockfile.
edit: When i said i was looking for a solution i ment i tried every single combination of flock with those flags and i just couldnt get it to work.
I will write pseudo code of what i want to do. A thing to note here is that this pseudocode is basicly the same as it is done in C .. its so simple, i dont know why everything has to be so complicated in bash.
script 1:
place a lock on file text.txt ( no one else can read it or write to it)
read input
place that input into file ( not deleting previous text )
remove lock on file text.txt
repeat
script 2:
print out complete text.txt ( but only if it is not locked, if it is locked obviously you cant)
repeat
And since script 2 is repeating all the time, it should print the complete text.txt ONLY when something new was writen to it.
I have about 100 other commands like flock that i have to learn in a very short time and i spent 1 day only for 1 of those commands. It would be kind of you to at least give me a hint. As for man page ...
I tried to do something like flock -x text.txt -c read > text.txt, tried every other combination also, but nothing works. It takes only 1 command, wont accept arguments. I dont even know why there is an option for command. I just want it to place a lock on file, write into it and then unlock it. In c it only takes flock("text.txt", ..).
Let's look at what this does:
flock -x text.txt -c read > text.txt
First, it opens test.txt for write (and truncates all contents) -- before doing anything else, including calling flock!
Second, it tells flock to get an exclusive lock on the file and run the command read.
However, read is a shell builtin, not an external command -- so it can't be called by a non-shell process at all, mooting any effect that it might otherwise have had.
Now, let's try using flock the way the man page suggests using it:
{
flock -x 3 # grab a lock on file descriptor #3
printf "Input to add to file: " # Prompt user
read -r new_input # Read input from user
printf '%s\n' "$new_input" >&3 # Write new content to the FD
} 3>>text.txt # do all this with FD 3 open to text.txt
...and, on the read end:
{
flock -s 3 # wait for a read lock
cat <&3 # read contents of the file from FD 3
} 3<text.txt # all of this with text.txt open to FD 3
You'll notice some differences from what you were trying before:
The file descriptor used to grab the lock is in append mode (when writing to the end), or in read mode (when reading), so you aren't overwriting the file before you even grab the lock.
We're running the read command (which, again, is a shell builtin, and so can only be run directly by the shell) by the shell directly, rather than telling the flock command to invoke it via the execve syscall (which is, again, impossible).
I'd like to run a program several times with slightly different inputs. The input file is a long .in file, and I'd only want to edit a single number in a specific line of that file. So ideally I'd like to write a Unix script that repeats this process several times:
Edits a line in a .in file
Runs a program which uses that file as input
Renames the output .nc file from the program and saves it
I'm completely new to this sort of scripting, and while I'm pretty sure I can figure out how to do steps 2 and 3 of this process, I'm not sure how to do the second step. Is it possible to use a script to automate the editing of a .in file, and how would I do that?
Here's an example that should get you started:
$ echo cat says meow >say.txt
$ sed -i s/meow/meowwwwwww/ say.txt
$ cat say.txt
cat says meowwwwwww
Let me know if you need more help.