subprocess missing output file - macos

I am completely new to python but I am trying to learn.
I would like to use subprocess command to run a simulation program that can me called in the terminal in a bash environment. The syntax is quite simple:
command inputfile.in
where in the the command is a greater simulation script in a tcltk environment.
Ok I have read a lot of the python literature and have decided to use the the Popen functionality of the subprocess command.
So from what I understand I should be able to format the command as follows:
p= subprocess.Popen(['command','inputfile.in'],stdout= subprocess.PIPE])
print(p.communicate())
The output of this command are two files. When I run the command in the terminal I get two files in the same directory as the original input file.
File1.fid File2.spe.
When I use Popen there are two things that confuse me.
(1) I do not get any output files written to directory of the input file. (2) The value p.communicate is present indicating that the simulation was run .
What happen to the output files. Is there a specified way to call a command function that produces files as a result?
I am running this in a Jupyter-notebook cell inside of for loop. This for loop serves to iteratively change the input file thus systematically varying conditions of the simulations. My operating systems is mac osx.
The goal is to simulated or run the command with each iteration of the for loop then store the output file data in a larger dictionary. Later I would like to compare the output file data to the experimental data iteratively in a optimization process that minimizes the residuals.
I would appreciate any help. Also any direction if popen is not the correct python function to do this.

Let's learn from something easy like this:
# This is equivalent with the command line `dir *.* /s /b` on Windows
import subprocess
sp = subprocess.Popen(['dir', '*.*', '/s', '/b'], stderr=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
(std_out, std_err) = sp.communicate() # returns (stdout, stderr)
# print out if error occur
print('std_err: ', std_err) # expect ('std_err: ', '')
# print out saved echoing messages
print('std_out: ', std_out) # expect ('std_out: ', ... can be a long list ...)

Related

Why won't an external executable run without manual input from the terminal command line?

I am currently writing a Python script that will pipe some RNA sequences (strings) into a UNIX executable, which, after processing them, will then send the output back into my Python script for further processing. I am doing this with the subprocess module.
However, in order for the executable to run, it must also have some additional arguments provided to it. Using the subprocess.call method, I have been trying to run:
import subprocess
seq= "acgtgagtag"
output= subprocess.Popen(["./DNAanalyzer", seq])
Despite having my environmental variables set properly, the executables running without problem from the command line of the terminal, and the subprocess module functioning normally (e.g. subprocess.Popen(["ls"]) works just fine), the Unix executable prints out the same output:
Failed to open input file acgtgagtag.in
Requesting input manually.
There are a few other Unix executables in this package, and all of them behave the same way. I even tried to create a simple text file containing the sequence and specify it as the input in both the Python script as well as within the command line, but the executables only want manual input.
I have looked through the package's manual, but it does not mention why the executables can ostensibly be only run through the command line. Because I have limited experience with this module (and Python in general), can anybody indicate what the best approach to this problem would be?
The Popen() is actually a constructor for an object – that object being a "sub-shell" that directly runs the executable. But because I didn't set a standard input or output (stdin and stdout), they default to None, meaning that the process's I/O are both closed.
What I should have done is pass subprocess.PIPE to signify to the Popen object that I want to pipe input and output between my program and the process.
Additionally, the environment variables of the script (in the main shell) were not the same as the environment variables of the subshell, and these specific executables needed certain environment variables in order to function (in this case, it was the path to the parameter files in its package). This was done in the following fashion:
import subprocess as sb
seq= "acgtgagtag"
my_env= {BIOPACKAGEPATH: "/Users/Bobmcbobson/Documents/Biopackage/"}
p= sb.Popen(['biopackage/bin/DNAanalyzer'], stdin=sb.PIPE, stdout=sb.PIPE, env=my_env)
strb = (seq + '\n').encode('utf-8')
data = p.communicate(input=strb)
After creating the Popen object, we send it a formatted input string using communicate(). The output can now be read, and further processed in whatever way in the script.

How do I pipe output from one python script as input to another python script?

For example:
A script1.py gets an infix expression from the user and converts it to a postfix expression and returns it or prints it to stdout
script2.py gets a postfix expression from stdin and evaluates it and outputs the value
I wanted to do something like this:
python3 script1.py | python3 script2.py
This doesn't work though, could you point me in the right direction as to how I could do this?
EDIT -
here are some more details as to what "doesn't work".
When I execute python3 script1.py | python3 script2.py
the terminal asks me for input for the script2.py program, when it should be asking for input for the script1.py program and redirecting that as script2.py's input.
So it asks me to "Enter a postfix expression: ", when it should be asking "Enter an infix expression: " and redirect that to the postfix script.
If I undestand your issue correctly, your two scripts each write out a prompt for input. For instance, they could both be something like this:
in_string = input("Enter something")
print(some_function(in_string))
Where some_function is a function that has different output depending on the input string (which may be different in each script).
The issue is that the "Enter something" prompt doesn't get displayed to the user correctly when the output of one script is being piped to another script. That's because the prompt is written to standard output, so the first script's prompt is piped to the second script, while the second script's prompt is displayed. That's misleading, since it's the first script that will (directly) receive input from the user. The prompt text may also mess up the data being passed between the two scripts.
There's no perfect solution to this issue. One partial solution is to write the prompt to standard error, rather than standard output. This would let you see both prompts (though you'd only actually be able to respond to one of them). I don't think you can directly do that with input, but print can write to other file streams if you want: print("prompt", file=sys.stderr)
Another partial solution is to check if your input and output streams and skip printing the prompts if either one is not a "tty" (terminal). In Python, you can do sys.stdin.isatty(). Many command line programs have a different "interactive mode" if they're connected directly to the user, rather than to a pipe or a file.
If piping the output around is a main feature of your program, you may not want to use prompts ever! Many standard Unix command-line programs (like cat and grep) don't have any interactive behavior at all. They require the user to pass command line arguments or set environment variables to control how they run. That lets them work as expected even when they don't have access to standard input and standard output.
For example if you have nginx running and script1.py:
import os
os.system("ps aux")
and script2.py
import os
os.system("grep nginx")
Then running:
python script1.py | script2.py
will be same as
ps aux | grep nginx
For completion's sake, and to offer an alternative to using the os module:
The fileinput module takes care of piping for you, and from running a simple test I believe it'll make it an easy implementation.
To enable your files to support piped input, simply do this:
import fileinput
with fileinput.input() as f_input: # This gets the piped data for you
for line in f_input:
# do stuff with line of piped data
all you'd have to do then is:
$ some_textfile.txt | ./myscript.py
Note that fileinput also enables data input for your scripts like so:
$ ./myscript.py some_textfile.txt
$ ./myscript.py < some_textfile.txt
This works with python's print output just as easily:
>test.py # This prints the contents of some_textfile.txt
with open('some_textfile.txt', 'r') as f:
for line in f:
print(line)
$ ./test.py | ./myscript.py
Of course, don't forget the hashbang #!/usr/bin/env python at the top of your scripts for this way to work.
The recipe is featured in Beazley & Jones's Python Cookbook - I wholeheartedly recommend it.

Run multiple commands in a non-interactive shell session and parse the output

I would like to communicate with a (remote) non-interactive shell via its stdin/stdout to run multiple commands and read the outputs. The problem is that if I stuff multiple commands on shell stdin, I am not able to detect the boundaries between outputs of individual commands.
In Python-like pseudo-code:
sh = Popen(['ssh', 'user#remote', '/bin/bash'], stdin=PIPE, stdout=PIPE)
sh.stdin.write('ls /\n')
sh.stdin.write('ls /usr\n')
sh.stdin.close()
out = sh.stdout.read()
But obviously out contains the outputs of both commands concatenated, and I have no way of reliably splitting them.
So far my best idea is to insert \0 bytes between the outputs:
sh.stdin.write('ls /; echo -ne "\0"\n')
sh.stdin.write('ls /usr; echo -ne "\0"\n')
Then I can split out on zero characters.
Other approaches that don't work for me:
I don't want to run a separate ssh session per command, as the handshake is too heavyweight.
I'd prefer not to force ControlMaster options on the created shells to respect end-user's ssh_config.
I'd prefer to not need require users to install specific server programs.
Is there a better way of running several commands in one session and getting individual outputs? Is there a widely-deployed shell with some sort of binary output mode?
PS. There is a duplicate question, but it doesn't have a satisfactory answer:
Run multiple commands in a single ssh session using popen and save the output in separate files
For SSH I used paramiko and its invoke_shell method to create a programmatically-manageable shell instance.
The following is not a complete answer, it's still hacky, but I feel it's a step in the right direction.
I required the same read/write shell instance functionality in Windows but have had no luck, so I extended your approach a little (thank you for the idea by the way).
I verify each command executes successfully based on its exit code by placing a conditional exit between each command, then I use the text of said conditional check (a known string) as the delimiter to define each command's response.
A crude example:
from subprocess import Popen, PIPE
sh = Popen('cmd', stdin=PIPE, stdout=PIPE)
sh.stdin.write(b'F:\r\n')
sh.stdin.write(b"if not %errorlevel% == 0 exit\r\n")
sh.stdin.write(b'cd F:\\NewFolder\r\n')
sh.stdin.write(b"if not %errorlevel% == 0 exit\r\n")
sh.stdin.write('...some useful command with the current directory confirmed as set to F:\NewFolder...')
sh.stdin.close()
out = sh.stdout.read()
sh.stdout.close()
# Split 'out' by each line that ends with 'if not %errorlevel% == 0 exit' and do what you require with the responses

Stata command line arguments in batch mode

A helpful FAQ from Stata describes that arguments can be passed to do files. My do file looks like this:
* program.do : Program to fetch information from main dataset
args inname outname
save `outname', emptyok // file to hold results
insheet using `inname', comma clear names case
// a bunch of processing
save `outname', replace
According to the FAQ, this script can be run using do filename.csv result.dta. When I run this command from within Stata, everything works fine. The program is long, however, so I want to run it in batch mode. Stata has another FAQ about batch mode.
Combining the information from these webpages, I type the following at my Unix prompt:
$ nohup stata -b do program.do filename.csv result.dta &
Stata starts up, but it terminates with the following error:
. save `outname', emptyok // file to hold results
invalid file specification
r(198);
A little experimentation tells me that Stata is never receiving the two arguments when I run the program in batch mode. What is the solution to this problem? (i.e. how do you pass arguments to a do file when running it in batch mode?)
The thread below may be helpful:
http://www.stata.com/statalist/archive/2012-09/msg00609.html
In Windows, if my program Test.do is:
args a b
display "`a'"
display "`b'"
I can run it in batch mode in Windows by simply typing:
"c:\Stata13\stata.exe" /e do "c:\Scripts\Test.do" Test Script
And it will display (within Stata):
Test
Script
So I wonder whether the nohup is what's preventing your program from working.

Python Windows7: Odd behaviour opening file for append

I am seeing odd behaviour when I open a file in append mode ('a+') under Windows 7 using Python.
I was wondering whether the behaviour is in fact incorrect or I am misunderstanding how to use the following code:
log_file= open(log_file_path, "a+")
return_code = subprocess.call(["make", target], stdout=log_file, stderr=subprocess.STDOUT)
log_file.close()
The above code lines does not properly append to the file. In fact on subsequent runs it won't even modify the file.
I tested it out using the Python Shell as well.
Once the file has been opened for the first time, making multiple subprocess calls will append properly to the file, however once the file has been closed and reopened it will never append again.
Anyone have any clues?
Thanks
To further simply the problem Here is another set of steps that will fail:
log_file=open("temp.txt", "a+")
log_file.write("THIS IS A TEST")
log_file.close()
log_file=open("temp.txt", "a+")
subprocess.call(["echo", "test"], stdout=log_file, stderr=subprocess.STDOUT, shell=True)
log_file.close()
If you open the file temp.txt here is what I see:
testS A MUTHER F** TEST
It looks like your problem is in the use of shell=True. From Python documentation for POpen:
On Unix, with shell=True: If args is a string, it specifies the
command string to execute through the shell. This means that the
string must be formatted exactly as it would be when typed at the
shell prompt. This includes, for example, quoting or backslash
escaping filenames with spaces in them. If args is a sequence, the
first item specifies the command string, and any additional items will
be treated as additional arguments to the shell itself.
So it looks like "echo" is the command, and "test" gets sent as an argument to the shell, instead of to "echo".
So changing your subprocess call to either:
subprocess.call("echo test", stdout=log_file, stderr=subprocess.STDOUT, shell=True)
or:
subprocess.call(["echo", "test"], stdout=log_file, stderr=subprocess.STDOUT)
Fixes the problem, at least in my testing.
see http://mail.python.org/pipermail/python-list/2009-October/1221841.html
briefly: opening a file in append mode leaves the file ptr in an implementation-dependent state. seek to the end to get the same results on windows as on linux.

Resources