I have an exe that I need to open using Python 3.2.3. I also need to pass an argument in the form of bytes to the exe. I try doing something like:
argument = '\x50'*260
subprocess.call([command, argument])
This works fine but when I try to give a non-printable character as the argument like '\x86', it gets converted to '\x3f'. Printing the argument gives the following error:
UnicodeEncodeError: 'charmap' codec can't encode character '\x86' in position 262: character maps to <undefined>
So I tried doing it using os.system:
command = "C:\myfile.exe "+b"\x50"*260
os.system(command)
But obviously, this leads to a type error. Does anyone have any suggestions to get this thing done?
It can't be done. The thing is that what subprocess does is pretend to type commands into the command prompt. Do you have access to the source of myfile.exe? You can easily represent bytes as a string or number.
Related
My Installation needs to check the result of a command from cmd.exe. Thus, I redirect the result of the command to a text file and then try to read the file to get the result as follows:
// send command to cmd to execute and redirect the result to a text file
// try to read the file
szDir = "D:\\";
szFileName = "MyFile.txt";
if Is(FILEEXISTS, szDir ^ szFileName) then
listID = ListCreate(STRINGLIST);
if listID != LIST_NULL then
if OpenFIleMode(FILE_MODE_NORMAL) = 0 then
if OpenFile(nFileHandle, szDir, szFileName) = 0 then
// I run into problem here
while (GetLine(nFileHandle, szCurLine) = 0 )
ListAddString(listID, szCurLine, AFTER);
endwhile;
CloseFile(nFileHandle);
endif;
endif;
endif;
endif;
The problem is that right after the command prompt is executed and the result is redirected to MyFile.txt, I can set open file mode, open the file but I can not read any text into my list. ListReadFromFile() does not helps. If I open the file, edit and save it manually, my script works.
After debugging, I figured that GetLine() returns an error code (-1) which means the file pointer must be at the end of file or other errors. However, FILE_MODE_NORMAL sets the file as read only and SET THE FILE POINTER AT THE BEGINNING OF THE FILE.
What did I possibly do wrong? Is this something to do with read/write access of the file? I tried this command without result:
icacls D:\MyFile.txt /grant Administrator:(R,W)
I am using IstallShield 2018 and Windows 10 64-bit btw. Your help is much appreciated.
EDIT 1: I suspected the encoding and tried a few things:
After running "wslconfig /l", the content of MyFile.txt opened in Notepad++ is without an encoding, but still appeared normal and readable. I tried to converted the content to UTF-8 but it did not work.
If I add something to the file (echo This line is appended >> MyFile.txt), the encoding changed to UTF-8, but the content in step 1 is changeed also. NULL (\0) is added to between every character and even repelace new line character. Maybe this is why GetLine() failed to read the file.
Work around: after step 1, I run "find "my_desired_content" MyFile.txt" > TempFile.txt and read TempFile.txt (which is encoded in UTF-8).
My ultimate goal is to check if "my_desired_content" apeears in the result of "wslconfig /l" so this is fine. However, what I don't understand is that both MyFile.txt and TempFile.txt are created from cmd command but they are encoded differently?
The problem is due to the contents of the file. Assuming this is the file generated by your linked question, you can examine its contents in a hex editor to find out the following facts:
Its contents are encoded in UTF-16 (LE) without a BOM
Its newlines are encoded as CR or CR CR instead of CR LF
I thought the newlines would be more important than the text encoding, but it turns out I had it backwards. If I change each of these things independently, GetLine seems to function correctly for either CR, CR CR, or CR LF, but only handles UTF-16 when the BOM is present. (That is, in a hex editor, the file starts with FF FE 57 00 instead of 57 00 for a file starting with the character W.)
I'm at a bit of a loss for the best way to address this. If you're up for a challenge, you could read the file with FILE_MODE_BINARYREADONLY, and can use your extra knowledge about what should be in the file to ensure you interpret its encoding correctly. Note that for most of UTF-16, you can create a single code unit by combining two bytes in the following manner:
szResult[i] = (nHigh << 8) + nLow;
where nHigh and nLow are probably values like szBuffer[2*i + 1] and szBuffer[2*i], assuming you filled a STRING szBuffer by calling ReadBytes.
Other unproven ideas include editing it in binary to ensure the BOM (FF FE) is present, figuring out ways to ensure the file is originally created with the BOM, figuring out ways to create it in an alternate encoding, finding another command you can invoke to "fix" the file, or lodging a request with the vendor (my employer) and hoping the development team changes something to better handle this case.
Here's an easier workaround. If you can safely assume that the command will append UTF-16 characters without a signature, you can append this output to a file that has just a signature. How do you get such a file?
You could create a file with just the BOM in your development environment, and add it to your Support Files. If you need to use it multiple times, copy it around first.
You could create it with code. Just call the following (error checking omitted for clarity)
OpenFileMode(FILE_MODE_APPEND_UNICODE);
CreateFile(nFileHandle, szDir, szFileName);
CloseFile(nFileHandle);
and if szDir ^ szFileName didn't exist, it will now be a file with just the UTF-16 signature.
Assuming this file is called sig.txt, you can then invoke the command
wslconfig /l >> sig.txt to write to that file. Note the doubled >> for append. The resulting file will include the Unicode signature you created ahead of time, plus the Unicode data output from wslconfig, and GetLine should interpret things correctly.
The biggest problem here is that this hardcodes around the behavior of wslconfig, and that behavior may change at any point. This is why Christopher alludes to recommending an API, and I agree completely. In the mean time, You could try to make this more robust by invoking it in a cmd /U (but my understanding of what that does or guarantees is fuzzy at best), or by trying the original way and then with the BOM.
This whole WSL thing is pretty new. I don't see any APIs it but rather then screen scrapping command outputs you might want to look at this registry key:
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Lxss
It seems to have the list of installed distros that come from the store. Coming from the store probably explains why this is HKCU and not HKLM.
A brave new world.... sigh.
Yet another encoding question on Python.
How can I pass non-ASCII characters as parameters on a subprocess.Popen call?
My problem is not on the stdin/stdout as the majority of other questions on StackOverflow, but passing those characters in the args parameter of Popen.
Python script used for testing:
import subprocess
cmd = 'C:\Python27\python.exe C:\path_to\script.py -n "Testç on ã and ê"'
process = subprocess.Popen(cmd,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
output, err = process.communicate()
result = process.wait()
print result, '-', output
For this example call, the script.py receives Testç on ã and ê. If I copy-paste this same command string on a CMD shell, it works fine.
What I've tried, besides what's described above:
Checked if all Python scripts are encoded in UTF-8. They are.
Changed to unicode (cmd = u'...'), received an UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 128: ordinal not in range(128) on line 5 (Popen call).
Changed to cmd = u'...'.decode('utf-8'), received an UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 128: ordinal not in range(128) on line 3 (decode call).
Changed to cmd = u'...'.encode('utf8'), results in Testç on ã and ê
Added PYTHONIOENCODING=utf-8 env. variable with no luck.
Looking on tries 2 and 3, it seems like Popen issues a decode call internally, but I don't have enough experience in Python to advance based on this suspicious.
Environment: Python 2.7.11 running on an Windows Server 2012 R2.
I've searched for similar problems but haven't found any solution. A similar question is asked in what is the encoding of the subprocess module output in Python 2.7?, but no viable solution is offered.
I read that Python 3 changed the way string and encoding works, but upgrading to Python 3 is not an option currently.
Thanks in advance.
As noted in the comments, subprocess.Popen in Python 2 is calling the Windows function CreateProcessA which accepts a byte string in the currently configured code page. Luckily Python has an encoding type mbcs which stands in for the current code page.
cmd = u'C:\Python27\python.exe C:\path_to\script.py -n "Testç on ã and ê"'.encode('mbcs')
Unfortunately you can still fail if the string contains characters that can't be encoded into the current code page.
This is killing me. I have a config file, "myconfig.cfg", with the following content:
SOME_VAR=2
echo "I LOVE THIS"
Then I have a script that I'm trying to run, that sources the config file in order to use the settings in there as variables. I can print them out fine, but when I try to put one into a numeric variable for use in something like a "seq " command, I get this weird "invalid arithmetic operator" error.
Here's the script:
#!/bin/bash
source ./myconfig.cfg
echo "SOME_VAR=${SOME_VAR}"
let someVarNum=${SOME_VAR}
echo "someVarNum=${someVarNum}"
And here's the output:
I LOVE THIS
SOME_VAR=2
")syntax error: invalid arithmetic operator (error token is "
someVarNum=
I've tried countless things that theoretically shouldn't make a difference, and, surprise, they don't. I simply can't figure it out. If I simply take the line "SOME_VAR=2" and put it directly into the script, everything's fine. I'm guessing I'll have to read in the config file line by line, split the strings by "=", and find+create the variables I want to use manually.
The error is precisely as indicated in a comment by #TomFenech. The first line (and possibly all the lines) in myconfig.cfg is terminated with a Windows CR-LF line ending. Bash considers CR to be an ordinary character (not whitespace), so it will set SOME_VAR to the two character string 2CR. (CR is the character with hex code 0x0D. You could see that if you display the file with a hex-dumper: hd myconfig.cfg.)
The let command performs arithmetic on numbers. It also considers the CR to be an ordinary character, but it is neither a digit nor an operator so it complains. Unfortunately, it does not make any attempt to sanitize the display of the character in the error message, so the carriage return is displayed between the two " symbols. Consequently, the end of the error message overwrites the beginning.
Don't create Unix files with a Windows text editor. Or use a utility like dos2unix to fix them once you copy them to the Unix machine.
Underlying problem: I want to enable running robot tests using selenium 2 in a portable Firefox that's stored within EXECDIR.
${firefox_binary}= Evaluate sys.modules['selenium.webdriver.firefox.firefox_binary'].FirefoxBinary('${EXECDIR}${/}Firefox${/}App${/}Firefox${/}Firefox.exe') sys, selenium.webdriver
${firefox_profile}= Evaluate sys.modules['selenium.webdriver.firefox.firefox_profile'].FirefoxProfile('${EXECDIR}${/}Lib${/}SeleniumFirefoxProfile') sys, selenium.webdriver
Create Webdriver Firefox firefox_binary=${firefox_binary} firefox_profile=${firefox_profile}
That works fine if, instead of ${EXECDIR}, I use the actual path.
EXECDIR is something like C:\Users\bart.simpson\workspace\projectname here. The issue is that a backslash, when followed by the b, is converted to the ASCII backslash character. The test log then says:
Evaluating expression 'sys.modules['selenium.webdriver.firefox.firefox_profile'].FirefoxProfile('C:\Users\bart.simpson\workspace\projectname\Lib\SeleniumFirefoxProfile')' failed: OSError: [Errno 20047] Unknown error: 20047: 'C:\\Users\x08art.simpson\\workspace\\projectname\\Lib\\SeleniumFirefoxProfile'
Of course I've tried using ${fixedExecDir}= Replace String ${EXECDIR} '\' '/' and such, but nothing ever changes the outcome.
Ideas? Thanks.
Try treating the path as a raw string literal, by putting an "r" before the quote immediately before ${EXECDIR}:
${firefox_binary}= Evaluate ....FirefoxBinary(r'${EXECDIR}${/}Firefox...')
This should work because the robot variables are substituted before the string is passed to python, so the python interpreter only ever sees the complete string.
If you are unfamiliar with python raw string literals, see this question:
What exactly do “u” and “r” string flags do in Python, and what are raw string literals?
I am attempting to write a line of code that will take a line of japanese text and delete a certain set of characters. However I am having trouble with using unicode characters inside of the regular expression.
I am currently using text.gsub(/《.*?》/u, '') but I get the error
'gsub': invalid byte sequence in Windows-31J (Argument error)
Can anyone tell me what I am doing incorrectly?
Example text : その仕草《しぐさ》があまりに無造作《むぞうさ》だったので
Expected result: その仕草があまりに無造作だったので
Thanks
edit: # encoding: utf-8 is present at the top of the script.
Try this:
text.encode('utf-8', 'utf-8').gsub(/《.*?》/u, '')