Lines added to output text file - visual-studio

I am compiling a Fortran 77 project with Visual Studio 2008 using Intel Fortran 2013 and I am having a stupid issue in one of the output file created by the executable.
In this file, I am expect to read something like
EXPECT FILE :
"
foo1
foo2
"
instead, I obtained almost the same but with empty lines lines in between:
OBTAINED FILE:
"
foo1
foo2
"
This can seem like a detail but it is actually a problem as this file is read by another program which is not checking for the empty lines.
The strange thing is that I also compiled this under Linux and the problem does not appear, that's why I concluded it is necessarily a Visual Studio option issue.
The source code looks like this :
character*80 comment(2)
comment(1)="foo1"
comment(2)="foo2"
do i=1, 2
write(10,*)comment(i)
end do
I tried to change several options inside the Fortran Properties but none of them did work
Anyone having some idea about this ?

This is (most likely) because the string is printed including all 80 characters, i.e., even with the trailing spaces (as suggested in the comments). One can observe this directly by putting the string being printed in quotes:
WRITE(10, '(A)') "'"//comment(i)//"'"
One solution would be to use, e.g.,
WRITE(10, '(A)') TRIM(ADJUSTL(comment(i)))
Here, ADJUSTL would remove also leading spaces. If this is not desirable, one could use just TRIM.

Related

Can't seem to use more than one -c argument for tesseract

I'm just using tesseract through bash scripting. I've finally come up with all the settings that recognize my text for my images nearly perfectly; however, I can't seem to use all of the options together. My command is as follows:
$ tesseract infile.tif outputbase --psm 6 -c tosp_min_sane_kn_sp=0.0;tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+&/\
I need the whitelist, because tesseract is picking up some lowercase characters, strange characters (such as yen sign), and other oddities. My images do not contain those characters, and since my document is quite simple I figured it would just be easier to whitelist the ones that do exist. Additionally, the image is in a "table" format (without any lines or borders), and tesseract only picks up the large spaces (which separate columns) and not individual spaces in between words within a column. Setting the tosp value to 0 seemed to fix that problem.
Now the issue is that tesseract won't process with both of those -c arguments at the same time, but the man pages explicitly states that you can use multiple -c arguments!
I've also tried to work around in the following way:
my_config_file
tosp_min_sane_kn_sp 0.0
tessedit_char_whitelist ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+&/\
$ tesseract infile.tif outputbase --psm 6 my_config_file
The config file is saved in the correct directory, but again only one of the options will work at a time. If both options are in the config file, it seems like it ignores the tosp_min_sane_kn_sp 0.0. If I remove one, then the other works.
I'm pulling out my hair here, and I'm about to just work around this issue by running the OCR twice and then just merging the two files with an awk script. I really don't want to do that, however, because its obviously less efficient and I don't really like the idea of trying to use awk when the OCR isn't guaranteed to be formatted 100% in the way that I'm going to have to assume in my potential awk script.
Please help!
EDIT:
I forgot to mention that I have indeed tried to pass multiple -c options. Instead of guessing various field separators in between variables semicolon made the most sense to me because I understand that tesseract is written in C++ which uses semicolons to signify the end of a line. I know C++ isn't interpreted, but it just seemed to make sense. Now I'm digressing . . .
Additionally, I've tried the advice of putting the whitelist in quotation marks, but that has made no difference. I was really excited because that didn't even occur to me, but it doesn't seem that tesseract even recognizes quotations even if I run that one -c argument by itself.
You can't pass multiple arguments to a single -c option, especially not separated by semicolons. I don't have tesseract, but I'm pretty sure you need to pass a separate -c option for each config variable you want to set:
tesseract infile.tif outputbase --psm 6 -c tosp_min_sane_kn_sp=0.0 -c 'tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+&/\'
(I also enclosed the second variable setting in single-quotes, so the shell doesn't try to interpret the backslash. Without the quotes, it'd escape the newline, so the next line would be treated as a continuation of this one.)
Explanation of the original problem: When the shell sees a semicolon (and it isn't in quotes or escaped), the shell treats it as a command separator. So it treated the line as two completely separate commands (with the next line combined, because of the backslash):
tesseract infile.tif outputbase --psm 6 -c tosp_min_sane_kn_sp=0.0
tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+&/ <whatever's on the next line of the file>
The first runs tesseract with one -c option, and the second one creates a shell variable named tessedit_char_whitelist. And even if you quoted or escaped it, so the semicolon got passed to tesseract, I suspect it wouldn't treat it as a separator the way you want it to.

InstallScript GetLine() can not read text file contains result from command prompt

My Installation needs to check the result of a command from cmd.exe. Thus, I redirect the result of the command to a text file and then try to read the file to get the result as follows:
// send command to cmd to execute and redirect the result to a text file
// try to read the file
szDir = "D:\\";
szFileName = "MyFile.txt";
if Is(FILEEXISTS, szDir ^ szFileName) then
listID = ListCreate(STRINGLIST);
if listID != LIST_NULL then
if OpenFIleMode(FILE_MODE_NORMAL) = 0 then
if OpenFile(nFileHandle, szDir, szFileName) = 0 then
// I run into problem here
while (GetLine(nFileHandle, szCurLine) = 0 )
ListAddString(listID, szCurLine, AFTER);
endwhile;
CloseFile(nFileHandle);
endif;
endif;
endif;
endif;
The problem is that right after the command prompt is executed and the result is redirected to MyFile.txt, I can set open file mode, open the file but I can not read any text into my list. ListReadFromFile() does not helps. If I open the file, edit and save it manually, my script works.
After debugging, I figured that GetLine() returns an error code (-1) which means the file pointer must be at the end of file or other errors. However, FILE_MODE_NORMAL sets the file as read only and SET THE FILE POINTER AT THE BEGINNING OF THE FILE.
What did I possibly do wrong? Is this something to do with read/write access of the file? I tried this command without result:
icacls D:\MyFile.txt /grant Administrator:(R,W)
I am using IstallShield 2018 and Windows 10 64-bit btw. Your help is much appreciated.
EDIT 1: I suspected the encoding and tried a few things:
After running "wslconfig /l", the content of MyFile.txt opened in Notepad++ is without an encoding, but still appeared normal and readable. I tried to converted the content to UTF-8 but it did not work.
If I add something to the file (echo This line is appended >> MyFile.txt), the encoding changed to UTF-8, but the content in step 1 is changeed also. NULL (\0) is added to between every character and even repelace new line character. Maybe this is why GetLine() failed to read the file.
Work around: after step 1, I run "find "my_desired_content" MyFile.txt" > TempFile.txt and read TempFile.txt (which is encoded in UTF-8).
My ultimate goal is to check if "my_desired_content" apeears in the result of "wslconfig /l" so this is fine. However, what I don't understand is that both MyFile.txt and TempFile.txt are created from cmd command but they are encoded differently?
The problem is due to the contents of the file. Assuming this is the file generated by your linked question, you can examine its contents in a hex editor to find out the following facts:
Its contents are encoded in UTF-16 (LE) without a BOM
Its newlines are encoded as CR or CR CR instead of CR LF
I thought the newlines would be more important than the text encoding, but it turns out I had it backwards. If I change each of these things independently, GetLine seems to function correctly for either CR, CR CR, or CR LF, but only handles UTF-16 when the BOM is present. (That is, in a hex editor, the file starts with FF FE 57 00 instead of 57 00 for a file starting with the character W.)
I'm at a bit of a loss for the best way to address this. If you're up for a challenge, you could read the file with FILE_MODE_BINARYREADONLY, and can use your extra knowledge about what should be in the file to ensure you interpret its encoding correctly. Note that for most of UTF-16, you can create a single code unit by combining two bytes in the following manner:
szResult[i] = (nHigh << 8) + nLow;
where nHigh and nLow are probably values like szBuffer[2*i + 1] and szBuffer[2*i], assuming you filled a STRING szBuffer by calling ReadBytes.
Other unproven ideas include editing it in binary to ensure the BOM (FF FE) is present, figuring out ways to ensure the file is originally created with the BOM, figuring out ways to create it in an alternate encoding, finding another command you can invoke to "fix" the file, or lodging a request with the vendor (my employer) and hoping the development team changes something to better handle this case.
Here's an easier workaround. If you can safely assume that the command will append UTF-16 characters without a signature, you can append this output to a file that has just a signature. How do you get such a file?
You could create a file with just the BOM in your development environment, and add it to your Support Files. If you need to use it multiple times, copy it around first.
You could create it with code. Just call the following (error checking omitted for clarity)
OpenFileMode(FILE_MODE_APPEND_UNICODE);
CreateFile(nFileHandle, szDir, szFileName);
CloseFile(nFileHandle);
and if szDir ^ szFileName didn't exist, it will now be a file with just the UTF-16 signature.
Assuming this file is called sig.txt, you can then invoke the command
wslconfig /l >> sig.txt to write to that file. Note the doubled >> for append. The resulting file will include the Unicode signature you created ahead of time, plus the Unicode data output from wslconfig, and GetLine should interpret things correctly.
The biggest problem here is that this hardcodes around the behavior of wslconfig, and that behavior may change at any point. This is why Christopher alludes to recommending an API, and I agree completely. In the mean time, You could try to make this more robust by invoking it in a cmd /U (but my understanding of what that does or guarantees is fuzzy at best), or by trying the original way and then with the BOM.
This whole WSL thing is pretty new. I don't see any APIs it but rather then screen scrapping command outputs you might want to look at this registry key:
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Lxss
It seems to have the list of installed distros that come from the store. Coming from the store probably explains why this is HKCU and not HKLM.
A brave new world.... sigh.

jamplus: link command line too long for osx

I'm using jamplus to build a vendor's cross-platform project. On osx, the C tool's command line (fed via clang to ld) is too long.
Response files are the classic answer to command lines that are too long: jamplus states in the manual that one can generate them on the fly.
The example in the manual looks like this:
actions response C++
{
$(C++) ##(-filelist #($(2)))
}
Almost there! If I specifically blow out the C.Link command, like this:
actions response C.Link
{
"$(C.LINK)" $(LINKFLAGS) -o $(<[1]:C) -Wl,-filelist,#($(2:TC)) $(NEEDLIBS:TC) $(LINKLIBS:TC))
}
in my jamfile, I get the command line I need that passes through to the linker, but the response file isn't newline terminated, so link fails (osx ld requires newline-separated entries).
Is there a way to expand a jamplus list joined with newlines? I've tried using the join expansion $(LIST:TCJ=\n) without luck. $(LIST:TCJ=#(\n)) doesn't work either. If I can do this, the generated file would hopefully be correct.
If not, what jamplus code can I use to override the link command for clang, and generate the contents on the fly from a list? I'm looking for the least invasive way of handling this - ideally, modifying/overriding the tool directly, instead of adding new indirect targets wherever a link is required - since it's our vendor's codebase, as little edit as possible is desired.
The syntax you are looking for is:
newLine = "
" ;
actions response C.Link
{
"$(C.LINK)" $(LINKFLAGS) -o $(<[1]:C) -Wl,-filelist,#($(2:TCJ=$(newLine))) $(NEEDLIBS:TC) $(LINKLIBS:TC))
}
To be clear (I'm not sure how StackOverflow will format the above), the newLine variable should be defined by typing:
newLine = "" ;
And then placing the carat between the two quotes and hitting enter. You can use this same technique for certain other characters, i.e.
tab = " " ;
Again, start with newLine = "" and then place carat between the quotes and hit tab. In the above it is actually 4 spaces which is wrong, but hopefully you get the idea. Another useful one to have is:
dollar = "$" ;
The last one is useful as $ is used to specify variables typically, so having a dollar variable is useful when you actually want to specify a dollar literal. For what it is worth, the Jambase I am using (the one that ships with the JamPlus I am using), has this:
SPACE = " " ;
TAB = " " ;
NEWLINE = "
" ;
Around line 28...
I gave up on trying to use escaped newlines and other language-specific characters within string joins. Maybe there's an awesome way to do that, that was too thorny to discover.
Use a multi-step shell command with multiple temp files.
For jamplus (and maybe other jam variants), the section of the actions response {} between the curly braces becomes an inline shell script. And the response file syntax #(<value>) returns a filename that can be assigned within the shell script, with the contents set to <value>.
Thus, code like:
actions response C.Link
{
_RESP1=#($(2:TCJ=#)#$(NEEDLIBS:TCJ=#)#$(LINKLIBS:TCJ=#))
_RESP2=#()
perl -pe "s/[#]/\n/g" < $_RESP1 > $_RESP2
"$(C.LINK)" $(LINKFLAGS) -o $(<[1]:C) -Wl,-filelist,$_RESP2
}
creates a pair of temp files, assigned to shell variable names _RESP1 and _RESP2. File at path _RESP1 is assigned the contents of the expanded sequence joined with a # character. Search and replace is done with a perl one liner into _RESP2. And link proceeds as planned, and jamplus cleans up the intermediate files.
I wasn't able to do this with characters like :;\n, but # worked as long as it had no adjacent whitespace. Not completely satisfied, but moving on.

os.execute() with command line options

The Question: How do I execute an OS Command with three command line options in Lua ?
I have a device connected to my PC. (Windows 7, USB cables, typical corporate)
The software which controls the device is located here...
C:\Program Files (x86)\PowerUSB\
The name of the executable file (aka "Program") is...
pwrusbcmd
That program wants three single digit parameters either 1 or 0, separated by spaces
I opened a command prompt box, switched to that directory, and tested all 8 cases. All worked fine.
I then switched to another subdirectory, and tried this command...
"C:\Program Files (x86)\PowerUSB\pwrusbcmd" 1 1 1
That also worked fine.
So I figured that the Lua command to execute that command would be either...
os.execute("C:\Program Files (x86)\PowerUSB\pwrusbcmd 1 1 1 ")
or
os.execute("C:\\Program Files (x86)\\PowerUSB\\pwrusbcmd 1 1 1")
Lua runs each, with no complaints, BUT, no action occurs on the device.
So I tried to alter the construction of the command itself, with the ".." connecting the two segments of the total string, like this...
os.execute("C:\\Program Files (x86)\\PowerUSB\\pwrusbcmd".." 1 1 1 ")
Still no action.
I looked here on StackOverflow, and found
THIS QUESTION
and THIS ONE
and THIS ONE
and THIS ONE
I am in sympathy with each person who wrote those questions. Much like user ID thatthing, I also tried..
square brackets
quote marks (")
single and double and triple backslashes
front slash and s (/s)
So far, I can't find a single syntax construction that works.
The only "fix" (misnomer if there ever was one) I could concoct on my own is to write eight different MS-DOS bat files, and give them unique names. This renders the machine de facto unusable.
How do I get Lua to execute this command ???
C:\Program Files (x86)\PowerUSB\pwrusbcmd 1 1 1
You forgot to add the double quotes around the command name, the easiest way is to use single quoted strings:
os.execute('"C:\\Program Files (x86)\\PowerUSB\\pwrusbcmd" 1 1 1')
Try os.execute([["C:\Program Files (x86)\PowerUSB\pwrusbcmd" 1 1 1 ]])
I believe your problem is the spaces in the file path.
I know you say you used square brackets, but I can't see what combination of them you have used. This works for me.

Makefile problem with files beginning with "#"

I have a directory "FS2" that contains the following files:
ARGH
this
that
I have a makefile with the following contents.
Template:sh= ls ./FS2/*
#all: $(Template)
echo "Template is: $(Template)"
touch all
When I run "clearmake -C sun" and the file "all" does not exist, I get the following output:
"Template is: ./FS2/#ARGH# ./FS2/that ./FS2/this"
Modifying either "this" or "that" does not cause "all" to be regenerated. When run with "-d" for debug, the "all" target is only dependent on the directory "./FS2", not the three files in the directory. I determined that when it expands "Template", the "#" gets treated as the beginning of a comment and the rest of the line is ignored!
The problem is caused by an editor that when killed leaves around files that begin with "#". If one of those files exists, then no modifications to files in the directory causes "all" to be regenerated.
Although, I do not want to make compilation dependent on whether a temporary file has been modified or not and will remove the file from the "Template" variable, I am still curious as to how to get this to work if I did want to treat the "#ARGH#" as a filename that the rule "all" is dependent on. Is this even possible?
I have a directory "FS2" that contains the following files: #ARGH# ...
Therein lies your problem. In my opinion, it is unwise using "funny" characters in filenames. Now I know that those characters are allowed but that doesn't make them a good idea (ASCII control characters like backspace are also allowed with similar annoying results).
I don't even like spaces in filenames, preferring instead SomethingLikeThis to show independent words in a file name, but at least the tools for handling spaces in many UNIX tools is known reasonably well.
My advice would be to rename the file if it was one of yours and save yourself some angst. But, since they're temporary files left around by an editor crash, delete them before your rules start running in the makefile. You probably shouldn't be rebuilding based on an editor temporary file anyway.
Or use a more targeted template like: Template:sh= ls ./FS2/[A-Za-z0-9]* to bypass those files altogether (that's an example only, you should ensure it doesn't faslely exclude files that should be included).
'#' is a valid Makefile comment char, so the second line is ignored by the make program.
Can you filter out (with grep) the files that start with # and process them separately?
I'm not familiar with clearmake, but try replacing your template definition with
Template:sh= ls ./FS2/* | grep -v '#'
so that filenames containing # are not included in $(Template).
If clearmake follows the same rules as GNU make, then you can also re-write your target using something like Template := $(wildcard *.c) which will be a little more intelligent about files with oddball names.
If I really want the file #ARGH# to contribute to whether the target all should be rebuilt as well as be included in the artifacts produced by the rule, the Makefile should be modified so that the line
Template:sh= ls ./FS2/*
is changed to
Template=./FS2/*
Template_files:sh= ls $(Template)
This works because $(Template) will be replaced by the literal string ./FS2/* after all and in the expansion of $(Template_files).
Clearmake (and GNU make) then use ./FS2/* as a pathname containing a wildcard when evaluating the dependencies, which expands in to the filenames ./FS2/#ARGH# ./FS2/that ./FS2/this and $(Template_files) can be used in the rules where a list of filenames is needed.

Resources