In Windows, opening executables and writing output files quickly is failing randomly - windows

I've got an executable that does some structural analysis. It's compiled from old Fortran code, somewhat of a black box. It reads an input file and writes output to the command window.
I've integrated that executable into an Excel VBA macro to do design optimization. My optimization routine does
Write 10 input files in different directories
Call 10 concurrent instances of the executable (each of the 10 instances is from a copied and renamed version of the exe file) and pipe the output to a file
Wait for them all to finish
Read in output files, use the results to generate a new set of designs, and start again.
The executable runs very quickly, less than a second for all the concurrent instances.
This scheme is pretty reliable when I run it on its own. However, I'd like to run multiple optimization jobs concurrently. So imagine 8 or 10 instances of Excel, each running these optimizations concurrently. On my computer, it generally runs fine. On other, nominally identical spec, machines, we're running into problems, where the output file isn't getting created, either because the executable isn't getting called, or is failing to run, or the output is failing to be piped to the results file. I'd welcome suggestions to check for those. This doesn't happen every time, maybe once per 1000 iterations. But it does happen simultaneously across most of the Excel instances and most of the 10 executable calls.
Any idea what is going wrong? It seems like it has something to do with calling so many executables or writing so many files so quickly.

Related

VB6 compiled is slow when copying files

I know, VB6 is historic...ok, but...
I w
rote years ago a backup program not being satisfied from coomercial producuts I tested.
Now I wanted to renew it with some enhancements and a new graphic; the result is quite good for me. Since the file copying process is generally rather slow, I thought to compile it to squeeze some seconds...and instead...this is much slower.
Here are some info:
Win10-64 (version 22H2 just upgraded)
Tested on the same PC with identical parameters
VB6 runs with admin privileges, in Win7 SP3 compatibility mode.
Even if it is not relevant here, the job was to copy a folder containing other 426 folders and 4598 files of different sizes (from 1kB to 435MB, for a total of 1.05GB), from an inside SSD disk to an external SSD disk.
The interpreted version took 7.2 sec while the compiled version ended in 18.6 sec !
I tried different compilation setting in native code, dismissing all the advanced controls over ranges, integers and floats, without any notable difference.
I could accept a small difference for some unknown reason, but it is unreal to get a 2.5:1 ratio.
Any idea?
EDIT
Based on comments:
I repeated the comparison several times; the differences (in both the compiled and the interpreted mode) is around +/- 1sec.
Files are copied using filesystemobject.copyfile
my admin privileges are the same for both
Again, I'm not complaining nor worried by the absolute time the copy takes, I can survive with that since it is an operation made every week and during easy hours.
What is surprising is WHY it happens.
Even the idea to compile the program was due to my curiosity since there is very little to optimize in the code; it is just a for-next loop with very little calculations and assignements.
The program takes the dir and files info from a text-based DB created by recursively scanning of the source folder, then loaded into a custom array...pretty simple.
This is done before the actual copy phase, which is what I'm investigating.

The loop with one go-exiftool instance hangs on big amount of files

I'm looping big ~10k files using go-exiftool.
I'm using one instance of the go-exiftool to get info for all required files.
This code is called 10k times in the loop, where the file is always different.
fileInfos := et.ExtractMetadata(file)
After the ~7k loops the program hangs. I debugged go-exiftool and found that it hangs in
https://github.com/barasher/go-exiftool/blob/master/exiftool.go#L121
on the line:
fmt.Fprintln(io.WriteCloser, "-execute")
if i understood correct io.WriteCloser has instance of exec.Command(binary, initArgs...).StdinPipe()
so, the questions are:
Does exec.Command has a limit of execution?
If 1) - not, what can be the reason else?
Does it depends on the file sizes? I tried another folder and it worked with 35k files and then hanged. How to check that?
UPDATE:
i tried to run the same file in 10k loops. Works fine. It looks like it runs out of memory, can it be? I see no problem in the system memory graph. Or stdin is overflowed. Have no idea how to check that.

VBscript alternatives to writing variables to a file needed (FSO.write is too slow)

TL;DR: I need something way faster than FSO.write OR another way to share a variable in memory between different script instances.
Hello, I am running CCPulse (on Windows 7), which is a Call Center monitoring tool. Agents are represented as "Objects" and can have various statistics (like calls taken, total talk duration etc). CCPulse allows to apply thresholds and actions to any statistic. These are basically vbscripts and as far as I can tell, there are no restrictions.
This allows me to take the "Threshold StatValue" and do things with it, ie writing it to a file. The issue is that if I apply a threshold to a statistic for all agents, the script executes for each agent object seperately (in sequence, not parallel). However, I want to export all the agent stats to a single csv file.
I already got it working, by creating a file if it doesn't exist, then open/ReadAll into a string. If an agent has not been written to the file yet his stat values get appended as a newline in the string, if he already exists in this file I search and replace his line using a regex pattern. I then write the entire multiline string back to the file:
Set objFile = objFSO.OpenTextFile(inFile,2)
objFile.Write strMemoryBuffer
objFile.Close
set objFile = nothing
strMemoryBuffer contains the files original content, with either a new line or a modified line. This string (and subsequently the export file) is around 30kb in size after all agents have been exported. It looks like this (simplified):
LoginID;Calls;TotalTalkTime
2243;08;9403
2132;12;8439
As I said, since the script runs seperately for each agent, only one line is ever added/modified per pass (CCpulse will execute the script one object at a time, until all are finished).
The write process is very slow however, using Timer() it says it needs between 0.10 and 0.15 seconds! That is way too slow, as I need to run the script on almost 500 agents (ideally in no more than 30 second intervals), but all the writing would take over a minute (CCPulse would create a backlog of threshold operations which could never be finished. I can decrease the recalculation frequency, but that is detrimental in other ways).
If I comment out only the above block, execution time dramatically decreases to ~0.02 seconds. So reading the file and manipulating the string takes almost no time at all, just the write process is slow.
I am writing the file locally to a hard drive (no SSD though). I cannot use a RAM Disk.
I also already tried writing to the volatile environment, but somehow, this is even slower (it does work, but for some reason the explorer process goes crazy with up to 50% cpu usage and ccpulse locks up, allthough the export file is still being updated).
The ideal solution would to have the string being repeadetly manipulated only in memory, and then written to file like only once every 30 seconds or something like that, but I don't know how I can make the strMemoryBuffer variable available to the "next" agent. Any ideas?

How to monitor and control background processes in shell script

I need to write a shell (bash) script that will be executing several Hive queries.
Each of the queries will produce a directory with a lot of files.
After all queries are finished I need to process all these files in a specific order.
I want to run Hive queries in parallel as background processes as each one might take couple of hours.
I would also like to parallelize resulting file processing but there are some culprits, that I don't know how to handle. I.e. I can start processing results of the first and second queries as soon as they are finished, but for the third, I need to hold until first two processors are done. Similarly for the fourth and fifth.
I won't have any problems writing such a program in Java, but how to do it in shell - beats me.
If someone can give me a hint on how can I monitor execution of these components in the shell script, I would appreciate it greatly.

Reading file in parallel from multiple processes

I'm running multiple processes in parallel and each of these processes read the same file in parallel. It looks like some of the processes see a corrupted version of the file if I increase the number of processes to > 15 or so. What is the recommended way of handling such a scenario?
More details:
The file being read in parallel is actually a perl script. The multiple jobs are python processes, and each of them launch this perl script independently with different input parameters. When the number of jobs is increased, some of these jobs give errors that the perl script has invalid syntax (which is not true). Hence, I suspect that some of these jobs read in corrupted versions of the perl script.
I'm running all of this on a 32core machine.
If any process is also writing to the file, then you need to enforce some synchronization, for example with a global named mutex.
If there is no asynchronous writing going on, I would not expect to see corruption during the reads. Are you opening the files with "r" access? If you're still encountering troubles, it might be worth experimenting with reducing read buffer size. Or call out to a native win32 API for the file access.
Good luck!

Resources