We have a PDF document processing system, implemented in AppleScript (where we call the scripts from the shell using osascript). In some of the scripts, we call Acrobat Preflight Droplets from the Applescript.
This does usually work without problems. However, in some cases, where the processed document is big or/and complex. the droplet returns control to the script before the report is written and the document is moved to the "success" or "failure" folder. The consequence is that the process continues, but without the moved file, it eventually fails.
The workaround so far has been to add a delay after those droplet calls. This does help, but it is a waste of time for small documents, and there will always be a document big and complex enough to take longer than the delay.
We also found out that the time needed for finishing writing the report and moving the document depends on the speed of the system (had to be expected…).
The workaround would be to calculate the delay from the document size, its number of pages, and a machine-dependent parameter. Document size, and number of pages are no big deal; they can be retrieved in the Applescript.
The problem is the machine-dependent parameter, which can be determined experimentally. But how do I make that parameter available to all the scripts needing it?
Incorporating it into the scripts is not an option, because we have a number of systems installed, and if we would do that, we'd end up in a maintenance nightmare. Passing it as an argument in the initial system call is also not possible, because the calls are many, and again would lead to a maintenance nightmare.
So, is there a way to set up a place where that machine parameter can be stored and easily called from any Applescript, no matter how it itself is called.
Thanks a lot for your advice.
You might find the Property List Suite in System Events useful. It’s a standard means of storing and then retrieving such information. Property List files themselves are simply XML files, so you can even create them outside of AppleScript and then read them within your scripts.
There’s a description with examples at https://apple.stackexchange.com/questions/58007/how-do-i-pass-variables-values-between-subsequent-applescript-runs-persistent
A simple suggestion if you only have one paramater to keep track of would be to just have a text file in a known location on each machine. The only content of the text file would be the machine paramater. I like to use the Application Support folder this kind of thing.
Assuming your machine parameter is CPU speed. You can save a text file in /Library/Application Support/Preflight Scripts/machinecpu.txt with the contents:
2.4
Then in Applescript, you would just read the text file.:
set machineParam to read file "Macintosh HD:Library:Application Support:Preflight Scripts:machinecpu.txt"
Related
I am trying to use Nifi to get a file from SFTP server. Potentially the file can be big , so my question is how to avoid getting the file while it is being written. I am planning to use ListSFTP+FetchSFTP but also okay with GetSFTP if it can avoid copying partially written files.
thank you
In addition to Andy's solid answer you can also be a bit more flexible by using the ListSFTP/FetchSFTP processor pair by doing some metadata based routing.
After ListSFTP each flowfile will have attributes such as 'file.lastModifiedTime' and others. You can read about them here https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.ListSFTP/index.html
You can put a RouteOnAttribute process in between the List and Fetch to detect objects that at least based on the reported last modified time are 'too new'. You could route those to a processor that is just a slow pass through to intentionally wait a bit. You can then run those back through the first router until they are 'old enough'. Now, this is admittedly a power user approach but it does give you a lot of flexibility and control. The approach I'm mentioning here is not fool proof as the source system may not report the last mod time correctly, it may not mean the source file is doing being written, etc.. But it gives you additional options IF you cannot do the definitely correct thing above that Andy talks about.
If you have control over the process which writes the file in, a common pattern to solve this is to initially write the file with a specific naming structure, such as beginning with .. After the successful write operation, the file is renamed without the . and it is picked up by the processor. Both GetSFTP and ListSFTP have a processor property called Ignore Dotted Files which is set to true by default and means those processors will not operate on or return files beginning with the dot character.
There is a minimum file age property you can use. The last modification time gets updated as the file is being written. Setting this value to something other than 0 will help fix the problem:
I am performing very rapid file access in ruby (2.0.0 p39474), and keep getting the exception Too many open files
Having looked at this thread, here, and various other sources, I'm well aware of the OS limits (set to 1024 on my system).
The part of my code that performs this file access is mutexed, and takes the form:
File.open( filename, 'w'){|f| Marshal.dump(value, f) }
where filename is subject to rapid change, depending on the thread calling the section. It's my understanding that this form relinquishes its file handle after the block.
I can verify the number of File objects that are open using ObjectSpace.each_object(File). This reports that there are up to 100 resident in memory, but only one is ever open, as expected.
Further, the exception itself is thrown at a time when there are only 10-40 File objects reported by ObjectSpace. Further, manually garbage collecting fails to improve any of these counts, as does slowing down my script by inserting sleep calls.
My question is, therefore:
Am I fundamentally misunderstanding the nature of the OS limit---does it cover the whole lifetime of a process?
If so, how do web servers avoid crashing out after accessing over ulimit -n files?
Is ruby retaining its file handles outside of its object system, or is the kernel simply very slow at counting 'concurrent' access?
Edit 20130417:
strace indicates that ruby doesn't write all of its data to the file, returning and releasing the mutex before doing so. As such, the file handles stack up until the OS limit.
In an attempt to fix this, I have used syswrite/sysread, synchronous mode, and called flush before close. None of these methods worked.
My question is thus revised to:
Why is ruby failing to close its file handles, and how can I force it to do so?
Use dtrace or strace or whatever equivalent is on your system, and find out exactly what files are being opened.
Note that these could be sockets.
I agree that the code you have pasted does not seem to be capable of causing this problem, at least, not without a rather strange concurrency bug as well.
What I'm trying to accomplish is to always keep a parsable duplicate of all printed documents, and execute a secondary process for each print.
(i.e.: Be able to parse all text, account for pages, vectors, images, etc).
Processing the document can either be done immediately or deferred (immediately is desirable).
As formats go, any PDL might be suitable, my best guess is XPS would probably be the best bet for a parsable format, any recommendations for other formats are appreciated.
Ideally, I'd like to not mess with the user interaction with the printing (e.g.: print settings page; or create a virtual printer, which could save a XPS and then forward the print job to the physical printer).
Since users might not be tech savvy to either set up/use it properly and/or mess up the process at a later date.
What I'm looking for at this time:
Documentation on the print process and flow (WDK, PDL, what else?)
How this could be accomplished, if at all possible; are there any existing solutions?
Any directions into what I should be looking at.
It's only part of an answer, but rumor has it you can tell Windows to keep spooled documents (right-click the printer, choose "Printer Properties", Advanced, "Keep Printed Documents").
You could enable this, and then create a scheduled task (or system service, etc.) that watches the spool directory and moves all files older than a certain threshold to a more appropriate location for further processing. (The age threshold would be a reasonable way to avoid trying to move files that are currently being written.)
Then you'd have to find a program to convert the .spl files to whatever format you like, or try interpreting it yourself. It looks pretty low-level but Microsoft does offer some documentation about the MS-EMF and MS-EMFSPOOL formats that might be a start.
I want to be able to (programmatically) move (or copy and truncate) a file that is constantly in use and being written to. This would cause the file being written to would never be too big.
Is this possible? Either Windows or Linux is fine.
To be specific what I'm trying to do is log video with FFMPEG and create hour long videos.
It is possible in both Windows and Linux, but it would take cooperation between the applications involved. If the application that is writing the new data to the file is not aware of what the other application is doing, it probably would not work (well ... there is some possibility ... back to that in a moment).
In general, to get this to work, you would have to open the file shared. For example, if using the Windows API CreateFile, both applications would likely need to specify FILE_SHARE_READ and FILE_SHARE_WRITE. This would allow both (multiple) applications to read and write the file "concurrently".
Beyond sharing the file, though, it would also be necessary to coordinate the operations between the applications. You would need to use some kind of locking mechanism (either by locking some part of the file or some shared mutex/semaphore). Note that if you use file locking, you could lock some known offset in the file to act as a "semaphore" (it can even be a byte value beyond the physical end of the file). If one application were appending to the file at the same exact time that the other application were truncating it, then it would lead to unpredictable results.
Back to the comment about both applications needing to be aware of each other ... It is possible that if both applications opened the file exclusively and kept retrying the operations until they succeeded, then perform the operation, then close the file, it would essentially allow them to work without "knowledge" of each other. However, that would probably not work very well and not be very efficient.
Having said all that, you might want to consider alternatives for efficiency reasons. For example, if it were possible to have the writing application write to new files periodically, it might be more efficient than having to "move" the data constantly out of one file to another. Also, if you needed to maintain some portion of the file (e.g., move out the first 100 MB to another file and then move the second 100 MB to the beginning) that could be a fairly expensive operation as well.
logrotate would be a good option is linux, comes stock on just about any distro. I'm sure there's a similar windows service out there somewhere
When attempting to load the iTunes XML/plist file, I get "internal table overflow." After Googling, it looks like Applescript has run out of memory. The file is 18 meg on disk, so while on the larger side of things, it should still work on a Mac with 2 gigs.
How can I resolve this?
Obviously, since it's created by iTunes, I can't control the generate of it much.
Update: The relevant snippet:
tell application "System Events"
tell property list file (itunes_xml_file as string)
tell contents
set my_tracks to value of property list item "Tracks"
repeat with t in items of my_tracks
I guess that AppleScript is simply not made to handle this amount of data. I tried to use AppleScript a while back as well and tried to do something similar (reading an iTunes library). AppleScript's original intention was to automate applications by sending AppleEvents to them - which in combination with the weird syntax of AppleScript, confuses a lot and makes it difficult to do a lot of simple things.
After some frustrating time I decided to use Python instead, as it provides a simple module for reading the plist files: http://docs.python.org/dev/library/plistlib.html
Possibly not what you wanted to hear, but the problem with AppleScript is that it is easily overloaded with data, as the abstraction of data it works with is rather bulky and takes up lot of memory.
I'm sure if you give Python a try, you'll have something up and running in less than a hour. Python is installed on all Macs by default and is really easy to learn.
OS version 10.6.8
Update: Confirmation of validity
After initially writing the below I tried to revert the code at hand but could not reproduce the error and hid the post by "deleting" it.
Just now the error happened when loading one non-existant and one existent file-path in an options-parsing script loaded with a load script call, initially handled by run handler, called by the CLI osascript(1) program. This time it is revertible and I feel confident to un-"delete" it.
In short, my solution is to change anyone_else's POSIX file path_posix to AppleScript's POSIX file path_posix
Some relation
After writing the below I now realize that i first only saw "iTunes" and missed the relevant first line with tell app "system events" and the use of it's property list file which perhaps/actually/somehow could be related to my issue with info for a POSIX file.
A note related to OP/question: file or alias as string gives a colon-separated "HFS"-path. System Events handles both.
My issue
In a script-loading script i got error "Internal table overflow." number -2707 from the block below.
It was issued when i called the block's handler using ~IPC~ (app app_name's handler_name()) (when i investigated it more thoroughly but, i had come across it before - without IPC).
try
set file_modified_date to (info for my POSIX file file_path_posix)'s modification date
true
on error error_message number error_number from error_source partial result error_result to error_class
if {error_number} is not in {-43, -37} then error error_message number error_number from error_source partial result error_result to error_class
false
end try
A (the(?)) parent (used for my) of this script, worth mentioning, is current application (with some levels in between) (compiled and bundled in "AppleScript Editor" to be run as a stand-alone .app)
My solution
Changing
set file_modified_date to (info for my POSIX file file_path_posix)'s modification date
to
set file_modified_date to (info for AppleScript's POSIX file file_path_posix)'s modification date
solved the issue - for now.
Thoughts
I'm guessing different ~modules~ has different "tables" (don't now much C) for handling a thing like POSIX file and info for (open "Scripting Addition" / extension (OSAX) "Standard Additions", as it (both) still is, is it not?).
Hope this helps, and that my level of detail (and parentheses) didn't loose or confuse you :) Good night.
Diggin' 'round the cradles
Around the AS memory grave: (Spam prevention made me downgrade all but 2 hyperlinks - give me some rep. and i'll fix it :p)
• The eminent ~has [on this error code] (http://lists.apple.com/archives/applescript-users/2005/Jul/msg00166.html) with a related but perhaps other source (and mentioning his library loader) via list.apple.com.
• Some [good questioning] (http://lists.apple.com/archives/applescript-implementors/2005/Jun/msg00104.html), some to which the below might be a slightly yawning answer:
• cs.cmu.edu provides (120625) a pascal source from 1992 that defines the same limits as my local /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/OpenScripting.framework/Versions/A/Headers/AppleScript.h
• And finally a more distantly related [issue with large scripts] (http://macscripter.net/viewtopic.php?id=11760) from macscripter.net - a good forum of knowledge and collection of resources around applescript.