How can I use Acrobat XI Pro's (or any other software) to batch optimize thousands of PDF files without prompting per file as Acrobat XI Pro's batch mode seems to do now (with its "Add Document Description" followed by "Optimize Scanned PDF" followed by "Save As"). These documents were erroneously scanned in 600DPI with full color but should have been B&W at 300DPI.
Related
Just out of curiosity. I observed that when I copied some webpage text in Firefox that contained font size and color (set by CSS) and pasted them into OneNote, the font size and color were copied along with it. How is this formatting information transferred between the two applications?
OneNote offers several paste operations: keep the original formatting, merge formatting, and keep only the text. But this formatting information is supposed to be saved to the Windows clipboard when the copy button is pressed? I have no knowledge of Windows application development, but I assume that Firefox is the active window when I press the copy key, so it is Firefox that accepts and handles this keyboard event?
I went searching for Firefox's guidance documentation and didn't find anything related to the system clipboard.
By reading Microsoft technical documentation I learned that there are many kinds of clipboard data formats (yes, because Windows' clipboard can handle many data formats, it needs so many formats). If you want to pass data between two applications, I think this format must be one of the standard formats, but I'm not sure which one.
Or is the truth a completely different mechanism from my guess?
When an application is asked to copy something to the clipboard it can store "that something" in multiple formats simultaneously and when another application is asked to paste, it can pick from all the applicable formats.
OneNote perhaps picks CF_HTML > CF_RTF > CF_UNICODETEXT. On the other hand, when you ask it to paste without formatting it might pick CF_UNICODETEXT first (and if it is not available, manually strip the formatting from the HTML/RTF).
There are various tools that lets you see which formats are on the clipboard...
I have an AppleScript source with this code:
«event coreslct» (last row of table X of document Y)
insert rows selection position below number of rows Z
The double angle brackets mean, the enclosed words are raw format or event code.
So my question is: what is the command of event code event coreslct...
The «event coreslct» as used in your example is specific to Microsoft Excel, and loosely translates to "select the cells concerned", or simply "select". To determine precisely what the definition of the command is you can perhaps try the instructions below:
According to Microsoft's KB:
To use the program-specific capabilities of Excel for Mac with
AppleScript, open and examine the AppleScript dictionary that is
supplied with Excel for Mac.
To use the Script Editor open the dictionary in Excel for Mac, follow
these steps:
1. Start the Script Editor. To do this, follow these steps:
a. Open your hard disk.
b. Open the Applications folder.
c. Open the AppleScript folder. For the Apple OS versions
earlier than OSX, open the Apple Extras folder, and then open
the AppleScript folder.
d. Double-click Script Editor.
2. On the File Menu, click Open Dictionary.
3. In the Open Dictionary dialog box, select Microsoft Excel
(Application) in the Name list, and then click Open.
In the window that appears, you can select an object or a class to
view its description. You can also click the bold suite names to view
an whole suite at one time. You can use the descriptions in this
window to create scripts in the Script Editor to control Excel for
Mac.
The versions of Excel for Mac listed at the beginning of this article
support a very large number of events. For a complete list, follow the
instructions in this article to open Excel for Mac in the AppleScript
Script Editor.
I don't have access to the Applescript dictionary mentioned above, however, it was ported to Python at some point, and you can view a complete set of commands here.
It's the select command of Microsoft Word. Wrap around a tell "Microsoft Word" block around it and it will be corrected using the application's dictionary.
I am writing a cross-platform application which generates PDF documents and then opens them with the standard PDF viewer. On Linux, updating (re-writing) an already opened document is no problem. The PDF viewer (have tried several) updates the displayed document after overwriting the file. I don't even have to trigger anything. This comes in very handy.
On Windows, however, I cannot overwrite a PDF document opened for reading by some viewer, because the viewers seem to open the document exclusively. Of course I cannot change the code of the viewers like Adobe Reader.
Is there any solution to this? I don't want to clutter the directory with arbitrary new file names, but rather use the same file name again. Deleting of the opened file is also forbidden on Windows.
I cannot determine which PDF viewer is used. In order to open the document, I use
cmd.exe /c start <fileName>
I will be writing code that takes a screenshot, crops to a small section of the screen (predefined area of screen), and then extracts the text from that cropped image (via OCR tools), and then saves the resulting text to a file. I was wondering if there is software (preferably for Windows) that can do this, or at least parts of it. I am already looking into tesseract as an OCR tool. Anyone know of software that can take the screenshot, and possibly crop a predefined region of the image.
Thanks,
-Jason
I use Greenshot, which is a very awesome tool for screenshots and according to the FAQ it supports OCR (using MODI = Microsoft Office Document Imaging) as well. However, I never got it working on my Windows machine and used Tesseract instead (for Linux, with some scripting experience, this method should be possible as well):
Download Tesseract here for Ubuntu/Debian/Windows and install it.
Download and install Greenshot
Create a new windows batch script called "Greenshot_Tesseract_OCR.bat" using a text editor like notepad or Notepad++ - and save it at a location of your choice, e.g. "C:\Users\MyUser\Scripts\Greenshot_Tesseract_OCR.bat" - with the following content (depending on the installation location of tesseract):
ECHO OFF
set arg1=%1
"C:\Program Files\Tesseract-OCR\tesseract.exe" "%arg1%" "%arg1%"
type "%arg1%.txt" | clip
Right-click the Greenshot icon in the toolbar and click "configure external command"
Add a new command with a name like "Tesseract OCR to Clipboard", select the batch script you just created as a command and as argument, use the default "{0}". Then click OK twice.
You should now be able to copy the text of a screenshot into your clipboard, with a shortcut ("Print" key in my case) and 1-2 mouse clicks (depending on your Greenshot settings)!
You can try the following open-source programs:
Greenshot for screenshots and VietOCR (a GUI frontend for Tesseract) for OCR on screenshots.
I used to convert Word documents to PDF via Word Automation: Enumerate the CommandBars until one containing "PDFmaker" was found, enumerating its controls and executing it.
With Word 2007 this no longer works - although the PdfMaker Com Addin is installed and accessible via the Acrobat menu.
PDFmaker is required for quality reasons. Therefore I cannot use the Microsoft "Save as PDF" addin; so the SaveAs method described in another post here is not applicable.
Any ideas?
A common way to get a PDF out of Word is to print to a virtual PDF printer. I could bet you have one installed. Maybe you find the quality appropriate.
The code would be:
Application.ActivePrinter = "whatever PDF printer you've got"
ThisDocument.PrintOut OutputFileName:="c:\whatever.pdf", PrintToFile:=True