Adding image from URL in Word 2011 for Mac OSX using VBA - macos

I am working on porting a project in Windows over to OSX. I have overcome issues with VBA for OSX Word 2011 not allowing you to send POSTs to a server and have figured out how to return a string result from an external script. Now I have to insert an image in my Word file from a URL that is built using the return of my external script.
The current attempt is as follows, and works in Windows but crashes Word in OSX:
Selection.InlineShapes.AddPicture FileName:=File_Name, _
LinkToFile:=False, SaveWithDocument:=True
After doing some research, it looks like MS may have disabled this functionality in OSX as a "security risk". I still need to make it work. Does anybody know of a way within VBA for Office 2011 to make this work, or barring that a workaround? I am trying to avoid writing the image file to the disk if possible.
UPDATE: I have created a Python script for getting the image file from a URL, but I still do not know how to get this image from the Python script into VBA, and from there into the Word document at the location of the cursor. The important bits of the script are below. The image is read in as a PIL object and I can show it using img.show() just fine, but I am not sure what filetype this is or how to get VBA to accept it.
# Import the required libraries
from urllib2 import urlopen, URLError
from cStringIO import StringIO
from PIL import Image
# Send request to the server and receive response, with error handling!
try:
# Read the response and print to a file
result = StringIO(urlopen(args.webAddr + args.filename).read())
img = Image.open(result)
img.show()
except URLError, e:
if hasattr(e, 'reason'): # URL error case
# a tuple containing error code and text error message
print 'Error: Failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'): # HTTP error case
# HTTP error code, see section 10 of RFC 2616 for details
print 'Error: The server could not fulfill the request.'
print 'Error code: ', e.code
Note that in the above, args.webAddr and args.filename are passed to the script using the argparse library. This script works, and will show the image file that I expect. Any ideas on how to get that image into Word 2011 for OSX and insert it under the cursor?
Thanks a lot!
Edit: updated the link to the project since migrating to github.

Old question, but no answer, and I see the same crash here when the image is at an http URL. I think you can use the following workaround
Sub insertIncludePictureAndUnlink()
' Put your URL in here...
Const theImageURL As String = ""
Dim f As Word.Field
Dim r As Word.Range
Set f = Selection.Fields.Add(Range:=Selection.Range, Type:=wdFieldIncludePicture, Text:=Chr(34) & theImageURL & Chr(34), PreserveFormatting:=False)
Set r = f.Result
f.Unlink
Set f = Nothing
' should have an inlineshape in r
Debug.Print r.InlineShapes.Count
' so now you can do whatever you need, e.g....
r.Copy
Set r = Nothing
End Sub

Related

Saving decoded Protobuf content

I'm trying to setup a .py plugin that will save decoded Protobuf responses to file, but whatever I do, the result is always file in byte format (not decoded). I have also tried to do the same by using "w" in Mitmproxy - although on screen I saw decoded data, in the file it was encoded again.
Any thoughts how to do it correctly?
Sample code for now:
import mitmproxy
def response(flow):
# if flow.request.pretty_url.endswith("some-url.com/endpoint"):
if flow.request.pretty_url.endswith("some-url.com/endpoint"):
f = open("test.log","ab")
with decoded(flow.response)
f.write(flow.request.content)
f.write(flow.response.content)
Eh, I'm not sure this helps, but what happens if you don't open the file in binary mode
f = open("test.log","a")
?
Hy,
some basic things that I found.
Try replacing
f.write(flow.request.content)
with
f.write(flow.request.text)
I read it on this website
https://discourse.mitmproxy.org/t/modifying-https-response-body-not-working/645/3
Please read and try this to get the requests and responses assembled.
MITM Proxy, getting entire request and response string
Best of luck with your project.
I was able to find the way to do that. Seems mitmdump or mitmproxy wasn't able to save raw decoded Protobuf, so I used:
mitmdump -s decode_script.py
with the following script to save the decoded data to a file:
import mitmproxy
import subprocess
import time
def response(flow):
if flow.request.pretty_url.endswith("HERE/IS/SOME/API/PATH"):
protobuffedResponse=flow.response.content
(out, err) = subprocess.Popen(['protoc', '--decode_raw'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate(protobuffedResponse)
outStr = str(out, 'utf-8')
outStr = outStr.replace('\\"', '"')
timestr = time.strftime("%Y%m%d-%H%M%S")
with open("decoded_messages/" + timestr + ".decode_raw.log","w") as f:
f.write(outStr)

Ruby - How to add EOF marker into a PDF file or otherwise bypass PDF::Reader::MalformedPDFError: PDF does not contain EOF marker

I'm using the Mechanize ruby gem to click a button on the web to download a PDF file and save it to the local file system.
URL = "www.my-site.com"
agent = Mechanize.new
agent.pluggable_parser.pdf = Mechanize::File # FYI I have also tried Mechanize::FileSaver and Mechanize::Download here
page = agent.get(URL)
form = page.forms.first
button = page.form.button_with(:value => "Some Button Text")
local_file = "path/to/file.pdf"
response = agent.submit(form, button)
response.save_as(local_file)
But when I try to read this PDF file using the PDF::Reader gem, I get an error "PDF does not contain EOF marker".
reader = PDF::Reader.new(local_file) # this also happens if I try to use PDF::Reader.new(response.body) and PDF::Reader.new(response.body_io) depending on the different pluggable_parser configurations mentioned above
#> PDF::Reader::MalformedPDFError: PDF does not contain EOF marker
I'm able to save the PDF locally and view it and it looks fine, but the PDF::Reader gem is complaining about it missing an EOF marker.
So my question is: is there a way I could add an EOF marker into the PDF or something to get around this error so I can parse the PDF?
Thanks.
Related (unanswered) question: PDF does not contain EOF marker (PDF::Reader::MalformedPDFError) with pdf-reader
Related Docs:
http://mechanize.rubyforge.org/Mechanize/File.html
http://mechanize.rubyforge.org/Mechanize/Download.html
http://mechanize.rubyforge.org/Mechanize/FileSaver.html
https://github.com/yob/pdf-reader
EDIT:
I found the EOF marker somewhere in the middle of the downloaded file contents, followed by some HTML-looking stuff that I can't seem to figure out how to get rid of. I want to isolate the PDF content and then parse that, but still running into issues. Here is the full script I am using:
https://gist.github.com/s2t2/c6766846d024edd696586b2bc7fee0bf
The issue seems to be with the website you're accessing: http://employmentsummary.abaquestionnaire.org
The add HTML data at the end of the response.
However, you could truncate the response by searching for the first substring %EOF and removing all the data after that.
i.e.:
pdf_data = result.body
pdf_data.slice!(0, pdf_data.index("%EOL").to_i + 4)
if(pdf_data.length <= 4)
# handle error
else
# save/send pdf_data
end

How to convert PIL image file into string in python3.4?

I have been trying to read a jpeg file using PIL in python 3.4. I need to save this file into string format. Although some options are provided on this site but I have tried a few but it is not working. Following is my code snippet which i have found on this site only:-
from io import StringIO
fp = Image.open("images/login.jpg")
output = StringIO()
fp.save(output, format="JPEG")
contents = output.getvalue()
output.close()
But i am facing the following error :-
TypeError: string argument expected, got 'bytes'
Could you please suggest what I have done wrong and how to get this working?
In python 3 you should use a BytesIO,
whereas as read in python docs:
StringIO is a native in-memory unicode container
.
Thanks a lot for the hint. I Actually have a found a different way of reading the image file and storing in string object in python2.x . Here is the code. Please let me know if there is any disadvantage of using this.
imgText = open("images/login.jpg", 'rb')
imgTextStr = imgText.read()
imgText.close()

Win32com Save PDF to XML with Acrobat Pro X > com_error "-2147467263, 'Not implemented'" [duplicate]

This question already has answers here:
"Not implemented" Exception when using pywin32 to control Adobe Acrobat
(2 answers)
Closed 6 years ago.
Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Windows XP SP3
Python 2.7 pywin32-218
Adobe Acrobat X 10.0.0
I want to use Python to automate Acrobat Pro to export a PDF to XML. I already tried it manually using the 'Save As' dialog box from the running program and now want to do it via a Python script. I have read many pages including parts of the Adobe SDK, SDK Forum, VB Forums and am having no luck.
I read Blish's problem here: "Not implemented" Exception when using pywin32 to control Adobe Acrobat
And this page: timgolden python/win32_how_do_i/generate-a-static-com-proxy.html
I am missing something. My code is:
import win32com.client
import win32com.client.makepy
win32com.client.makepy.GenerateFromTypeLibSpec('Acrobat')
adobe = win32com.client.DispatchEx('AcroExch.App')
avDoc = win32com.client.DispatchEx('AcroExch.AVDoc')
avDoc.Open('C:\Documents and Settings\PC\Desktop\a_PDF.pdf', 'C:\Documents and Settings\PC\Desktop')
pdDoc = avDoc.GetPDDoc()
jObject = pdDoc.GetJSObject()
jObject.SaveAs('C:\Documents and Settings\PC\Desktop\a_PDF.xml', "com.adobe.acrobat.xml-1-00")
The full error is:
Traceback (most recent call last):
File "<pyshell#31>", line 1, in <module>
jObject.SaveAs('C:\Documents and Settings\PC\Desktop\a_PDF.xml', "com.adobe.acrobat.xml-1-00")
File "C:\Python27\lib\site-packages\win32com\client\dynamic.py", line 511, in __getattr__
ret = self._oleobj_.Invoke(retEntry.dispid,0,invoke_type,1)
com_error: (-2147467263, 'Not implemented', None, None)
I'm guessing it has to do with make.py but I don't understand how to implement it in my code.
I pulled this line from my code and got the same error when I ran it:
win32com.client.makepy.GenerateFromTypeLibSpec('Acrobat')
I then changed these two lines from 'DispatchEX' to 'Dispatch' and same error:
adobe = win32com.client.Dispatch('AcroExch.App')
avDoc = win32com.client.Dispatch('AcroExch.AVDoc')
When I run the Dispatches by themselves and then call them back I get:
>>> adobe = win32com.client.DispatchEx('AcroExch.App')
>>> adobe
<win32com.gen_py.Adobe Acrobat 10.0 Type Library.CAcroApp instance at 0x18787784>
>>> avDoc = win32com.client.Dispatch('AcroExch.AVDoc')
>>> avDoc
<win32com.gen_py.Adobe Acrobat 10.0 Type Library.CAcroAVDoc instance at 0x20365224>
Does this mean I should make only one call to Dispatch? I pulled:
adobe = win32com.client.Dispatch('AcroExch.App')
and got the same error.
This Adobe site says:
AVDoc
Product availability: Acrobat, Reader
Platform availability: Macintosh, Windows, UNIX
Syntax
typedef struct _t_AVDoc* AVDoc;
A view of a PDF document in a window. There is one AVDoc per displayed document. Unlike a PDDoc, an AVDoc has a window associated with it.
acrobat_sdk/9.1/Acrobat9_1_HTMLHelp/API_References/Acrobat_API_Reference/AV_Layer/AVDoc.html#AVDocSaveParams
The PDDoc page says:
A PDDoc object represents a PDF document. There is a correspondence between a PDDoc and an ASFile. Also, every AVDoc has an associated PDDoc, although a PDDoc may not be associated with an AVDoc.
/9.1/Acrobat9_1_HTMLHelp/API_References/Acrobat_API_Reference/PD_Layer/PDDoc.html
I tried the following code and also got the same error:
import win32com.client
import win32com.client.makepy
pdDoc = win32com.client.Dispatch('AcroExch.PDDoc')
pdDoc.Open('C:\Documents and Settings\PC\Desktop\a_PDF.pdf')
jObject = pdDoc.GetJSObject()
jObject.SaveAs('C:\Documents and Settings\PC\Desktop\a_PDF.xml', "com.adobe.acrobat.xml-1-00")
Same error if I change:
pdDoc = win32com.client.Dispatch('AcroExch.PDDoc')
to
pdDoc = win32com.client.gencache.EnsureDispatch('AcroExch.PDDoc')
like here: win32com.client.Dispatch works but not win32com.client.gencache.EnsureDispatch
user2993272, you were almost there: just one more line and the code you have should have worked flawlessly.
I'm going to attempt to answer in the same spirit as your question and provide you as much details as I can.
This thread holds the key to the solution you are looking for: https://mail.python.org/pipermail/python-win32/2002-March/000260.html
I admit that the post is not the easiest to find (perhaps Google scores it low based on the age of the content?).
Specifically, applying this piece of advice will get things running for you: https://mail.python.org/pipermail/python-win32/2002-March/000265.html
For completeness, this piece of code should get the job done and not require you to manually patch dynamic.py (snippet should run pretty much out of the box):
# gets all files under ROOT_INPUT_PATH with FILE_EXTENSION and tries to extract text from them into ROOT_OUTPUT_PATH with same filename as the input file but with INPUT_FILE_EXTENSION replaced by OUTPUT_FILE_EXTENSION
from win32com.client import Dispatch
from win32com.client.dynamic import ERRORS_BAD_CONTEXT
import winerror
# try importing scandir and if found, use it as it's a few magnitudes of an order faster than stock os.walk
try:
from scandir import walk
except ImportError:
from os import walk
import fnmatch
import sys
import os
ROOT_INPUT_PATH = None
ROOT_OUTPUT_PATH = None
INPUT_FILE_EXTENSION = "*.pdf"
OUTPUT_FILE_EXTENSION = ".txt"
def acrobat_extract_text(f_path, f_path_out, f_basename, f_ext):
avDoc = Dispatch("AcroExch.AVDoc") # Connect to Adobe Acrobat
# Open the input file (as a pdf)
ret = avDoc.Open(f_path, f_path)
assert(ret) # FIXME: Documentation says "-1 if the file was opened successfully, 0 otherwise", but this is a bool in practise?
pdDoc = avDoc.GetPDDoc()
dst = os.path.join(f_path_out, ''.join((f_basename, f_ext)))
# Adobe documentation says "For that reason, you must rely on the documentation to know what functionality is available through the JSObject interface. For details, see the JavaScript for Acrobat API Reference"
jsObject = pdDoc.GetJSObject()
# Here you can save as many other types by using, for instance: "com.adobe.acrobat.xml"
jsObject.SaveAs(dst, "com.adobe.acrobat.accesstext")
pdDoc.Close()
avDoc.Close(True) # We want this to close Acrobat, as otherwise Acrobat is going to refuse processing any further files after a certain threshold of open files are reached (for example 50 PDFs)
del pdDoc
if __name__ == "__main__":
assert(5 == len(sys.argv)), sys.argv # <script name>, <script_file_input_path>, <script_file_input_extension>, <script_file_output_path>, <script_file_output_extension>
#$ python get.txt.from.multiple.pdf.py 'C:\input' '*.pdf' 'C:\output' '.txt'
ROOT_INPUT_PATH = sys.argv[1]
INPUT_FILE_EXTENSION = sys.argv[2]
ROOT_OUTPUT_PATH = sys.argv[3]
OUTPUT_FILE_EXTENSION = sys.argv[4]
# tuples are of schema (path_to_file, filename)
matching_files = ((os.path.join(_root, filename), os.path.splitext(filename)[0]) for _root, _dirs, _files in walk(ROOT_INPUT_PATH) for filename in fnmatch.filter(_files, INPUT_FILE_EXTENSION))
# Magic piece of code that should get everything working for you!
# patch ERRORS_BAD_CONTEXT as per https://mail.python.org/pipermail/python-win32/2002-March/000265.html
global ERRORS_BAD_CONTEXT
ERRORS_BAD_CONTEXT.append(winerror.E_NOTIMPL)
for filename_with_path, filename_without_extension in matching_files:
print "Processing '{}'".format(filename_without_extension)
acrobat_extract_text(filename_with_path, ROOT_OUTPUT_PATH, filename_without_extension, OUTPUT_FILE_EXTENSION)
I have tested this on WinPython x64 2.7.6.3, Acrobat X Pro

Mac Office 2011 VBA - calling a server script

I'm porting a large VBA project over from Windows to the new Mac Word 2011. It's actually going very well...almost all of the code is working.
My code needs to call scripts on my server. On Windows, I call the system function InternetOpenUrl to call a script and InternetReadFile to read the results returned by the script. For example, I call a script like:
"http://www.mysite.com/cgi-bin/myscript.pl?param1=Hello&param2=World
and it returns a string like "Success"
What's the best way to do the equivalent on the Mac? Is using Applescript (via the vba MacScript function) the answer? I do that to display the file chooser dialog, but I can't find what the applescript to call an online script would look like. Or is there a better/faster way to do this?
Thanks in advance,
gary
You can try the URL Access Scripting library, which is a front-end for curl, or go to the script via a browser and reading the text through there.
I recently figured this out for making a call to a server to convert a user-defined LaTeX string to an image of the equation. The call is made through VBA via the MacScript command as:
command = "do shell script """ & pyPath & "python " & getURLpath & "getURL.py --formula '" _
& Latex_Str & "' --fontsize " & Font_Size & " " & WebAdd & """"
result = MacScript(command)
Which looks ugly, but this is just building the command do shell script /usr/bin/python {path to script}/getURL.py --formula '{LaTeX formula string}' --fontsize {int} {myurl} and passing it to the command. My Python script then uses argparse to parse the arguments sent to it, and urllib and urllib2 to handle sending the request to the server. The MacScript command read the stdout of my Python script and returns it as a string to result.
This guide on urllib2 should help you get the Python script up and running.
EDIT: Sorry, my answer was incomplete last time. The Python script I used to finish the job is below.
# Import the required libraries
from urllib import urlencode
from urllib2 import Request, urlopen, URLError, ProxyHandler, build_opener, install_opener
import argparse
# Set up our argument parser
parser = argparse.ArgumentParser(description='Sends LaTeX string to web server and returns meta data used by LaTeX in Word project')
parser.add_argument('webAddr', type=str, help='Web address of LaTeX in Word server')
parser.add_argument('--formula', metavar='FRML', type=str, help='A LaTeX formula string')
parser.add_argument('--fontsize', metavar='SIZE', type=int, default=10, help='Integer representing font size (can be 10, 11, or 12. Default 10)')
parser.add_argument('--proxServ', metavar='SERV', type=str, help='Web address of proxy server, i.e. http://proxy.server.com:80')
parser.add_argument('--proxType', metavar='TYPE', type=str, default='http', help='Type of proxy server, i.e. http')
# Get the arguments from the parser
args = parser.parse_args()
# Define formula string if input
if args.formula:
values = {'formula': str(args.fontsize) + '.' + args.formula} # generate formula from args
else:
values = {}
# Define proxy settings if proxy server is input.
if args.proxServ: # set up the proxy server support
proxySupport = ProxyHandler({args.proxType: args.proxServ})
opener = build_opener(proxySupport)
install_opener(opener)
# Set up the data object
data = urlencode(values)
data = data.encode('utf-8')
# Send request to the server and receive response, with error handling!
try:
req = Request(args.webAddr, data)
# Read the response and print to a file
response = urlopen(req)
print response.read()
except URLError, e:
if hasattr(e, 'reason'): # URL error case
# a tuple containing error code and text error message
print 'Error: Failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'): # HTTP error case
# HTTP error code, see section 10 of RFC 2616 for details
print 'Error: The server could not fulfill the request.'
print 'Error code: ', e.code

Resources