Process facebook messenger url image - image

I'm trying to process this image provided by messenger-platform API (send-api-reference)
I used:
url = "https://scontent-lht6-1.xx.fbcdn.net/v/t34.0-12/20916840_10214193209010537_198030613_n.jpg?_nc_ad=z-m&oh=3eab9a3a400c7e05fb5b74c391852426&oe=5998B9A8"
#app.route('/photobot/<path:photo_url>')
def tensor_photobot(photo_url):
file = cStringIO.StringIO(urllib.urlopen(photo_url).read())
img = Image.open(file)
if img:
list_elements = process_image(img)
return json.dumps(list_elements)
But the image is not recognized. Any idea?
Message:
{u'mid': u'mid.$cAAbv-uhIfdVkIn9OVld8TqA6u2Hz', u'seq': 40125,
u'attachments': [{u'type': u'image', u'payload': {u'url':
u'https://scontent-lht6-1.xx.fbcdn.net/v/t34.0-12/20916840_10214193209010537_198030613_n.jpg?_nc_ad=z-m&oh=3eab9a3a400c7e05fb5b74c391852426&oe=5998B9A8'}}]}
[Reference][1] python 2.x
[1]:
https://developers.facebook.com/docs/messenger-platform/send-api-reference/image-attachment
Edit: following comment recommendations, I detected the problem is from url-string truncation.
I added all the implementation for more context.

From my comment in case the answer is needed by anyone in the future:
The query string is being truncated from the URL. To load the image, the entire URL including the query string is required.

Related

How to get the downloaded xlsx file from the api endpoint in karate?

I have an endpoint that downloads an xlsx file. In my test, I need to check the content of the file (not comparing the file with another file, but reading the content and checking). I am using karate framework for testing and I am trying to use apache POI for working with the excel sheet. However, the response I get from karate when calling the download endpoint is a String. For creating an excel file with POI I need an InputStream or the path to the actual file. I have tried the conversion, but it does not work.
I guess I am missing some connection here, or maybe the conversion is bad, I am new to karate and to the whole thing.
I appreciate any help, thanks!
Given url baseUrl
Given path downloadURI
When method GET
Then status 200
And match header Content-disposition contains 'attachment'
And match header Content-disposition contains 'example.xlsx'
And match header Content-Type == 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
* def value= FileChecker.createExcelFile(response)
* print value
And the Java code:
public static String createExcelFile(String excel) throws IOException, InvalidFormatException {
InputStream stream = IOUtils.toInputStream(excel, Charset.forName("UTF-8"));
Workbook workbook = WorkbookFactory.create(stream);
return ("Workbook has " + workbook.getNumberOfSheets() + " Sheets : ");
}
When running the scenario, I get the following error:
javascript evaluation failed: FileChecker.createExcelFile(response), java.io.IOException: Failed to read zip entry source
When testing the same endpoint in Postman, I am getting a valid excelsheet.
In Karate 0.9.X onwards you have a responseBytes variable which will be raw bytes, which may be what you need.
* def value = FileChecker.createExcelFile(responseBytes)
And you can change your method signature to be:
public static String createExcelFile(byte[] excel) {}
You should be easily able to convert a byte array to an InputStream and take it from there.
P.S. just saying that it "works in Postman" is not helpful :P
TO download zip file from Karate tests as binary bite array
Scenario: To verify and get the ADCI Uri from deployment
Given url basicURL + DeployUri +ArtifactUri
And headers {authorization:'#(authToken)',accept:'application/json',tenant:'#(tenantUUId)',Content-Type:'application/zip'}
When method get
Then status 200
And def responsebytes = responseBytes

requests.get() not retrieving correct url in python 2.7

I'm trying to access url and then parse it's contents based on tags.
My code:
page = requests.get('https://support.apple.com/downloads/')
self.tree = html.fromstring(page.content)
names = self.tree.xpath("//span[#class='truncate_name']//text()")
Problem: variable page is containing data that of url 'https://support.apple.com/'
I'm new to python 2.7. The whole encoding issues in file. I'm using unicode-escape as my default encoding. Encoding on resource at https://support.apple.com/downloads/ is utf-8 whereas encoding of resource at https://support.apple.com/ is variable. Is this has something to do with the problem? Please suggest solution for this.
It has nothing to do with encoding , what you are looking for is dynamically created so not in the source you get back. A series of ajax calls populates the data. To get the product names etc.. from the carousel where you see the span.truncate_name in your browser:
params = {"page": "products",
"locale": "en_US",
"doctype": "DOWNLOADS",
}
js = requests.get("https://km.support.apple.com/kb/index", params=params).content
Normally we could call .json() on the response object but in this case we need to use "unicode_escape" then call loads:
from json import loads, dumps
js2 = loads(js.decode("unicode_escape"))
print(js2)
Which gives you a huge dict of data like:
{u'products': [{u'name': u'Servers and Enterprise', u'urlpath': u'serversandenterprise', u'order': u'', u'products': .............
You can see the request in chrome tools:
We leave off callback:ACDow‌​nloadSearch.customCa‌​llBack as we want to get back valid json.

PdfBox: PDF/A-1A to PDF/A-3A

i have the following problem:
i want to transform a PDF/A-1A document to a PDF/A-3A.
The original document is validated by Arobat Reader Pro, so i can asume it is PDF/A-1A conform.
I try to convert the PDF metadata with the following code:
private PDDocumentCatalog makeA3compliant(PDDocument doc) throws IOException, TransformerException {
PDDocumentCatalog cat = doc.getDocumentCatalog();
PDMetadata metadata = new PDMetadata(doc);
cat.setMetadata(metadata);
XMPMetadata xmp = new XMPMetadata();
XMPSchemaPDFAId pdfaid = new XMPSchemaPDFAId(xmp);
xmp.addSchema(pdfaid);
XMPSchemaDublinCore dc = xmp.addDublinCoreSchema();
String creator = "TestCr";
String producer = "testPr";
dc.addCreator(creator);
dc.setAbout("");
XMPSchemaBasic xsb = xmp.addBasicSchema();
xsb.setAbout("");
xsb.setCreatorTool(creator);
xsb.setCreateDate(GregorianCalendar.getInstance());
PDDocumentInformation pdi = new PDDocumentInformation();
pdi.setProducer(producer);
pdi.setAuthor(creator);
doc.setDocumentInformation(pdi);
XMPSchemaPDF pdf = xmp.addPDFSchema();
pdf.setProducer(producer);
pdf.setAbout("");
PDMarkInfo markinfo = new PDMarkInfo();
markinfo.setMarked(true);
doc.getDocumentCatalog().setMarkInfo(markinfo);
pdfaid.setPart(3);
pdfaid.setConformance("A");
pdfaid.setAbout("");
metadata.importXMPMetadata(xmp);
return cat;
}
If i try to validate the new file with Acrobat again, i get a validation error:
CIDset in subset font is incomplete (font contains glyphs that are not listed)
if i try to validate the file with this online validator (http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx) it is a valid PDF/A-3A....
am i missing something?
is nobody able to help?
EDIT: Here is the PDF file
This worked for us to be fully PDF/A-3 compliant regarding the CIDset issue:
private void removeCidSet(PDDocumentCatalog catalog) {
COSName cidSet = COSName.getPDFName("CIDSet");
// iterate over all pdf pages
for (Object object : catalog.getAllPages()) {
if (object instanceof PDPage) {
PDPage page = (PDPage) object;
Map<String, PDFont> fonts = page.getResources().getFonts();
Iterator<String> iterator = fonts.keySet().iterator();
// iterate over all fonts
while (iterator.hasNext()) {
PDFont pdFont = fonts.get(iterator.next());
if (pdFont instanceof PDType0Font) {
PDType0Font typedFont = (PDType0Font) pdFont;
if (typedFont.getDescendantFont() instanceof PDCIDFontType2Font) {
PDCIDFontType2Font f = (PDCIDFontType2Font) typedFont.getDescendantFont();
PDFontDescriptor fontDescriptor = f.getFontDescriptor();
if (fontDescriptor instanceof PDFontDescriptorDictionary) {
PDFontDescriptorDictionary fontDict = (PDFontDescriptorDictionary) fontDescriptor;
fontDict.getCOSDictionary().removeItem(cidSet);
}
}
}
}
}
}
}
OK - I think I have an answer on your question from the perspective of the callas and/or Adobe technology (and once more, I'm affiliated with callas and its pdfToolbox technology that is also used inside of Acrobat).
According to my research and the people I consulted, your example PDF document contains a font with a CID character set that is incomplete. Why does pdfToolbox or Acrobat say it's a valid PDF/A-1a file but not a valid PDF/A-3a file? Interesting question:
1) The rules for incomplete CID sets changed between PDF/A-1a and PDF/A-3a. They are stricter in PDF/A-3a.
2) But while in PDF/A-1a a CID set always had to be there, in PDF/A-3a you can have a valid, compliant file, without such a CID set.
So, your PDF file contains a CID set (which makes it valid for PDF/A-1a and A-3a) but while that CID set is fine for A-1a it does not contains all characters to be A-3a compliant.
To test at least part of this theory, I processed your file through pdfToolbox with a fixup entitled "Remove CIDset if incomplete". That correction (as the name implies) removes the CID set from the file but doesn't change anything else. After doing so your file validates as a valid A-3a file.
That leaves the question why the pdftools web site claims this is a valid PDF/A-3a file; according to the people I've spoken to, the result from preflight for this file is correct and there should be an error on this file. So perhaps that's something you need to take up with the pdftools guys (and they possibly with callas to figure out who's finally right).
Feel free to send me a personal message if you want to discuss this further - more discussion on the tools themselves probably becomes off-topic for this public site.

Importing binary data to parse.com

I'm trying to import data to parse.com so I can test my application (I'm new to parse and I've never used json before).
Can you please give me an example of a json file that I can use to import binary files (images) ?
NB : I'm trying to upload my data in bulk directry from the Data Browser. Here is a screencap : i.stack.imgur.com/bw9b4.png
In parse docs i think 2 sections could help you out depend on whether you want to use REST api of the android sdk.
rest api - see section on POST, uploading files that can be upload to parse using REST POST.
SDk - see section on "files"
code for Rest includes following:
use some HttpClient implementation having "ByteArrayEntity" class or something
Map your image to bytearrayEntity and POST it with the correct headers for Mime/Type in httpclient...
case POST:
HttpPost httpPost = new HttpPost(url); //urlends "audio OR "pic"
httpPost.setProtocolVersion(new ProtocolVersion("HTTP", 1,1));
httpPost.setConfig(this.config);
if ( mfile.canRead() ){
FileInputStream fis = new FileInputStream(mfile);
FileChannel fc = fis.getChannel(); // Get the file's size and then map it into memory
int sz = (int)fc.size();
MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, sz);
data2 = new byte[bb.remaining()];
bb.get(data2);
ByteArrayEntity reqEntity = new ByteArrayEntity(data2);
httpPost.setEntity(reqEntity);
fis.close();
}
,,,
request.addHeader("Content-Type", "image/*") ;
pseudocode for post the runnable to execute the http request
The only binary data allowed to be loaded to parse.com are images. In other cases like files or streams .. etc the most suitable solution is to store a link to the binary data in another dedicated storage for such type of information.

Jython, ImageInfo

I trying to use ImageInfo and Jython to get information from a image on my harddrive.
I have imported the module fine but keep getting this error:
TypeError: setInput(): expected 2 args; got 1
And this is the code I am trying to use:
filename = "C:\\image.jpg"
img = ImageInfo.setInput(filename)
Could anyone point out what I am doing wrong.
Cheers
Eef
The missing argument Jython complains about is the ImageInfo object itself, which doesn't exist yet. You must construct it first. So:
filename = "C:\\image.jpg"
ii = ImageInfo()
img = ii.setInput(filename)
or
filename = "C:\\image.jpg"
img = ImageInfo().setInput(filename)
may work also.

Resources