I'm having trouble saving the output given by the Google Vision API. I'm using Python and testing with a demo image. I get the following error:
TypeError: [mid:...] + is not JSON serializable
Code that I executed:
import io
import os
import json
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
# Instantiates a client
vision_client = vision.ImageAnnotatorClient()
# The name of the image file to annotate
file_name = os.path.join(
os.path.dirname(__file__),
'demo-image.jpg') # Your image path from current directory
# Loads the image into memory
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
image = types.Image(content=content)
# Performs label detection on the image file
response = vision_client.label_detection(image=image)
labels = response.label_annotations
print('Labels:')
for label in labels:
print(label.description, label.score, label.mid)
with open('labels.json', 'w') as fp:
json.dump(labels, fp)
the output appears on the screen, however I do not know exactly how I can save it. Anyone have any suggestions?
FYI to anyone seeing this in the future, google-cloud-vision 2.0.0 has switched to using proto-plus which uses different serialization/deserialization code. A possible error you can get if upgrading to 2.0.0 without changing the code is:
object has no attribute 'DESCRIPTOR'
Using google-cloud-vision 2.0.0, protobuf 3.13.0, here is an example of how to serialize and de-serialize (example includes json and protobuf)
import io, json
from google.cloud import vision_v1
from google.cloud.vision_v1 import AnnotateImageResponse
with io.open('000048.jpg', 'rb') as image_file:
content = image_file.read()
image = vision_v1.Image(content=content)
client = vision_v1.ImageAnnotatorClient()
response = client.document_text_detection(image=image)
# serialize / deserialize proto (binary)
serialized_proto_plus = AnnotateImageResponse.serialize(response)
response = AnnotateImageResponse.deserialize(serialized_proto_plus)
print(response.full_text_annotation.text)
# serialize / deserialize json
response_json = AnnotateImageResponse.to_json(response)
response = json.loads(response_json)
print(response['fullTextAnnotation']['text'])
Note 1: proto-plus doesn't support converting to snake_case names, which is supported in protobuf with preserving_proto_field_name=True. So currently there is no way around the field names being converted from response['full_text_annotation'] to response['fullTextAnnotation']
There is an open closed feature request for this: googleapis/proto-plus-python#109
Note 2: The google vision api doesn't return an x coordinate if x=0. If x doesn't exist, the protobuf will default x=0. In python vision 1.0.0 using MessageToJson(), these x values weren't included in the json, but now with python vision 2.0.0 and .To_Json() these values are included as x:0
Maybe you were already able to find a solution to your issue (if that is the case, I invite you to share it as an answer to your own post too), but in any case, let me share some notes that may be useful for other users with a similar issue:
As you can check using the the type() function in Python, response is an object of google.cloud.vision_v1.types.AnnotateImageResponse type, while labels[i] is an object of google.cloud.vision_v1.types.EntityAnnotation type. None of them seem to have any out-of-the-box implementation to transform them to JSON, as you are trying to do, so I believe the easiest way to transform each of the EntityAnnotation in labels would be to turn them into Python dictionaries, then group them all into an array, and transform this into a JSON.
To do so, I have added some simple lines of code to your snippet:
[...]
label_dicts = [] # Array that will contain all the EntityAnnotation dictionaries
print('Labels:')
for label in labels:
# Write each label (EntityAnnotation) into a dictionary
dict = {'description': label.description, 'score': label.score, 'mid': label.mid}
# Populate the array
label_dicts.append(dict)
with open('labels.json', 'w') as fp:
json.dump(label_dicts, fp)
There is a library released by Google
from google.protobuf.json_format import MessageToJson
webdetect = vision_client.web_detection(blob_source)
jsonObj = MessageToJson(webdetect)
I was able to save the output with the following function:
# Save output as JSON
def store_json(json_input):
with open(json_file_name, 'a') as f:
f.write(json_input + '\n')
And as #dsesto mentioned, I had to define a dictionary. In this dictionary I have defined what types of information I would like to save in my output.
with open(photo_file, 'rb') as image:
image_content = base64.b64encode(image.read())
service_request = service.images().annotate(
body={
'requests': [{
'image': {
'content': image_content
},
'features': [{
'type': 'LABEL_DETECTION',
'maxResults': 20,
},
{
'type': 'TEXT_DETECTION',
'maxResults': 20,
},
{
'type': 'WEB_DETECTION',
'maxResults': 20,
}]
}]
})
The objects in the current Vision library lack serialization functions (although this is a good idea).
It is worth noting that they are about to release a substantially different library for Vision (it is on master of vision's repo now, although not released to PyPI yet) where this will be possible. Note that it is a backwards-incompatible upgrade, so there will be some (hopefully not too much) conversion effort.
That library returns plain protobuf objects, which can be serialized to JSON using:
from google.protobuf.json_format import MessageToJson
serialized = MessageToJson(original)
You can also use something like protobuf3-to-dict
When having a look to my couchbase view using the web api I got this result:
{
"total_rows": 18279385,
"rows": []
}
But I'm using the ruby couchbase gem like follows
require 'couchbase'
c = Couchbase.connect(...)
sources = c.design_docs['Data']
pp sources.All
#<Couchbase::View:47373151271840 #endpoint="_design/Data/_view/All" #params={:connection_timeout=>75000}>
But how do I get the total_rows from the view? I've found few documentation which relates to a Method "total_rows" but that doesn't seem to be present at this point.
The comment of Anthony solves the problem:
sources.All(limit: 0).fetch.total_rows
Use limit: 0 for speeding up the request.
I would like to get help with the code in ruby filter.
logstash version is 5.0.0 alpha4.
I am testing the code in ruby filter as below but I am getting _rubyexception.
ruby {
code => "
event['newfield'] = 'test'
"
}
ruby filter is defined inside filter { }.
The logstash.log shows as below,
:timestamp=>"2016-08-03T15:26:47.291000+0900", :message=>"Ruby exception occurred: undefined method[]=' for 2016-08-03T06:26:46.829Z test %{message}:LogStash::Event", :level=>:error}
I cant find the reason why ruby filter is unable to use event object.
I appreciate if I could get some help to cope this issue.
Thanks,
Yu
There is a new event API in Logstash 5.0.0 and now you need to set fields on the event as follows:
ruby {
code => "
event.set('newfield', 'test')
"
}
For get the value from field name:
ruby {
code => "
value = (event.get('field_name'))
"
}
It's pretty straightforward but I have this issue:
Unexpected error while processing request: undefined method `PresignedPost' for Aws::S3:Module
My objective: Get the presigned URL for an object to which I can perform an upload file.
My gem has
gem 'aws-sdk', '~> 2'
Code:
##aws_creds = Aws::SharedCredentials.new(profile_name: profile)
Aws.config.update({region: 'us-west-2',credentials: ##aws_creds})
s3 = Aws::S3::Resource.new
#bucket = s3.bucket(bucketName)
form = Aws::S3::PresignedPost(:key => key )
if(form)
form.fields
end
you normally don't do a standalone presignedpost. you do it using the bucket method.
something like #bucket.presigned_post(:key=>key)
See doc:
http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/PresignedPost.html
Note: Normally you do not need to construct a PresignedPost yourself.
See Bucket#presigned_post and Object#presigned_post.
I've been searching around for a bit and couldn't find anything that really helped me. Especially because sometimes things don't seem to be consistant.
I have the following YAML that I use to store data/ configuration stuff:
---
global:
name: Core Config
cfg_version: 0.0.1
databases:
main_database:
name: Main
path: ~/Documents/main.reevault
read_only: false
...
I know how to update fields with:
cfg = YAML.load_file("test.yml")
cfg['global']['name'] = 'New Name'
File.open("test.yml", "w"){ |f| YAML.dump(cfg, f) }
And that's essentially everybody on the internet talks about. However here is my problem: I want to dynamically be able to add new fields to that file. e.g. under the "databases" section have a "secondary_db" field with it's own name, path and read_only boolean. I would have expected to do that by just adding stuff to the hash:
cfg['global']['databases']['second_db'] = nil
cfg['global']['databases']['second_db']['name'] = "Secondary Database"
cfg['global']['databases']['second_db']['path'] = "http://someurl.remote/blob/db.reevault"
cfg['global']['databases']['second_db']['read_only'] = "true"
File.open("test.yml", "w"){ |f| YAML.dump(cfg, f) }
But I get this error:
`<main>': undefined method `[]=' for nil:NilClass (NoMethodError)
My question now is: how do I do this? Is there a way with the YAML interface? Or do I have to write stuff into the file manually? I would prefer something via the YAML module as it takes care of formatting/ indentation for me.
Hope someone can help me.
Yo have to initialize cfg['global']['database']['second_db'] to be a hash not nil. Try this cfg['global']['database']['second_db'] = {}