How to convert json into data type that is supported by Azure Form Recognizer - azure-sdk-python

I'd like to convert a json into the data type that is supported by Azure Form Recognizer. I'm able to convert the data type into a dic and then into a json but I'm not able to do the opposite without analysing once again the document. How could I use the data type supported by Azure Form Recognizer without having to analyse the document more than one time?
Here is what I have.
endpoint = "endpoint"
key = "key"
# create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
document_analysis_client = DocumentAnalysisClient(endpoint=endpoint, credential=AzureKeyCredential(key))
# Extract text from doc using "prebuilt-document"
with open("file.pdf", "rb") as f:
poller = document_analysis_client.begin_analyze_document(
"prebuilt-document", document=f)
result = poller.result()
import json
form_pages = poller.result()
d = form_pages.to_dict()
json_string = json.dumps(d)
print(json_string)
data = json.loads(json_string)
poller1 = form_pages.from_dict(data)

What's the scenario for converting the JSON model representation back to the SDK model? The operations don't take the result models as an input, as a solution the original result model be stored somewhere until it needs to be used again in that case.
Also, for converting the model to JSON, it would be better to use the AzureJSONEncoder to that the SDK can properly serialize all types. For example:
from azure.core.serialization import AzureJSONEncoder
# save the dictionary as JSON content in a JSON file, use the AzureJSONEncoder
# to help make types, such as dates, JSON serializable
# NOTE: AzureJSONEncoder is only available with azure.core>=1.18.0.
with open('data.json', 'w') as f:
json.dump(analyze_result_dict, f, cls=AzureJSONEncoder)
Here is a link to the full sync sample: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/formrecognizer/azure-ai-formrecognizer/samples/v3.2/sample_convert_to_and_from_dict.py

Related

SWIFT - String based Key-Value Array decoding?

I have a String based Key-Value Array inside of a String, and I want to decoded it and assign the value to an existing array in Swift 4.2. For example:
let array: [String:String] = []
let stringToDecode = “[\“Hello\”:\”World\”, \"Key\":\"Value\"]”
// I want ‘array’ to be assigned
// to the value that is inside
// ‘stringToDecode’
I’ve tried the JSON decoder, but it couldn’t decode it. Is there a simple way to do this? Thank you.
Try using a library like SwiftyJson, it makes working with json much easier.

Any ar js multimarkers learning tutorial?

I have been searching for ar.js multimarkers tutorial or anything that explains about it. But all I can find is 2 examples, but no tutorials or explanations.
So far, I understand that it requires to learn the pattern or order of the markers, then it stores it in localStorage. This data is used later to display the image.
What I don't understand, is how this "learner" is implemented. Also, the learning process is only used once by the "creator", right? The output file should be stored and then served later when needed, not created from scratch at each person's phone or computer.
Any help is appreciated.
Since the question is mostly about the learner page, I'll try to break it down as much as i can:
1) You need to have an array of {type, URL} objects.
A sample of creating the default array is shown below (source code):
var markersControlsParameters = [
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-hiro.patt',
},
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-kanji.patt',
}]
2) You need to feed this to the 'learner' object.
By default the above object is being encoded into the url (source) and then decoded by the learner site. What is important, happens on the site:
for each object in the array, an ArMarkerControls object is created and stored:
// array.forEach(function(markerParams){
var markerRoot = new THREE.Group()
scene.add(markerRoot)
// create markerControls for our markerRoot
var markerControls = new THREEx.ArMarkerControls(arToolkitContext, markerRoot, markerParams)
subMarkersControls.push(markerControls)
The subMarkersControls is used to create the object used to do the learning. At long last:
var multiMarkerLearning = new THREEx.ArMultiMakersLearning(arToolkitContext, subMarkersControls)
The example learner site has multiple utility functions, but as far as i know, the most important here are the ArMultiMakersLearning members which can be used in the following order (or any other):
// this method resets previously collected statistics
multiMarkerLearning.resetStats()
// this member flag enables data collection
multiMarkerLearning.enabled = true
// this member flag stops data collection
multiMarkerLearning.enabled = false
// To obtain the 'learned' data, simply call .toJSON()
var jsonString = multiMarkerLearning.toJSON()
Thats all. If you store the jsonString as
localStorage.setItem('ARjsMultiMarkerFile', jsonString);
then it will be used as the default multimarker file later on. If you want a custom name or more areas - then you'll have to modify the name in the source code.
3) 2.1.4 debugUI
It seems that the debug UI is broken - the UI buttons do exist but are nowhere to be seen. A hot fix would be using the 'markersAreaEnabled' span style for the div
containing the buttons (see this source bit).
It's all in this glitch, you can find it under the phrase 'CHANGES HERE' in the arjs code.

How to use Resolv::DNS::Resource::Generic

I would like to better understand how Resolv::DNS handles records that are not directly supported. These records are represented by the Resolv::DNS::Resource::Generic class, but I could not find documentation about how to get the data out of this record.
Specifically, my zone will contain SSHFP and TLSA records, and I need a way to get to that data.
Through reverse engineering, I found the answer - documenting it here for others to see.
Please note that this involves undocumented features of the Resolv::DNS module, and the implementation may change over time.
Resource Records that the Resolv::DNS module does not understand are represented not through the Generic class, but rather through a subclass whose name represents the type and class of the DNS response - for instance, an SSHFP record (type 44) will be represented as Resolv::DNS::Resource::Generic::Type44_Class1
The object will contain a method "data" that gives you access to the RDATA of the record, in plain binary format.
Thus, to access an SSHFP record, here is how to get it:
def handle_sshfp(rr) do
# the RDATA is a string but contains binary data
data = rr.data.bytes
algo = data[0].to_s
fptype = data[1].to_s
fp = data[2..-1].to_s
hex = fp.map{|b| b.to_s(16).rr.rjust(2,'0') }.join(':')
puts "The SSHFP record is: #{fptype} #{algo} #{hex}"
end
Resolv::DNS.open do |dns|
all_records = dns.getresources('myfqdn.example.com', Resolv::DNS::Resource::IN::ANY ) rescue nil
all_records.each do |rr|
if rr.is_a? Resolv::DNS::Resource::Generic then
classname = rr.class.name.split('::').last
handle_sshfp(rr) if classname == "Type44_Class1"
end
end
end

how to send a JSON object from contentURL : data.url("foo.html") to contentScript

I tried using window.postMessage but this only sends a variable (containing string) to the contentScript. But I want to send a number of variables' values. This seems to be possible by using a JSON object.
Simply use JSON.stringify() to turn the object into a string:
var data = {a: 1, b: 2};
window.postMessage(JSON.stringify(data), "*");
On the other end use JSON.parse() to reverse the process:
var data = JSON.parse(message);
If you use:
self.port.emit('some-event', object)
...and only send objects that can be serialized into JSON properly, the SDK will handle serialization and parsing for you. Here's a quick builder example that illustrates this:
https://builder.addons.mozilla.org/addon/1036506/latest/
I had thought that postMessage would be the same?

Django models are not ajax serializable

I have a simple view that I'm using to experiment with AJAX.
def get_shifts_for_day(request,year,month,day):
data= dict()
data['d'] =year
data['e'] = month
data['x'] = User.objects.all()[2]
return HttpResponse(simplejson.dumps(data), mimetype='application/javascript')
This returns the following:
TypeError at /sched/shifts/2009/11/9/
<User: someguy> is not JSON serializable
If I take out the data['x'] line so that I'm not referencing any models it works and returns this:
{"e": "11", "d": "2009"}
Why can't simplejson parse my one of the default django models? I get the same behavior with any model I use.
You just need to add, in your .dumps call, a default=encode_myway argument to let simplejson know what to do when you pass it data whose types it does not know -- the answer to your "why" question is of course that you haven't told poor simplejson what to DO with one of your models' instances.
And of course you need to write encode_myway to provide JSON-encodable data, e.g.:
def encode_myway(obj):
if isinstance(obj, User):
return [obj.username,
obj.firstname,
obj.lastname,
obj.email]
# and/or whatever else
elif isinstance(obj, OtherModel):
return [] # whatever
elif ...
else:
raise TypeError(repr(obj) + " is not JSON serializable")
Basically, JSON knows about VERY elementary data types (strings, ints and floats, grouped into dicts and lists) -- it's YOUR responsibility as an application programmer to match everything else into/from such elementary data types, and in simplejson that's typically done through a function passed to default= at dump or dumps time.
Alternatively, you can use the json serializer that's part of Django, see the docs.

Resources