Azure Form Recognizer training not finding data

Azure Form Recognizer training not finding data - azure-blob-storage

I'm trying to train a Form Recognizer using the browser API console (https://eastus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api/operations/TrainCustomModel/console). I've uploaded traning images to a container and created an SAS. The browser API console generate following HTTP request:
POST https://eastus.api.cognitive.microsoft.com/formrecognizer/v1.0-preview/custom/train?source=https://pythonimages.blob.core.windows.net/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rl&se=2020-01-22T00:23:33Z&st=2020-01-21T16:23:33Z&spr=https&sig=••••••••••••••••••••••••••••••••&prefix=images HTTP/1.1
Host: eastus.api.cognitive.microsoft.com
Content-Type: application/json
Ocp-Apim-Subscription-Key: ••••••••••••••••••••••••••••••••
{
"source": "string",
"sourceFilter": {
"prefix": "string",
"includeSubFolders": true
}
}
However, the answer I get back is
Transfer-Encoding: chunked
x-envoy-upstream-service-time: 4
apim-request-id: 5ad37aa2-e251-4b61-98ae-023930b47d27
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
Date: Tue, 21 Jan 2020 16:25:03 GMT
Content-Type: application/json; charset=utf-8
{
"error": {
"code": "1004",
"message": "Dataset path must be relative to local input mount path '/input' if local data is referenced."
}
}
I don't understand why it seems to be looking for data locally. I've experimented with the SAS, e.g. including the container name (images) in the blob http address rather than as a query parameter, but no success so far.
I've also tried the Python/REST path (described here: https://learn.microsoft.com/en-gb/azure/cognitive-services/form-recognizer/quickstarts/python-train-extract-v1), which results in a different error:
Response status code: 408
Response body: {'error': {'code': '1011', 'innerError': {'requestId': 'e7f9ef9f-97bc-4b6a-86f3-0b29c9591c87'}, 'message': 'The operation exceeded allowed time limit and was canceled. The common reasons are that the data source is too large or contains unsupported content. Please check that your request conforms to service limits and retry with redacted data source.'}}
For completeness, the code I use is as follows (key/signature *ed out:)
########### Python Form Recognizer Train #############
from requests import post as http_post
# Endpoint URL
base_url = r"https://markusformsrecognizer.cognitiveservices.azure.com/" + "/formrecognizer/v1.0-preview/custom"
source = r"https://pythonimages.blob.core.windows.net/images?sv=2019-02-02&ss=bfqt&srt=sco&sp=rl&se=2020-01-22T15:37:26Z&st=2020-01-22T07:37:26Z&spr=https&sig=*********************************"
headers = {
# Request headers
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': '*********************************'
}
url = base_url + "/train"
body = {"source": source}
try:
resp = http_post(url = url, json = body, headers = headers)
print("Response status code: %d" % resp.status_code)
print("Response body: %s" % resp.json())
except Exception as e:
print(str(e))

For error code 1004 Please follow the below to get the Source path containing the training documents and pass as value to the source key.
{
"source": "string",
"sourceFilter": {
"prefix": "string",
"includeSubFolders": true
}
}
Replace with the Azure Blob storage container's shared access signature (SAS) URL. To retrieve the SAS URL, open the Microsoft Azure Storage Explorer, right-click your container, and select Get shared access signature.
Make sure the Read and List permissions are checked, and click Create.
Then copy the value in the URL section. It should have the form:
https://.blob.core.windows.net/container name?SAS value.

Please use the new Form Recognizer v2.0 release it is an async API and enables training on large data sets and analyzing large documents. https://aka.ms/form-recognizer/api
quick start - https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/python-train-extract

To get started with Form Recognizer please login to the Azure Portal using this link to create a Form Recognizer resource (for v2.0 (preview) please use West US 2 or West Europe regions).

try removing the string value from prefix property.
{
"source": "string",
"sourceFilter": {
"prefix": "",
"includeSubFolders": true
}
}

The Python Quick Start code for version 2.0 seems to be working, at least I don’t get any errors anymore. I’m now feeling slightly silly that I didn’t try this earlier. The API (web-browser) console, linked from the Quick Start page of the Form Recognizer seems automatically assume I want to use version 1.0 and there’s no way to change that (or perhaps I’ve just overseen something). Hence I assumed I’d been allocated a v1.0 trial and therefore that’s what I used when I tried the Python Quick Start the first time around.

Instead of using just the SAS URI in the "source" of Request parameter on the API POST call, use the complete string of the container followed by the SAS URI token.
For ex:
https://.blob.core.windows.net//

Related

PCEP-SR draft version 6, SR Explicit Route Object/Record Route Object subobjects

I am setting up Segment routing via Pathman-SR with ODL Nitrogen Controller and vMX Juniper routers. To allow this, I have to change IANA subojbects code points, but I am unable to do it...
Followed this documenntations, but still no result:
https://docs.opendaylight.org/en/stable-carbon/user-guide/pcep-user-guide.html#segment-routing
https://test-odl-docs.readthedocs.io/en/latest/user-guide/pcep-user-guide.html
I tried to update configuration via REST API, but when I send PUT request:
/restconf/config/pcep-segment-routing-app-config:pcep-segment-routing-app-config
with the body:
<pcep-segment-routing-config xmlns="urn:opendaylight:params:xml:ns:yang:controller:pcep:segment-routing-app-config">
<iana-sr-subobjects-type>true</iana-sr-subobjects-type>
</pcep-segment-routing-config>
I get the following error:
{
"errors": {
"error": [
{
"error-type": "protocol",
"error-tag": "invalid-value",
"error-message": "URI has bad format. Possible reasons:\n 1. \"pcep-segment-routing-app-config:pcep-segment-routing-app-config\" was not found in parent data node.\n 2. \"pcep-segment-routing-app-config:pcep-segment-routing-app-config\" is behind mount point. Then it should be in format \"/yang-ext:mount/pcep-segment-routing-app-config:pcep-segment-routing-app-config\"."
}
]
}
}

I think there is a typo in the URL in the doc, you have to use /restconf/config/pcep-segment-routing-app-config:pcep-segment-routing-config
You can check this guide for reference:
https://docs.opendaylight.org/projects/bgpcep/en/stable-neon/pcep/pcep-user-guide-active-stateful-pce.html#iana-code-points

Cannot fire Bigcommerce webhooks

so far I've managed to create two webhooks by using their official gem (https://github.com/bigcommerce/bigcommerce-api-ruby) with the following events:
store/order/statusUpdated
store/app/uninstalled
The destination URL is a localhost tunnel managed by ngrok (the https) version.
status_update_hook = Bigcommerce::Webhook.create(connection: connection, headers: { is_active: true }, scope: 'store/order/statusUpdated', destination: 'https://myapp.ngrok.io/bigcommerce/notifications')
uninstall_hook = Bigcommerce::Webhook.create(connection: connection, headers: { is_active: true }, scope: 'store/app/uninstalled', destination: 'https://myapp.ngrok.io/bigcommerce/notifications')
The webhooks seems to be active and correctly created as I can retrieve and list them.
Bigcommerce::Webhook.all(connection:connection)
I manually created an order in my store dashboard but no matter to which state or how many states I change it, no notification is fired. Am I missing something?

The exception that I'm seeing in the logs is:
ExceptionMessage: true is not a valid header value
The "is-active" flag should be sent as part of the request body--your headers, if you choose to include them, would be an arbitrary key value pair that you can check at runtime to verify the hook's origin.
Here's an example request body:
{
"scope": "store/order/*",
"headers": {
"X-Custom-Auth-Header": "{secret_auth_password}"
},
"destination": "https://app.example.com/orders",
"is_active": true
}
Hope this helps!

Performance test for graphQL API

Today I'm doing my API automation testing and performance testing with Jmeter when the server is a REST API.
Now the development changed to graphQL API, and I have two questions about it:
What is the best way to perform the automation API and performance testing?
Does Jmeter support graphQL API?

I use Apollo to build the GraphQL server, and use JMeter to query the GraphQL API as below.
1. Set up HTTP Request
2. Set up HTTP Headers
Depending on your application, you might also need to set up HTTP header Authorization for JWT web tokens, such as:
Authorization: Bearer xxxxxxxxxxxxxxxxxxxxxxxxxxxx
3. Set up HTTP Cookie if needed for your app
4. Run the test

Disclaimer: I work for LoadImpact; the company behind k6.
If you are willing to consider an alternative, I've recently written a blog post about this topic: Load testing GraphQL with k6.
This is how a k6 example looks like:
let accessToken = "YOUR_GITHUB_ACCESS_TOKEN";
let query = `
query FindFirstIssue {
repository(owner:"loadimpact", name:"k6") {
issues(first:1) {
edges {
node {
id
number
title
}
}
}
}
}`;
let headers = {
'Authorization': `Bearer ${accessToken}`,
"Content-Type": "application/json"
};
let res = http.post("https://api.github.com/graphql",
JSON.stringify({ query: query }),
{headers: headers}
);

Looking into Serving over HTTP section of the GraphQL documentation
When receiving an HTTP GET request, the GraphQL query should be specified in the "query" query string.
So you can just append your GraphQL query to your request URL.
With regards to "best practices" - you should follow "normal" recommendations for web applications and HTTP APIs testing, for example check out REST API Testing - How to Do it Right article.

You can try using easygraphql-load-tester
How it works:
easygraphql-load-tester is a node library created to make load testing on GraphQL based on the schema; it'll create a bunch of queries, that are going to be the ones used to test your server.
Examples:
Artillery.io
K6
Result:
Using this package, it was possible to me, to identify a bad implementation using dataloaders on the server.
Results without dataloaders
All virtual users finished
Summary report # 10:07:55(-0500) 2018-11-23
Scenarios launched: 5
Scenarios completed: 5
Requests completed: 295
RPS sent: 36.88
Request latency:
min: 1.6
max: 470.9
median: 32.9
p95: 233.2
p99: 410.8
Scenario counts:
GraphQL Query load test: 5 (100%)
Codes:
200: 295
Results with dataloaders
All virtual users finished
Summary report # 10:09:09(-0500) 2018-11-23
Scenarios launched: 5
Scenarios completed: 5
Requests completed: 295
RPS sent: 65.85
Request latency:
min: 1.5
max: 71.9
median: 3.3
p95: 19.4
p99: 36.2
Scenario counts:
GraphQL Query load test: 5 (100%)
Codes:
200: 295

I am testing our GraphQL Implementation, you will need:
Thread Group
HTTP Header Manager: You need to add as Content-Type: Application/json
https://i.stack.imgur.com/syXqK.png
HTTP Request: use GET and add in the Body Data your query
https://i.stack.imgur.com/MpxAb.png
Response Assertion: You want to count as correct requests only responses without errors
https://i.stack.imgur.com/eXWGs.png
A Listener:
https://i.stack.imgur.com/VOVLo.png

I have recently tried API testing with GraphQl with both GET and POST request in Jmeter
Make sure its POST request for both Query and Mutation
Example Your Graph Ql query
{
storeConfig{
default_title
copyright
}
}
For Jmeter it would be like this
{
"query":"{ storeConfig { default_title copyright } }"
}
Step up HTTP Request
In place of the localhost, your domain name will come. Make sure you don't add https
Example:- https://mydomainname.com
In Jmeter :- mydomainname.com
Setup HTTP Header Manager
For requesting Mutation in Jmeter
Example mutation in Graphql
mutation {
generateCustomerToken(
email: "rd#mailinator.com"
password: "1234567"
) {
token
}
}
In Jemeter mutation will be like this
{
"query":"mutation { generateCustomerToken( email: \"rd#mailinator.com\" password: \"1234567\" ) { token } }"
}
Replace double quotes with (\") as shown in above query

The easiest way will be to use the GraphQL queries directly in JMeter without the need to convert them to JSON.
All you need to do is to pass "Content-Type" as "application/graphql" in the header.
Image Link for: HTTP Request with GraphQL Query as input
Image Link for: Header details

Updating a couchbaselite design document does not alter my view result

I am developing a react-native application and I am using Couchbase lite as a database locally. The way this works is that you spawn a local REST server when the app starts and you use the REST API to communicate with the CouchbaseLite server.
I have created a few design documents, but when I try to update those I do not get the new results when I run my REST client (seperate app I use for debugging). When I GET the design document it has a new _rev after the update, the new map function is as I updated it, but whenever I do a get on the view the result is the same as the first version of the map function.
Apparently the updated docs are not used by get.
The design doc:
var designDoc = {
name: 'expenses',
language: 'javascript',
views: {
contact_parts_for_group: {
'map': function(doc){
if(doc.type == 'expense'){
emit('some things I emit', doc.amount)
}
}.toString()
}
}
};
I send this to the server along with the proper _rev as the json body: JSON.stringify(designDoc)
.1. I am updating my design document with a PUT call:
PUT /kittydb/_design/expenses?rev=4-6f89f1e13d1fbb89c712d6bab53ee7d4 HTTP/1.1
Host: 127.0.0.1:5800
Connection: close
User-Agent: Paw/2.2.2 (Macintosh; OS X/10.11.2) GCDHTTPRequest
Content-Length: 356
{"name":"expenses","language":"javascript","views":{"contact_parts_for_group":{"map":"function (doc){ if(doc.type=='expense'){ var i,len,part,ref; ref=doc.parts; for(i=0,len=ref.length;i<len;i++){ part=ref[i]; var amount=part.contact==doc.expense_by?-1*part.amount:part.amount; emit([doc.group_id,part.contact,part.contact==doc.expense_by],amount);}}}"}}}
.2. I populate the database using the interface of the app prototype I developed so far
.3. I am not sure what you mean by this.
.4. This is the get:
GET /kittydb/_design/expenses/_view/contact_parts_for_group HTTP/1.1
Host: 127.0.0.1:5800
Connection: close
User-Agent: Paw/2.2.2 (Macintosh; OS X/10.11.2) GCDHTTPRequest
More information in reaction to some comments:
I am using the CouchbaseLite Community Edition, version 1.1.1 for iOS. I am running the simulator as an iPhone 6 with iOS 9.2.
I made some screenshots to illustrate what is going on a bit more:
I don't know how to retrieve the map function that goes with this but what it seems to do is:
emit([doc.group_id,part.contact],amount)
I used the get as above.
Now my update:
PUT /kittydb/_design/expenses?rev=7-6f979706f38acce9c7db380fba8565e4 HTTP/1.1
Host: 127.0.0.1:5800
Connection: close
User-Agent: Paw/2.2.2 (Macintosh; OS X/10.11.2) GCDHTTPRequest
Content-Length: 350
{
"name": "expenses",
"language": "javascript",
"views": {
"contact_parts_for_group": {
"map": "function (doc){ if(doc.type=='expense'){ var i,len,part,ref; ref=doc.parts; for(i=0,len=ref.length;i<len;i++){ part=ref[i]; var amount=part.contact==doc.expense_by?-1*part.amount:part.amount; emit('Hello SO', 'Overflow');}}}"
}
}
}
What it should do now is: emit('Hello SO', 'Overflow');
I get this response when I run the above request:
HTTP/1.1 201 Created
Location: http://127.0.0.1:5800/kittydb/_design/expenses
Content-Type: application/json
Server: CouchbaseLite 1.1 (unofficial)
Etag: "8-3ae4b6ff37b936657ca23acb8d836619"
Accept-Ranges: bytes
Date: Wed, 20 Jan 2016 21:57:03 GMT
Transfer-Encoding: chunked
{"id":"_design\/expenses","rev":"8-3ae4b6ff37b936657ca23acb8d836619","ok":true}
Now I run the get request again:
And nothing changed...
When I create a new document with 'type = expense' I get the same result, just more of them.

I don't know how to retrieve the map function that goes with this
Aha -- if you don't know where the original view definition is, and you can't get it from the design document, it's probably being defined in native code (at app launch time.) Such a definition will override one in a design document.
I don't know anything about React-Native. Is there (as the name implies) native code in the app? If so, look for a call to [CBLView setMapBlock: ...].

How to format signedUserToken for sinch?

I'm trying to integrate Sinch into my ROR webapp, and am having some difficulty formatting the signedUserToken to start the sinchClient.
Here is my view, using haml :
#{#signedUserTicket}
%script{src: "//cdn.sinch.com/latest/sinch.min.js", type: "text/javascript"}
= javascript_tag do
$(function(){
$sinchClient = new SinchClient({
applicationKey: 'APP_KEY',
capabilities: {messaging: true, calling: true},
supportActiveConnection: true,
onLogMessage: function(message) {
console.log(message);
},
});
$sinchClient.start({
'userTicket' : "#{#signedUserTicket}",
});
});
And whatever formatting I try to do in the controller, the closest I get to succeeding is :
DOMException [InvalidCharacterError: "String contains an invalid character"
code: 5
nsresult: 0x80530005
location: http://cdn.sinch.com/latest/sinch.min.js:5]
I'd appreciate a little help and would even build a Rubygem for integrating Sinch in Rails if I get the right info and can spare some time.
Cheers,
James
Edit :
I have tried a few modifications and am getting closer (I think).
The problem of InvalidCharacter came from the trailing '='s which apparently don't decode well in Javascript.
My new controller is now :
class SinchController < ApplicationController
skip_before_filter :verify_authenticity_token
before_filter :authenticate_user!
def client
username = current_user.username
applicationKey = "APP_KEY"
applicationSecret = "APP_SECRET_B64"
userTicket = {
"identity" => {"type" => "username", "endpoint" => username},
"expiresIn" => 3600,
"applicationKey" => applicationKey,
"created" => Time.now.utc.iso8601
}
userTicketJson = userTicket.to_json
userTicketBase64 = Base64.strict_encode64(userTicketJson).chop
digest = Digest::HMAC.digest(Base64.decode64(applicationSecret), userTicketJson, Digest::SHA256)
signature = Base64.strict_encode64(digest).chop
#signedUserTicket = (userTicketBase64 + ':' + signature).remove('=')
end
end
But now I'm facing the following error:
POST https://api.sinch.com/v1/instance 500 (Internal Server Error)
client:1 XMLHttpRequest cannot load https://api.sinch.com/v1/instance. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http:// localhost:3000' is therefore not allowed access. The response had HTTP status code 500.
(the space before localhost is due to new user restrictions on SO)
I added Rack::Cors to my rails server to try and allow Cross-domain requests in case it came from my own requests, but whatever configuration I tried, it seems the request never contains the right headers.
Am I misunderstanding CORS requests? Does the problem come from the requests generated by sinch.min.js?
Regards,
James

Error message is due to Firefox base64 decoder can't decode the token, due to symbols (such as #) that are not in the base64 character set. This suggest that the ticket is actually not passed to start(), and this line may be incorrect;
'userTicket' : "#{#signedUserTicket}",

I dont know HAML but shouldnt
'userTicket' : "#{#signedUserTicket}",
be 'userTicket' : #signedUserTicket,

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Azure Form Recognizer training not finding data - azure-blob-storage

Please use the new Form Recognizer v2.0 release it is an async API and enables training on large data sets and analyzing large documents. https://aka.ms/form-recognizer/api quick start - https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/python-train-extract

To get started with Form Recognizer please login to the Azure Portal using this link to create a Form Recognizer resource (for v2.0 (preview) please use West US 2 or West Europe regions).

try removing the string value from prefix property. { "source": "string", "sourceFilter": { "prefix": "", "includeSubFolders": true } }

Instead of using just the SAS URI in the "source" of Request parameter on the API POST call, use the complete string of the container followed by the SAS URI token. For ex: https://.blob.core.windows.net//

Related

PCEP-SR draft version 6, SR Explicit Route Object/Record Route Object subobjects

Cannot fire Bigcommerce webhooks

Performance test for graphQL API

Updating a couchbaselite design document does not alter my view result

How to format signedUserToken for sinch?

Categories

Resources