Running Stanford corenlp server with custom models - stanford-nlp

I've trained a POS tagger and neural dependency parser with Stanford corenlp. I can get them to work via command line, and now would like to access them via a server.
However, the documentation for the server doesn't say anything about using custom models. I checked the code and didn't find any obvious way of supplying a configuration file.
Any idea how to do this? I don't need all annotators, just the ones I trained.

Yes, the server should (in theory) support all the functionality of the regular pipeline. The properties GET parameter is translated into the Properties object you would normally pass into StanfordCoreNLP. Therefore, if you'd like the server to load a custom model, you can just call it via, e.g.:
wget \
--post-data 'the quick brown fox jumped over the lazy dog' \
'localhost:9000/?properties={"parse.model": "/path/to/model/on/server/computer", "annotators": "tokenize,ssplit,pos", "outputFormat": "json"}' -O -
Note that the server won't garbage-collect this model afterwards though, so if you load too many models there's a good chance you'll run into out-of-memory errors...

Related

How to plugin into grpc-java to modify code generation and add setNameOrClear(null) methods?

We are having too many issues around all this extra code for every database field with regard to
if(databaseObj.getName() != null)
builder.setName(databaseObj.getName());
and I read square wired into protobuf adding setOrClear methods in java. How do we do this when we generate as well using gradle?
We are using the gradle code from this page right now..
https://github.com/grpc/grpc-java
thanks,
Dean
You can accomplish that via protoc_insertion_points. When you generate the Java code you will see comments like // ##protoc_insertion_point(...). That is where the insertion will occur.
While appearing useful, this approach has serious drawbacks for .protos used in multiple projects. All projects using the same .proto and in the same language should use the same plugins, otherwise it causes the diamond dependency problem. This is why gRPC did not use this approach and instead generates its classes in separate files from the normal message generation. I strongly discourage against this approach, as it paints you into a corner and you don't know when you will need to "pay the piper."
To insert into a point, your plugin needs to run in the same protoc command-line invocation as the java builtin. Your plugin would then need to set CodeGeneratorResponse.file.insertion_point and content for each file you want to inject code.

Stanford Core NLP train tagger using Java API

Does anyone know if it's possible to train a Stanford tagger using the Java API? I'm only finding examples of people doing it through the command line. That should imply that there exists an API method somewhere, but I can't find it.
You can put all of your training properties in a .properties file and then call MaxentTagger.main("-props", "/path/to/training.properties"). I don't see any easier way to do this in the Java API.
The only solution I came up with is to use MaxentTagger.main(...) and pass it a bunch of arguments that have been formatted command line style.

Use GET params to provide the Web interface with a specific text to annotate

I would like to link to my instance of CoreNLP server, with a specified text (and possibly, a specified set of annotators). (i.e. without having to paste the text then click on Submit)
Is there a way to do this?
(I know and use the API version, but I'm looking for the Web visualisation)
No, the current visualization doesn't let you specify text in the URL (though pull requests are always welcome; the source code lives here).
The server does respond to regular POST requests, e.g., if you want to call CoreNLP from your own webpage though Javascript. For example, the given curl command (from the documentation page:
curl --data 'The quick brown fox jumped over the lazy dog.' 'http://localhost:9000/?properties={%22annotators%22%3A%22tokenize%2Cssplit%2Cpos%22%2C%22outputFormat%22%3A%22json%22}' -o -
The visualization is done with more or less vanilla brat.
So, answering my own question: this is now possible (I believe from CoreNlp 3.8), after a successfully merged pull request from yours truly.
This is the relevant pull request: https://github.com/stanfordnlp/CoreNLP/pull/423

4GL and Magento SOAP API. Need a simple example

My company runs a 4GL application internally. It's very old and no one really knows how to improve/develop for it since the developers are long gone.
I need to make a simple SOAP call to my Magento web store. There are tons of examples online in a multitude of languages, but I can't find a single 4GL (OpenEdege ABL) example.
I'm trying to set SKU's to Out of stock status.
Does anyone have a simple example that I can look at, or at least a starting point since there seems to be so little information on 4GL on the web.
Example of the call I need in PHP:
<?php
$proxy = new SoapClient('http://www.domain.com/api/soap/?wsdl');
$sessionId = $proxy->login('admin', 'password');
$proxy->call($sessionId, 'product_stock.update', array('sku123', array('qty'=>50, 'is_in_stock'=>1)));
For version 10.2B there's built in support for consuming web services in Progress ABL.
This is a basic tutorial of how to create a client for a SOAP-based web service in ABL. It's not best practices or in any way complete. Just a quick guide to get started.
1. Analyse the WSDL
There's a built in tool available via command line that lets you analyse a WSDL and create documentation about available services, datatypes, syntax etc. Invoke it on your wsdl like this:
proenv> bprowsdldoc yourwsdl-file c:\temp\docs
The wsdl can be local or remote. If its remote you specify the URL, if it's local you can specify just the local complete path. Documentation in html format will end up in c:\temp\docs. Open up index.html in that folder.
2. Create a basic client
In the index.html document there's a number of headings. Click the link under "Port types". In the Port Type document you will find some useful data.
Copy-and-paste the example in "Connection Details" into your Progress Editor. It should look something like this (names of services and procedures will be different - they are defined in the wsdl):
DEFINE VARIABLE hWebService AS HANDLE NO-UNDO.
DEFINE VARIABLE hYYY AS HANDLE NO-UNDO.
CREATE SERVER hWebService.
hWebService:CONNECT("-WSDL 'file_or_url_to_wsdl.wsdl'").
RUN XXX SET hYYY ON hWebService.
If you run this code your client is connected to the web service but it's still not doing anything.
Further down the same document there's a heading called "Operation (internal procedure) details". This is where the actual web service is invoked. It will look something like the code below. It actually show two ways of making the same call, one functional call and one procedural so choose whatever you prefer and insert it into your editor (I'm usually using the procedural for no real reason other than old habits):
DEFINE VARIABLE strXMLRequest AS CHARACTER NO-UNDO.
DEFINE VARIABLE ProcessXMLResult AS CHARACTER NO-UNDO.
FUNCTION ProcessXML RETURNS CHARACTER
(INPUT strXMLRequest AS CHARACTER)
IN hYYY.
/* Function invocation of ProcessXML operation. */
ProcessXMLResult = ProcessXML(strXMLRequest).
/* Procedure invocation of ProcessXML operation. */
RUN ProcessXML IN hYYY (INPUT strXMLRequest, OUTPUT ProcessXMLResult).
Now all you need to end your program is disconnecting and cleaning up. So insert:
hWebService:DISCONNECT().
DELETE OBJECT hWebService.
If you've followed all steps you should have a skeleton for invoking a web service. The only problem is that you need to handle the in- and out-data.
3. Handle the answer and the request
Depending on how the web service is built this can be easy (if you only input and output simple data like strings and numbers) or quite complicated (if you input and output entire xml-documents). The documentation you created in step one lists all datatypes (in the index.html document) but it doesn't offer any support in how you create any needed xml documents. There's specific Progress documentation available on how to work with xml...
The better approach is to take a look at the official documentation. There you will find everything above and more - how to handle errors etc.
Here is an overview of all 10.2B documentation and here is the PDF named Web Services.
Here is a link to a complete (but actually not so good) example in the Progress KnowledgeBase where a client and corresponding request/response xml is created and handled.
Look at these chapters:
6 - Creating an ABL Client from WSDL
7 - Connecting to Web Services from ABL
8 - Invoking Web Service Operations from ABL
That will basically take you through the entire process from start to beginning.

ProtocolViolationException Load testing web service (GET action with content-body)

I created an ASP.NET MVC4 Web API service (REST) with a single GET action. The action currently needs 11 input values, so rather than passing all of those values in the URL, I opted to encapsulate those values into a single class type and have it passed as Content-Body. When I test in Fiddler, I specify the verb as GET, and enter the JSON text in the "Request Body" input box. This works great!
The problem is when I attempt to perform Load Testing in Visual Studio 2010 Ultimate. I am able to specify the GET action and the JSON Content-Body just fine. But when I run the Load test, VS reports exceptions of type ProtocolViolationException (Cannot send a content-body with this verb-type) in the test results. The test executes in 1ms so I suspect the exceptions are causing the test to immediately abort. What can I do to avoid those exceptions? I'd prefer to not change my API to use URL arguments just to work-around the test tooling. If I should change the API for other reasons, let me know. Thanks!
I found it easier to put this answer rather than carry on the discussions.
Sending content with GET is not defined in RFC 2616 yet it has not been prohibited. So as far as the spec is concerned we are in a territory that we have to make our judgement.
GET is canonically used to get a resource. So you are retrieving this resource using this verb with the parameters you are sending. Since GET is both safe and idempotent, it is ideal for caching. Caching usually takes place based on the resource URI - and sometimes based on various headers. The point is cache implementations - AFAIK - would not use the GET content (and to be honest I have not seen any GET with content in real world). And it would not make sense to include the content in the key generation since it reduces the scalability of the caches.
If you have parameters to send, they must be in the URI since this is part of what defines that URI. As such, I strongly believe sending content with GET is wrong.
Even when you look at implementations such as OData, they put the criteria in the URI. I cannot imagine your (or any) applications requirements is beyond OData query requirements.

Resources