With the Simple API in Stanford CoreNLP, is there a way to get multi-token entity mentions? - stanford-nlp

This question is very similar to my question, however due to the way SO works, I think it is better to ask a new question rather than just continue a thread.
CoreNLP has the Simple API which allows for quicker access to various components of the NLP pipeline. The way to get named entities appears to be:
Form a document annotation from the text
Get the sentences from the document object
Use nerTags() from the sentences object to get the token-by-token ner labeling.
Via other mechanisms, as talked about in the question link above, one can retrieve full multi-token entity mentions such as George Washington, which is an entity mention composed of 2 tokens. Is there a way using the simple api to get these multi-token entity mentions?

Yes, though it gives you less information than the full API, returning only the String spans of the mention. See Sentence#mentions(String) and Sentence#mentions().
If you want to get more information about a mention, you'll have to either use the regular API, or re-implement the logic in these functions. You can also try mucking around in the raw Proto, which will certainly have all the information you could possibly want, but in a less-than-pleasant proto interface. The proto definition is here.

Related

Elasticsearch request validation

I'm trying to figure out if I can validate elasticsearch requests against a pre-defined mapping. I've googled around and searched StackOverflow, but haven't been able to find anything that speaks to what I'm trying to do apart from this question from a year ago that went unanswered.
Is there any tool out there that fits this need, or that at the least would convert ES mappings to some other easily validatable entity like JSONSchema? Extra points if that tool would be accessible in Python, but any language would work.
Specifics:
I'm looking at the openFDA API, which has, e.g., this endpoint for animal and veterinary data. openFDA provides this mapping (YAML download here) for valid fields. I'd like to be able to e.g. make sure that a provided animal.age field is an object instead of some other type, since such a query not obeying the defined mapping returns a rather unhelpful message stating that no records were found.

Is there a policy definition language for GraphQL APIs?

Is there a way for us to define the policies of a GraphQL API, which is both machine-readable and human-readable, which contains a set of rules (in other words, a specification) to describe the format of the API? I'm not talking about the schema, but of a spec where we can add security-related details (for example, complexity value to be assigned per field and depth limitation values) or any other related details. Any thoughts or ideas? Or can we send all of this within the SDL itself?
For example, for REST APIs, we use Swagger to define information on how to define paths, parameters, responses, models, security and more. Is there a need for a similar approach for GraphQL APIs? Your response is highly appreciated
We are working in an approach to add policies to your GraphQL API and allow you to better manage it, especially as you expose the interface externally.
Part of the challenge is that as opposed to a REST call that can easily be differentiated from others, all GraphQL requests look the same, unless a deeper analysis is performed on the incoming query.
This blog post describes how we perform this analysis: https://www.ibm.com/blogs/research/2019/02/graphql-api-management/
if this is of interest let's connect!
As per my understanding you need a tool to make documentation for the APIs you have build for parameters and so on.
If that's what you are searching, there is like swagger for GraphQL - Swagger-to-GraphQL
Hope that helps.!!

Protocol buffers: read only fields?

Is it possible to mark fields as read only in a .proto file such that when the code is generated, these fields do not have setters?
Ultimately, I think the answer here will be "no". There's a good basic guidance rule that applies to DTOs:
DTOs should generally be as simple as possible to convey the data for serialization in a manner well-suited to the specific serializer.
if that basic model is sufficient for you to work with above that layer, then fine
but if not: do not fight the serializer; instead, create a separate domain model above the DTO layer, and simply map between the two models before serialization or after deserialization
Or put another way: the fact that the generator doesn't want to expose read-only members is irrelevant, because if you need something exotic, you shouldn't be using the generated type outside of the code that directly touches serialization. So: in your domain type that mirrors the DTO: make it read-only there.
As for why read-only fields aren't usually a thing in serialization tools: you presumably want to be able to give it a value. Serialization tools usually want to be able to write everything they can read, and read everything they can write.
Minor note for completeness since you mention C#: if you are using a code-first approach with protobuf-net, it'll work fine with {get;}-only auto-props, and with {get;}-only manual props if all public members trivially map to an obvious constructor.

Search methods in FHIR

I'm working on extracting patients info in FHIR server however, I've came across two types of searching methods that were somewhat different. What is the difference between the search method of
Bundle bundle = client.seach().forResource(DiagnosticReport.class)
.
.
and
GET [base]/DiagnosticReport?result.code-value-
quantity=http://loinc.org|2823-3$gt5.4|http://unitsofmeasure.org|mmol/L
It's very confusing as it seemed that there isn't much that is mentioned about these two search methods. Can i achieve the same level of filtering with the first method compared to the url method?
The first is how to perform a search using the Java reference implementation. The latter explains what the actual HTTP query looks like that hits the server (and also specifies some additional search criteria). Behind the scenes the Java code in the first example is actually making an HTTP call that looks similar to the second example. The primary documentation in the FHIR specification deals with the HTTP call. The reference implementations work differently based on which language they are and are documented outside the FHIR specification on a reference implementation by reference implementation basis.

Documenting fields in Django Rest Framework

We're providing a public API that we use internally but also provide to our SaaS users as a feature. I have a Model, a ModelSerializer and a ModelViewSet. Everything is functional but it's blurting out the Model help_text for the description in the API documentation.
While this works for some fields, we would like to be a lot more explicit for API users, providing examples, not just explanations of guidance.
I realise I can redefine each field in a Serializer (with the same name, then just add a new help_text argument, but this is pretty boring work.
Can I provide (eg) a dictionary of field names and their text?
If not, how can I intercede in the documentation process to make something like that work?
Also, related, is there a way to provide a full example for each Viewset endpoint? Eg showing what is submitted and returned, like a lot of popular APIs do (Stripe as an example). Or am I asking too much from DRF's documentation generation? Should I handle this all externally?
To override help_text values coming from the models, you'll need to use your own schema generator subclass and override get_path_fields. There you'd be able to prioritize a mapping on the viewset (as you envision) over the model fields help_text values.
On adjusting the example generation - you could define a JSON language which just deals with raw JSON and illustrate the request side of things pretty easily, however, illustrating responses is difficult without really getting deep into the plumbing, as the default schema generated does not contain response structure data.

Resources