openEHR, Snomed and Measurement units - openehr

I'm new to openEHR and snomed. I'm wanting to store information pack definition for a tobacco summary. How do I go about storing the measurement units (grams, oz, number of cigarettes)? Is there a reference list of these in either of the standards?
Thanks

Your question should not be about storing, it should be about modeling with openEHR. Storage of openEHR data is a separated issue.
For modeling, you will need first to understand the information model, the structure, the datatypes, etc. You will find some types that might be useful in your case, for instance using a DV_COUNT for storing the number of (this is for counting, like number of cigarettes), that doesn't have units of measure since is a count. If you want to store volume or weight, the openEHR information model has DV_QUANTITY. For standard units, as Bert says, you can use UCUM. For non standard units, you might need to choose a different datatype since the recommendation for DV_QUANTITY.units is to use UCUM (Unified Code for Units of Measure).
When you have that figured our, you need to follow the openEHR methodology for modeling, using archetypes and templates. A template would be the final form of your structure that can be used in software. At that moment you can worry about storage.
Storing today is a solved problem. There are many solutions, using relational, document and mixed databases. My implementation, the EHRServer, uses pure relational approach. But you can create your own, just map the openEHR information model structures to your database of preference, starting from the datatypes.
And of course, start with the openEHR specs: https://www.openehr.org/programs/specification/workingbaseline
BTW, SNOMED doesn't play any role here, not sure why you mentioned that in the title. You need to understand the standards before trying to implement them.

OpenEhr has an own Unit list from which you should choose a unit in a DvQuantity, but since short time, in the specs, the newest version, is described that you must use a unit from the UCUM standard. Check the description for DataTypes in the specifications.
You can find the UCUM standard here. The link is published by the Regenstreif institute (the same institute which serves the LOINC standard), so it is stable.
http://unitsofmeasure.org/ucum.html
There is a Golang-UCUM-library:
https://github.com/BertVerhees/ucum

Related

Clarify steps to add a language variant to Stanza

I would like to add a non-standard variant of a language already supported by Stanza. It should be named differently from the standard variety included in the common distribution of Stanza. I could use a modification of the corpus for training the AI, since the changes are mostly morphological rather than syntactical, but how many steps would I need to take in order to make a new language variety for Stanza from this background? I don't understand what data are input and what are output in the process of adding a new language in the web documentation.
It sounds like you are trying to add a different set of processors rather than a whole new language. The difference being that other steps of the pipeline will still work the same, right? NER models, for example.
If that's the case, if you can follow the steps to retrain the current models, you should be able to then replace the input data with your morphological updates.
I suggest filing an issue on github if you encounter difficulties in the process. It will be a lot easier to back & forth there.
Times when we would actually recommend a whole new language are when 1) it's actually a new language or 2) it uses a different character set - think different writing systems for ZH or for Punjabi, if we had any Punjabi models

How do I access h2o xgb model input features after saving a model to disk and reloading it?

I'm using h2o's xgboost implementation in Python. I've saved a model to disk and I'm trying to load it later on for analysis and predicting. I'm trying to access the input features list or, even better, the feature list used by the model which does not include the features it decided not to use. The way people advise doing this is to use varimp function to get the variable importance and while this does remove features that aren't used in the model this actually gives you the variable importance of intermediate features created by OHE the categorical features, not the original categorical feature names.
I've searched for how to do this and so far I've found the following but no concrete way to do this:
Someone asking something very similar to this and being told the feature has been requested in Jira
Said Jira ticket which has been marked resolved but I believe says this was implemented but not customer visible.
A similar ticket requesting this feature (original categorical feature importance) for variable importance heatmaps but it is still open.
Someone else who found an unofficial way to access the columns with model._model_json['output']['names'] but that doesn't give the features that weren't used by the model and they are told to use a different method that doesn't work if you have saved the model to disk and reloaded it (which I am doing).
The only option I see is to just use the varimp features, split on period character to break the OHE feature names, select the first part of all the splits, and then run a set over everything to get the unique column names. But I'm hoping there's a better way to do this.

Internationalisation - displaying gendered adjectives

I'm currently working on an internationalisation project for a large web application - initially we're just implementing French but more languages will follow in time. One of the issues we've come across is how to display adjectives.
Let's take "Active" as an example. When we received translations back from the company we're using, they returned "Actif(ve)", as English "Active" translates to masculine "Actif" or feminine "Active". We're unsure of how to display this, and wondered if there are any well established conventions in the web development world.
As far as I see it there are three possible scenarios:
We know at development time which noun a given adjective is referring to. In this case we can determine and use the correct gender.
We're referring to a user, either directly ("you") or in the third person. Short of making every user have a gender, I don't see a better approach than displaying both, i.e. "Actif(ve)"
We are displaying the adjective in isolation, not knowing which noun it's referring to. For example in a table of data, some rows might be dealing with a masculine entity, some feminine.
Scenarios 2 and 3 seem to be the toughest ones. Does anyone have any experience handling these issues? Any tips would be appreciated!
This is complex, because we cannot imagine all the cases, and there is risk to go in "opinion based" answer, so I keep it short and generic.
Usually I prefer to give context in translation (for translator), e.g. providing template: _("active {user_name}" (so also the ordering will be correct if languages want different ordering).
Then you may need to change code and template into _("active {first_name_feminine}") and _("active {first_name_masculine}") (and possibly more for duals, trials, plurals, collectives, honorific, etc.). Note: check that the translator will not mangle the {} and the string inside. Usually you need specific export/import scripts. Or I add a note inside the string, and I quickly translate into English removing the note to the translator). Also this can be automated (be creative on using special Unicode characters which should not be used in normal text, to delimit such text).
But if you cannot know the gender, the Actif(ve) may be the polite version used in such language. You need a native speaker test, and changes back and forth.

What is the difference between GraphQL and SPARQL?

I'm doing a lot of research right now on Semantic Web and complex data models that represent relationships between individuals and organizations.
I knew a little semantic ontologies although I never understood what it was used if not make graphs.
I saw on university wiki that the language to question an ontology is the SPARQL (tell me if I'm wrong).
But recently I saw a company that had created a semantic ontology put it in the form of GraphQL that I did not know (https://diffuseur.datatourisme.gouv.fr/graphql/voyager/).
It seems to me that semantic ontologies are made to better find information, for example to make a chatbot (it's what I want to do), but here they transformed a semantic ontology into an API, is it right? To make a GraphQL, should I build first a semantic ontology?
Could you explain to me a little the difference between all this, honestly it's a little vague for me.
Context
Datatourisme is a platform that allows to publish (via the Producteur component) and to consume (via the Diffuseur component) POI-related open data.
It seems you have linked to a particular application developed over Diffuseur with a help of GraphQL Voyager. The application illustrates capabilities of the GraphQL API exposed by Diffuseur.
The API documentation is available here (in French):
datatourisme.frama.io/api
framagit.org/datatourisme/api
Problem
Datatourisme stores data in the RDF format (presumably using the Blazegraph
triplestore)
Datatourisme provides access via GraphQL (not SPARQL)
Why RDF
Partly due to some "schemaless", RDF is convenient in heterogeneous data integration tasks:
Tourism national ontology structuring in a common sharing format the whole tourism data extracted from different official French data bases: entertainment and events, natural and cultural sites, leisure and sports activities, touristic products, tours, accomodations, shops, restaurants.
RDF is semantic: in particular, RDF is self-describing.
SPARQL
SPARQL is W3C-standardized language for RDF querying. There were also proposed other RDF query languages.
BTW, it is possible to query non-RDF sources with SPARQL, e. g. defining R2RML mappings.
RDF self-descibeness and SPARQL standardness remove the need to create or to learn a new (shitty) API every day.
GraphQL
Similar to SPARQL, GraphQL allows to avoid multiple requests.
GraphQL allows to wrap different data sources of different kinds, but typically they are REST APIs.
As you can see, it is possible to wrap a SPARQL endpoint (there also exists HyperGraphQL).
Why GraphQL
Why Datatourisme have preferred GraphQL?
GraphQL is closer to developers and technologies they use en masse. In the past, JSON-LD had the same motivation (however, see my note about JSON-LD here).
As it seems, Diffuseur's GraphQL layer provides API keys support and prevents too complex SPARQL queries.
Are data still semantic
The answer depends on what you mean by semantic. There was an opinion that even relational model is quite semantical...
I'd answer affirmatively, if it's possible to extract e. g. the comment to the :rcs property with GraphQL (and the answer seems to be negative).
Conclusion
Answering your direct question:
it is not necessary (though possible) to create a semantic ontology first in order to use GraphQL;
it is not necessary (though possible) to use GraphQL after creating a semantic ontology.
Answering your indirect question:
probably you need a semantic ontology in order to build such chatbot;
probably you need something else in addition.
See also: How is the knowledge represented in Siri – is it an ontology or something else?
Update
In addition to HyperGraphQL, there are other interesting convergence projects:
in Stardog
in Topbraid
in Ontotext Platform
GraphQL and SPARQL are different languages for different purposes. SPARQL is a language to work with Triple stores, graph datasets, and RDF nodes. GraphQL is a API language, preferably for working with JSON structures. As for your specific case, I would recommend to clarify your goal on using AI in your application. If you require to apply a graph dataset in your application, perform more advance knowledge discovery like reasoning on dataset, then you may need a Semantic Web approach to apply SPARQL on top of your dataset. As you can see in the picture below, Semantic Web presents different layers to perform knowledge discovery, perform reasoning, by ontology design and RDF-izing datasets.
see here to read more.
If your AI application does not have such requirements and you can accomplish your data analysis using a JSON-based database, GraphQL is probably a good choice to create your API, as it is widely used by different Web and Mobile applications these days. In particular, it is used to share your data through different platforms and microservices. See here for more information.
Quickly, the differences are :
SPARQL (SPARQL Protocol and RDF Query Language) is a language dedicated to query RDF graph databases (CRUD and more). It is a standard within the Semantic Web tools and provided by W3C Recommendation.
GraphQL is a language created by Facebook and strongly ressembling JSON to communicate with APIs. It is a communication tool between clients and server endpoints. The request defines itself the structure of the answer. Its use is not limited to SQL or NoSQL or ...
"Graph" doesnt mean "a structure made of triples" as for the RDF.
Those are two different language for different applications.
A very important difference, which I didn't see mentioned in the previous answers, is that, while SPARQL is more powerful query language in general, it produces only tabular output, while GraphQL gives tree structures, which is important in some implementation cases.

Inline data representation

I would like to represent data that gives an overview but allows them to drill down in an inline fashion - so if you had a grouping of say 6 objects the user could expand the data and it would show the 6 objects immeadiately below it before any more high level data.
It would appear that MSHFlexgrid gives this ability but I can't find any information about actually using it, or what it's limitations are (can you have differing number of fields and/or can they have different spacing, what about column headers, indentation at for the start, etc).
I found this site, but the images are broken (in ie8 and ff3.5). Google searches show people just using the flat data representation but nothing using the hierarchical properties). Does anyone know any good tutorials or forums with a good discussion about pitfalls?
Due to lack of information about using it, I am thinking of coding my own version but if anyone has done work in this area I haven't found it - I would of thought it would be a natural wish for data representation. If someone has coded a version of this (any language) then I wouldn't mind reading about it - maybe my idea of how to do it wouldn't be the best way.
You might want to check out vbAccelerator. He has a Multi-Column Treeview control that sounds like what you may be looking for. He gives you the source and has some pretty decent samples.
The MSHFlexGrid reference pages and the "using the MSHFlexGrid" topic in the Visual Basic manual?
Sorry if you've already looked at these!

Resources