Should I use UUID on Graphql? - graphene-python

While I was reading Graphql official documents, I found out that Graphql uses UUID. I Googled and most examples on the internet was fetching data by using localized IDs with type names. I have no idea why the usages are different.
Only a handful of documents mentions that it is best to use base64 for encoding and decoding type:ID data sets. Is it true? Django model uses integers for primary keys and they need to be serialized in some way to be UUIDs. I am planning to use Big Int primary keys in a few of Django models. Will that approach still work for Big Int primary keys?
I found one post mentioning that I could actually use UUID in a DB level. But it seems like there is a performance issue. https://blog.hasura.io/graphql-and-uuid-type-on-postgres-767f016479e9/
Should I use UUID in a DB level, just use base64 to decode type:id, or not at all and just ask a localized ID? Even if I use a UUID, it is still exposed to clients if I follow examples on the internet, isn't it? And why do they just use localized IDs in contrast to the official document?

Related

Defining a schema that will be compatible between multiple databases and different conventions

I want to define one schema that will be cross teams & platform valid. This is pretty simple and can be thought of as a kind of ontology. What I need is to have the ability to define what the field represents and under it the name of the field on each platform. I'd like the schema to have the ability to generate data objects for each of the used languages, and therefore I'd like to know if my need can be filled within Protobuf or GraphQL. Notice - my conventions can be different than the trivial in my generated target language since it needs to be compatible with the databases. A simple example for my need:
{
"lastName": {
"mssqlName":"LastName",
"oracleName":"FamilyName",
"elasticName":"lastName",
"cassandraName":"last_name",
"rocksDbName":"surname",
},
"age" : {
...
}
As you can see, on some platforms I have totally different names than the others. I'd like to know what are the usual ways\ technologies to solve this problem, and if whether it will be possible with codegen-able technologies like Proto & GraphQL.
A single schema as the single point of truth for all object / message definition across databases, comms links, multiple languages and plaforms? It would be nice, wouldn't it?
The closest I can think of is XSD (XML schema), but I don't think it works when it comes to tools. For example, I know of tools that will take an XSD schema and generate you code that will serialise / deserialise objects to / from XML (e.g. Microsoft's xsd.exe). There's even some good ones.
And then there's tools that will create SQL tables from that XSD schema. But a code generator that builds classes that can access those tables isn't also building them to serialise / deserialise objects to and XML wireformat.
Basically, I've not come across a schema language that has tooling that does everything. The ASN.1 tools are very good at creating serialisation classes, but I've never found one that also targets SQL interactions. Same with XSD.
My knowledge is of course not exhaustive, and there might be something in JSON-land that works.
Minimum Pain Compromise Approach
What I have settled on in the past is to accept that I'm having to do some manual coding around changes in schema, but probably not too much. I'd define messages fully in, say, Google Protocol Buffers, and use that for object exchange between applications / languages. Where I wanted to stash objects in a database, I'd accept that for that I'd be having to have a parallel definition of the object in the table columns, but only for critical fields that I'd want to search on. The last column would be an arbitrary container, able to store the serialised object whole.
For example, if a GPB message had an integer ID field, and a string Name field, plus a bunch of other fields. My data base table would then have an ID column, a Name column, and column for storing Bytes.
That way I could serialise an object, and push it into a row's Bytes column whilst also filling in the ID and Name columns. I could quickly search for objects, because of the Name / ID column. If I then wanted access to the other fields in the object stored in the database, I'd have to retreive the record from the database and deserialise the Bytes column.
This way one is essentially taking a bet that those key columns / field names (ID, Name) won't ever be changed during development in the schema. But it's quite likely a safe bet. Generally, one can settle things like that quite easily, early on in a project, it's the rest of the schema that might be changed during development.
One small payoff is that if the reason to hunt out an object in the database is to be able to send it through a communications channel, it is already serialised in the database. No need to serialise it again before dispatch down the comms link.
So this approach can leave one with some duplication of code / points of truth, but can be quite performant in avoiding a serialisation step during parts of runtime.
You can also cheat a little. If the serialisation wireformat is text based (JSON, XML, some ASN.1 formats, etc), then there's a good chance that string searches on the bytes column will yield good results anyway. For instance, suppose a message field was MiddleName, but I'd not created that as a distinct table column in the database. I could find likely records for any given MiddleName by searching for the value in the Bytes column, as it's stored as text somewhere in there.
Reflection Based Approach?
A potential other approach is to accept that the tooling does not exist to satisfy all needs, and adapt using language features (reflection) to exploit a common feature of code generators.
For example, consider GPB's proto compiler. In the generated code you end up with classes whose members are named after the fields in messages. And it'll be more or less the same with any code generated to access a database table that has columns by the same name.
So it is possible to use reflection to make an auto-transcriber between generated classes. You iterate down the tree of members in one class, and you can match that up to a member in a different generated class.
This avoids the need for code like:
Protobuf::MyClass myObj_g; // An object built using GPB
JSON::MyClass myObj_j; // equivalent object to be copied from myObj_g;
myObj_j.Field1 = myObj_g.Field1;
myObj_j.Field2 = myObj_g.Field2;
.
.
.
Instead:
Protobuf::MyClass myObj_g; // An object built using GPB
JSON::MyClass myObj_j; // equivalent object to be copied from myObj_g;
foreach (Protobuf::MyClass::Reflection::Field field in Protobuf::MyClass.Fields)
{
myObj_j.Reflection.FindByName(field.Name) = myObj_g.Reflection.FindByName(field.Name);
}
There'd be a fit of fiddling around to do to get this to work between each database and serialisation technology, per language, but the point is you'd only ever have to write it once. Any subsequent schema changes do not require code changes, at least not so far as exchanging objects between a serialisation technology and a database access technology.
Obviously, reflection is easier / possible in some languages and not otheres.
The Fix It At Runtime Approach?
Apache Avro has the characteristic where serialised data describes it's own shape. Basically, wireformat data comes with its own schema, so a consumer can build a representation of the data automatically. In some languages that's horrid (C, C++), but libraries exist.
Basically, it forces you to write applications so that they work out what to do with data for themselves;

Create subsets for certain Resources to better fit existing data model?

We are trying to implement a FHIR Rest Server for our application. In our current data model (and thus live data) several FHIR resources are represented by multiple tables, e.g. what would all be Observations are stored in tables for vital values, laboratory values and diagnosis. Each table has an independent, auto-incrementing primary ID, so there are entries with the same ID in different tables. But for GET or DELETE calls to the FHIR server a unique ID is needed. What would be the most sensible way to handle this?
Searching didn't reveal an inherent way of doing this, so I'm considering these two options:
Add a prefix to all (or just the problematic) table IDs, e.g lab-123 and vit-123
Add a UUID to every table and use that as the logical identifier
Both have drawbacks: an ID parser is necessary for the first one and the second requires multiple database calls to identify the correct record.
Is there a FHIR way that allows to split a resource into several sub-resources, even in the Rest URL? Ideally I'd get something like GET server:port/Observation/laboratory/123
Server systems will have all sorts of different divisions of data in terms of how data is stored internally. What FHIR does is provide an interface that tries to hide those variations. So Observation/laboratory/123 would be going against what we're trying to do - because every system would have different divisions and it would be very difficult to get interoperability happening.
Either of the options you've proposed could work. I have a slight leaning towards the first option because it doesn't involve changing your persistence layer and it's a relatively straight-forward transformation to convert between external/fhir and internal.
Is there a FHIR way that allows to split a resource into several
sub-resources, even in the Rest URL? Ideally I'd get something like
GET server:port/Observation/laboratory/123
What would this mean for search? So, what would /Obervation?code=xxx search through? Would that search labs, vitals etc combined, or would you just allow access on /Observation/laboratory?
If these are truly "silos", maybe you could use http://servername/lab/Observation (so swap the last two path parts), which suggests your server has multiple "endpoints" for the different observations. I think more clients will be able to handle that url than the url you suggested.
Best, still, I think is having one of your two other options, for which the first is indeed the easiest to implement.

Purpose and implemetnation of json field type in laravel schema builder

what is purpose of $table->json('options'); as field type of laravel database schema builder.I tried searching hard but couldn't get any relevant info on it.Please some one state list purpose with example
Some database engines - PostgreSQL being a major example - have JSON-friendly data types (that MySQL currently lacks - it'll just store as a TEXT data type there). This can be handy for working with data (like the options example you cite) that might contain a large amount of schema-less or loosely-structured data.
http://www.postgresql.org/docs/9.4/static/datatype-json.html
http://www.postgresql.org/docs/9.3/static/functions-json.html
Instead of having 100+ columns for a bunch of on/off options for a model, you could store them in a JSON object in the database.
Sometimes it is useful, even with MySQL to store data as JSON.
If you are building an application with user settings, when you only require a handful of user settings for your applications, a few columns in your users or settings table will do the trick nicely. But what about when you have dozens and dozens of configuration options? Well, in these cases, you might consider encoding a bit of JSON, and saving it to a single column.

couchdb validation based on content from existing documents

QUESTION
Is it possible to query other couchdb documents as part of a standard couchdb validation function ?
If not, what is the standard approach for including properties of other documents as part of a validation rule inside a couchdb validation function?
RATIONALE
Consider a run-of-the-mill address book application where the validation function is intended to prevent two or more entries having the same value for the 'e-mail' in one of the address book entry fields.
Consider also an address book application where it is possible to specify validation rules in separate documents, based on whether the postal code is a US-based postal code or something else.
No, it is not possible to query other couchdb documents in a validate_doc_update function. Each runs in isolation passing references only to: the new document, the old document, and user (where applicable).
My personal experience has been there are at least three options for dealing with duplicate checking:
Use Cloudant as your CouchDB provider. They offer a free tier for now if you'd like to experiment, but they guarantee consistency across nodes for a CouchDB database. (See #2)
I've used a secondary "reserve table" for names using the type-key as the ID. Then, you need to check for conflicts if not using a system like Cloudant. Basically, there's a simple document that maintains a key to prevent duplicates. It's not fun code to write given that you need to watch for conflicts. (Even with cloudant, you need to deal with failed requests to write, but it's easier than dealing with timing issues surrounding data replication across multiple nodes).
Use a traditional DB like MySQL for example that can maintain a unique and consistent index for specific data values like you're describing. Store the documents away in CouchDB though. While slightly annoying that you need different data providers, it's reliable.
(Optional: decide that CouchDB isn't a great fit for the type of system you're building)

Serializing a MongoDB grid ID into a string using ActiveRecord

In my Sinatra app, I'm using a MongoDB with Grid to store book covers on Heroku. I want to be able to associate these with the books in my ActiveRecord-driven primary database. Currently, I'm downloading the image from Google Books, storing it in the MongoDB, and storing the BSON::ObjectID object into the database as a string.
When I go to retrieve the image, however, grid won't accept this string as a way to get the file.
Is there a better way for me to store this information or a better way for me to associate data between the two databases?
A friend helped me with this one: it was a YAML string, so
YAML::load
on the string being stored in the database did the trick.

Resources