How do I avoid n+1 queries with Spring Data Rest? - spring

Question. How do I avoid n+1 queries with Spring Data REST?
Background. When querying Spring Data REST for a list of resources, each of the resulting top-level resources has links to the associated resources, as opposed to having the associated resources embedded directly in the top-level resources. For example, if I query for a list of data centers, the associated regions appear as links, like this:
{
"links" : [ {
"rel" : "self",
"href" : "http://localhost:2112/api/datacenters/1"
}, {
"rel" : "datacenters.DataCenter.region",
"href" : "http://localhost:2112/api/datacenters/1/region"
} ],
"name" : "US East 1a",
"key" : "amazon-us-east-1a"
}
It is pretty typical, however, to want to get the associated information without having to do n+1 queries. To stick with the example above, I might want to display a list of data centers and their associated regions in a UI.
What I've tried. I created a custom query on my RegionRepository to get all the regions for a given set of data center keys:
#RestResource(path = "find-by-data-center-key-in")
Page<Region> findByDataCentersKeyIn(
#Param("key") Collection<String> keys,
Pageable pageable);
Unfortunately the links this query generates don't overlap with the links that the data center query above generates. Here are the links I get for the custom query:
http://localhost:2112/api/regions/search/find-by-data-center-key-in?key=amazon-us-east-1a&key=amazon-us-east-1b
{
"links" : [ ],
"content" : [ {
"links" : [ {
"rel" : "self",
"href" : "http://localhost:2112/api/regions/1"
}, {
"rel" : "regions.Region.datacenters",
"href" : "http://localhost:2112/api/regions/1/datacenters"
}, {
"rel" : "regions.Region.infrastructureprovider",
"href" : "http://localhost:2112/api/regions/1/infrastructureprovider"
} ],
"name" : "US East (N. Virginia)",
"key" : "amazon-us-east-1"
}, {
"links" : [ {
"rel" : "self",
"href" : "http://localhost:2112/api/regions/1"
}, {
"rel" : "regions.Region.datacenters",
"href" : "http://localhost:2112/api/regions/1/datacenters"
}, {
"rel" : "regions.Region.infrastructureprovider",
"href" : "http://localhost:2112/api/regions/1/infrastructureprovider"
} ],
"name" : "US East (N. Virginia)",
"key" : "amazon-us-east-1"
} ],
"page" : {
"size" : 20,
"totalElements" : 2,
"totalPages" : 1,
"number" : 1
}
}
The challenge seems to be that the data center query returns links that aren't particularly informative once you already understand the shape of the data. For example, I already know that the region for data center 1 is at /datacenters/1/region, so if I want actual information about which specific region is involved, I have to follow the link to get it. In particular I have to follow the link to get the canonical URI that shows up in the bulk queries that would allow me to avoid n+1 queries.

The reason Spring Data REST works like this is the following: by default, we assume every application repository a primary resource of the REST service. Thus, if you expose a repository for an entity's related object you get links rendered to it and we expose the assignment of one entity to another via a nested resource (e.g. foo/{id}/bar).
To prevent this, annotate the related repository interface with #RestResource(exported = false) which prevents the entities managed by this repository from becoming top level resources.
The more general approach to this is starting with Spring Data REST letting you expose the resources you want to get managed and default rules applied. You can then customize the rendering and links by implementing ResourceProcessor<T> and registering your implementation as Spring bean. The ResourceProcessor will then allow you to customize the data rendered, links added to the representation etc.
For everything else, manually implement controllers (potentially blending into the URI space of the default controllers) and add links to those through ResourceProcessor implementations. An example for this can be seen in the Spring RESTBucks sample. The sample project uses Spring Data REST to manage Order instances and implements a custom controller to implement the more complex payment process. Beyond that it adds a link to the Order resource to point to the manually implemented code.

Spring Data REST will only create the representation you describe if the serializer that is configured inside the Jackson ObjectMapper is triggered by seeing a PersistentEntityResource, which is a special kind of Resource that is used inside Spring Data REST.
If you create a ResourceProcessor<Resource<MyPojo>> and return a new Resource<MyPojo>(origResource.getContent(), origResource.getLinks()), then the default Spring Data REST serialization machinery will not be triggered and Jackson's normal serialization rules will apply.
Note, however, that the reason Spring Data REST does associations the way it does is because it's very difficult to arbitrarily stop traversing an object graph when serializing to JSON. By handling associations the way it does, it guarantees that the serializer won't start traversing an object graph that is N levels deep and become much slower in performance and in the performance of the representation going over-the-wire.
Ensuring that Jackson does not try to serialize a PersistentEntityResource, which is what it's doing in the default configuration, will ensure that none of the Spring Data REST handling of associations is triggered. The down side to this, of course, is that none of Spring Data REST's helpers will be triggered. If you still want links to the associated resources, you'll have to make sure you create those yourself and add them to the outgoing plain Resource.

Related

FHIR: Extending the Basic resource with extensions

I'm an absolute FHIR newbie and I'm trying to create a set of StructureDefinitions and examples for an upcoming medical project.
For this project, we need a very specific resource, which is not supported by any FHIR resource yet. Here's our use case:
We are placing sensors on our Patients while they execute certain exercises (e.g. a leg squat) - we capture the sensor measurements and based on those we assign a pre-calculated bio-mechanical body model to each individual Patient. Those body models are calculated and assigned somewhere else in our system (this process is not relevant here). In a first step, I would like to add all the pre-calculated body models itself to our FHIR dataset as resources - so that I'm able to output all existing body models in our system.
Such a body model consists of an unique identifier, a human readable title and a set of attributes which describe the body model. The crucial part are the attributes - those might vary for each body model and we don't know the set of possible attributes beforehand, hence I need a dynamic format representing key and value of each attribute. If I were to represent this in a simple json structure I'd look as follows:
{
"id": "0",
"title": "SAMPLE_BODY_MODEL",
"attributes": [
{
"key": "ATTRIBUTE_1",
"value": "EXAMPLE_1"
},
{
"key": "ATTRIBUTE_2",
"value": "EXAMPLE_2"
}
]
}
My goal now is to create a StructureDefinition corresponding to the custom resource I've described above.
Hence I looked up the topic of "custom resources" and found this article on the HL7 site: https://hl7.org/fhir/basic.html - explaining that the Basic resource should be used for custom resources.
So I went ahead and tried to create a basic resource and extending it:
{
"resourceType": "StructureDefinition",
...
"type": "Basic",
"differential": {
"element": [
{
"id" : "Basic",
"path": "Basic",
"definition": "This element describes a general body model captured during an exercise or a movement, e.g. whilst doing leg squats."
},
{
"id" : "Basic.id",
"path": "Basic.id",
"definition": "ID of the body model"
}
{
"id": "Basic.extension:title",
"path": "Basic.extension",
"sliceName": "definition",
"definition": "Title of the body model",
"min": 0,
"max": "1",
"type": [
{
"code": "string" // I know that's wrong, but I somehow would like to restrict this to a string only
}
]
},
{
"id": "Basic.extension:attributes",
"path": "Basic.extension",
"sliceName": "attributes",
"definition": "Attributes of the body model",
// This is where I'm stuck - how do I define this to be a list of objects consisting of attributes key and value?
}
]
}
}
To sum it all up: How do I create a new StructureDefinition from a basic resource allowing me to specify a new required attribute named "attributes", which consists of one-to-many elements, which again contain the attributes key and value for the key and value of the body model attributes?
Hope this makes sense - otherwise please feel free to let me know and I'll try to rephrase my question.
Many thanks in advance!
First, for a newbie, you're doing really well :) (And nice job on framing the question well too!)
Your first extension slice has a few issues:
sliceName should be "title", not "definition" - essentially the 'extra' bit in the id is the slicename
The 'type' needs to be Extension. (The type of all extensions is always Extension.) However, you should also specify a specific profile on Extension that indicates the canonical URL the StructureDefinition you've used to define the 'title' extension. That extension will have a context of Basic and will constrain extension.value[x] to be of type string and will also establish a fixed URL for extension.url.
Your second slice will be similar. However, the profile on extension it points to won't constrain extension.value. Instead, it'll slice extension.extension to have two slices, one with a fixed url of "name" and the other with a fixed url of "value". There's an example here of a 2-element complex extension. Your slice names and data types will differ, as will the context, but it should make a good model for you.
If you still have issues, add your revised version to your question and we'll see if we can help further.

Map HATEOAS links to actual API links

I'm trying to implement a HATEOAS Rest Client using Spring Boot.
Right now, I'm stuck in a point where I need to convert HATEOAS into an actual API URI.
If I post a new object of type Customer like:
{
"name": "Frank",
"address": "http://localhost:8080/address/23"
}
And then I retrieved with a request to http://localhost:8080/api/customer/1`, HATEOAS gives me something like
{
"name": Frank,
"_links": {
"address": {
"href": "http://localhost:8080/api/customer/1/address"
}
}
}
Is it possible to convert a link of the form of http://localhost:8080/api/customer/1/address to an API call like http://localhost:8080/api/address/23 ?
If you see what HATEOS returns after you say,
GET: http://localhost:8080/api/customer/1
is
{
"name": Frank,
"_links": {
"address": {
"href": "http://localhost:8080/api/customer/1/address"
}
}
}
According to Understanding HATEOS,
It's possible to build more complex relationships. With HATEOAS, the output makes it
easy to glean how to interact with the service without looking up a specification or
other external document
which means,
after you have received resource details with
http://localhost:8080/api/customer/1
what other operations are possible with the received resource those will be shown for easier/click thru access to your service/application,
here in this case HATEOS could find a link http://localhost:8080/api/customer/1/address that was accessible once you have customer/1 and from there if you want then without going anywhere else customer/1 's address could be found with /customer/1/address.
Similarly if you have /customer/1's occupation details then there would be another link below address link called http://localhost:8080/api/customer/1/occupation.
So if address is dependent on customer i.e. there can be no address without customer then your API endpoint has to be /api/customer/1/address and not directly /api/address/23.
However, after understanding these standards and logic behind HATEOS's such responses if you still want to go with your own links that may not align with HATEOS's logic you can use,
Link Object provided by LinkBuilder interface of HATEOS.
Example:
With object of type Customer like:
Customer c = new Customer( /*parameters*/ );
Link link= linkTo(AnyController.class).slash("address").slash(addressId);
customer.add(link);
//considering you want to add link `http://localhost:8080/api/address/23` and `23 is your addressID`.
Also you can create a list of Links and keep adding many such links to that list and then add that list to your object.
Hope this helps you !

Elasticsearch Terms Query exclude large amount of users

I'm working on a tinder like app. In order to exclude profiles that user has swiped before, I use a "must_not" query like this:
must_not : [{"terms": { "swipedusers": ["userid1", "userid1", "userid1"…]}}]
I wonder what are the limits using this approach? is this a scalable approach that would also work when the swipedusers array contains 2000 user ids? If there is a better scalable approach to this I would be happy to know...
there is a better approach! and it called "terms lookup", is something like the traditional join that you could do on relational databases...
I could try to explain you here, but, all the information that you need is well documented on the official Elastic Search page:
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup
The final solution is having 2 indices, one for the registered users and another one to track swipes for each user.
Then, for each swipe, you should update the document containing current user swipes... Here you will need to add elements to an array, and this is another problem in ElasticSearch (big problem if you are using AWS managed ElasticSearch) that only can be solved using scripting...
More info at https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates
For your case, the query will result in something like:
GET /possible_matches/_search
{
"query" : {
"terms" : {
"user" : {
"index" : "swiped",
"type" : "users",
"id" : "current-user-id",
"path" : "swipedUserId"
}
}
}
}
Another thing that you should take in account is the replication configuration for the swipes index, since each node will perform "joins" with that index, is highly recommended to have a full copy of that index in each node. You could achieve this creating the index with the "auto_expand_replicas" with "0-all" value.
PUT /swipes
{
"settings": {
"auto_expand_replicas": "0-all"
}
}

Can I use parent-child relationships on Kibana?

On a relational DB, I have two tables connected by a foreign key, on a typical one-to-many relationship. I would like to translate this schema into ElasticSearch, so I researched and found two options: the nested and parent-child. My ultimate goal was to visualize this dataset in Kibana 4.
Parent-child seemed the most adequate one, so I'll describe the steps that I followed, based on the official ES documentation and a few examples I found on the web.
curl -XPUT http://server:port/accident_struct -d '
{
"mappings" : {
"event" : {
},
"details": {
"_parent": {
"type": "event"
} ,
"properties" : {
}
}
}
}
';
here I create the index accident_struct, which contains two types (corresponding to the two relational tables): event and details.
Event is the parent, thus each document of details has an event associated to it.
Then I upload the documents using the bulk API. For event:
{"index":{"_index":"accident_struct","_type":"event","_id":"17f14c32-53ca-4671-b959-cf47e81cf55c"}}
{values here...}
And for details:
{"index":{"_index":"accident_struct","_type":"details","_id": "1", "_parent": "039c7e18-a24f-402d-b2c8-e5d36b8ad220" }}
The event does not know anything about children, but each child (details) needs to set its parent. In the ES documentation I see the parent being set using "parent", while in other examples I see it using "_parent". I wonder what is the correct option (although at this point, none works for me).
The requests are successfully completed and I can see that the number of documents contained in the index corresponds to the sum of events + types.
I can also query parents for children and children for parents, on ES. For example:
curl -XPOST host:port/accident_struct/details/_search?pretty -d '{
"query" : {
"has_parent" : {
"type" : "event",
"query" : {
"match_all" : {}
}
}
}
}'
After setting the index on Kibana, I am able to list all the fields from parent and child. However, if I go to the "discover" tab, only the parent fields are listed.
If I uncheck a box that reads "hide missing fields", the fields from the child documents are shown as grey out, along with an error message (see image)
Am I doing something wrong or is the parent-child not supported in Kibana4? And if it is not supported, what would be the best alternative to represent this type of relationship?
Per the comment in this discussion on the elastic site, P/C is, like nested objects, at least not supported in visualizations. Le sigh.

HATEOAS with multiple pages

Is there a rule to define a sequential list of links using HATEOAS?
It is easy to add self, next and previous links to a HATEOAS-based response. Below is a sample response:
{
links : [{
rel : "next",
href : "http://localhost:8080/persons?page=1&size=20"
}],
content : [{
id: "",
name: "",
lastname: "",
age: 0
}],
pageMetadata : {
size : 20,
totalElements : 30,
totalPages : 2,
number : 0
}
};
So, pagination with next and previous links is not difficult but I couldn't figure out how it is possible to access for example 10th page directly, for example using a select element (let's assume there are more than 10 pages). Should I add all the links to the response as, for example, page1, page2, page3, etc? Of course I know the format of the request so I could just create the appropriate query but that seems kind of wrong because the whole point seems to be not relying on the actual link of the service. I am not an expert on this issue and I couldn't find an answer in this site or from Google.
Thanks in advance.
The common solution is to use URI templates. Specifically, section 3.2.8 regarding Form-Style Query Expansion.
Assuming both page and size are optional, you would construct your link URI template as localhost:8080/persons{?page,size}.
URI templates do not define which values are acceptable; there's no mechanism for restricting page or size to numeric values. As such, you should include this template in addition to your existing next/prev links. I'm not exactly sure which link relation best describes such a resource, but "collection" doesn't seem too far off.
In order to use URI templates, you'll need a library in whichever languages will be producing/consuming the template. A quick search produced a list of libraries in various languages which you may find helpful.
Lastly, depending on your hypermedia format you may need to specify a templated URI as such. For example, HAL+JSON uses a boolean templated property.
{
"_links": {
"collection": {
"href": "/persons{?page,size}",
"templated": true
}
}
}

Resources