Nested queries in SmallRye GraphQL - graphql

1. I want to create a GraphQL query that can filter by multiple records.
It is about filtering details of statistics. For example, the statistic contains the fields "Number of deaths", "Number of cases", "Number of recovered".
I have already written queries that can filter by the individual fields. Now I want to program a query that uses multiple filters, or a query in which multiple queries are nested.
I have already tried to define the individual steps of each query in a common query. You can see this in the attached images. The program compiles first. However, when I execute the query in the GraphQL UI, I get error messages
2. Unfortunately, I have not yet received any helpful tips regarding my query or my error.
Screenshot
At the top left you can see the individual queries, at the top right the merged query and at the bottom the errors as soon as I try to execute the query.

Related

How can I get information from 2 different ElasticSearch indexes?

So, I have 2 indexes in my Elasticsearch server.
I need to gather the results from the first index, and for each result I need to gather info from the second index.
How to do that? Tried the foreach processor, but no luck so far.
Tky
I need to gather the results from the first index, and for each result I need to gather info from the second index.
Unless you create parent/child relationships, that's not possible in ElasticSearch.
However, note:
In Elasticsearch the key to good performance is to de-normalize your data into documents. Each join field, has_child or has_parent query adds a significant tax to your query performance.
Handle reading from multiple indexes within your application or rethink your index mapping.
The foreach processor is for ingest pipelines, meaning, stuff that gets done at indexing time. So it won't help you when you are trying to gather the results.
In general, it's not going to be possible to query another index (which might live on another shard) from within a query.
In some cases, you can use a join field. There are performance implications, it's only recommended in specific cases.
If you are not in the join field use case, and you can restructure your data to use nested objects, it will be more performant than join fields.
Otherwise, you'll be better off running multiple queries in the application code (maybe you can fetch all the "secondary" results using just one query, so you'd have 2 queries in total?)

Associating each document with a function to be satisfied by search parameters in Elasticsearch

In Elasticsearch, can I associate each document with a (different) function that must be satisfied by parameters I supply on a search, in order to be returned on that search?
The particular functions I would particularly like to use involve a loop, some kind of simple branching (if-statement of switch-statement), an array-like data structure, strings comparisons, and simple boolean operators.
couple of keynotes here:
At query time:
- If your looking to shape the relevancy function, meaning the actual relevancy score of each document, you could use a script score query.
- If you're only looking to filter out unwanted documents, you could use a script query that allows you to do just that.
Both of those solutions enables you to compute a score comparing incoming query parameters against existing previously indexed values.
Take note that usage of scripts at query time can lead to increased memory usage and performance issues.
Elastic can also handle a second batch of filtering rules that are applied to the actual query result in the form of a post filter. Can come in handy sometime if you're not in a position of stream processing the output at API view level.
At index time:
There is such a thing called script fields that allows you to store a function that computes a result based on other fields value and incoming query parameters. they can be really powerful given the fact that they are assigned at index time. I think they might be what you are looking for.
I would not be using those if i weren't to have those field values compared against query params. Reason is that I like my index process to be lean and fast so I tend to compute those kinds of values at stream level, in upstream from the actual bulk indexing query.
Although convenient, those custom scripts results are likely to be achievable with a combination of regular queries and filters. In each release, the elasticsearch teams is adding new query and field types that let you do what you use to do via scripted queries whiteout the risk of blowing out you memory. a good example of this is the rank feature datatype recently introduced in the 7.x release.
A piece of advice for you. think of your elasticsearch service as a regular API in your datalayer. As such you can do query processing before the actual call to elastic and you can do data processing from the actual elastic results. If you really can't fit your business rules in there, that would be your last resort.
Fell free to contact me if you still have any questions. All the best.

Spring Data ElasticSearch: returned scores are off

I have a Spring Boot project with org.springframework.boot:spring-boot-starter-data-elasticsearch:jar:2.0.0.RELEASE connecting to a elasticsearch-6.3.1 server.
I have the following scenario: for some elasticsearch query (which involves a should bool), I get different scores from when I run the query manually, using curl.
Steps I have tried: Extract query with debugger from SearchQuery before calling the repo, extract query from elasticsearch logs (using "index.search.slowlog.threshold.fetch.debug" : "0s", "index.search.slowlog.threshold.query.debug" : "0s"); in both cases, running the queries manually, with curl, gives a set of scores that are different from the ones given by Java api.
I mention that I couldn't find a pattern by looking at the diff between the two score sets. The scores returned by the manual query seem to be the correct ones, because I expect some of them to have the same value, which does not happen for the ones returned by the api.
If you have any ideas on what might cause this or how to continue the investigation it is much appreciated.
I have managed to make the api return the same scores as the manual run by wrapping the inner query with a constantScoreQuery, it seems that the TF/IDF criteria was the 'culprit'.
It is still curious, though, why the manual query behaved as ignoring TF/IDF in the first place ..

Can I get messages from the Kibana visualization?

Wondering if there is a way to get list of the messages related to a Kibana visualization. I understand if I apply the same filter on the "Discover", which is on "Visualization", I can filter the related messages. But I want to have more direct user experience like an user clicks on a region of a graph and can get the related messages which formed that region. Is there any way to do it?
This helped me:
https://discuss.elastic.co/t/can-i-get-the-related-messages-from-a-kibana-visualization/101692/2
It says:
Not directly, unfortunately. You can click on the visualization to create a filter, and you can pin that filter and take it to discover, which will do what you're asking, but isn't very obvious.
The reason is that visualizations are built using aggregate data, so they don't know what the underlying documents are, they only know the aggregate representation of the information. For example, if you have a bunch of traffic data, and you are looking at bytes over time, the records get bucketed by time and the aggregate of the bytes in that bucket are shown (average, sum, etc.).
In contrast, Discover only works with the raw documents, showing you exactly what you have stored in Elasticsearch. Both documents and aggregations can use filters and queries, which is why you can create a filter in one and use it in the other, but the underlying data is not the same.

How to write fast Elastic Search queries

Is there a guide to writing the ES queries - what to do, what to avoid, this sort of stuff. The official site describes all various ways to search, but provides little giudance as to when select what.
In my particular instance I have a list of providers, each one has a name an address and a number of IDs. I want to give the user a box he can type in anything he knows about the provider and run search based on whatever is provided. Essentially I would like to match every word from the box against the records (documents) in the index.
For the end user this should look like a simple keyword search.
Matching should cover exact matches, wild card matches, phonetic matches, synonyms (for names). Also some fuzziness should be included too.
The official site describes various ways to do that, but how to combine them together? For instance to support wild card search do I use wild card query, or do I index it with the NGram and do just text query?
With the SQL queries a certain way to get this sort of information is to check the execution plan for the query. If the SQL optimizer tells you that it will use table scan against a table of considerable size, you know you should change your query, or, may be, add an index. AFAIK there is no equivalent for this powerful feature in ES and I am not even sure if it is possible to build it.
But at least some generic considerations...? Pretty please...
There is not a best way to go about doing things, because a lot of times it depends on what you are indexing, and how you map your data into variables within Elasticsearch.
Some rule of thumb that you should look out for:
a. Faceted Queries in Elasticsearch work in sequences:
{
"query": {
// data will be searched from this block first //
}, "facets": {
// after the data is received, it will be processed into facets //
}
}
Hence if your query size is huge, you are going to slow down your query further by faceting. Monitor the results of your query.
b. Filters vs Queries
Filters do a subset of your queries, meaning it will take the entire result of what your query is, and then filter out what you do want or what you do not want.
Queries are usually direct searches for data.
Hence, if you can make your query as specific as possible before you do a filter, it should yield faster results.
c. Queries are cached; running them again and again will generally yield faster responses. The Warmers API should be able to make your queries even quicker if you are always going to use the same set of queries
Again, all these are rule of thumbs and cannot be followed strictly, because what you index into specific variables will affect processing times. A string is different from long types, and strings with analyzers are different from non-analyzers. What you need to do is probably to experiment with your queries to get a better judgement.
One correction from the above - Filters are cacheable by ES, and not queries. Queries does the extra step of relevance scoring & full text search. So, where ever full text search is not needed using filter is advised.
Also, design your mappings with correct index values (not_analyzed, no, analyzed)

Resources