Elasticsearch "self-join" type operation - elasticsearch

I have an index containing documents that look something like this (unnecessary fields omitted)
{
_id: String,
...
relatedIds: [ String ]
}
The relatedIds are recursively referring to the _id of the documents themselves.
I want to write a query that will return only the id's from the relatedIds array that are not an _id of a document.
In abstract, I want to grab all of these identifiers, perform some computation, so that in the end every id in the relatedIds refers to a document in the index.

If your id field is also contained in the source document, you can do it like this:
POST index/_search
{
"script_fields": {
"relatedIdsLessId": {
"script": {
"inline": "doc. relatedIds.values - doc.id.value"
}
}
}
}
This will compute a new field named relatedIdsLessId which will only contain the related IDs which are not the ID of the document itself.
Note: you need to make sure to enable dynamic scripting if not already done.

Related

What is data structure used for Elasticsearch flattened type

I was trying to find how flattened type in Elasticsearch works under the hood, the documentation specifies that all leaf values will be indexed into a single field as a keyword, as a result, there will be a dedicated index for all those flattened keywords.
From documentation:
By default, Elasticsearch indexes all data in every field and each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees.
The specific case that I am trying to understand:
If I have flattened field and index object with nested objects there is the ability to query a specific nested key in the flattened object. See how to query by labels.release:
PUT bug_reports
{
"mappings": {
"properties": {
"labels": {
"type": "flattened"
}
}
}
}
POST bug_reports/_doc/1
{
"labels": {
"priority": "urgent",
"release": ["v1.2.5", "v1.3.0"]
}
}
POST bug_reports/_search
{
"query": {
"term": {"labels.release": "v1.3.0"}
}
}
Would flattened field have the same index structure as the keyword field, and how it is able to reference the specific child key of flattened object?
The initial design and implementation of the flattened field type is described in this issue. The leaf keys are also indexed along with the leaf values, which is how they are allowing the search for a specific sub-field.
There are some ongoing improvements to the flattened field type and Elastic would also like to support numeric values, but that's not yet released.

Kibana Regex check if a field Value contains another field value

I'm trying to search for documents in which a description field contains the value of a name field (from another document). I tried to do a Regex query as following :
GET inventory-full-index/_search
{
"query": {
"regexp": {
"description.description_data.value.keyword": ".*doc['name.keyword'].*"
}
}
}
It returns me interesting documents, that fit my need. The problem is that i created a document that contains "python3" in the description, and I made sure there was a document named "python3" as well. This query doesn't return this document, so i obviously missed something.
Any idea how to fix this ?

Querying ElasticSearch document based on particular value without knowing field name

I need to query the entire index based on particular text value. I don't have field name to query for. Is it possible to search the documents based on particular text?
You can use query string.
You can specify multiple fields. If no field is specified it will search in entire document
{
"query": {
"query_string" : {
"query" : "text"
}
}
}

Using a combined field as id mapping in ElasticSearch

From this question I can see that it is possible Use existing field as id in elasticsearch
My question is, if can do similar thing but concatenating fields.
{
"RecordID": "a06b0000004SWbdAAG",
"SystemModstamp": "01/31/2013T07:46:02.000Z",
"body": "Test Body"
}
And then do something like
{
"your_mapping" : {
"_id" : {
"path" : "RecordID" + "body"
}
}
}
So the id is automatically formed from concatenating those fields.
No you can't, you can only make the _id point to a field that's within the document, using the dot notation as well if needed (e.g. level1,level2.id).
I'd suggest to have a field that contains the whole id in your documents, or even better to take the id out and provide it in the url, as configuring a path causes the document to be parsed when not needed.

Can elasticsearch return multiple value fields in a single facet?

I am looking for a way to create a facet such that I can essentially return two values for one key.
For instance, I am attempting to retrieve both an amount and schedule properties of an object. I attempted to use a computed value script, but the calculations that have to be done using the two objects are date based, and require an external library to perform them.
Basically, something along the lines of:
"theFacet": {
"terms_stats": {
"key_field": "someKeyProbablyADate",
"value_field": "amount",
"value_field": "simpleSchedule"
}
}
Workarounds are also appreciated. Perhaps some way to return a new dynamic object with both fields?
Sounds like you want to pre-process your data before you index it into a single field, then facet on that.
Something among the line of a single string containing key#amount#schedule
Then when you get the faceting results back you can split it up again and run whatever logic you want.
Try combining different fields with a script element. For example:
"facets": {
"facet-name": {
"terms": {
"field": "some-field",
"script": "_source['another-field'] + '/' + term
}
}
}

Resources