Elasticsearch: average count of matching nested documents - elasticsearch

I have documents with nested items. The mapping is something like the following:
"document": {
"properties": {
"fieldA": { "type": "integer" },
"items": { "type": "nested",
"properties": {
"is_x": {"type": "boolean"},
"name": {"type": "string"}
}
}
}
}
And a sample document:
document:
fieldA: 123,
...
items:
[
{ "name": "item1", "is_x":true},
{ "name": "item2", "is_x":false},
...
{ "name": "itemn", "is_x":true}
]
I want to get the average count of items per document that have "is_x"=false
One option is to save this value during the indexing, but I would love to know how this can be done during the search itself (search performance is not an issue in this case).

Related

Return only top level fields in elasticsearch query?

I have a document that has nested fields. Example:
"mappings": {
"blogpost": {
"properties": {
"title": { "type": "text" },
"body": { "type": "text" },
"comments": {
"type": "nested",
"properties": {
"name": { "type": "text" },
"comment": { "type": "text" },
"age": { "type": "short" },
"stars": { "type": "short" },
"date": { "type": "date" }
}
}
}
}
}
}
Can the query be modified so that the response only contains non-nested fields?
In this example, the response would only contain body and title.
Using _source you can exclude/include fields
GET /blogpost/_search
{
"_source":{
"excludes":["comments"]
}
}
But you have to explicitly put the field names inside exclude, I'm searching for a way to exclude all nested fields without knowing their field name
You can achieve that but in a static way, which means you entered the field(s) name using excludes keyword, like:
GET your_index/_search
{
"_source": {
"excludes": "comments"
},
"query": {
"match_all" : {}
}
}
excludes can take an array of strings; not just one string.

Elasticsearch: Schema without mapping?

According to Elasticsearch's roadmap, mapping types are going to be completely removed at 7.x
How are we going to give a schema structure to Documents without mapping?
For example how would we replace this (A Doc/mapping_type with 3 fields of specific data type):
PUT twitter
{
"mappings": {
"user": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
They are going to remove types (user in you example) from mapping, because there is only 1 type per index now, the rest will be the same:
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
}
As you can see, there is no user type anymore.

Elasticsearch Field Preference for result sequence

I have created the index in elasticsearch with the following mapping:
{
"test": {
"mappings": {
"documents": {
"properties": {
"fields": {
"type": "nested",
"properties": {
"uid": {
"type": "keyword"
},
"value": {
"type": "text",
"copy_to": [
"fulltext"
]
}
}
},
"fulltext": {
"type": "text"
},
"tags": {
"type": "text"
},
"title": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"url": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
}
While searching I want to set the preference of fields for example if search text found in title or url then that document comes first then other documents.
Can we set a field preference for search result sequence(in my case preference like title,url,tags,fields)?
Please help me into this?
This is called "boosting" . Prior to elasticsearch 5.0.0 - boosting could be applied in indexing phase or query phase( added as part of field mapping ). This feature is deprecated now and all mappings after 5.0 are applied in query time .
Current recommendation is to to use query time boosting.
Please read this documents to get details on how to use boosting:
1 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html
2 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html

Querying inner objects by field name in Elasticsearch

I'm reading the documentation on this site: https://www.elastic.co/guide/en/elasticsearch/guide/current/complex-core-fields.html#_how_inner_objects_are_indexed
I'm interested in this paragraph:
Inner fields can be referred to by name (for example, first). To distinguish between two fields that have the same name, we can use the full path (for example, user.name.first) or even the type name plus the path (tweet.user.name.first).
so if I have the example from the linked docu site:
{
"gb": {
"tweet": {
"properties": {
"tweet": { "type": "string" },
"user": {
"type": "object",
"properties": {
"id": { "type": "string" },
"gender": { "type": "string" },
"age": { "type": "long" },
"name": {
"type": "object",
"properties": {
"full": { "type": "string" },
"first": { "type": "string" },
"last": { "type": "string" }
}
}
}
}
}
}
}
}
According to the docu I should be able to search with condition last: whatever, but it does not work. I always have to use the full path user.last: whatever. Is the documentation false or my understanding of it? Note that last occurs only in the inner object, so in theory full path should not be necessary to reference it.
edit:
query that works:
get /my_index/_search?q=user.name.last:test
query that does not work but should according to documentation:
get /my_index/_search?q=last:test

Nested type in Elasticsearch: "object mapping can't be changed from nested to non-nested" when indexing a document

I try to index some nested documents into an Elasticsearch (v2.3.1) mapping which looks as follows (based on this example from the documentation):
PUT /my_index
{
"mappings": {
"blogpost": {
"properties": {
"title": { "type": "string" },
"comments": {
"type": "nested",
"properties": {
"name": { "type": "string" },
"comment": { "type": "string" }
}
}
}
}
}
}
However, I do not understand what my JSON documents have to look like in order to fit into that mapping. I tried with
PUT /my_index/some_type/1
{
"title": "some_title",
"comments": {
"name": "some_name",
"comment": "some_comment"
}
}
as well as with
PUT /my_index_some_type/1
{
"title": "some_title",
"comments": [
{
"name": "some_name",
"comment": "some_comment"
}
]
}
which both result in
{
"error":
{
"root_cause":
[
{
"type": "remote_transport_exception",
"reason": "[Caiman][172.18.0.4:9300][indices:data/write/index[p]]"
}
],
"type": "illegal_argument_exception",
"reason": "object mapping [comments] can't be changed from nested to non-nested"
},
"status": ​400
}
Which is the correct format to index nested documents? Any working examples are much appreciated, most examples here at SO or on other pages concentrate on nested queries rather than how the documents have been indexed before.
It seems you're really creating a document of type some_type and comments will default to a normal object (i.e. not nested), which is not allowed since you already have a nested object called comments in the blogpost mapping type in the same index.
Try this instead and it should work:
PUT /my_index/blogpost/1
{
"title": "some_title",
"comments": {
"name": "some_name",
"comment": "some_comment"
}
}

Resources