Elasticsearch facet term versus label - elasticsearch

The object that I am indexing has both a UserId (a GUID) and a FullName property. I would like to do a faceted search using the UserId but display the FullName so that it's readable. I don't want to do the facet on the FullName since it technically doesn't have to be unique.
Right now I'm doing something like this:
{
"query": {
"match_all": {}
},
"facets": {
"userFacet": {
"terms": {
"field": "userId"
}
}
}
}
But then it is giving me the Guids in the response which I would need to hit the database to lookup the full name which is obviously not a real solution.
So how can I use one field to do the facets with and then a different field for the display values to use?

i am using the below line for displaying the fields.
It is working for grids only but not facets.
Still facet is taking from the id only not label.
fields: [{
id: 'name_first',
'label': 'First Name'
}]

Try adding a fields clause to your query, as follows:
{
"query": {
"match_all": {}
},
"facets": {
"userFacet": {
"terms": {
"field": "userId"
}
}
},
"fields": [ "FullName" ]
}
Note that in order to return field values in search results, you need to actually store them in ES. So you need to either store the _source (this happens by default, unless you override it with "_source" : {"enabled" : false}, or set "store": "yes" for that field in your mapping.

Related

How to rank ElasticSearch documents based on scores

I have an Elastic search index that contain thousands of documents, each document represent a user.
each document has set of fields (is_verified: boolean, country: string, is_creator: boolean), also i have another service that call ES search to lookup for documents, how i can rank the retrieved documents based on those fields? for example a verified user with match should come first than un verified one.
is there some kind of document scoring while indexing the documents ? if yes can i modify it based on my criteria ?
what shall i read/look to understand how to rank in elastic search.
thanks
I guess the sorting function mentioned by Mikael is pretty straight forward and should cover your use cases. Check Elastic Doc for more information on that.
But in case you want to do really fancy sorting, maybe you could use a bool query and different boost values to set your desired relevancy for each matched field. It tried to come up with a real life example, but honestly didn't find one. For the sake of completeness, he following snippet should give you an idea how to achieve similar results as with the sort API (but still, i would prefer using sort).
GET /yourindexname/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "Monica"
}
}
],
"should": [
{
"term": {
"is_verified": {
"value": true,
"boost": 2
}
}
},
{
"term": {
"is_creator": {
"value": true,
"boost": 2
}
}
}
]
}
}
}
is there some kind of document scoring while indexing the documents ? if yes can i modify it based on my criteria ?
I wouldn't assign a fixed score to a document while indexing, as the score should be dependent on the query. However, if you insist to have a predefined relevancy for each document, theoretically you could add a field relevancy having that value for ordering and use it later in the query:
GET /yourindexname/_search
{
"query" : {
"match" : {
"name": "Monica"
}
},
"sort" : [
{
"relevancy": {
"order": "desc"
},
"_score"
}
]
}
You can consider using the Sort Api inside your search queries ,In example below we used the search on the field country and sorted the result with respect of Boolean field (is_verified) , You can also add the other Boolean field inside Sort brackets .
GET /yourindexname/_search
{
"query" : {
"match" : {
"country": "Iceland"
}
},
"sort" : [
{
"is_verified": {
"order": "desc"
}
}
]
}

Finding which nested entry matched an elasticsearch query

Say I'm indexing elasticsearch data like so:
{"entities": {
"type": "firstName",
"value": "Barack",
},
{
"type": "lastName",
"value": "Obama"
}}
I'd like users to be able to add custom attributes, so I don't know every possible value of "type" ahead of time.
My mappings might look like:
typename:
entities:
type: nested
If I do a match query for the text "Obama", with highlighting, is there a way to get back the full nested "entity" which matched? I would like to know if my query for "Obama" matched the firstName or the lastName.
I was able to solve this with inner_hits (thanks Andrei!)
{
"query": {
"nested": {
"query": {
{"match": {"entities.name": "Obama"}}
}
},
"inner_hits": {
"highlight": {
"fields": {
"entities.name": {}
}
}
}
}
}

How to make use of `gt` and `fields` in the same query in Elasticsearch

In my previous question, I was introduced to the fields in a query_string query and how it can help me to search nested fields of a document.
{
"query": {
"query_string": {
"fields": ["*.id","id"],
"query": "2"
}
}
}
But it only works for matching, what if I want to do some comparison? After some reading and testing, it seems queries like range do not support fields. Is there any way I can perform a range query, e.g. on a date, over a field that can be scattered anywhere in the document hierarchy?
i.e. considering the following document:
{
"id" : 1,
"Comment" : "Comment 1",
"date" : "2016-08-16T15:22:36.967489",
"Reply" : [ {
"id" : 2,
"Comment" : "Inner comment",
"date" : "2016-08-16T16:22:36.967489"
} ]
}
Is there a query searching over the date field (like date > '2016-08-16T16:00:00.000000') which matches the given document, because of the nested field, without explicitly giving the address to Reply.date? Something like this (I know the following query is incorrect):
{
"query": {
"range" : {
"date" : {
"gte" : "2016-08-16T16:00:00.000000",
},
"fields": ["date", "*.date"]
}
}
}
The range query itself doesn't support it, however, you can leverage the query_string query (again) and the fact that you can wildcard fields and that it supports range queries in order to achieve what you need:
{
"query": {
"query_string": {
"query": "\*date:[2016-08-16T16:00:00.000Z TO *]"
}
}
}
The above query will return your document because Reply.date matches *date

How to query for many facets in single elasticsearch query

I'm looking for a way to query the distribution of the top n values for many object fields in single query
My object in elastic search looks like:
obj: {
os: "Android",
device_model: "Samsung Galaxy S II (GT-I9100)",
device_brand: "Samsung",
os_version: "Android-2.3",
country: "BR",
interests: [1,2,3],
behavioral_segment: ["sport", "lifestyle"]
}
The following query brings the distribution of the values for specific field with number of appearances of this value only for the UK users
curl -XPOST http://<endpoint>/profiles/_search?search_type=count -d '
{
"query": {
"match": {
"country" : "UK"
}
},
"facets": {
"ItemsPerCategoryCount": {
"terms": {
"field": "behavioral_segment"
}
}
}
}'
How can I query for many fields - for example I would like to get a result for behavioral_segment and device_brand and os in single query. Is it possible?
In the facets section of the query, you should use the fields parameter.
"facets": {
"ItemsPerCategoryCount": {
"terms": {
"fields": ["behavioral_segment","device_brand"]
}
}
}
That should solve your problem, but of course it might not garantee the coherence of the data

Elasticsearch grouping facet by owner, mine vs others

I am using Elasticsearch to index documents that have an owner which is stored in a userId property of the source object. I can easily do a facet on the userId and get facets for each owner that there is, but I'd like to have the facets for owner show up like so:
Documents owned by me (X)
Documents owned by others (Y)
I could handle this on the client side and take all of the facets returned by elasticsearch and go through them and figure out those owned by the current user and not and display it appropriately, but I was hoping there was a way to tell elasticsearch to handle this in the query itself.
You can use filtered facets to do this:
curl -XGET "http://localhost:9200/_search" -d'
{
"query": {
"match_all": {}
},
"facets": {
"my_docs": {
"filter": {
"term": { "user_id": "my_user_id" }
}
},
"others_docs": {
"filter": {
"not": {
"term": { "user_id": "my_user_id" }
}
}
}
}
}'
One of the nice things about this is that the two terms filters are identical and so are only executed once. The not filter just inverts the results of the cached term filter.
You're right, ElasticSearch has a way to do that. Take a look to scripting term facets, specially to the second example ("using the boolean feature"). You should be able to do somthing like:
{
"query" : {
"match_all" : { }
},
"facets" : {
"userId" : {
"terms" : {
"field" : "userId",
"size" : 10,
"script" : "term == '<your user id>' ? true : false"
}
}
}
}

Resources