Elasticsearch version: 5.0.2
I populate my index with:
{_id: 1, tags: ['plop', 'plip', 'plup']},
{_id: 2, tags: ['plop', 'plup']},
{_id: 3, tags: ['plop']},
{_id: 4, tags: ['plap', 'plep']},
{_id: 5, tags: ['plop', 'plip', 'plup']},
{_id: 6, tags: ['plup', 'plip']},
{_id: 7, tags: ['plop', 'plip']}
Then, I would like to retrieve the max relevant rows for tags plop and plip:
query: {
bool: {
should: [
{term: {tags: {value:'plop', _name: 'plop'}}},
{term: {tags: {value:'plip', _name: 'plip'}}}
]
}
}
which is equivalent to (but I used the former one to debug):
query: {
bool: {
should: [
{terms: {tags: ['plop', 'plip']}}
]
}
}
Then, I find really strange scores:
[
{ id: '2', score: 0.88002616, tags: [ 'plop', 'plup' ] },
{ id: '6', score: 0.88002616, tags: [ 'plup', 'plip' ] },
{ id: '5', score: 0.5063205, tags: [ 'plop', 'plip', 'plup' ] },
{ id: '7', score: 0.3610978, tags: [ 'plop', 'plip' ] },
{ id: '1', score: 0.29277915, tags: [ 'plop', 'plip', 'plup' ] },
{ id: '3', score: 0.2876821, tags: [ 'plop' ] }
]
Here is the detail of the response:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0.88002616,
"hits": [
{
"_index": "myindex",
"_type": "mytype",
"_id": "2",
"_score": 0.88002616,
"_source": {
"tags": [
"plop",
"plup"
]
},
"matched_queries": [
"plop"
]
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "6",
"_score": 0.88002616,
"_source": {
"tags": [
"plup",
"plip"
]
},
"matched_queries": [
"plip"
]
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "5",
"_score": 0.5063205,
"_source": {
"tags": [
"plop",
"plip",
"plup"
]
},
"matched_queries": [
"plop",
"plip"
]
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "7",
"_score": 0.3610978,
"_source": {
"tags": [
"plop",
"plip"
]
},
"matched_queries": [
"plop",
"plip"
]
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "1",
"_score": 0.29277915,
"_source": {
"tags": [
"plop",
"plip",
"plup"
]
},
"matched_queries": [
"plop",
"plip"
]
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "3",
"_score": 0.2876821,
"_source": {
"tags": [
"plop"
]
},
"matched_queries": [
"plop"
]
}
]
}
}
So, two questions:
Why a row maching only one query (id 2 and 6) have a better score than one matching two (id 1, 5 and 7)?
Why two rows having the same tags can have different scores? (id 1 and 5)
Did I miss something?
Ok i figure out your real problem. Elasitcsearch by default use 5 shards to store your index data, and if you have small number it can matter in case computing your _score value. Some theory about shards: https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html
Why does it matter? Because for better performance each shard make _score computing on his own data. But while computing score value elasticsearch use IDF/TF algorithm which relies on total number of docs and frequency of searching terms (IN SHARD) (https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html)
To solve this problem you can create index with one shard like this:
{
"settings": {
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings": {
"my_type": {
"properties": {
"tags": {
"type": "keyword"
}
}
}
}
}
You can verify my theory using ?explain in your search query:
http://localhost:9200/test1/my_type/_search?explain
Or you can read this example if you need more ;)
Those are my results for your query: ["plop", "plip"]
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0.9808292,
"hits": [
{
"_index": "test",
"_type": "my_type",
"_id": "2",
"_score": 0.9808292,
"_source": {
"tags": [
"plop",
"plup"
]
}
},
{
"_index": "test",
"_type": "my_type",
"_id": "6",
"_score": 0.9808292,
"_source": {
"tags": [
"plup",
"plip"
]
}
},
{
"_index": "test",
"_type": "my_type",
"_id": "5",
"_score": 0.5753642,
"_source": {
"tags": [
"plop",
"plip",
"plup"
]
}
},
{
"_index": "test",
"_type": "my_type",
"_id": "1",
"_score": 0.36464313,
"_source": {
"tags": [
"plop",
"plip",
"plup"
]
}
},
{
"_index": "test",
"_type": "my_type",
"_id": "7",
"_score": 0.36464313,
"_source": {
"tags": [
"plop",
"plip"
]
}
},
{
"_index": "test",
"_type": "my_type",
"_id": "3",
"_score": 0.2876821,
"_source": {
"tags": [
"plop"
]
}
}
]
}
}
Why is document with plop,plip,plup as third? Check explain for this one:
"_shard": "[test][1]",
"_node": "LjGrgIa7QgiPlEvMxqKOdA",
"_index": "test",
"_type": "my_type",
"_id": "5",
"_score": 0.5753642,
"_source": {
"tags": [
"plop",
"plip",
"plup"
]
},
This is the only one doc in this shard: test[1] (i verified in other returned docs) !! So IDF value is equal to '1' which is the highest possible value. Score = TF/IDF (so for lower IDF, the score is higher). Check how this 0.5753642 score is computed for this doc:
"value": 0.2876821,
"description": "weight(tags:plop...
"details": [
{
"value": 0.2876821,
"description": "idf(docFreq=1, docCount=1)",
sum with
{
"value": 0.2876821,
"description": "weight(tags:plip..
"value": 0.2876821,
"description": "idf(docFreq=1, docCount=1)",
"details": []
},
The problem I had is nicely explained in the answer of jgr.
The solution I've found is to use dfs_query_then_fetch as search type.
Here is the resulting query with the JavaScript client:
body: {
query: {
bool: {
should: [
{terms: {tags: ['plop', 'plip']}}
]
}
},
searchType: 'dfs_query_then_fetch'
}
Note that with more data in the index type, this certainly wouldn't be needed because scores would balance naturally between shards.
Related
I have serveral documents that look like the following stored in my elastic search index:
PUT tests
{
"mappings": {
"_doc": {
"dynamic": false,
"properties": {
"objects": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"text": {
"type": "text"
}
}
}
}
}
PUT tests/_doc/1
{
"text": "lel",
"objects": ["A"]
}
PUT tests/_doc/2
{
"text": "lol",
"objects": ["B"]
}
PUT tests/_doc/3
{
"text": "lil",
"objects": ["C"]
}
PUT tests/_doc/4
{
"text": "lul",
"objects": ["A", "B", "C"]
}
I want to query for objects with the following query:
GET _search
{
"query": {
"terms": {
"objects.keyword": ["A", "B", "C"]
}
}
}
The result includes all three sample objects I provided.
My question is simply whether I can make an object appear of a higher importance (boost) that has a full match (all keywords in the objects array) and not just only a partial match and if so how, since I could not find any information in the elastic search documentation.
This is the result I am currently receiving:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 11,
"successful": 11,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1,
"hits": [
{
"_index": "tests",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"text": "lol",
"objects": [
"B"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "4",
"_score": 1,
"_source": {
"text": "lul",
"objects": [
"A",
"B",
"C"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"text": "lel",
"objects": [
"A"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "3",
"_score": 1,
"_source": {
"text": "lil",
"objects": [
"C"
]
}
}
]
}
}
I think your best bet is using a bool query with should and minimum_should_match: 1.
GET _search
{
"query": {
"bool": {
"should": [
{
"term": {
"objects.keyword": "A"
}
},
{
"term": {
"objects.keyword": "B"
}
},
{
"term": {
"objects.keyword": "C"
}
}
],
"minimum_should_match": 1
}
}
}
Results:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 6,
"successful": 6,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1.5686159,
"hits": [
{
"_index": "tests",
"_type": "_doc",
"_id": "4",
"_score": 1.5686159,
"_source": {
"text": "lul",
"objects": [
"A",
"B",
"C"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"text": "lel",
"objects": [
"A"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "3",
"_score": 0.2876821,
"_source": {
"text": "lil",
"objects": [
"C"
]
}
},
{
"_index": "tests",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"text": "lol",
"objects": [
"B"
]
}
}
]
}
}
EDIT: Here's why, as explained by the docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html):
The bool query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document.
My Data looks like this After sorting : Below is the response after sorting. But this is not the expected output.
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 8,
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "size",
"_id": "AWVVTy-v9pbhY5QtJPGe",
"_score": null,
"_source": {
"e_size": "5"
},
"sort": [
"5"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTmY89pbhY5QtJPGa",
"_score": null,
"_source": {
"e_size": "3"
},
"sort": [
"3"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTxXe9pbhY5QtJPGd",
"_score": null,
"_source": {
"e_size": "100"
},
"sort": [
"100"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTpYJ9pbhY5QtJPGc",
"_score": null,
"_source": {
"e_size": "10-"
},
"sort": [
"10-"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTk1x9pbhY5QtJPGY",
"_score": null,
"_source": {
"e_size": "1-7"
},
"sort": [
"1-7"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTnsm9pbhY5QtJPGb",
"_score": null,
"_source": {
"e_size": "1-6"
},
"sort": [
"1-6"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVTjAq9pbhY5QtJPGX",
"_score": null,
"_source": {
"e_size": "1-2"
},
"sort": [
"1-2"
]
},
{
"_index": "test",
"_type": "size",
"_id": "AWVVThkT9pbhY5QtJPGW",
"_score": null,
"_source": {
"e_size": "1"
},
"sort": [
"1"
]
}
]
}
}
Below is the sort used to get the results.
{
"sort": [{
"e_size": {
"order": "desc"
}
}]
}
e_size type is "String" and index is "not_analyzed"
How to fix this sort issue. Do we need to use any analyzer for this. Or e_size data type should be different.
It has sorted it correctly. You can try to sort those strings yourself and you would get the same result.
Example:
["100", "10-"]
here "0" < "-" so that's why "10-" comes after "100" and so on. you can think of it as how do you find some words in a dictionary.
Either make e_size a number or use any different series of string for each e_size
Currently the ES logs are indexed in a way that some fields have a list instead of a single value.
For example:
_source:{
"field1":"["item1", "item2", "item3"],
"field2":"something",
"field3": "something_else"
}
Of course, the length of list is not always the same. I'm trying to find a way to aggregate the number of logs that consist each item (so some logs will be counted multiple times)
I know I have to use aggs, but how can I form the right query (after -d)?
You can use below query that uses terms aggregation and top_hits.
{
"size": 0,
"aggs": {
"group": {
"terms": {
"script": "_source.field1.each{}"
},
"aggs":{
"top_hits_log" :{
"top_hits" :{
}
}
}
}
}
}
Output will be:
"buckets": [
{
"key": "item1",
"doc_count": 3,
"top_hits_log": {
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "so",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"field1": [
"item1",
"item2",
"item3"
],
"field2": "something1"
}
},
{
"_index": "so",
"_type": "test",
"_id": "2",
"_score": 1,
"_source": {
"field1": [
"item1"
],
"field2": "something2"
}
},
{
"_index": "so",
"_type": "test",
"_id": "3",
"_score": 1,
"_source": {
"field1": [
"item1",
"item2"
],
"field2": "something3"
}
}
]
}
}
},
{
"key": "item2",
"doc_count": 2,
"top_hits_log": {
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "so",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"field1": [
"item1",
"item2",
"item3"
],
"field2": "something1"
}
},
{
"_index": "so",
"_type": "test",
"_id": "3",
"_score": 1,
"_source": {
"field1": [
"item1",
"item2"
],
"field2": "something3"
}
}
]
}
}
},
{
"key": "item3",
"doc_count": 1,
"top_hits_log": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "so",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"field1": [
"item1",
"item2",
"item3"
],
"field2": "something1"
}
}
]
}
}
}
]
Make sure to enable dynamic scripting. Set script.disable_dynamic: false
Hope this helps.
There is no need to use scripting. It will be slow especially _source parsing. You also need to make sure your field1 is not_analyzed or you will get weird results as terms aggregation is performed on unique tokens in Inverted Index.
{
"size": 0,
"aggs": {
"unique_items": {
"terms": {
"field": "field1",
"size": 100
},
"aggs": {
"documents": {
"top_hits": {
"size": 10
}
}
}
}
}
}
Here the size is 100 inside terms aggregation, change this according to how many unique values you think you have(default is 10).
Hope this helps!
How to sort by match prioritising the most left words matched
Explanation
Sort the prefix query by the word it matches, but prioritising the matches in the words more at left.
Tests I've made
Data
DELETE /test
PUT /test
PUT /test/person/_mapping
{
"properties": {
"name": {
"type": "multi_field",
"fields": {
"name": {"type": "string"},
"original": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
PUT /test/person/1
{"name": "Berta Kassulke"}
PUT /test/person/2
{"name": "Kaley Bartoletti"}
PUT /test/person/3
{"name": "Kali Hahn"}
PUT /test/person/4
{"name": "Karolann Klein"}
PUT /test/person/5
{"name": "Sofia Mandez Kaloo"}
The mapping was added for the 'sort on original value' test.
Simple query
Query
POST /test/person/_search
{
"query": {
"prefix": {"name": {"value": "ka"}}
}
}
Result
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "4",
"_score": 1,
"_source": {
"name": "Karolann Klein"
}
},
{
"_index": "test",
"_type": "person",
"_id": "5",
"_score": 1,
"_source": {
"name": "Sofia Mandez Kaloo"
}
},
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"_source": {
"name": "Berta Kassulke"
}
},
{
"_index": "test",
"_type": "person",
"_id": "2",
"_score": 1,
"_source": {
"name": "Kaley Bartoletti"
}
},
{
"_index": "test",
"_type": "person",
"_id": "3",
"_score": 1,
"_source": {
"name": "Kali Hahn"
}
}
]
}
}
With sorting
Request
POST /test/person/_search
{
"query": {
"prefix": {"name": {"value": "ka"}}
},
"sort": {"name": {"order": "asc"}}
}
Result
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "2",
"_score": null,
"_source": {
"name": "Kaley Bartoletti"
},
"sort": [
"bartoletti"
]
},
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": null,
"_source": {
"name": "Berta Kassulke"
},
"sort": [
"berta"
]
},
{
"_index": "test",
"_type": "person",
"_id": "3",
"_score": null,
"_source": {
"name": "Kali Hahn"
},
"sort": [
"hahn"
]
},
{
"_index": "test",
"_type": "person",
"_id": "5",
"_score": null,
"_source": {
"name": "Sofia Mandez Kaloo"
},
"sort": [
"kaloo"
]
},
{
"_index": "test",
"_type": "person",
"_id": "4",
"_score": null,
"_source": {
"name": "Karolann Klein"
},
"sort": [
"karolann"
]
}
]
}
}
With sort on original value
Query
POST /test/person/_search
{
"query": {
"prefix": {"name": {"value": "ka"}}
},
"sort": {"name.original": {"order": "asc"}}
}
Result
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": null,
"_source": {
"name": "Berta Kassulke"
},
"sort": [
"Berta Kassulke"
]
},
{
"_index": "test",
"_type": "person",
"_id": "2",
"_score": null,
"_source": {
"name": "Kaley Bartoletti"
},
"sort": [
"Kaley Bartoletti"
]
},
{
"_index": "test",
"_type": "person",
"_id": "3",
"_score": null,
"_source": {
"name": "Kali Hahn"
},
"sort": [
"Kali Hahn"
]
},
{
"_index": "test",
"_type": "person",
"_id": "4",
"_score": null,
"_source": {
"name": "Karolann Klein"
},
"sort": [
"Karolann Klein"
]
},
{
"_index": "test",
"_type": "person",
"_id": "5",
"_score": null,
"_source": {
"name": "Sofia Mandez Kaloo"
},
"sort": [
"Sofia Mandez Kaloo"
]
}
]
}
}
Intended result
Sorted by name ASC but prioritising the matches on the most left words
Kaley Bartoletti
Kali Hahn
Karolann Klein
Berta Kassulke
Sofia Mandez Kaloo
Good Question. One way to achieve this would be with the combination of edge ngram filter and span first query
This is my setting
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase",
"edge_filter",
"asciifolding"
]
}
},
"filter": {
"edge_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 8
}
}
}
},
"mappings": {
"person": {
"properties": {
"name": {
"type": "string",
"analyzer": "my_custom_analyzer",
"search_analyzer": "standard",
"fields": {
"standard": {
"type": "string"
}
}
}
}
}
}
}
After that I inserted your sample documents. Then I wrote the following query with dis_max. Notice that end parameter for first span query is 1 so this will prioritize(higher score) leftmost match. I am first sorting by score and then by name.
{
"query": {
"dis_max": {
"tie_breaker": 0.7,
"boost": 1.2,
"queries": [
{
"match": {
"name": "ka"
}
},
{
"span_first": {
"match": {
"span_term": {
"name": "ka"
}
},
"end": 1
}
},
{
"span_first": {
"match": {
"span_term": {
"name": "ka"
}
},
"end": 2
}
}
]
}
},
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"name.standard": {
"order": "asc"
}
}
]
}
The result I get
"hits": [
{
"_index": "esedge",
"_type": "policy_data",
"_id": "2",
"_score": 0.72272325,
"_source": {
"name": "Kaley Bartoletti"
},
"sort": [
0.72272325,
"bartoletti"
]
},
{
"_index": "esedge",
"_type": "policy_data",
"_id": "3",
"_score": 0.72272325,
"_source": {
"name": "Kali Hahn"
},
"sort": [
0.72272325,
"hahn"
]
},
{
"_index": "esedge",
"_type": "policy_data",
"_id": "4",
"_score": 0.72272325,
"_source": {
"name": "Karolann Klein"
},
"sort": [
0.72272325,
"karolann"
]
},
{
"_index": "esedge",
"_type": "policy_data",
"_id": "1",
"_score": 0.54295504,
"_source": {
"name": "Berta Kassulke"
},
"sort": [
0.54295504,
"berta"
]
},
{
"_index": "esedge",
"_type": "policy_data",
"_id": "5",
"_score": 0.2905494,
"_source": {
"name": "Sofia Mandez Kaloo"
},
"sort": [
0.2905494,
"kaloo"
]
}
]
I hope this helps.
I am using ElasticSearch via NEST c#. I have large list of information about people
{
firstName: 'Frank',
lastName: 'Jones',
City: 'New York'
}
I'd like to be able to filter and sort this list of items by lastName as well as order by the length so people who only have 5 characters in their name will be at the beginning of the result set then people with 10 characters.
So with some pseudo code I'd like to do something like
list.wildcard("j*").sort(m => lastName.length)
You can do the sorting with script-based sorting.
As a toy example, I set up a trivial index with a few documents:
PUT /test_index
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"name":"Bob"}
{"index":{"_id":2}}
{"name":"Jeff"}
{"index":{"_id":3}}
{"name":"Darlene"}
{"index":{"_id":4}}
{"name":"Jose"}
Then I can order search results like this:
POST /test_index/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"script": "doc['name'].value.length()",
"type": "number",
"order": "asc"
}
}
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": null,
"_source": {
"name": "Bob"
},
"sort": [
3
]
},
{
"_index": "test_index",
"_type": "doc",
"_id": "4",
"_score": null,
"_source": {
"name": "Jose"
},
"sort": [
4
]
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": null,
"_source": {
"name": "Jeff"
},
"sort": [
4
]
},
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": null,
"_source": {
"name": "Darlene"
},
"sort": [
7
]
}
]
}
}
To filter by length, I can use a script filter in a similar way:
POST /test_index/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "doc['name'].value.length() > 3",
"params": {}
}
}
}
},
"sort": {
"_script": {
"script": "doc['name'].value.length()",
"type": "number",
"order": "asc"
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "4",
"_score": null,
"_source": {
"name": "Jose"
},
"sort": [
4
]
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": null,
"_source": {
"name": "Jeff"
},
"sort": [
4
]
},
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": null,
"_source": {
"name": "Darlene"
},
"sort": [
7
]
}
]
}
}
Here's the code I used:
http://sense.qbox.io/gist/22fef6dc5453eaaae3be5fb7609663cc77c43dab
P.S.: If any of the last names will contain spaces, you might want to use "index": "not_analyzed" on that field.