Searching multiple types in elasticsearch - elasticsearch

I have a usecase where there are two different types in the same index. Both the types have different structure and mapping.
I need to query both types at the same time using different query DSL.
How can I build my query DSL to simultaneously query more than one type of the same index.
I looked into elasticsearch guide at https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-index-multi-type.html but there is no proper explanation here. According to this even if I set two different types in my request :
/index/type1,type2/_search
I will have to send the same query DSL.

You need to use multi-search API and the _msearch endpoint
curl -XGET localhost:9200/index/_msearch -d '
{"type": "type1"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 10}
{"type": "type2"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 10}
'
Note: make sure to separate each line by newlines (including the last line)
You'll get two responses in the same order as the requests

Related

improving performance of search query using index field when working with alias

I am using an alias name when writing data using Bulk Api.
I have 2 questions:
Can I get the index name after writing data using the alias name maybe as part of the response?
Can I improve performance if I send search queries on specific indexes instead to search on all indexes of the same alias?
If you're using an alias name for writes, that alias can only point to a single index which you're going to receive back in the bulk response
For instance, if test_alias is an alias to the test index, then when sending this bulk command:
POST test_alias/_doc/_bulk
{"index":{}}
{"foo": "bar"}
You will receive this response:
{
"index" : {
"_index" : "test", <---- here is the real index name
"_type" : "_doc",
"_id" : "WtcviYABdf6lG9Jldg0d",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
}
Common sense has it that searching on a single index is always faster than searching on an alias spanning several indexes, but if the alias only spans a single index, then there's no difference.
You can provide the multiple index names while searching the data, if you are using alias and it has multiple indices by default it would search on all the indices, but if you want to filter it based on a few indices in your alias, that is also possible based on the fields in the underlying indices.
You can read the Filter-based aliases to limit access to data section in this blog on how to achieve it, as it queries fewer indices and less data, search performance would be better.
Also alias can have only single writable index, and name of that you can get as part of _cat/alias?v api response as well, which shows which is the write_index for the alias, you can see the sample output here

Elastic Search pipeline search queries

I am looking for a way to pipeline multiple queries into Elastic search. My main problem is that when I receive the results I want to be able to know the which was the query that generated the result. In pseudo-code I would like to do something like following
query1="James Bond"
query2="Sean Connery"
query3="Charlie Chaplin"
pipeline=new ElasticSearchPipeline()
pipeline.add(query1);pipeline.add(query2);pipeline.add(query3)
pipeline.execute()
jamesBondResults=pipeline.getResultsForQuery(query1)
seanConneryResults=pipeline.getResultsForQuery(query2)
charleChaplinResults=pipeline.getResultsForQuery(query3)
The key feature is that I want to send avoid the overhead of sending multiple requests on the ES server, but still be able to treat the results as if I had sent the queries one by one.
The multi search API is exactly what you're looking for.
You can send many queries and the response will contain an array with the responses to each query in the same order:
curl -XPOST localhost:9200/_msearch -d '
{"index" : "test1"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 10}
{"index" : "test2",}
{"query" : {"match_all" : {}}}
'
The response array of the above multi search queries will contain two ES responses with the documents from the first and second queries.

Elasticsearch: is bulk search possible?

i know there is support for bulk index operation. but is it possible to do the same for search queries? i want to send many different unrelated queries (to do precision/recall testing) and it would probably be faster using bulk query
Yes, you can use the multi search API and the /_msearch endpoint to send as many queries as you wish in one shot.
curl -XPOST localhost:9200/_msearch -d '
{"index" : "test1"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 10}
{"index" : "test2"}
{"query" : {"match_all" : {}}}
'
You'll get a responses array with the response of each query in the same order as in the request.
Note:
make sure to separate each line by a newline character
make sure to add the extra newline after the last query.

How to view the response for multiple indices for a single query

I have created multiple indices in elasticsearch and have passed a single query to all of them. Is there any way to know,how many results came from each index?
Here is the screenshot of my elasticsearch head,showing a single aggregation applied to two indices
screenshot:
Here as in the figure you can see I have done an aggregation named "posted_time" on the indices foodfind and comics (red box 1).
But in the response window,to the right,only the results for the index "comics" is shown. How can I see the results for the other index too?
You can use terms aggregation on the field _index for this.
Lets say you need to run the same on index-a , index-b and index-c.
You need to make the request in this pattern -
curl -XPOST 'http://localhost:9200/index-a,index-b,index-c/_search' -d '{
"aggs" : {
"indexStats" : {
"terms" : {
"field" : "_index"
}
}
}
}'

How to enable fuzziness for phrase queries in ElasticSearch

We're using ElasticSearch for searching through millions of tags. Our users should be able to include boolean operators (+, -, "xy", AND, OR, brackets). If no hits are returned, we fall back to a spelling suggestion provided by ES and search again. That's our query:
$ curl -XGET 'http://127.0.0.1:9200/my_index/my_type/_search' -d '
{
"query" : {
"query_string" : {
"query" : "some test query +bools -included",
"default_operator" : "AND"
}
},
"suggest" : {
"text" : "some test query +bools -included",
"simple_phrase" : {
"phrase" : {
"field" : "my_tags_field",
"size" : 1
}
}
}
}
Instead of only providing a fallback to spelling suggestions, we'd like to enable fuzzy matching. If, for example, a user searches for "stackoverfolw", ES should return matches for "stackoverflow".
Additional question: What's the better performing method for "correcting" spelling errors? As it is now, we have to perform two subsequent requests, first with the original search term, then with the by ES suggested term.
The query_string does support some fuzziness but only when using the ~ operator, which I think doesn't your usecase. I would add a fuzzy query then and put it in or with the existing query_string. For instance you can use a bool query and add the fuzzy query as a should clause, keeping the original query_string as a must clause.
As for your additional question about how to correct spelling mistakes: I would use fuzzy queries to automatically correct them and two subsequent requests if you want the user to select the right correction from a list (e.g. Did you mean), but your approach sounds good too.

Resources