How to use lucene SpanQuery in ElasticSearch - elasticsearch

For my project, I thought of using Span Near Queries of ElasticSearch, with the constraint that is, certain tokens may have to searched with Fuzziness. I was able to generate a set of SpanQuery (org.apache.lucene.search.spans.SpanQuery) objects some with fuzzy enabled, some without. I couldn't figure out how to use these set of SpanQueries in ElasticSearch spanNearQuery.
Can someone help me out with right pointers to samples or docs. And is there any way to construct ES SpanNearQueryBuilder with some clauses fuzzy enabled ?

You can wrap an fuzzy query into a span query with Span Multi Term Query:
{
"span_near" : {
"clauses" : [
{ "span_term" : { "field" : "value1" } },
{ "span_multi" :
"match" : {
"prefix" : { "user" : { "field" : "value2" } }
}
}
],
...
}
}

Related

Combining results of two queries

I'm using Kibana v6.1.1 and trying to get within one GET request two different queries in order to use the "must" or "should" terms more than once.
When I run this query under "Dev Tools" in the Kibana, it works.
When I want to apply this "double query" (without the GET line of course) under "Discover"->"Add a filter"->"Edit filter"->"Edit Query DSL", it doesn't accept the syntax {} in order to create an 'OR' between the queries.
It is necessary that these two "must" terms will be separated but stay in the same filter.
GET _my_index/_search
{
"query" : {
"bool" : {
"must" : [{
...
}]
}
}
}
{}
{
"query" : {
"bool" : {
"must" : [{
...
}]
}
}
}
P.S.
Using the simple_query_string doesn't seem to solve the problem and so far, I couldn't find the way to combine these two queries.
I'm not sure what you actually want to achieve. Use the following if at least one of the shoulds has to match (there is an implicit minimum_should_match if there are no other conditions, but you can also set an explicit value for that):
{
"query" : {
"bool" : {
"should" : [
{
...
},
{
...
}
]
}
}
}
If you want to run independent queries, use a multi search.

Will Elasticsearch remove exist filter cache after I set cache in query to false

Say I have a filter in query like this:
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"price" : 20
}
}
}
}
}
According to the official doc, there will be a filter cache associated to the key "price".
One day, I change the query as follow:
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"price" : 20,
"_cache" : false
}
}
}
}
}
Will Elasticsearch automatically remove the exist cache?
Not really sure. It will probably be removed eventually but probably not immediately. It doesn't really matter however as setting _cache = false will tell elastic search to not use the cache even if it is technically still there. If you want to clear the cache manually there's an API for it.
Here is an example:curl -XPOST 'http://localhost:9200/twitter/_cache/clear
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-clearcache.html

How to get elasticsearch most used words?

I am using terms aggregation on elasticsearch to get most used words in a index with 380607390 (380 millions) and i receive timeout on my application.
The aggregated field is a text with a simple analyzer( the field holds post content).
My question is:
The terms aggregation is the correct aggregation to do that? With a large content field?
{
"aggs" : {
"keywords" : {
"terms" : { "field" : "post_content" }
}
}
}
You can try this using min_doc_count. You would ofcourse not want to get those words which have been used just once or twice or thrice...
You can set min_doc_count as per your requirement. This would definitely
reduce the time.
{
"aggs" : {
"keywords" : {
"terms" : { "field" : "post_content",
"min_doc_count": 5 //----->Set it as per your need
}
}
}
}

elasticsearch query to find documents that don't exist

Is there a way in Elasticsearch through filters, queries, aggregations etc to search for a list of document ids and have returned which ids did not hit?
With a small list it is easy enough to compare the results against the requested ids list but I'm dealing with lists of ids in the tens of thousands and it is not going to be performant to do that.
Do you mean, from https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-not-filter.html
"filtered" : {
"query" : {
"term" : { "name.first" : "shay" }
},
"filter" : {
"not" : {
"range" : {
"postDate" : {
"from" : "2010-03-01",
"to" : "2010-04-01"
}
}
}
}
}
Take a look at the guide at https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

Is there anyway to create alias on query search?

I want to create an alias on top of this. Index - test, Type - type
POST /test/type/_search
{
"query": {
"match": {
"brand_name": "xyz"
}
}
}
But I don't see anyway of doing it,since Elasticsearch aliases can only be created on filters and when I try with term filter,I don't get the results which I want.Any trick to achieve this ?
You can use a query filter to use any query as a filter:
"filter" : {
"query" : {
"match" : {
"brand_name" : "xyz"
}
}
}

Resources