I am using the tire gem to access elastic search from my Rails application. My code generates the following query, which to my understanding of the docs for sort is correct:
[2013-08-09 15:06:08,538][DEBUG][action.search.type ] [Nitro] [tweets][3], node[INKC2ryGQ4Sx1qYP4qC-Og], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest#54daad1d]
org.elasticsearch.search.SearchParseException: [tweets][3]: query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to parse source [{
"query": {
"match_all": {
}
},
"sort": [
{
"author": "asc"
}
],
"filter": {
"terms": {
"entities_ids": [
"10"
]
}
},
"size": 10,
"from": 0
}]]
The method generating this looks like this:
def Tweet.search_tweets(params = {})
Tweet.search(page: params[:page], per_page: params[:per_page]) do
if params[:query_string].present? || params[:sentiment].present? ||
(params[:start_date].present? && params[:end_date].present?)
query do
boolean do
if params[:query_string].present?
must { string params[:query_string], default_operator: "AND" }
end
if params[:sentiment].present?
must { term :sentiment, params[:sentiment]}
end
if params[:start_date].present? && params[:end_date].present?
must { string "created_at:[#{params[:start_date]} TO #{params[:end_date]}]"}
end
end
end
else
query do
all
end
end
if params[:entity_id].present?
filter :terms, entities_ids: [params[:entity_id]]
end
if params[:sort].present?
sort { by params[:sort][:by], params[:sort][:order] }
end
end
end
I have absolutely no idea why this doesn't work.
I had similar issues with sorting... depends on how you have indexed that field. If its a tokenised string you may struggle. Instead you could use a multifield and not analyse it or you could use a _script sort based on the _source:
{ "query": {
"query_string": {
"query": "something or other"
} }, "sort": {
"_script": {
"script": "_source.contentLegend",
"type": "string",
"order": "desc"
} } }
Hope that helps a bit!
Related
I would like to construct an elasticsearch query in which I can search for a term and on-the-fly compute a new field for each found document, which is calculated based on some existing fields as well as the query term. Is this possible?
For example, let's say in my EL query I am searching for documents which have the keyword "amsterdam" in the "text" field.
"filter": [
{
"match_phrase": {
"text": {
"query": "amsterdam"
}
}
}]
Now I would also like to have a script field in my query, which computes some value based on other fields as well as the query.
So far, I have only found how to access the other fields of a document though, using doc['someOtherField'], for example
"script_fields" : {
"new_field" : {
"script" : {
"lang": "painless",
"source": "if (doc['citizens'].value > 10000) {
return "large";
}
return "small";"
}
}
}
How can I integrate the query term, e.g. if I wanted to add to the if statement "if the query term starts with a-e"?
You're on the right track but script_fields are primarily used to post-process your documents' attributes — they won't help you filter any docs because they're run after the query phase.
With that being said, you can use scripts to filter your documents through script queries. Before you do that, though, you should explore alternatives.
In other words, scripts should be used when all other mechanisms and techniques have been exhausted.
Back to your example. I see three possibilities off the top of my head.
Match phrase prefix queries as a group of bool-should subqueries:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match_phrase_prefix": {
"text_field": "a"
}
},
{
"match_phrase_prefix": {
"text_field": "b"
}
},
{
"match_phrase_prefix": {
"text_field": "c"
}
},
... till the letter "e"
]
}
}
]
}
}
}
A regexp query:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"regexp": {
"text_field": "[a-e].+"
}
}
]
}
}
}
Script queries using .charAt comparisons:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": """
char c = doc['text_field.keyword'].value.charAt(0);
return c >= params.gte.charAt(0) && c <= params.lte.charAt(0);
""",
"params": {
"gte": "a",
"lte": "e"
}
}
}
}
]
}
}
}
If you're relatively new to ES and would love to see real-world examples, check out my recently released Elasticsearch Handbook. One chapter is dedicated to scripting and as it turns out, you can achieve a lot with scripts (if of course executed properly).
I store in Elasticsearc objects like that:
{
"userName": "Cool User",
"orders":[
{
"orderType": "type1",
"amount": 500
},
{
"orderType": "type2",
"amount": 1000
}
]
}
And all is ok while I`m searching by 'orders.orderType' or 'orders.amount' fields.
But what query I have to use for getting objects, which has 'orders.amount >= 500' and 'orders.orderType=type2'?
I`ve tried to query like that:
{
"query": {
"bool": {
"must": [
{
"range": {
"orders.amount": {
"from": "499"
}
}
},
{
"query_string": {
"query": "type2",
"fields": [
"orders.orderType"
]
}
}
]
}
}
}
..but this request returns records that has 'orders.orderType=type2' OR 'orders.amount >= 500'.
Please help me to construct query, that will look for objects that has object inside orders array and it object has to have amount >= 500 AND 'orderType=type2'.
Finally, I found blog post that describes exactly my case.
https://www.bmc.com/blogs/elasticsearch-nested-searches-embedded-documents/
Thanks for help.
I'm using Elasticsearch with the python library and I have a problem using the search query when the object become a little bit complex. I have objects build like that in my index:
{
"id" : 120,
"name": bob,
"shared_status": {
"post_id": 123456789,
"text": "This is a sample",
"urls" : [
{
"url": "http://test.1.com",
"displayed_url": "test.1.com"
},
{
"url": "http://blabla.com",
"displayed_url": "blabla.com"
}
]
}
}
Now I want to do a query that will return me this document only if in one of the displayed URL's a substring "test" and there is a field "text" in the main document. So I did this query:
{
"query": {
"bool": {
"must": [
{"exists": {"field": "text"}}
]
}
}
}
}
But I don't know what query to add for the part: one of the displayed URL's a substring "test"
Is that posssible? How does the iteration on the list works?
If you didn't define an explicit mapping for your schema, elasticsearch creates a default mapping based on the data input.
urls will be of type object
displayed_url will be of type string and using standard analyzer
As you don't need any association between url and displayed_url, the current schema will work fine.
You can use a match query for full text match
GET _search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "text"
}
},
{
"match": {
"urls.displayed_url": "test"
}
}
]
}
}
}
Per our requirement we need to find the max ID of the document before adding new document. Problem here is doc may contain string data also So had to use inline script on the elastic query to find out max id only for the document which has integer data otherwise returning 0. am using following inline script query to find max-key but not working. can you help me onthis ?.
{
"size":0,
"query":
{"bool":
{"filter":[
{"term":
{"Name":
{
"value":"Test2"
}
}}
]
}},
"aggs":{
"MaxId":{
"max":{
"field":"Key","script":{
"inline":"((doc['Key'].value).isNumber()) ? Integer.parseInt(doc['Key'].value) : 0"}}
}
}
}
The error is because the max aggregation only supports numeric fields, i.e. you cannot specify a string field (i.e. Key) in a max aggregation.
Simply remove the "field":"Key" part and only keep the script part
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"Name": "Test2"
}
}
]
}
},
"aggs": {
"MaxId": {
"max": {
"script": {
"source": "((doc['Key'].value).isNumber()) ? Integer.parseInt(doc['Key'].value) : 0"
}
}
}
}
}
I want sort by name, but keep entries with null or missing price data at the end of the results.
I have tried:
sort: {
sorting.name: {
order: "asc"
},
prices.minprice.price: {
missing: "_last"
}
}
but this only sorts by name.
Actually you could use the missing order.
https://www.elastic.co/guide/en/elasticsearch/reference/current/sort-search-results.html#_missing_values
To do this we can use a custom function score (which is faster than custom script based sorting according to elasticsearch documentation). When price is missing, the numeric value will return 0.
We then sort on score first to split into a section with missing price data, and a section without missing price data, and then sort each of these two sections based on name.
{
"query": {
"function_score": {
"boost_mode": "replace",
"query": {"match_all": {}},
"script_score": {
"script": "doc['price'].value == 0 ? 0 : 1"
}
}
},
"sort": [
"_score",
"name"
]
}
This has worked for me,
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_script_based_sorting
"sort":[
{
"_script":{
"type":"number",
"script":{
"lang":"painless",
"source":"return params._source.store[0].sku_available_unit == 0 ? 0 : 1;"
},
"order":"desc"
}
}
]