Elasticsearch - boosting fields for multi match without specifying complete field list in query - elasticsearch

I am trying to boost fields using multi match query without specifying complete field list but I cannot find out how to do it. I am searching through multiple indices on all fields, which I don't know at the run time, but I know which are the important ones.
For example I have index A with the fields 1,2,3,4 and index B with fields 1,5,6,7,8. I need to search across both indexes through all fields with the boosting on field 1.
So far I got
GET A,B/_search
{
"query": {
"multi_match" : {
"query" : "somethingToSearch"
}
}
}
Which goes through all fields on both indices, but I would like to have something like this (boosting match on field 1 before the others)
GET A,B/_search
{
"query": {
"multi_match" : {
"query" : "somethingToSearch",
"fields" : ["1^5,*"]
}
}
}
Is there any way how to do it without using bool queries?

Related

Elasticsearch bool query with filter doesn't work on field with "-"

I have multiple document in my index per customer:
{"customer":"m-test-service", "customer_id":"x55sg" "book":"f","date"....}
{"customer":"m-test-service", "customer_id":"x55sg" "book":"g","date"....}
{"customer":"x12", "customer_id":"dhb5" "book":"e","date"....}
{"customer":"x12", "customer_id":"dhb5" "book":"d","date"....}
I want to retrieve all the documents that has both customer and customer_id set to some two specific values like the following SQL query : SELECT * FROM TABLE WHERE CUSTOMER='XXX' AND CUSTOMER_ID="YYYY".
I have been using the following query :
GET my_index/_search
{
"query":{
"bool":{
"filter" : [
{"term" : {"customer" : "x12"} },
{"term" : {"customer_id" : "dhb5"}}
]
}
}
}
However, when I try to query a customer that has "-" in its string I don't get any documents. I tried to run the previous query mentioned with the following values:
"customer":"m-test-service", "customer_id":"x55sg"
and I didn't get results.
When I remove the '-' char from the customer field I got results including the documents of m-test-service.
Why is this char problematic?
Term query returns documents that contain an exact term in a provided field.
If you want to search for "customer":"m-test-service", then try using customer.keyword field. This uses the keyword analyzer instead of the standard analyzer
{
"query":{
"bool":{
"filter" : [
{"term" : {"customer.keyword" : "m-test-service"} },
{"term" : {"customer_id" : "x55sg"}}
]
}
}
}
Looks like your customer field in your mapping is text which is using the standard analyzer while breaks the text, you should use .keyword field if its dynamically generated otherwise using multi-field you should add it and query it.

Is it possible to check that specific data matches the query without loading it to the index?

Imagine that I have a specific data string and a specific query. The simple way to check that the query matches the data is to load the data into the Elastic index and run the online query. But can I do it without putting it into the index?
Maybe there are some open-source libraries that implement the Elastic search functionality offline, so I can call something like getScore(data, query)? Or it's possible to implement by using specific API endpoints?
Thanks in advance!
What you can do is to leverage the percolator type.
What this allows you to do is to store the query instead of the document and then test whether a document would match the stored query.
For instance, you first create an index with a field of type percolator that will contain your query (you also need to add in the mapping any field used by the query so ES knows what their types are):
PUT my_index
{
"mappings": {
"properties": {
"query": {
"type": "percolator"
},
"message": {
"type": "text"
}
}
}
}
Then you can index a real query, like this:
PUT my_index/_doc/match_value
{
"query" : {
"match" : {
"message" : "bonsai tree"
}
}
}
Finally, you can check using the percolate query if the query you've just stored would match
GET /my_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"message" : "A new bonsai tree in the office"
}
}
}
}
So all you need to do is to only store the query (not the documents), and then you can use the percolate query to check if the documents would have been selected by the query you stored, without having to store the documents themselves.

Make a prefix query on whole filed in elastic search

Hi I am having a field called text_field in which i have two document
1.lubricant
2.air lube
I have used Edge-N gram analyzer with term query but in result when i serch with lub
Terms query over filed analyzed with edge n-gram analyzer
{
"terms" : {
"text_field" : [ "lub" ]
}
}
prefix query over filed analyzed with keyword tokenizer:
{
"prefix" : {
"text_field" : {
"prefix" : "lub"
}
}
}
In both these queries m getting two results in result set
"lubricant",
"air lube"
I don't want air lube to be in result as it starts with word air,is there any way to make a search prefix query on whole field,looks like here it's checking terms,is there any way to sort this out.

Elastic Search boost query corresponding to first search term

I am using PyElasticsearch (elasticsearch python client library). I am searching strings like Arvind Kejriwal India Today Economic Times and that gives me reasonable results. I was hoping I could increase weight of the first words more in the search query. How can I do that?
res = es.search(index="article-index", fields="url", body={
"query": {
"query_string": {
"query": "keywordstr",
"fields": [
"text",
"title",
"tags",
"domain"
]
}
}
})
I am using the above command to search right now.
split given query into multiple terms. In your example it will be Arvind, Kejriwal... Now form query string queries(or field query or any other which fits into the need) for each of the given terms. A query string query will look like this
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-query-string-query.html
{
"query_string" : {
"default_field" : "content",
"query" : "<one of the given term>",
"boost": <any number>
}
}
Now you have got multiple queries like above with different boost values(depending upon which have higher weight). Combine all of those queries into one query using BOOL query. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
If you want all of the terms to be present in the result, query will be like this.
{
"bool" : {
"must" : [q1, q2, q3 ...]
}
}
you can use different options of bool query. for example you want any of 3 terms to present in result then query will be like
{
"bool" : {
"should" : [q1, q2,q3 ...]
},
"minimum_should_match" : 3,
}
theoretically:
split into terms using api
query against terms with different boosting
Lucene Query Syntax does the trick. Thanks
http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Boosting%20a%20Term

Full-text schema in ElasticSearch

I'm (extremely) new to ElasticSearch so forgive my potentially ridiculous question. I currently use MySQL to perform full-text searches, and want to move this to ElasticSearch. Currently my table has a fulltext index spanning three columns:
title,description,tags
In ES, each document would therefore have title, description and tags fields, allowing me to do a fulltext search for a general phrase, or filter on a given tag.
I also want to add further searchable fields such as username (so I can retrieve posts by a given user). So, how do I specify that a fulltext search should match title OR description OR tags but not username?
From the OR filter example, I'd assume I'd have to use something like this:
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"or" : [
{
"term" : { "title" : "foobar" }
},
{
"term" : { "description" : "foobar" }
},
{
"term" : { "tags" : "foobar" }
}
]
}
}
}
Coming at this new, it doesn't seem like this is very efficient. Is there a better way of doing this, or do I need to move the username field to a separate index?
This is fine.
I general I would suggest getting familiar with ElasticSearch mapping types and options.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html

Resources