Elasticsearch bool query with filter doesn't work on field with "-" - elasticsearch

I have multiple document in my index per customer:
{"customer":"m-test-service", "customer_id":"x55sg" "book":"f","date"....}
{"customer":"m-test-service", "customer_id":"x55sg" "book":"g","date"....}
{"customer":"x12", "customer_id":"dhb5" "book":"e","date"....}
{"customer":"x12", "customer_id":"dhb5" "book":"d","date"....}
I want to retrieve all the documents that has both customer and customer_id set to some two specific values like the following SQL query : SELECT * FROM TABLE WHERE CUSTOMER='XXX' AND CUSTOMER_ID="YYYY".
I have been using the following query :
GET my_index/_search
{
"query":{
"bool":{
"filter" : [
{"term" : {"customer" : "x12"} },
{"term" : {"customer_id" : "dhb5"}}
]
}
}
}
However, when I try to query a customer that has "-" in its string I don't get any documents. I tried to run the previous query mentioned with the following values:
"customer":"m-test-service", "customer_id":"x55sg"
and I didn't get results.
When I remove the '-' char from the customer field I got results including the documents of m-test-service.
Why is this char problematic?

Term query returns documents that contain an exact term in a provided field.
If you want to search for "customer":"m-test-service", then try using customer.keyword field. This uses the keyword analyzer instead of the standard analyzer
{
"query":{
"bool":{
"filter" : [
{"term" : {"customer.keyword" : "m-test-service"} },
{"term" : {"customer_id" : "x55sg"}}
]
}
}
}

Looks like your customer field in your mapping is text which is using the standard analyzer while breaks the text, you should use .keyword field if its dynamically generated otherwise using multi-field you should add it and query it.

Related

how to query strings with incasesensitive the text in elastic search

I'm looking for data in two fields with one filed must be the same, one using query
i have data
{
"NUMBER" : "5587120",
"SID" : "121213-13131-_X",
"ADDRESS" : "purwakarta"
}
i have tried use query string like this
GET test/_doc/_search
{
"query" : {
"bool" : {
"must" : [
{"match" : {"NUMBER" : "5587120"}}
],
"filter" : {
"query_string" : {
"default_field" : "SID.keyword",
"query" : "*X*"
}
}
}
}
when I enter the same text as the one recorded, the data I want appears, but when I write the text with lowercase, the data doesn't appear
As it's not clear from your question, that on which field you want the case insensitive search, based on the context I am assuming its the SID.keyword field.
Why your solution not working: Please note that keyword fields are not analyzed and indexed in elasticsearch as it is, so in case of your field SID.keyword you are providing its value 121213-13131-_X so it will be stored as it is, it will not create just one token which is exactly same as the provided value.
Now you are using the query_string on-field SID.keyword, hence your query string will use the same analyzer configured for the field which is the keyword analyzer which is again no-op analyzer, hence doesn't lowercase the *X* provided in the query.
Solution : If you want the insensitive search than instead of using the SID.keyword field, simply creates a custom analyzer which uses the keyword analyzer and then passes it to lowercase token filter, so your 121213-13131-_X will be converted to 121213-13131-_x(Note small case x). And then your query string will also use the same analyzer and will match the document as ultimately elasticsearch works on tokens match.

Elasticsearch - boosting fields for multi match without specifying complete field list in query

I am trying to boost fields using multi match query without specifying complete field list but I cannot find out how to do it. I am searching through multiple indices on all fields, which I don't know at the run time, but I know which are the important ones.
For example I have index A with the fields 1,2,3,4 and index B with fields 1,5,6,7,8. I need to search across both indexes through all fields with the boosting on field 1.
So far I got
GET A,B/_search
{
"query": {
"multi_match" : {
"query" : "somethingToSearch"
}
}
}
Which goes through all fields on both indices, but I would like to have something like this (boosting match on field 1 before the others)
GET A,B/_search
{
"query": {
"multi_match" : {
"query" : "somethingToSearch",
"fields" : ["1^5,*"]
}
}
}
Is there any way how to do it without using bool queries?

Make a prefix query on whole filed in elastic search

Hi I am having a field called text_field in which i have two document
1.lubricant
2.air lube
I have used Edge-N gram analyzer with term query but in result when i serch with lub
Terms query over filed analyzed with edge n-gram analyzer
{
"terms" : {
"text_field" : [ "lub" ]
}
}
prefix query over filed analyzed with keyword tokenizer:
{
"prefix" : {
"text_field" : {
"prefix" : "lub"
}
}
}
In both these queries m getting two results in result set
"lubricant",
"air lube"
I don't want air lube to be in result as it starts with word air,is there any way to make a search prefix query on whole field,looks like here it's checking terms,is there any way to sort this out.

ElasticSearch filter on exact url

Let's say I create this document in my index:
put /nursery/rhyme/1
{
"url" : "http://example.com/mary",
"text" : "Mary had a little lamb"
}
Why does this query not return anything?
POST /nursery/rhyme/_search
{
"query" : {
"match_all" : {}
},
"filter" : {
"term" : {
"url" : "http://example.com/mary"
}
}
}
The Term Query finds documents that contain the exact term specified in the inverted index. When you save the document, the url property is analyzed and it will result in the following terms (with the default analyzer) : [http, example, com, mary].
So what you currently have in you inverted index is that bunch of terms, non of them is http://example.com/mary.
What you want is to not analyze the url property or to do a Match Query that will split the query into terms just like when indexing.
Exact Match does not work for analyzed field. A string is by default analyzed which means http://example.com/mary string will be split and stored in reverse index as http , example , com , mary. That's why your query results in no output.
You can make your field not analyzed
{
"url": {
"type": "string",
"index": "not_analyzed"
}
}
but for this you will have to reindex your index.
Study about not_analyzed and term query here.
Hope this helps
In the ElasticSearch 7.x you have to use type "keyword" in maping properties, which is not analized https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html

Elasticsearch bulk or search

Background
I am working on an API that allows the user to pass in a list of details about a member (name, email addresses, ...) I want to use this information to match up with account records in my Elasticsearch database and return a list of potential matches.
I thought this would be as simple as doing a bool query on the fields I want, however I seem to be getting no hits.
I'm relatively new to Elasticsearch, my current _search request looks like this.
Example Query
POST /member/account/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" [{
"term" : {
"email": "jon.smith#gmail.com"
}
},{
"term" : {
"email": "samy#gmail.com"
}
},{
"term" : {
"email": "bo.blog#gmail.com"
}
}]
}
}
}
}
}
Question
How should I update this query to return records that match any of the email addresses?
Am I able to prioritise records that match email and another field? Example "family_name".
Will this be a problem if I need to do this against a few hundred emails addresses?
Well , you need to make the change in the index side rather than query side.
By default your email ID is broken into
jon.smith#gmail.com => [ jon , smith , gmail , com]
While indexing.
Now when you are searching using term query , it does not apply the analyzer and it tries to get the exact match of jon.smith#gmail.com , which as you can see , wont work.
Even if you use match query , then you will end up getting all document as matches.
Hence you need to change the mapping to index email ID as a single token , rather than tokenizing it.
So using not_analyzed would be the best solution here.
When you define email field as not_analyzed , the following happens while indexing.
jon.smith#gmail.com => [ jon.smith#gmail.com]
After changing the mapping and indexing all your documents , now you can freely run the above query.
I would suggest to use terms query as following -
{
"query": {
"terms": {
"email": [
"jon.smith#gmail.com",
"samy#gmail.com",
"bo.blog#gmail.com"
]
}
}
}
To answer the second part of your question - You are looking for boosting and would recommend to go through function score query

Resources