elasticsearch, find exact phrase in a field - elasticsearch

I want to search exact phrase in a single field.. and this is my approach:
"query": {
"match_phrase": {
"word": "developer"
}
}
but the point is this query will find any document that have this keyword developer :
like "word": "developer" and "word": "php developer"
how can I create a query, that when I search for developer just return "word": "developer" doc,
and when I searched for php developer return "word": "php developer" doc
thanks

In a simple way, if your field word would be of type keyword, you can then make use of Term Query as shown below:
POST <your_index_name>/_search
{
"query": {
"term" : { "word" : "developer" }
}
}
If you have word as only of type text I'd suggest you to add its keyword field as multi-fields, in that way you can make use of word for text matches and word.keyword for exact matches.
PUT <your_index_name>
{
"mappings": {
"_doc": {
"properties": {
"word": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}

Related

How to turn off autocomplete for easticsearch match_phrase or match_phrase_prefix?

I have ES data, which contains a field name of type text. I have to search by a lowercase input, while the actual name might use lower and uppercase symbols. I need only the exact (but case insensitive) names.
I try to use match_phrase (as well as match_phrase_prefix). But it returns results with autocompleting. Like query
"match_phrase": {
"name": {
"query": "apple iphone 11"
}
}
returns two items:
{
"id": "547",
"name": "Apple iPhone 11",
}
and
{
"id": "253",
"name": "Apple iPhone 11 Pro",
}
I need only the one with id: 547, i.e. where there are no extra symbols in the name.
Does Elastcsearch have tools to find the exact name, but in a case insensitive form and without autocomplete?
Does Elastcsearch have tools to find the exact name?
Yes, Elastic search provides a "keyword" type for exact search.
in a case insensitive form and without autocomplete?
You can use a normalizer with a lowercase filter
Add Normalizer in index setting
PUT /so_index/
{
"settings":{
"analysis":{
"normalizer":{
"name_normalizer":{
"type":"custom",
"filter":[
"lowercase"
]
}
}
}
}
}
Mapping (Either you use the name as a keyword for just exact match or use both keyword and text for exact search and full-text search)
PUT /so_index/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "name_normalizer"
}
}
}
}
}
Use Match or Term Query
GET /so_index/_search
{
"query": {
"match": {
"name.keyword": "apple iphone 11"
}
}
}
I achieved my needs via a simple script:
"filter": [
{
"script": {
"script": {
"source": "doc[params.nameField].value != null && doc[params.nameField].value.equalsIgnoreCase(params.name)",
"lang": "painless",
"params": {
"name": "apple iphone 11",
"nameField": "name.exact"
}
},
"boost": 1.0
}
}
]

Elasticsearch: Look for a term at the end of text

I'm indexing the following documents in Elasticsearch 5.x:
{
"id": 1
"description": "The quick brown fox"
}
{
"id": 2
"description": "The fox runs fast"
}
If I search for the word "fox" in these documents, I will get both of them.
How can I find documents that their description field ends with the word "fox"? In the above example, I am looking the one with id=1
If possible, I prefer to do this with a Query String.
Well regex should work. Note if you have to look for endwith a lot of time, reindex using a analyzer will be the best (https://discuss.elastic.co/t/elasticsearch-ends-with-word-in-phrases/60473)
"regexp":{
"id": {
"value": "*fox",
}
}
Make sure your index mapping includes a keyword field for the description. For example:
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "long"
}
}
}
}
}
Add the documents:
POST my_index/_doc
{
"id": 1,
"description": "The quick brown fox"
}
POST my_index/_doc
{
"id": 2,
"description": "The fox runs fast"
}
And use a query string like this:
GET /my_index/_search
{
"query": {
"query_string" : {
"default_field" : "description.keyword",
"query" : "*fox"
}
}
}
Elastic uses Lucene's Regex engine, which doesn't support everything. To solve your particular problem however, you can use wildcard search or regexp search like so:
GET /products/_search
{
"query": {
"wildcard": {
"description.keyword": {
"value": "*able"
}
}
}
}
GET /products/_search
{
"query": {
"regexp":{
"description.keyword": {
"value": "[a-zA-Z0-9]+fox",
"flags": "ALL",
"case_insensitive": true
}
}
}
}

Difference between keyword and text in ElasticSearch

Can someone explain the difference between keyword and text in ElasticSearch with an example?
keyword type:
if you define a field to be of type keyword like this.
PUT products
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "keyword"
}
}
}
}
}
Then when you make a search query on this field you have to insert the whole value (keyword search) so keyword field.
POST products/_doc
{
"name": "washing machine"
}
when you execute search like this:
GET products/_search
{
"query": {
"match": {
"name": "washing"
}
}
}
it will not match any docs. You have to search with the whole word "washing machine".
text type on the other hand is analyzed and you can search using tokens from the field value. a full text search in the whole value:
PUT products
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text"
}
}
}
}
}
and the search :
GET products/_search
{
"query": {
"match": {
"name": "washing"
}
}
}
will return a matching documents.
You can check this to more details keyword Vs. text
The primary difference between the text datatype and the keyword datatype is that text fields are analyzed at the time of indexing, and keyword fields are not.
What that means is, text fields are broken down into their individual terms at indexing to allow for partial matching, while keyword fields are indexed as is.
Keyword Mapping
"channel" : {
"name" : "keyword"
},
"product_image" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
Along with the other advantages of keyword type in elastic search, one more is that you can store any data type inside of it. Be it string, numeric, date, etc.
PUT /demo-index/
{
"mappings": {
"properties": {
"name": { "type": "keyword" }
}
}
}
POST /demo-index/_doc
{
"name": "2021-02-21"
}
POST /demo-index/_doc
{
"name": 100
}
POST /demo-index/_doc
{
"name": "Jhon"
}

Elasticsearch 5.X Percolate: How to autogenerate copy_to fields?

In ES 2.3.3, many queries in the system I'm working on use the _all field. Sometimes these are registered to a percolate index, and when running percolator on the doc, _all is generated automatically.
In converting to ES 5.X _all is being deprecated and so _all has been replaced with a copy_to field that contains the components that we actually care about, and it works great for those searches.
Registering the same query to a percolate index with the same document mapping including copy_to fields works fine. Sending a percolate query with the document never results in a hit for a copy_to field however.
Manually building the copy_to field via simple string concatenation seems to work, it's just that I'd expect to be able to Query -> DocIndex and get the same result as Doc -> PercolateQuery... So I'm just looking for a way to have ES generate the copy_to fields automatically on a document being percolated.
Ended up there was nothing wrong with ES of course, posting here in case it helps someone else. Figured it out while attempting to generate a simpler example to post here with details... Basically the issue came down to the fact that attempting to percolate a document of a type that doesn't exist in the percolate index doesn't give any errors back, but seems to apply all percolate queries without applying any mappings which was just confusing as it worked for simple test cases, but not complex ones. Here's an example:
From the copy_to docs, generate an index with a copy_to mapping. See that a query to the copy_to field works.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
PUT my_index/my_type/1
{
"first_name": "John",
"last_name": "Smith"
}
GET my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
Create a percolate index with the same type
PUT /my_percolate_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
},
"queries": {
"properties": {
"query": {
"type": "percolator"
}
}
}
}
}
Create a percolate query that matches our other percolate query on the copy_to field, and a second query that just queries on a basic unmodified field
PUT /my_percolate_index/queries/1?refresh
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
PUT /my_percolate_index/queries/2?refresh
{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}
Search, but with the wrong type... there will be a hit on the basic field (first_name: John) even though no document mappings match the request
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "non_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}
Send in the correct document_type and see both matches as expected
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "my_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.51623213,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"1","_score":0.51623213,"_source":{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}},{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}

Elasticsearch query_string query with multiple default fields

I would like to avail myself of the feature of a query_string query, but I need the query to search by default across a subset of fields (not all, but also not just one). When I try to pass many default fields, the query fails. Any suggestions?
Not specifying a specific field in the query, so I want to search three fields by default:
{
"query": {
"query_string" : {
"query" : "some search using advanced operators OR dog",
"default_field": ["Title", "Description", "DesiredOutcomeDescription"]
}
}
}
If you want to create a query on 3 specific fields as above, just use the fields parameter.
{
"query": {
"query_string" : {
"query" : "some search using advanced operators OR dog",
"fields": ["Title", "Description", "DesiredOutcomeDescription"]
}
}
}
Alternatively, if you want to search by default on those 3 fields without specifying them, you will have to use the copy_to parameter when you set up the mapping. Then set the default field to be the concatenated field.
PUT my_index
{
"settings": {
"index.query.default_field": "full_name"
},
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
I have used this and don't recommend it because the control over the tokenization can be limiting, as you can only specify one tokenizer for the concatenated field.
Here is the page on copy_to.

Resources