How to query by number and desconsider special characters - elasticsearch

currently I have a document in my opensearch database with the value 1301-003.023.
If I run the following query the document will be returned:
GET test/example
"query": {
"match": {
"my_number": "1301-003.023"
the main problem is if the user run this query:
GET test/example
"query": {
"match": {
"my_number": "1301003.023"
In the query above the symbol - is missing, and it will returning nothing. I need to create a search that can deal with it but without return documents that doesn't have the exactly same numbers. So, if i search for 1301003023 I want to find the document with 1301-003.023, but I don't for documents with 1301-003.032 (see that the last two numbers were exchanged)

I created a new analyzer using char filter that mapping simbols "." and "-" to empty. So, the number "1301-003.023" becomes token "1301003023".
Full example:
PUT /test
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"char_filter": {
"my_filter": {
"type": "mapping",
"mappings": [
". => ",
"- => "
"mappings": {
"properties": {
"my_number": {
"type": "text",
"analyzer": "my_analyzer"
POST test/_bulk
{"my_number": "1301-003.023"}
GET test/_search
"query": {
"match": {
"my_number": {
"query": "1301003023"
"hits": [
"_index": "test",
"_id": "MC7v0IUBKJKciEqCrBP-",
"_score": 0.2876821,
"_source": {
"my_number": "1301-003.023"


How to find word 'food2u' by search 'food' in Elasticsearch?

I am a rookie who just started learning elasticsearch,And I want to find word like 'food2u' by search keyword 'food'.But I can only get the results like 'Food Repo','Give Food' etc. The field's Mapping is 'text' and this is my query
GET api/_search
{"query": {
"match": {
"Name": {
"query": "food"
You are getting the results like 'Food Repo','Give Food', as the text field uses a standard analyzer if no analyzer is specified. Food Repo gets tokenized into food and repo. Similarly Give Food gets tokenized into give and food.
But food2u gets tokenized into food2u. Since there is no matching token ("food"), you will not get the food2u document.
You need to use edge_ngram tokenizer to do a partial text match.
Adding a working example with index data, mapping, search query and search result
Index Mapping:
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 4,
"max_gram": 10,
"token_chars": [
"max_ngram_diff": 10
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "my_analyzer"
Index Data:
Search Query:
"query": {
"match": {
"name": "food"
Search Result:
"hits": [
"_index": "67552800",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": "food2u"
If you don't want to change the mapping, you can even use a wildcard query to return the matching documents
"query": {
"wildcard": {
"Name": {
"value": "food*"
OR you can even use query_string with wildcard
"query": {
"query_string": {
"query": "food*",
"fields": [

what types are best for elasticsearch "KEYWORDS"(like hashtags) field?

i want to make Elasticsearch index for something KEYWORDS, like.. hashtag.
and make synonym filter for KEYWORDs.
i think two ways indexing keyword, first is make keyword type.
"settings": {
"keywordField": {
"type": "keyword"
if make a index with League of Legends
maybe this.
"keywordField": ["leagueoflegends", "league", "legends", "lol" /* synonym */]
or text type:
"settings": {
"keywordField": {
"type": "text",
"analyzer": "lowercase_and_whitespace_and_synonym_analyzer"
maybe this.
"keywordField": ["league of legends"](synonym: lol => leagueoflegends)
if use _analyzer api for this field, expects "leagueoflegends", "league", "legends"
search query: 'lol', 'league of legends', 'League of Legends' have to match this field.
which practice is best?
Adding a working example with index data, mapping, search query, and search result. In the below example, I have taken two synonyms lol and leagueoflegends
Index Mapping:
"settings": {
"index": {
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": [
"leagueoflegends, lol"
"analyzer": {
"synonym_analyzer": {
"filter": [
"tokenizer": "standard"
"mappings": {
"properties": {
"keywordField": {
"type": "text"
Index Data:
"keywordField": ["leagueoflegends", "league", "legends"]
Search Query:
"query": {
"match": {
"keywordField": {
"query": "lol",
"analyzer": "synonym_analyzer"
Search Result:
"hits": [
"_index": "66872989",
"_type": "_doc",
"_id": "1",
"_score": 0.19363807,
"_source": {
"keywordField": [

search array of strings by partially match in elasticsearch

I got fields like that:
names: ["Red:123", "Blue:45", "Green:56"]
it's mapping is
"names": {
"type": "keyword"
how could I search like this
"query": {
"match": {
"names": "red"
to get all the documents where red is in element of names array?
Now it works only with
"query": {
"match": {
"names": "red:123"
You can add multi fields OR just change the type to text, to achieve your required result
Index Mapping using multi fields
"mappings": {
"properties": {
"names": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
Index Data:
"names": [
Search Query:
"query": {
"match": {
"names": "red"
Search Result:
"hits": [
"_index": "64665127",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"names": [

Custom analyzer, use case : zip-code [ElasticSearch]

Let be a set index/type named customers/customer.
Each document of this set has a zip-code as property.
Basically, a zip-code can be like:
String-String (ex : 8907-1009)
String String (ex : 211-20)
String (ex : 30200)
I'd like to set my index analyzer to get as many documents as possible that could match. Currently, I work like that :
PUT /customers/
"zip-code": {
some string properties ...
When I search a document I'm using that request :
GET /customers/customer/_search
That works if you want to search rigourously. But for instance if the zip-code is "200 30", then searching with "200-30" will not give any results.
I'd like to give orders to my index analyser in order to don't have this problem.
Can someone help me ?
P.S. If you want more information, please let me know ;)
As soon as you want to find variations you don't want to use not_analyzed.
Let's try this with a different mapping:
PUT zip
"settings": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"zip_code": {
"tokenizer": "standard",
"filter": [ ]
"mappings": {
"_doc": {
"properties": {
"zip": {
"type": "text",
"analyzer": "zip_code"
We're using the standard tokenizer; strings will be broken up at whitespaces and punctuation marks (including dashes) into tokens. You can see the actual tokens if you run the following query:
POST zip/_analyze
"analyzer": "zip_code",
"text": ["8907-1009", "211-20", "30200"]
Add your examples:
POST zip/_doc
"zip": "8907-1009"
POST zip/_doc
"zip": "211-20"
POST zip/_doc
"zip": "30200"
Now the query seems to work fine:
GET zip/_search
"query": {
"match": {
"zip": "211-20"
This will also work if you just search for "211". However, this might be too lenient, since it will also find "20", "20-211", "211-10",...
What you probably want is a phrase search where all the tokens in your query need to be in the field and also in the right order:
GET zip/_search
"query": {
"match_phrase": {
"zip": "211"
If the ZIP codes have a hierarchical meaning (if you have "211-20" you want this to be found when searching for "211", but not when searching for "20"), you can use the path_hierarchy tokenizer.
So changing the mapping to this:
PUT zip
"settings": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"zip_code": {
"tokenizer": "zip_tokenizer",
"filter": [ ]
"tokenizer": {
"zip_tokenizer": {
"type": "path_hierarchy",
"delimiter": "-"
"mappings": {
"_doc": {
"properties": {
"zip": {
"type": "text",
"analyzer": "zip_code"
Using the same 3 documents from above you can use the match query now:
GET zip/_search
"query": {
"match": {
"zip": "1009"
"1009" won't find anything, but "8907" or "8907-1009" will.
If you want to also find "1009", but with a lower score, you'll have to analyze the zip code with both variations I have shown (combine the 2 versions of the mapping):
PUT zip
"settings": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"zip_hierarchical": {
"tokenizer": "zip_tokenizer",
"filter": [ ]
"zip_standard": {
"tokenizer": "standard",
"filter": [ ]
"tokenizer": {
"zip_tokenizer": {
"type": "path_hierarchy",
"delimiter": "-"
"mappings": {
"_doc": {
"properties": {
"zip": {
"type": "text",
"analyzer": "zip_standard",
"fields": {
"hierarchical": {
"type": "text",
"analyzer": "zip_hierarchical"
Add a document with the inverse order to properly test it:
POST zip/_doc
"zip": "1009-111"
Then search both fields, but boost the one with the hierarchical tokenizer by 3:
GET zip/_search
"query": {
"multi_match" : {
"query" : "1009",
"fields" : [ "zip", "zip.hierarchical^3" ]
Then you can see that "1009-111" has a much higher score than "8907-1009".

Elasticsearch discard documents that contain superset of query

Let's say I have 3 documents:
{ "cities": "Paris Zurich Milan" }
{ "cities": "Paris Zurich" }
{ "cities": "Zurich"}
cities is just text, I'm not using any custom analyzer.
I want to query for documents that have in cities both Paris and Zurich, in this order, and do not have any other city. So I want to get only the second document.
This is what I'm trying so far:
"query": {
"match_phrase": {
"cities": "Paris Zurich"
But this returns also the first document.
What should I do instead?
If you do not care about case sensitivity just use term query:
"query": {
"term": {
"cities.keyword": "Paris Zurich"
It will only match the exact value of field.
On the other hand you can create custom analyzer that will still store the exact value of field (just like keyword) with one exception: the stored value will be converted to lowercase so you will be able to find Paris Zurich as well as paris Zurich. Here is the example:
"settings": {
"analysis": {
"analyzer": {
"lowercase_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"char_filter": [],
"filter": ["lowercase"]
"mappings": {
"doc": {
"properties": {
"cities": {
"type": "text",
"fields": {
"lowercased": {
"type": "text",
"analyzer": "lowercase_analyzer"
"query": {
"term": {
"cities.lowercased": "paris zurich" // Query string should also be in lowercase
