How to match on prefix in Elasticsearch - elasticsearch

let's say that in my elasticsearch index I have a field called "dots" which will contain a string of punctuation separated words (e.g. "first.second.third").
I need to search for e.g. "first.second" and then get all entries whose "dots" field contains a string being exactly "first.second" or starting with "first.second.".
I have a problem understanding how the text querying works, at least I have not been able to create a query which does the job.

Elasticsearch has Path Hierarchy Tokenizer that was created exactly for such use case. Here is an example of how to set it for your index:
# Create a new index with custom path_hierarchy analyzer
# See
curl -XPUT "localhost:9200/prefix-test" -d '{
"settings": {
"analysis": {
"analyzer": {
"prefix-test-analyzer": {
"type": "custom",
"tokenizer": "prefix-test-tokenizer"
"tokenizer": {
"prefix-test-tokenizer": {
"type": "path_hierarchy",
"delimiter": "."
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"analyzer": "prefix-test-analyzer",
//"index_analyzer": "prefix-test-analyzer", //deprecated
"search_analyzer": "keyword"
# Put some test data
curl -XPUT "localhost:9200/prefix-test/doc/1" -d '{"dots": "first.second.third"}'
curl -XPUT "localhost:9200/prefix-test/doc/2" -d '{"dots": ""}'
curl -XPUT "localhost:9200/prefix-test/doc/3" -d '{"dots": "first.baz.something"}'
curl -XPOST "localhost:9200/prefix-test/_refresh"
# Test searches.
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": "first"
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": "first.second"
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": ""
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true&q=dots:first.second"

There is also a much easier way, as pointed out in elasticsearch documentation:
just use:
"text_phrase_prefix" : {
"fieldname" : "yourprefix"
or since 0.19.9:
"match_phrase_prefix" : {
"fieldname" : "yourprefix"
instead of:
"prefix" : {
"fieldname" : "yourprefix"

Have a look at prefix queries.
$ curl -XGET 'http://localhost:9200/index/type/_search' -d '{
"query" : {
"prefix" : { "dots" : "first.second" }

You should use a commodin chars to make your query, something like this:
$ curl -XGET http://localhost:9200/myapp/index -d '{
"dots": "first.second*"
more examples about the syntax at:

I was looking for a similar solution - but matching only a prefix. I found #imtov's answer to get me almost there, but for one change - switching the analyzers around:
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"analyzer": "keyword",
"search_analyzer": "prefix-test-analyzer"
instead of
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"index_analyzer": "prefix-test-analyzer",
"search_analyzer": "keyword"
This way adding:
'{"dots": "first.second"}'
'{"dots": "first.third"}'
Will add only these full tokens, without storing first, second, third tokens.
Yet searching for either
will correctly return only the first entry:
'{"dots": "first.second"}'
Not exactly what you asked for but somehow related, so I thought could help someone.


Elasticseach not using synonyms from synonym file

I am new to elasticsearch so before downvoting or marking as duplicate, please read the question first.
I am testing synonyms in elasticsearch (v 2.4.6) which I have installed on Ubuntu 16.04. I am giving synonyms through a file named synonym.txt which I have placed in config directory. I have created an index synonym_test as follows-
curl -XPOST localhost:9200/synonym_test/ -d '{
"settings": {
"analysis": {
"analyzer": {
"my_synonyms": {
"tokenizer": "whitespace",
"filter": ["lowercase","my_synonym_filter"]
"filter": {
"my_synonym_filter": {
"type": "synonym",
"ignore_case": true,
"synonyms_path" : "synonym.txt"
The index contains two fields- id and some_text. I configure the field some_text with the custom analyzer as follows-
curl -XPUT localhost:9200/synonym_test/rulers/_mapping -d '{
"properties": {
"id": {
"type": "double"
"some_text": {
"type": "string",
"search_analyzer": "my_synonyms"
Then I have inserted some data as -
curl -XPUT localhost:9200/synonym_test/external/5 -d '{
"id" : "5",
"some_text":"apple is a fruit"
curl -XPUT localhost:9200/synonym_test/external/7 -d '{
"id" : "7",
"some_text":"english is spoken in england"
curl -XPUT localhost:9200/synonym_test/external/8 -d '{
"id" : "8",
"some_text":"Scotland Yard is a popular game."
curl -XPUT localhost:9200/synonym_test/external/9 -d '{
"id" : "9",
"some_text":"bananas contain potassium"
The synonym.txt file contains following-
After doing all this, when I run the query for term fruit (which should also return the text containing bananas as they are synonyms in file), I get the text containing fruit only.
"some_text":"apple is a fruit"
I have also tried the following links, but none seem to have helped me -
Synonym analyzer not working ,
Elasticsearch synonym analyzer not working , How to apply synonyms at query time instead of index time in Elasticsearch , how to configure the synonyms_path in elasticsearch and many other links.
So, can anyone please tell me if I am doing anything wrong? Is there anything wrong with the settings or synonym file? I want the synonyms to work (query time) so that when I search for a term, I get all documents related to that term.
Please refer to following url: Custom Analyzer on how you should configure custom analyzers.
If we follow the guides from above documentation our schema will be as follows:
curl -XPOST localhost:9200/synonym_test/ -d '{
"settings": {
"analysis": {
"analyzer": {
"type": "custom"
"my_synonyms": {
"tokenizer": "whitespace",
"filter": ["lowercase","my_synonym_filter"]
"filter": {
"my_synonym_filter": {
"type": "synonym",
"ignore_case": true,
"synonyms_path" : "synonym.txt"
Which currently works on my elasticsearch instance.

Elasticsearch does not filter as expected

I am using Elasticsearch 1.4
I have an Index:
curl -XPUT "http://localhost:49200/customer" -d '{"mappings": {"venues": {"properties": {"party_id": {"type": "string"},"sup_party_id": {"type": "string"},"location": {"type": "geo_point"} } } }}'
And put some data, for instances:
curl -XPOST "http://localhost:49200/customer/venues/RO2" -d '{ "party_id":"RO2", "sup_party_id": "SUP_GT_R1A_0001","location":{ "lat":"21.030347","lon":"105.842896" }}'
curl -XPOST "http://localhost:49200/customer/venues/RO3" -d '{ "party_id":"RO3", "sup_party_id": "SUP_GT_R1A_0004","location":{ "lat":"20.9602051","lon":"105.78709179999998" }}'
and my filter is:
the above query does not return data but It return data when I remove the following terms:
Please show me the problem, any suggestions is appreciated!
That's because the sup_party_id field is an analyzed string. Change your mapping like this instead and it will work:
curl -XPUT "http://localhost:49200/customer" -d '{
"mappings": {
"venues": {
"properties": {
"party_id": {
"type": "string"
"sup_party_id": {
"type": "string",
"index": "not_analyzed" <--- add this
"location": {
"type": "geo_point"

elasticsearch mapping analyzer - GET not getting result

I am trying to create an analyzer, which replaces special character with a whitespace and convert it into uppercase. then after, if I want to search with lowercase also it should work.
Mapping Analyzer:
soundarya#soundarya-VirtualBox:~/Downloads/elasticsearch-2.4.0/bin$ curl -XPUT 'http://localhost:9200/aida' -d '{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"filter": [
"char_filter": {
"my_char_filter": {
"type": "pattern_replace",
"pattern": "(\\d+)-(?=\\d)",
"replacement": "$1 "
soundarya#soundarya-VirtualBox:~/Downloads/elasticsearch-2.4.0/bin$ curl -XPOST 'http://localhost:9200/aida/_analyze?pretty' -d '{
"text":"My name is Soun*arya?jwnne&yuuk"
It is tokenizing the words properly by replacing the special character with the whitespace. Now if I search a word from the text, it is not retrieving me any result.
soundarya#soundarya-VirtualBox:~/Downloads/elasticsearch-2.4.0/bin$ curl -XGET 'http://localhost:9200/aida/_search' -d '{
I am not getting any result out of the above GET query. Getting result like :
soundarya#soundarya-VirtualBox:~/Downloads/elasticsearch-2.4.0/bin$ curl -XGET 'http://localhost:9200/aida/_search' -d '{
Can anyone help me with this! Thank you!
You don't seem to have indexed any data after creating your index. The call to _analyze will not index anything but simply show you how the content you send to ES would be analyzed.
First, you need to create your index by specifying a mapping in which you use the analyzer you've defined:
curl -XPUT 'http://localhost:9200/aida' -d '{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"filter": [
"char_filter": {
"my_char_filter": {
"type": "pattern_replace",
"pattern": "(\\d+)-(?=\\d)",
"replacement": "$1 "
"mappings": { <--- add a mapping type...
"doc": {
"properties": {
"text": { <--- ...with a field...
"type": "string",
"analyzer": "my_analyzer" <--- ...using your analyzer
Then you can index a new real document:
curl -XPOST 'http://localhost:9200/aida/doc' -d '{
"text": "My name is Soun*arya?jwnne&yuuk"
Finally, you can search:
curl -XGET 'http://localhost:9200/aida/_search' -d '{

Elasticsearch terms filter returns no results

I have a bunch of documents with an array field like this:
{ "feed_uids": ["math.CO", "cs.IT"] }
I would like to find all documents that contain some subset of these values i.e. treat them as tags. Documentation leads me to believe a terms filter should work:
{ "query": { "filtered": { "filter": { "terms": { "feed_uids": [ "cs.IT" ] } } } } }
However, the query matches nothing. What am I doing wrong?
The terms-filter works as you expect. I guess your problem here is that you have a mapping where feed_uids is using the standard analyzer.
This is quite a common problem which is described in a bit more depth here: Troubleshooting Elasticsearch searches, for Beginners
Here is a runnable example showcasing how it works if you specify "index": "not_analyzed" for the field:
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
"mappings": {
"type": {
"properties": {
"feed_uids": {
"type": "string",
"index": "not_analyzed"
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
"query": {
"filtered": {
"filter": {
"terms": {
"feed_uids": [

Indexing website/url in Elastic Search

I have a website field of a document indexed in elastic search. Example value: . The problem is that when I search for example, the document is not included. How to map correctly the website/url field?
I created the index below:
"tokenizer": "standard",
"char_filter": "html_strip"
"blogshops": {
"properties": {
"category": {
"properties": {
"name": {
"type": "string"
"reviews": {
"properties": {
"user": {
"properties": {
"_id": {
"type": "string"
I guess you are using standard analyzer, which splits http://example.dom into two tokens - http and You can take a look http://localhost:9200/_analyze?text=
If you want to split url, you need to use different analyzer or specify our own custom analyzer.
You can take a look how would be url indexed with simple analyzer - http://localhost:9200/_analyze?text= As you can see, now is url indexed as three tokens ['http', 'example', 'com']. If you don't want to index tokens like ['http', 'www'] etc, you can specify your analyzer with lowercase tokenizer (this is the one used in simple analyzer) and stop filter. For example something like this:
# Delete index
curl -s -XDELETE 'http://localhost:9200/url-test/' ; echo
# Create index with mapping and custom index
curl -s -XPUT 'http://localhost:9200/url-test/' -d '{
"mappings": {
"document": {
"properties": {
"content": {
"type": "string",
"analyzer" : "lowercase_with_stopwords"
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
"analysis": {
"filter" : {
"stopwords_filter" : {
"type" : "stop",
"stopwords" : ["http", "https", "ftp", "www"]
"analyzer": {
"lowercase_with_stopwords": {
"type": "custom",
"tokenizer": "lowercase",
"filter": [ "stopwords_filter" ]
}' ; echo
curl -s -XGET 'http://localhost:9200/url-test/_analyze?text='
# Index document
curl -s -XPUT 'http://localhost:9200/url-test/document/1?pretty=true' -d '{
"content" : "Small content with URL"
# Refresh index
curl -s -XPOST 'http://localhost:9200/url-test/_refresh'
# Try to search document
curl -s -XGET 'http://localhost:9200/url-test/_search?pretty' -d '{
"query" : {
"query_string" : {
"query" : "content:example"
NOTE: If you don't like to use stopwords here is interesting article stop stopping stop words: a look at common terms query
