Multiple Analyzers for query_string by Field - elasticsearch

for particular query, how can i define separate query analyzers by
field (phonetic_name, name). Just define search_analyzers for phonetic_name & name in Put Mapping of Index/Type?
{
"query_string" : {
"fields" : ["phonetic_name", "name^5"],
"query" : "italian food",
"use_dis_max" : true
}
}

You can specify the analyzer for a field when the index is created, for example:
curl -s -XPOST localhost:9200/myindex -d '{
"mappings":{
"mytype":{
"properties":{
"field1":{"store":"yes","index":"not_analyzed","type":"string"},
"field2":{"store":"yes","analyzer":"whitespace","type":"string"},
"field3":{"store":"yes","analyzer":"simple","type":"string"},
}
}
}
}'

Related

Don't make some fields searchable when using query_string or term/terms in Elasticsearch

Having this mapping:
curl -XPUT 'localhost:9200/testindex?pretty=true' -d '{
"mappings": {
"items": {
"dynamic": "strict",
"properties" : {
"title" : { "type": "string" },
"body" : { "type": "string" },
"tags" : { "type": "string" }
}}}}'
I add two simple items:
curl -XPUT 'localhost:9200/testindex/items/1' -d '{
"title": "This is a test title",
"body" : "This is the body of the java",
"tags" : "csharp"
}'
curl -XPUT 'localhost:9200/testindex/items/2' -d '{
"title": "Another text title",
"body": "My body is great and Im super handsome",
"tags" : ["cplusplus", "python", "java"]
}'
If I search the string java:
curl -XGET 'localhost:9200/testindex/items/_search?q=java&pretty=true'
... it will match both items. The first item will match on the body and the other one on the tags.
How can I avoid to search in some fields? In the example I dont know it to match with the field tags. But I want to maintain tags indexed as I use them for getting aggregations.
I know I can do it using this:
{
"query" : {
"query_string": {
"query": "java AND -tags:java"
}},
"_source" : {
"exclude" : ["*.tags"]
}
}'
But is there any other more elegant way, like putting something in the mapping?
PS: My searches are always query_strings and term / terms and I'm using ES 2.3.2
You can specify fields option if you only want to match against certain fields
{
"query_string" : {
"fields" : ["body"],
"query" : "java"
}
}
EDIT 1
You could use the "include_in_all": false param inside mapping. Check the documentation. Query string query defaults to _all so you can add "include_in_all": false to all the fields in which you don't want match and after that this query would only look in body field
{
"query_string" : {
"query" : "java"
}
}
Does this help?

Elastic Search Percolate Boolean Queries

I am trying to get boolean queries which are stored in ES using Percolate API.
Index mapping is given below
curl -XPUT 'localhost:9200/my-index' -d '{
"mappings": {
"taggers": {
"properties": {
"content": {
"type": "string"
}
}
}
}
}'
I am inserting records like this (Queries contain proper boolean format (AND, OR, NOT etc) as given in below example)
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"match" : {
"content" : "Audi AND BMW"
}
}
}'
And then I am posting a document to get matched queries.
curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{
"doc" : {
"content" : "I like audi very much"
}
}'
In above case no records should come because boolean query is "Audi AND BMW" but it is still giving record. It means that it is ignoring AND condition. I am not able to figure out that why it is not working for boolean queries.
You need to percolate this query instead, match queries do not understand the AND operator (they will treat it like the normal token and), but query_string does.
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"query_string" : {
"query" : "Audi AND BMW",
"default_field": "content"
}
}
}'

How to perform wildcard search on a date field?

I've a field containing values like 2011-10-20 with the mapping :
"joiningDate": { "type": "date", "format": "dateOptionalTime" }
The following query ends up in a SearchPhaseExecutionException.
"wildcard" : { "ingestionDate" : "2011*" }
Seems like ES(v1.1) doesn't provide that much of ecstasy. This post suggests the idea of scripting (unaccepted answer says even more). I'll try that, just asking if anyone has did it already ?
Expectation
A search string 13 should match all documents where the joiningDate field has values :
2011-10-13
2013-01-11
2100-13-02
I'm not sure if I understand your needs correctly, but I would suggest you to use "range query" for the date field.
The code below will return the results what you want to get.
{
"query": {
"range": {
"joiningDate": {
"gt": "2011-01-01",
"lt": "2012-01-01"
}
}
}
}'
I hope this could help you.
Edit (Searching date containing "13" itself.)
I suggest you to use "Multi field" functionality of Elasticsearch.
It means you can index "joiningDate" field by two different field type at the same time.
Please see and try the example codes below.
Create a index
curl -XPUT 'localhost:9200/blacksmith'
Define mapping in which the type of "joiningDate" field is "multi_field".
curl -XPUT 'localhost:9200/blacksmith/my_type/_mapping' -d '{
"my_type" : {
"properties" : {
"joiningDate" : {
"type": "multi_field",
"fields" : {
"joiningDate" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"verbatim" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
}'
Indexing 4 documents (3 documents containing "13")
curl -s -XPOST 'localhost:9200/blacksmith/my_type/1' -d '{ "joiningDate": "2011-10-13" }'
curl -s -XPOST 'localhost:9200/blacksmith/my_type/2' -d '{ "joiningDate": "2013-01-11" }'
curl -s -XPOST 'localhost:9200/blacksmith/my_type/3' -d '{ "joiningDate": "2130-12-02" }'
curl -s -XPOST 'localhost:9200/blacksmith/my_type/4' -d '{ "joiningDate": "2014-12-02" }' # no 13
Try wildcard query to the "joiningDate.verbatim" field NOT the "joiningDate" field.
curl -XGET 'localhost:9200/blacksmith/my_type/_search?pretty' -d '{
"query": {
"wildcard": {
"joiningDate.verbatim": {
"wildcard": "*13*"
}
}
}
}'

Is there any analyzer that performs exact word matching in elastic search

Is there any analyzer that performs exact word matching in elastic search
Example if, i have words like "America" and "American" and "America's", if i searched for "America" i should get only first one.. With standard analyzer it gives all the three ones.
I want to make sure this only at query time. I don't want to make changes to existing index. Please help me.
Set your mapping to not_analyzed
curl -XPOST localhost:9200/test -d '{
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}'
That will not do anything to your input string, but will still allow it to be searchable. Note, if you do this searching on "america" will not match "America" as there is a difference in case.
If you want to be able to match those, then you should try the keyword analyzer.
curl -XPOST localhost:9200/test -d '{
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "string", "analyzer" : "keyword" }
}
}
}
}'
You need not to worry about analyzer.. While querying use term and terms queries it Ll behave as you asked..

elasticsearch percolator stemmer

I'm attempting to use the percolation function in elasticsearch. It works great but out of the box there is no stemming to handle singular/plurals etc. The documentation is rather thin on this topic so I was wondering if anyone has gotten this working and what settings are required. At the moment I'm not indexing my documents since I'm not searching them, just passing them through the percolator to trigger notifications.
You can use the percolate API to test documents against percolators without indexing them. However, the percolate API requires and index and a type for your doc. This is so that it knows how each field in your document is defined (or mapped).
Analyzers belong to an index, and the fields in a mapping/type definition can use either globally defined analyzers, or custom analyzers defined for your index.
For instance, we could define a mapping for index test, type test using a globally defined analyzer as follows:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
{
"mappings" : {
"test" : {
"properties" : {
"title" : {
"type" : "string",
"analyzer" : "english"
}
}
}
}
}
'
Or alternatively, you could setup a custom analyzer that belongs just to the test index:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
{
"mappings" : {
"test" : {
"properties" : {
"title" : {
"type" : "string",
"analyzer" : "my_english"
}
}
}
},
"settings" : {
"analysis" : {
"analyzer" : {
"my_english" : {
"stopwords" : [],
"type" : "english"
}
}
}
}
}
'
Now we can create our percolator, specifying which index it belongs to:
curl -XPUT 'http://127.0.0.1:9200/_percolator/test/english?pretty=1' -d '
{
"query" : {
"match" : {
"title" : "singular"
}
}
}
'
And test it out with the percolate API, again specifying the index and the type:
curl -XGET 'http://127.0.0.1:9200/test/test/_percolate?pretty=1' -d '
{
"doc" : {
"title" : "singulars"
}
}
'
# {
# "ok" : true,
# "matches" : [
# "english"
# ]
# }

Resources