Exclude a field on a Elasticsearch query - elasticsearch

Having the following mapping:
curl -XPUT 'localhost:9200/testidx?pretty=true' -d '{
"mappings": {
"items": {
"dynamic": "strict",
"properties" : {
"title" : { "type": "string" },
"body" : { "type": "string" }
}}}}'
I put two items on it:
curl -XPUT 'localhost:9200/testidx/items/1' -d '{
"title": "Titulo anterior",
"body": "blablabla blablabla blablabla blablabla blablabla blablabla"
}'
curl -XPUT 'localhost:9200/testidx/items/2' -d '{
"title": "Joselr",
"body": "Titulo stuff more stuff"
}'
Now I want to search the word titulo on every field but body, so what I do is (following this post):
curl -XGET 'localhost:9200/testidx/items/_search?pretty=true' -d '{
"query" : {
"query_string": {
"query": "Titulo"
}},
"_source" : {
"exclude" : ["*.body"]
}
}'
It's supposed to show only the 1 item, as the second one has the word Titulo but it's on the body and that's what I want to ignore. How can archive this?
PS: This is just a simple example, I've a mapping with a lot of properties and I want to ignore some of them in some searches.
PS2: I'm using ES 2.3.2

The _source/exclude setting is only useful for not returning the body field in the response, but that doesn't exclude that field from being searched.
What you can do is to specify all the fields you want to search instead (whitelist approach)
curl -XGET 'localhost:9200/testidx/items/_search?pretty=true' -d '{
"query" : {
"query_string": {
"fields": ["title", "field2", "field3"], <-- add this
"query": "Titulo"
}},
"_source" : {
"exclude" : ["*.body"]
}
}'
Another thing you can do is to explicitly specify that body should not be matched with -body:Titulo
curl -XGET 'localhost:9200/testidx/items/_search?pretty=true' -d '{
"query" : {
"query_string": {
"query": "Titulo AND -body:Titulo" <-- modify this
}},
"_source" : {
"exclude" : ["*.body"]
}
}'

Up to elasticsearch 6.0.0 you can set "include_in_all": false to your index field properties, see e.g. https://www.elastic.co/guide/en/elasticsearch/reference/5.5/include-in-all.html.
(This of course needs a reindexing of the data.)

Related

How to get multiple type search results from elasticsearch index?

So, I have myindex elastic search index with two types type1 and type2. Both the type has two common fields as name and descriptionas below:
{
"name": "",
"description": ""
}
I want 5 results from type1 and 5 results from result2 if I specify the size as 10 in a single search query?
The below query gives me 10 results from type1 if the matching results are more from type1:
curl -XPOST 'localhost:9200/myindex/_search?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"size": 10,
"query": {
"match": {
"name": "xyz"
}
}
}'
I can do this in two different queries as below, but I want to do it in one go.
curl -XPOST 'localhost:9200/myindex/type1/_search?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"size": 5,
"query": {
"match": {
"name": "xyz"
}
}
}'
curl -XPOST 'localhost:9200/myindex/type2/_search?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"size": 5,
"query": {
"match": {
"name": "xyz"
}
}
}'
You can use a multisearch and the results will come back in two separate arrays.
GET /_msearch --data-binary
{ "index" : "myindex" , "type" : "type1" }
{ "size" : 5, "query" : { "match" : { "name" : "xyz" } } }
{ "index" : "myindex", "type" : "type2" }
{ "size" : 5, "query" : { "match" : { "name" : "xyz" } } }

how do you set a field to be not_analyized on a field that contains spaces?

I have a 'grade' field in an Elasticsearch index that contains text and numbers. I have set the field mapping to be 'not_analyized' but I can't search for grade ==== 'Year 1'.
I have read the finding exact values section of the docs but it doesn't seem to work for me.
Create the index.
curl -XPUT http://localhost:9200/my_test_index
Create the mapping template.
curl -XPUT http://localhost:9200/_template/my_test_index_mapping -d '
{
"template" : "my_test_index",
"mappings" : {
"my_type": {
"properties": {
"grade": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'
Create some documents.
curl -XPUT 'http://localhost:9200/my_test_index/my_type/1' -d '{
"title" : "some title",
"grade" : "Year 1"
}'
curl -XPUT 'http://localhost:9200/my_test_index/my_type/3' -d '{
"title" : "some title",
"grade" : "preschool"
}'
Query for "Year 1" returns 0 results.
curl -XPOST http://localhost:9200/my_test_index/_search -d '{
"query": {
"filtered" : {
"filter" : {
"term": {
"grade": "Year 1"
}
}
}
}
}'
Query for 'preschool' returns 1 result.
curl -XPOST http://localhost:9200/my_test_index/_search -d '{
"query": {
"filtered" : {
"filter" : {
"term": {
"grade": "preschool"
}
}
}
}
}'
Checking the mapping and the 'grade' field does not show 'not_analyzed'.
curl -XGET http://localhost:9200/my_test_index/_mapping
{
"my_test_index" : {
"mappings" : {
"my_type" : {
"properties" : {
"grade" : {
"type" : "string"
},
"title" : {
"type" : "string"
}
}
}
}
}
}
The template will only impact newly created indices.
Re-Created the index after the template has been created.
Alternatively, specify the mappings while creating the index, instead of relying on templates to a single index.
If you don't want the field to be analysed you can specify "index" : "not_analyzed" in the mapping. You'll then be able to search for exact matches as desired.
See: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
In your case,Please try to Re-create your mapping.

Elasticsearch completion suggester matching multiple inputs

I have an issue with ES completion suggester. I have the following index mapping:
curl -XPUT localhost:9200/test_index/ -d '{
"mappings": {
"item": {
"properties": {
"test_suggest": {
"type": "completion",
"index_analyzer": "whitespace",
"search_analyzer": "whitespace",
"payloads": false
}
}
}
}
}'
I index some names like so:
curl -X PUT 'localhost:9200/test_index/item/1?refresh=true' -d '{
"suggest" : {
"input": [ "John", "Smith" ],
"output": "John Smith",
"weight" : 34
}
}'
curl -X PUT 'localhost:9200/test_index/item/2?refresh=true' -d '{
"suggest" : {
"input": [ "John", "Doe" ],
"output": "John Doe",
"weight" : 34
}
}'
Now if I call suggest and provide only the first name John it works fine:
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"john",
"completion": {
"field" : "test_suggest"
}
}
}'
Same works for last names:
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"doe",
"completion": {
"field" : "test_suggest"
}
}
}'
Even searching for parts of last or first names work fine:
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"sm",
"completion": {
"field" : "test_suggest"
}
}
}'
However, when I try and search for something that includes part or all of the second word (last name) I get no suggestions, none of the calls below work:
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"john d",
"completion": {
"field" : "test_suggest"
}
}
}'
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"john doe",
"completion": {
"field" : "test_suggest"
}
}
}'
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":"john smith",
"completion": {
"field" : "test_suggest"
}
}
}'
I wonder how can I achieve such a thing without having to put the input a single text field, since I want both to match first and/or last names on completion.
You should do this:
curl -X PUT 'localhost:9200/test_index/item/1?refresh=true' -d '{
"suggest" : {
"input": [ "John", "Smith", "John Smith" ],
"output": "John Smith",
"weight" : 34
}
}'
i.e. add all wanted terms combinations into the input.
I faced the same problem, then I used something like
curl -XPOST localhost:9200/test_index/_suggest -d '{
"test_suggest":{
"text":["john", "smith"],
"completion": {
"field" : "test_suggest"
}
}
}'

Elastic Search parent with same type

Sorry if this is a duplicate (I did try searching), or if this is a silly question. New to posting questions.
I am trying to do parent child relations and queries in ElasticSearch with the following:
#!/bin/bash
curl -XDELETE 'http://localhost:9200/test/'
echo
curl -XPUT 'http://localhost:9200/test/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 1
}
}
}'
echo
curl -XPUT localhost:9200/test/_mapping/nelement -d '{
"nelement" : {
"_id" : { "path" : "nid", "store" : true, "index" : "not_analyzed"},
"_parent" : { "type" : "nelement"},
"properties" : {
"name" : { "type" : "string", "index" : "not_analyzed" },
"nid": { "type" : "string", "copy_to" : "_id" }
}
}
}'
echo
#curl -s -XPOST localhost:9200/_bulk --data-binary #test_data.json
test_data.json is as follows:
{"index":{"_index":"test","_type":"element", "_parent":"abc"}
{"nid":"1a","name":"parent1"}
{"index":{"_index":"test","_type":"element", "_parent":"1a"}
{"nid":"2b","name":"child1"}
{"index":{"_index":"test","_type":"element", "_parent":"2b"}
{"nid":"2c","name":"child2"}
curl -XGET 'localhost:9200/test/nelement/_search?pretty=true' -d '{
"query": {
"has_child": {
"child_type": "nelement",
"query": {
"match": {
"nid": "2c"
}
}
}
}
}'
echo
echo
curl -XGET 'localhost:9200/test/nelement/_search?pretty=true' -d '{
"query": {
"has_parent": {
"type": "nelement",
"query": {
"term": {
"nid": "2b"
}
}
}
}
}'
For some reason, my search queries get no results. I have confirmed that the objects are indexed....
Because you are using self referential(set parent and query in the same index type) to parent/child query.
For now Elasticsearch is not supporting it.
Explore parent/child self referential support

How to match on prefix in Elasticsearch

let's say that in my elasticsearch index I have a field called "dots" which will contain a string of punctuation separated words (e.g. "first.second.third").
I need to search for e.g. "first.second" and then get all entries whose "dots" field contains a string being exactly "first.second" or starting with "first.second.".
I have a problem understanding how the text querying works, at least I have not been able to create a query which does the job.
Elasticsearch has Path Hierarchy Tokenizer that was created exactly for such use case. Here is an example of how to set it for your index:
# Create a new index with custom path_hierarchy analyzer
# See http://www.elasticsearch.org/guide/reference/index-modules/analysis/pathhierarchy-tokenizer.html
curl -XPUT "localhost:9200/prefix-test" -d '{
"settings": {
"analysis": {
"analyzer": {
"prefix-test-analyzer": {
"type": "custom",
"tokenizer": "prefix-test-tokenizer"
}
},
"tokenizer": {
"prefix-test-tokenizer": {
"type": "path_hierarchy",
"delimiter": "."
}
}
}
},
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"analyzer": "prefix-test-analyzer",
//"index_analyzer": "prefix-test-analyzer", //deprecated
"search_analyzer": "keyword"
}
}
}
}
}'
echo
# Put some test data
curl -XPUT "localhost:9200/prefix-test/doc/1" -d '{"dots": "first.second.third"}'
curl -XPUT "localhost:9200/prefix-test/doc/2" -d '{"dots": "first.second.foo-bar"}'
curl -XPUT "localhost:9200/prefix-test/doc/3" -d '{"dots": "first.baz.something"}'
curl -XPOST "localhost:9200/prefix-test/_refresh"
echo
# Test searches.
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": "first"
}
}
}'
echo
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": "first.second"
}
}
}'
echo
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true" -d '{
"query": {
"term": {
"dots": "first.second.foo-bar"
}
}
}'
echo
curl -XPOST "localhost:9200/prefix-test/doc/_search?pretty=true&q=dots:first.second"
echo
There is also a much easier way, as pointed out in elasticsearch documentation:
just use:
{
"text_phrase_prefix" : {
"fieldname" : "yourprefix"
}
}
or since 0.19.9:
{
"match_phrase_prefix" : {
"fieldname" : "yourprefix"
}
}
instead of:
{
"prefix" : {
"fieldname" : "yourprefix"
}
Have a look at prefix queries.
$ curl -XGET 'http://localhost:9200/index/type/_search' -d '{
"query" : {
"prefix" : { "dots" : "first.second" }
}
}'
You should use a commodin chars to make your query, something like this:
$ curl -XGET http://localhost:9200/myapp/index -d '{
"dots": "first.second*"
}'
more examples about the syntax at: http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html
I was looking for a similar solution - but matching only a prefix. I found #imtov's answer to get me almost there, but for one change - switching the analyzers around:
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"analyzer": "keyword",
"search_analyzer": "prefix-test-analyzer"
}
}
}
}
instead of
"mappings": {
"doc": {
"properties": {
"dots": {
"type": "string",
"index_analyzer": "prefix-test-analyzer",
"search_analyzer": "keyword"
}
}
}
}
This way adding:
'{"dots": "first.second"}'
'{"dots": "first.third"}'
Will add only these full tokens, without storing first, second, third tokens.
Yet searching for either
first.second.anyotherstring
first.second
will correctly return only the first entry:
'{"dots": "first.second"}'
Not exactly what you asked for but somehow related, so I thought could help someone.

Resources