How to exclude certain fields from the _source field in ElasticSearch - elasticsearch

I have following simple snippet:
PUT /lib36
{
"mappings": {
"_source": {"enabled": false},
"properties": {
"name": {
"type": "text"
},
"description":{
"type": "text"
}
}
}
}
PUT /lib36/_doc/1
{
"name":"abc",
"description":"xyz"
}
POST /lib36/_search
{
"query": {
"match": {
"name": "abc"
}
}
}
With "_source": {"enabled": false}, the queried result doesn't include _source field.
I would to know how to write the query that the query result has the _source field ,but only contain name field, but not description field.
Thanks!

You can use the source filtering of elasticsearch, but for that first you need to have _source enabled in your mapping.
You need to have below key-value in your search query JSON. below search will exclude all other fields apart from name.
{
"query": {
"match": {
"name": "abc"
}
},
"_source": [
"name"
],
}

Related

Make query in elasticsearch

I'm learning elastic for a new project.
I have a index with format:
{
"content_id": "bbbbbb",
"title": "content 2",
"data": [
{
"id": "xx",
"value": 3,
"tags": ["a","b","c"]
},
{
"id": "yy",
"value": 1,
"tags": ["e","d","c"]
}
]
}
How can i make a query to search contents that have at least one element in data that include tags "a" and "b" ?
thanks so much !!
How can i make query or re design my format data to easy make new query ?
Working with lists and nested documents in Elasticsearch is a bit tricky.
When creating the index, you need to specify the mapping for the nested documents. You can do this by adding a mapping for the data field.
PUT /my_index
{
"mappings": {
"doc": {
"properties": {
"data": {
"type": "nested"
}
}
}
}
}
If you have already created the index, you should delete it first, then create it again with the mapping.
Before deleting the index, you can reindex (copy) the data to a new index.
Now you can query the data field using the nested query:
GET /my_index/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"terms": {
"data.tags": [
"j",
"b",
"c"
]
}
}
}
}
}
If the requirement is to get the document if at least one object in data contains a and b both, then you need to specify the nested mapping as suggested by #ChamRun.
But for getting the results having a and b both, you need to use the below query :
{
"query": {
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"term": {
"data.tags": "a"
}
},
{
"term": {
"data.tags": "b"
}
}
]
}
}
}
}
}

Elasticsearch - Mapping fields from other indices

How can I define mapping in Elasticsearch 7 to index a document with a field value from another index? For example, if I have a users index which has a mapping for name, email and account_number but the account_number value is actually in another index called accounts in field number.
I've tried something like this without much success (I only see "name", "email" and "account_id" in the results):
PUT users/_mapping
{
"properties": {
"name": {
"type": "text"
},
"email": {
"type": "text"
},
"account_id": {
"type": "integer"
},
"accounts": {
"properties": {
"number": {
"type": "text"
}
}
}
}
}
The accounts index has the following mapping:
{
"properties": {
"name": {
"type": "text"
},
"number": {
"type": "text"
}
}
}
As I understand it, you want to implement field joining as is usually done in relational databases. In elasticsearch, this is possible only if the documents are in the same index. (Link to doc). But it seems to me that in your case you need to work differently, I think your Account object needs to be nested for User.
PUT /users/_mapping
{
"mappings": {
"properties": {
"account": {
"type": "nested"
}
}
}
}
You can further search as if it were a separate document.
GET /users/_search
{
"query": {
"nested": {
"path": "account",
"query": {
"bool": {
"must": [
{ "match": { "account.number": 1 } }
]
}
}
}
}
}

elastic search copy_to field not filled

I'm trying to copy a main title field in Elastic Search 5.6, to an other field with: index:false, so I can use this field to match the exact value.
However. After the reindex, and performed search with _source:["exact_hoofdtitel"], the field "exact_hoofdtitel" is not filled with the value of "hoofdtitel".
PUT producten_prd_5_test
{
"aliases": {},
"mappings": {
"boek": {
"properties": {
"hoofdtitel": {
"type": "text",
"copy_to": [
"suggest-hoofdtitel", "exact_hoofdtitel"
]
},
"suggest-hoofdtitel": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": false,
"preserve_position_increments": true,
"max_input_length": 50
},
"exact_hoofdtitel":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"index":false
}
}
},
}
}
},
"settings": {
"number_of_shards": "1",
"number_of_replicas": "0"
}
}
GET producten_prd_5_test/_search
{
"_source":["hoofdtitel","exact_hoofdtitel"]
}
hits": [
{
"_index": "producten_prd_5_test",
"_type": "boek",
"_id": "9781138340671",
"_score": 1,
"_source": {
"hoofdtitel": "The Nature of the Firm in the Oil Industry"
}
},
I believe that you can achieve what you want without copy_to. Let me show you how and why you don't need it here.
How can I make both full-text and exact match queries on the same field?
This can be done with fields mapping attribute. Basically, with the following piece of mapping:
PUT producten_prd_5_test_new
{
"aliases": {},
"mappings": {
"boek": {
"properties": {
"hoofdtitel": {
"type": "text", <== analysing for full text search
"fields": {
"keyword": {
"type": "keyword" <== analysing for exact match
},
"suggest": {
"type": "completion", <== analysing for suggest
"analyzer": "simple",
"preserve_separators": false,
"preserve_position_increments": true,
"max_input_length": 50
}
}
}
}
}
}
}
you will be telling Elasticsearch to index the same field three times: one for full-text search, one for exact match and one for suggest.
The exact search will be possible to do via a term query like this:
GET producten_prd_5_test_new/_search
{
"query": {
"term": {
"hoofdtitel.keyword": "The Nature of the Firm in the Oil Industry"
}
}
}
Why the field exact_hoofdtitel does not appear in the returned document?
Because copy_to does not change the source:
The original _source field will not be modified to show the copied
values.
It works like _all field, allowing you to concat values of multiple fields in one imaginary field and analyse it in a special way.
Does it make sense to do a copy_to to an index: false field?
With index: false the field will not be analyzed and will not be searchable (like in your example, the field exact_hoofdtitel.keyword).
It may still make sense to do so if you want to do keyword aggregations on that field:
GET producten_prd_5_test/_search
{
"aggs": {
"by copy to": {
"terms": {
"field": "exact_hoofdtitel.keyword"
}
}
}
}
This will return something like:
{
"aggregations": {
"by copy to": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "The Nature of the Firm in the Oil Industry",
"doc_count": 1
}
]
}
}
}

Elasticsearch 5.X Percolate: How to autogenerate copy_to fields?

In ES 2.3.3, many queries in the system I'm working on use the _all field. Sometimes these are registered to a percolate index, and when running percolator on the doc, _all is generated automatically.
In converting to ES 5.X _all is being deprecated and so _all has been replaced with a copy_to field that contains the components that we actually care about, and it works great for those searches.
Registering the same query to a percolate index with the same document mapping including copy_to fields works fine. Sending a percolate query with the document never results in a hit for a copy_to field however.
Manually building the copy_to field via simple string concatenation seems to work, it's just that I'd expect to be able to Query -> DocIndex and get the same result as Doc -> PercolateQuery... So I'm just looking for a way to have ES generate the copy_to fields automatically on a document being percolated.
Ended up there was nothing wrong with ES of course, posting here in case it helps someone else. Figured it out while attempting to generate a simpler example to post here with details... Basically the issue came down to the fact that attempting to percolate a document of a type that doesn't exist in the percolate index doesn't give any errors back, but seems to apply all percolate queries without applying any mappings which was just confusing as it worked for simple test cases, but not complex ones. Here's an example:
From the copy_to docs, generate an index with a copy_to mapping. See that a query to the copy_to field works.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
PUT my_index/my_type/1
{
"first_name": "John",
"last_name": "Smith"
}
GET my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
Create a percolate index with the same type
PUT /my_percolate_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
},
"queries": {
"properties": {
"query": {
"type": "percolator"
}
}
}
}
}
Create a percolate query that matches our other percolate query on the copy_to field, and a second query that just queries on a basic unmodified field
PUT /my_percolate_index/queries/1?refresh
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
PUT /my_percolate_index/queries/2?refresh
{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}
Search, but with the wrong type... there will be a hit on the basic field (first_name: John) even though no document mappings match the request
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "non_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}
Send in the correct document_type and see both matches as expected
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "my_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.51623213,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"1","_score":0.51623213,"_source":{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}},{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}

Elasticsearch Multi-Field With 'Raw' Value Not Being Created

I'm attempting to add an un-analyzed version of an analyzed field, as a 'raw' multi-field, as per the ElasticSearch documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/multi-fields.html
This seems to be a common, well-supported pattern.
I've created the following index / field :
{
"person": {
"aliases": {},
"mappings": {
"employee": {
"properties": {
"userName": {
"type": "string",
"analyzer": "autocomplete",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
If I query the index directly, i.e. GET /person, I see the mapping as I've posted above, so I'm confident that there wasn't a syntax error, etc.
However, when we're pushing data into the index, a userName.raw field is not being created.
{
"_index": "person",
"_type": "employee",
"_id": "2",
"_version": 1,
"found": true,
"_source": {
"username": "Test Value"
}
}
Anyone see something I'm missing?
Thanks!
EDIT:
This was a novice mistake when creating my index.
PUT person
{
"person": {
"aliases": {},
"mappings": {
"employee": {
"properties": {
"email": {
Notice the person key is being PUT in the 'person' index. This was creating a nested person.
Correct syntax is to remove the extra "person"
PUT person
{
"aliases": {},
"mappings": {
"employee": {
"properties": {
"email": {
Please see Linoy.M.K's answer, as he is correct.
The 'raw' field will not appear when retrieving a record by ID. Its only useful as part of a query.
Adding multiple analyzers will not modify your source document means your source document will always have username only not username.raw
Added analyzers are useful when you do searching, means you can now search with username and username.raw to achieve different behavior like below.
GET /person/employee/_search
{
"query": {
"match": {
"username": "Te"
}
}
}
GET /person/employee/_search
{
"query": {
"match": {
"username.raw": "Test Value"
}
}
}

Resources