Is it possible to index only some part of the object in elasticsearch?
Example:
$ curl -XPUT 'http://localhost:9200/test/item/1' -d '
{
"record": {
"city": "London",
"contact": "Some person name"
}
}
$ curl -XPUT 'http://localhost:9200/test/item/2' -d '
{
"record": {
"city": "London",
"contact": { "phone": "some-phone-number", "name": "Other person's name" }
}
}
$ curl -XPUT 'http://localhost:9200/test/item/3' -d '
{
"record": {
"city": "Oslo",
"headquarters": { "phone": "some-other-phone-number",
"address": "some address" }
}
}
I want only city name to be searchable, all remaining part of the object I want to leave unindexed and completely arbitrary. For example some fields can change it's type from object to object.
Is it possible to write mapping that allow such behaviour?
UPDATE
My final solution looks like this:
{
"test": {
"dynamic": "false",
"properties": {
"name": {
"type": "string"
}
}
}
}
I add "dynamic": "false" on the lowest level of my mapping and it works as expected.
You can achieve this by disabling dynamic mapping on entire type or just inner object record:
"mappings": {
"doc": {
"properties": {
"record": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"dynamic": false
}
}
}
}
Related
According to Elasticsearch's roadmap, mapping types are going to be completely removed at 7.x
How are we going to give a schema structure to Documents without mapping?
For example how would we replace this (A Doc/mapping_type with 3 fields of specific data type):
PUT twitter
{
"mappings": {
"user": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
They are going to remove types (user in you example) from mapping, because there is only 1 type per index now, the rest will be the same:
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
}
As you can see, there is no user type anymore.
I insert this pattern to find my object when I search in kibana
curl -XPUT 'localhost:9200/exception_index?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"siam_exception": {
"dynamic": "true",
"properties": {
"Title": { "type": "text" },
"Date&Time": { "type": "text" },
"Tags": { "type": "text" },
"Level": { "type": "text" },
"Message": {
"dynamic": "true",
"properties": {
"Root": { "type": "text" },
"ExceptionList": { "type": "text" }
}
}
}
}
}
}
'
and this is my exception.log
"siam_exception":{
"Title": "exception",
"Date&Time": "2018-01-21 09:52:20.759950",
"Tags": "SiamCore",
"Level": "Fatal",
"Message":[
{
"Root": "( ['MQROOT' : 0x7f0a902b2d80]
(0x01000000:Name ):Properties = ( ['MQPROPERTYPARSER' : 0x7f0a902bffa0]
(0x03000000:NameValue):MessageSet = 'SIAMCOMMONMSG' (CHARACTER)",
"ExceptionList": "(0x03000000:NameValue):ReplyIdentifier = X'000000000000000000000000000000000000000000000000' (BLOB)","
}
]
}
but kibana find nothing when I search *
even if I omit ' "siam_exception": ' from first line of exception.log nothing change
what is my problem
I'd like to map the following structure:
- I have blog posts
- Blog posts can have comments
- Comments can have replies (which are also comments), so it should be a recursive datastructure
POST -----*--> COMMENT
COMMENT -----*---> COMMENT
Here's what I tried:
mappings: {
"comment": {
"properties": {
"content": { "type": "string" },
"replies": { "type": "comment" }
}
},
"post": {
"properties": {
"comments": {
"type": "comment"
}
}
}
}
Of course it's not working. How can I achieve this?
You're trying to declare the types as you would do in OO programming, that's not how it works in ES. You need to use parent-child relationships like below, i.e. post doesn't have a field called comments but the comment mapping type has a _parent meta field referencing the post parent type.
Also in order to model replies I suggest to simply have another field called in_reply_to which would contain the id of the comment that the reply relates to. Much easier that way!
PUT blogs
{
"mappings": {
"post": {
"properties": {
"title": { "type": "string"}
}
},
"comment": {
"_parent": {
"type": "post"
},
"properties": {
"id": {
"type": "long"
}
"content": {
"type": "string"
},
"in_reply_to": {
"type": "long"
}
}
}
}
}
mappings: {
"post": {
"properties": {
"content": { "type": "string" },
"comment": {
"properties" : {
"content": { "type": "string" },
"replies": {
"properties" : {
"content": { "type": "string" }
}
}
}
}
}
ElasticSearch has the ability to copy values to other fields (at index time), enabling you to search on multiple fields as if it were one field (Core Types: copy_to).
However, there doesn't seem to be any way to specify the order in which these values should be copied. This could be important when phrase matching:
curl -XDELETE 'http://10.11.12.13:9200/helloworld'
curl -XPUT 'http://10.11.12.13:9200/helloworld'
# copy_to is ordered alphabetically!
curl -XPUT 'http://10.11.12.13:9200/helloworld/_mapping/people' -d '
{
"people": {
"properties": {
"last_name": {
"type": "string",
"copy_to": "full_name"
},
"first_name": {
"type": "string",
"copy_to": "full_name"
},
"state": {
"type": "string"
},
"city": {
"type": "string"
},
"full_name": {
"type": "string"
}
}
}
}
'
curl -X POST "10.11.12.13:9200/helloworld/people/dork" -d '{"first_name": "Jim", "last_name": "Bob", "state": "California", "city": "San Jose"}'
curl -X POST "10.11.12.13:9200/helloworld/people/face" -d '{"first_name": "Bob", "last_name": "Jim", "state": "California", "city": "San Jose"}'
curl "http://10.11.12.13:9200/helloworld/people/_search" -d '
{
"query": {
"match_phrase": {
"full_name": {
"query": "Jim Bob"
}
}
}
}
'
Only "Jim Bob" is returned; it seems that the fields are copied in field-name alphabetical order.
How would I switch the copy_to order such that the "Bob Jim" person would be returned?
This is more deterministically controlled by registering a transform script in your mapping.
something like this:
"transform" : [
{"script": "ctx._source['full_name'] = [ctx._source['first_name'] + " " + ctx._source['last_name'], ctx._source['last_name'] + " " + ctx._source['first_name']]"}
]
Also, transform scripts can be "native", i.e. java code, made available to all nodes in the cluster by making your custom classes available in the elasticsearch classpath and registered as native scripts by the settings:
script.native.<name>.type=<fully.qualified.class.name>
in which case in your mapping you'd register the native script as a transform like so:
"transform" : [
{
"script" : "<name>",
"params" : {
"param1": "val1",
"param2": "val2"
},
"lang": "native"
}
],
In ElasticSearch, given the following document, Is it possible to add items to the "Lists" sub-document without passing the parent attributes (i.e. Message and tags)?
I have several attributes in the parent document which I dont want to pass every time I want to add one item to the sub-document.
{
"tweet" : {
"message" : "some arrays in this tweet...",
"tags" : ["elasticsearch", "wow"],
"lists" : [
{
"name" : "prog_list",
"description" : "programming list"
},
{
"name" : "cool_list",
"description" : "cool stuff list"
}
]
}
}
What you are looking for is, how to insert a nested documents.
In your case, you can use the Update API to append a nested document to your list.
curl -XPOST localhost:9200/index/tweets/1/_update -d '{
"script" : "ctx._source.tweet.lists += new_list",
"params" : {
"new_list" : {"name": "fun_list", "description": "funny list" }
}
}'
To support nested documents, you have to define your mapping, which is described here.
Assuming your type is tweets, the follwoing mapping should work:
curl -XDELETE http://localhost:9200/index
curl -XPUT http://localhost:9200/index -d'
{
"settings": {
"index.number_of_shards": 1,
"index.number_of_replicas": 0
},
"mappings": {
"tweets": {
"properties": {
"tweet": {
"properties": {
"lists": {
"type": "nested",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
}
}
}
}
}
}
}
}
}'
Then add a first entry:
curl -XPOST http://localhost:9200/index/tweets/1 -d '
{
"tweet": {
"message": "some arrays in this tweet...",
"tags": [
"elasticsearch",
"wow"
],
"lists": [
{
"name": "prog_list",
"description": "programming list"
},
{
"name": "cool_list",
"description": "cool stuff list"
}
]
}
}'
And then add your element with:
curl -XPOST http://localhost:9200/index/tweets/1/_update -d '
{
"script": "ctx._source.tweet.lists += new_list",
"params": {
"new_list": {
"name": "fun_list",
"description": "funny list"
}
}
}'