I have an index that contains documents of different types (not talking about _type here) and each document has a field document_type that states their type. Is it possible to define mappings for each type of document within this index?
Is it possible to define mappings for each type of document within this index?
No, if you think of using the same field name with different types. For instance, field name id of type string and integer won't work.
Having different document_type basically indicates different domains. What you could do is to group information under each respective domain or type. For instance, an employee and project, both have an id and name, but different types in this example. Some call that nesting.
An example index mapping:
PUT example
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"doc": {
"properties": {
"employee": {
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 64
}
}
}
}
},
"project": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "keyword",
"ignore_above": 32
}
}
}
}
}
}
}
If you write the information, with different types.
PUT example/doc/1
{
"employee": {
"id": 4711,
"name": "John Doe"
},
"project": {
"id": "Project X",
"name": "Firebrand"
}
}
Others would argue to store employee and project in separate indices. This approach depends on your scenario and is also desirable. You allow both domains to evolve separately from each other.
Having a separate employee and project index gives you an advantage regarding maintenance. For querying some would argue, that you can group than with an alias. In the above example, it doesn't make sense since the field types are different. A search for the name over an analysed text field is different than over a keyword. Querying makes sense if you have the same field type.
No, if you want to use a single index, you would need to define a single mapping that combines the fields of each document type.
A better way might be to define separate indices on the same cluster for each document type. You can then create a single index alias that aliases to both of those indices if you want to be able to query across document types. Be sure that all fields that exist in both documents have the same data type in both mappings.
Having a single field name with more than one mapping type in the same index is not possible. Two options I can think of:
1. Separate the different doc types to separate indices.
2. Use different fields names for different doc types, so that each name can have different mapping. You can also use nesting, like: type_a.my_field and type_b.my_field, both in the same index.
Related
I have a simple mapping in elasticsearch-6, like this.
{
"mappings": {
"_doc": {
"properties": {
"#timestamp": {
"type": "date"
},
"fields": {
"properties": {
"meta": {
"properties": {
"task": {
"properties": {
"field1": {
"type": "keyword"
},
"field2": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
}
}
Now I have to add another property to it - tasks which is just an array of the task property already defined.
Is there a way to reference the properties of task so that I don't have to duplicate all the properties? Something like:
{
"fields": {
"properties": {
"meta": {
"properties": {
"tasks": {
"type": "nested",
"properties": "fields.properties.meta.properties.task"
},
"task": {
...
}
}
}
}
}
}
you can already use your task field as an array of task objects, only, you cannot query them independently. If your goal is to achieve this (as I assume from your second example), I would directly set the "nested" data type into the mapping of the task field - then, yes, you'll need to reindex.
I can't imagine a use case where you would need the same array of objects duplicated in two fields, with one nested and the other not.
EDIT
Below, some considerations/suggestions based on the discussion in the comments:
One field can have either one value or an array of values. In your case, your task field can have either one task object or an array of task objects. You should only care about setting the "nested" datatype for task, if you plan to query its objects independently (of course, if they are more than one)
I would suggest to design your documents in such a way to avoid duplicated information in the first place. Duplicated information will make your documents bigger and more complex to process, leading to greater storage requirements and slower queries
If it's not possible to redesign your document mapping, you might check whether alias datatypes can help you avoiding some repetitions.
If it's not possible to redesign your document mapping, you might check whether dynamic templates can help you avoiding some repetitions
I want to enable both numberic and full-text search for a field.
I need the two ways to search the field for different scenarios.
How can I index the field?
You can always make use of fields for such use cases. Lets say the field name is field1. Below is how you can define it for indexing it in different ways:
"field1": {
"type": "integer",
"fields": {
"textval": {
"type": "text"
},
"keyword": {
"type": "keyword"
}
}
}
Refer this for understanding more on fields.
How can I use, create two index or what?
I have one entity goods and one entity shop, should I create two index or two type in elastic search 6?
I have tried two mapping two type but it throw Exception.
How Can I do ?
In elacticsearch 6, you cannot create more than one doc type for an index. Earlier for an index company you could have doc type employee, infra, 'building' etc but now you it will throw an error.
In future versions doc type will be completely removed, so you will only have to deal with index.
An index in the elasticsearch is like table in normal database. And every document that you store will be row, and fields of that document will be columns.
Without seeing the data and knowing what you want to accomplish it is pretty hard to suggest how you should plan the schema of elasticsearch, but these information can help you decide.
you can use one of these two options:
1)Index per document type
2)Custom type field
for option 2:
PUT twitter
{
"mappings": {
"goods": {
"properties": {
"field1": { "type": "text" },
"field2": { "type": "keyword" },
}
},
"shop": {
"properties": {
"field1": { "type": "text" },
"field2": { "type": "date" }
}
}
}
}
see this
I'm reading about mapping in elasticsearch and I see these 2 terms: Nested-field & Depth. I think these 2 terms are quite equivalent. I'm currently confused by these 2. Please can anyone clear me out? Thank you.
And btw, are there any ways to check a document depth via Kibana?
Sorry for my english.
The source of confusion is probably because in Elasticsearch term nested can be used in two different contexts:
"nested" as a regular JSON notation nested, i.e. JSON object within JSON object;
"nested" as Elasticsearch nested data type.
In the mappings documentation page when they mention "depth" they refer to the first meaning. Here the setting index.mapping.depth.limit defines how deeply nested can your JSON documents be.
How is JSON depth interpreted by Elasticsearch mapping?
Here is an example of JSON document with depth 1:
{
"name": "John",
"age": 30
}
Now with depth 2:
{
"name": "John",
"age": 30,
"cars": {
"car1": "Ford",
"car2": "BMW",
"car3": "Fiat"
}
}
By default (as of ES 6.3) the depth cannot exceed 20.
What is a nested data type and why isn't it the same as a document with depth>1?
nested data type allows to index arrays of objects and query their items individually via nested query. What this means is that Elasticsearch will index a document with such fields differently (see the page Nested Objects of the Definitive Guide for more explanation).
For instance, if in the following example we do not define "user" as nested field in the mapping, a query for user.first: John and user.last: White will return a match and it will be a mistake:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
If we do, Elasticsearch will index each item of the "user" list as an implicit sub-document and thus will use more resources, more disk and memory. This is why there is also another setting on the mappings: index.mapping.nested_fields.limit regulates how many different nested fields one can declare (which defaults to 50). To customize this you can see this answer.
So, Elasticsearch documents with depth > 1 are not indexed as nested unless you explicitly ask it to do so, and that's the difference.
Can I have nested fields inside nested?
Yes, you can! Just to stop this confusion, yes, you can define a nested field inside nested field in a mapping. It will look something like this:
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"user": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"cars": {
"type": "nested",
"properties": {
"brand": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
But keep in mind that the amount of implicit documents to be indexed will be multiplied, and it will be simply not that efficient.
Can I get the depth of my JSON objects from Kibana?
Most likely you can do it with scripts, check this blog post for further details: Using Painless in Kibana scripted fields.
After updating to ElasticSearch 2 I am no more able to map the ContextSuggester for different types:
PUT /test/foo/_mapping
{
"properties": {
"suggest": {
"type": "completion",
"context": {
"type": {
"type": "category",
"path": "_type",
"default": [
"foo"
]
}
}
}
}
}
PUT /test/bar/_mapping
{
"properties": {
"suggest": {
"type": "completion",
"context": {
"type": {
"type": "category",
"path": "_type",
"default": [
"bar"
]
}
}
}
}
}
Putting the map for the second type ends in the following exception:
Mapper for [suggest] conflicts with existing mapping in other types: [mapper [suggest] has different [context_mapping] values]
The problem is that the default value differs for the different types. From my point of view, this should be the expected approach. How can I solve this problem?
Tested version of ES: 2.2.1
You have a field conflict.
Mapping - field conflicts
Mapping types are used to group fields, but the fields in each
mapping type are not independent of each other. Fields with:
the same name
in the same index
in different mapping types
map to the same field internally, and must have the same mapping. If a
title field exists in both the user and blogpost mapping types, the
title fields must have exactly the same mapping in each type. The only
exceptions to this rule are the copy_to, dynamic, enabled,
ignore_above, include_in_all, and properties parameters, which may
have different settings per field.
Either create a separate index or rename the field for the other type.