Can anyone help me - how to use arrays in opensearch? - elasticsearch

I put an object with some field and i wanna figure out how to mapping the index to handle and show the values like elasticsearch. I dunno why opensearch separate to individual fields the values. Both app has the same index mappings but the display is different for something.
I tried to map the object type set to nested but nothing changes
PUT test
{
"mappings": {
"properties": {
"szemelyek": {
"type": "nested",
"properties": {
"szam": {
"type": "integer"
},
"nev": {
"type": "text"
}
}
}
}
}
}

Related

Elasticsearch custom mapping definition

I have to upload data to elk in the following format:
{
"location":{
"timestamp":1522751098000,
"resources":[
{
"resource":{
"name":"Node1"
},
"probability":0.1
},
{
"resource":{
"name":"Node2"
},
"probability":0.01
}]
}
}
I'm trying to define a mapping this kind of data and I produced he following mapping:
{
"mappings": {
"doc": {
"properties": {
"location": {
"properties" : {
"timestamp": {"type": "date"},
"resources": []
}
}
}
}
}
I have 2 questions:
how can I define the "resources" array in my mapping?
is it possible to define a custom type (e.g. resource) and use this type in my mapping (e.g "resources": [{type:resource}]) ?
There is a lot of things to know about the Elasticsearch mapping. I really highly suggest to read through at least some of their documentation.
Short answers first, in case you don't care:
Elasticsearch automatically allows storing one or multiple values of defined objects, there is no need to specify an array. See Marker 1 or refer to their documentation on array types.
I don't think there is. Since Elasticsearch 6 only 1 type per index is allowed. Nested objects is probably the closest, but you define them in the same file. Nested objects are stored in a separate index (internally).
Long answer and some thoughts
Take a look at the following mapping:
"mappings": {
"doc": {
"properties": {
"location": {
"properties": {
"timestamp": {
"type": "date"
},
"resources": { [1]
"type": "nested", [2]
"properties": {
"resource": {
"properties": {
"name": { [3]
"type": "text"
}
}
},
"probability": {
"type": "float"
}
}
}
}
}
}
}
}
This is how your mapping could look like. It can be done differently, but I think it makes sense this way - maybe except marker 3. I'll come to these right now:
Marker 1: If you define a field, you usually give it a type. I defined resources as a nested type, but your timestamp is of type date. Elasticsearch automatically allows storing one or multiple values of these objects. timestamp could actually also contain an array of dates, there is no need to specify an array.
Marker 2: I defined resources as a nested type, but it could also be an object like resource a little below (where no type is given). Read about nested objects here. In the end I don't know what your queries would look like, so not sure if you really need the nested type.
Marker 3: I want to address two things here. First, I want to mention again that resource is defined as a normal object with property name. You could do that for resources as well.
Second thing is more a thought-provoking impulse: Don't take it too seriously if something absolutely doesn't fit your case. Just take it as an opinion.
This mapping structure looks very inspired by a relational database approach. I think you usually want to define document structures for elasticsearch more for the expected searches. Redundancy is not a problem, but nested objects can make your queries complicated. I think I would omit the whole resources part and do it something like this:
"mappings": {
"doc": {
"properties": {
"location": {
"properties": {
"timestamp": {
"type": "date"
},
"resource": {
"properties": {
"resourceName": {
"type": "text"
}
"resourceProbability": {
"type": "float"
}
}
}
}
}
}
}
}
Because as I said, in this case resource can contain an array of objects, each having a resourceName and a resourceProbability.

ElasticSearch Indexing, Adding Fields

I would like to use elastic search to index the JSON schema provided below
{
"data": "etc",
"metadata": {
"foo":"bar",
"baz": "etc"
}
}
However the metadata can vary and I do not know all the fields that could be present. Is there a way to tell elastic search that if it sees a value in the metadata object to index it in a certain way? (I do know that all the values would be strings)
Thanks
Yes, you can do that using dynamic templates, basically like this:
PUT my_index
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"full_name": {
"path_match": "metadata.*",
"mapping": {
"type": "text" <---- add your desired mapping here
}
}
}
]
}
}
}

Create new Index Mapping error

When I create an index with mapping like this one, what does it mean the _template/ word? what does the _ mean? I ask your help to understand more about creating an index, are they stored in a kind of folder, like template/packets folder?
PUT _template/packets
{
"template": "packets-*",
"mappings": {
"pcap_file": {
"dynamic": "false",
"properties": {
"timestamp": {
"type": "date"
},
"layers": {
"properties": {
"frame": {
"properties": {
"frame_frame_len": {
"type": "long"
},
"frame_frame_protocols": {
"type": "keyword"
}
}
},
"ip": {
"properties": {
"ip_ip_src": {
"type": "ip"
},
"ip_ip_dst": {
"type": "ip"
}
}
},
"udp": {
"properties": {
"udp_udp_srcport": {
"type": "integer"
},
"udp_udp_dstport": {
"type": "integer"
}
}
}
}
}
}
}
}
}
I ask this because after typing this, I recieve he following error
! Deprecation: Deprecated field [template] used, replaced by [index_patterns]
{
"acknowledged": true
}
I copied the pattern from this link:
https://www.elastic.co/blog/analyzing-network-packets-with-wireshark-elasticsearch-and-kibana
And I'm trying to do exactly what is taught in the link, and I already can capture files with tshark and parse copy them into a packets.json file, and I will use filebeat to transfer the data to Elasticsearch, I already uploaded some data to Elasticsearch, but it wasn't indexed correctly, I just saw a lot of information with a lot of data.
My aim is to inderstand exactly how to create a new index pattern, and also how to relate what I upload to that index.
Thank you very much.
Just replace word template with index_patterns:
PUT _template/packets
{
"index_patterns": ["packets-*"],
"mappings": {
...
Index templates allow you to define templates that will automatically be applied when new indices are created.
After version 5.6 the format of Elasticsearch index templates has changed; the template field, which was used to specify one or more patterns for matching index names that would use the template at create time, was deprecated and superseded by the more appropriately named field index_patterns which works exactly the same way.
To solve the issue and get rid of the deprecation warnings you will have to update all your pre-6.0 index templates, changing the template to index_patterns.
You can list all your index templates by running this command:
curl -XGET 'http://localhost:9200/_template/*?pretty'
Or replace the asterisk with the name of one specific index template.
More about ES templates is here.

How to map dynamic field value in elasticsearch?

I'm mapping a couchbase gateway document and I'd like to tell elasticsearch to avoid indexing the internal attributes added by the gateway like the "_sync", this object contains another object named "channels" which has the following form:
"channels": {
"i7de5558-32ad-48ca-bf91-858c3a1e4588": 12
}
So I guess the mapping of this object would be like:
"channels": {
"type": "object",
"properties": {
"i7de5558-32ad-48ca-bf91-858c3a1e4588": {
"type": "integer",
"index": "not_analyze"
}
}
}
The problem is that the keys are always changing, so I don't know if I should use a wildcard like this "*": {"type": "integer", "index": "not_analyze"} for this property or do something else.
Any advice please?
If the fields are of integer types, you don't have to provide them explicitly in the mapping. You can create an empty mapping ,index documents with these fields. Elasticsearch will infer the type of field and update the mapping dynamically. You can also use dynamic templates for this.
{
"mappings": {
"my_type": {
"dynamic_templates": [
{
"analysed_string_template": {
"path_match": "channels.*",
"mapping": {
"type": "integer"
}
}
}
]
}
}
}
There`s a dynamic way to do that as you need, is called dynamic template
Using templates you are able to create rules like this:
PUT /my_index
{
"mappings": {
"my_type": {
"date_detection": false
}
}
}
In your case you could create a template to set all news fields inside the channel object as not_analyzed.
Hope it will help

ElasticSearch - issue with sub term aggregation with array fields

I have the two following documents:
{
"title":"The Avengers",
"year":2012,
"casting":[
{
"name":"Robert Downey Jr.",
"category":"Actor",
},
{
"name":"Chris Evans",
"category":"Actor",
}
]
}
and:
{
"title":"The Judge",
"year":2014,
"casting":[
{
"name":"Robert Downey Jr.",
"category":"Producer",
},
{
"name":"Robert Duvall",
"category":"Actor",
}
]
}
I would like to perform aggregations, based on two fields : casting.name and casting.category.
I tried with a TermsAggregation based on casting.name field, with a subaggregation, which is another TermsAggregation based on the casting.category field.
The problem is that for the "Chris Evans" entry, ElasticSearch set buckets for ALL categories (Actor, Producer) whereas it should set only 1 bucket (Actor).
It seems that there is a cartesian product between all casting.category occurences and all casting.name occurences.
It behaves like this with array fields (casting), whereas I don't have the problem with simple fields (as title, or year).
I also tried to use nested aggregations, but maybe not properly, and ElasticSearch throws an error telling that casting.category is not a nested field.
Any idea here?
Elasticsearch will flatten the nested objects, so internally you will get:
{
"title":"The Judge",
"year":2014,
"casting.name": ["Robert Downey Jr.","Robert Duvall"],
"casting.category": ["Producer", "Actor"]
}
if you want to keep the relationship you'll need to use either nested objects or a parent child relationship
To do a nested mapping you'd need to do something like this:
"mappings": {
"movies": {
"properties": {
"title" : { "type": "string" },
"year" : { "type": "integer" },
"casting": {
"type": "nested",
"properties": {
"name": { "type": "string" },
"category": { "type": "string" }
}
}
}
}
}

Resources