elasticsearch parent child mergin two tables togeher - elasticsearch

How do I do something like a JOIN in elasticsearch. I have 2 index
"_index": "blacklist1",
"_type": "logs",
"_id": "KS8CI-XKSID8DKSLAKS"
"_index": "raw_firewall_logs",
"_type": "logs",
"_id": "SADLFSJFOI3098WOIJFD",
Assuming in "raw_firewall_logs" I have a field "source_ip" that contains "123.123.123.1"
Assuming in "blacklist1" I have a field "block_ip" that contains "123.123.123.1" and a field "threat_type" containing "worm"
How should I update my mapping and how should I query so that I can see the following
{"source_ip": "123.123.123.1", "blacklist1": "Yes", "blacklist1": "worm"}

Related

Nested attribute term Query

I have a documents something like bellow
{
"_index": "lines",
"_type": "lineitems",
"_id": "4002_11",
"_score": 2.6288738,
"_source": {
"data": {
"type": "Shirt"
}
}
}
I want to get a count based on type attribute value. Any suggestion on this?
I tried term query but no lick with that.
You should use the terms aggregation, this will return the number of documents aggregated for each "type" field values.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

Elastic filter with dot (.) in name

I'm pretty new to ELK and seem to start with the complicated questions ;-)
I have elements that look like following:
{
"_index": "asd01",
"_type": "doc",
"_id": "...",
"_score": 0,
"_source": {
"#version": "1",
"my-key": "hello.world.to.everyone",
"#timestamp": "2018-02-05T13:45:00.000Z",
"msg": "myval1"
}
},
{
"_index": "asd01",
"_type": "doc",
"_id": "...",
"_score": 0,
"_source": {
"#version": "1",
"my-key": "helloworld.from.someone",
"#timestamp": "2018-02-05T13:44:59.000Z",
"msg": "myval2"
}
I want to filter for my-key(s) that start with "hello." and want to ignore elements that start with "helloworld.". The dot seem to be interpreted as a wildchard and every kind of escaping doesn't seem to work.
With a filter for that as I want to be able to use the same expression in Kibana as well as in the API directly.
Can someone point me to how to get it working with Elasticsearch 6.1.1?
It's not being used as a wildcard, it's just being removed by the default analyzer (standard analyzer). If you do not specify a mapping, elasticsearch will create one for you. For string fields it will create a multi value field, the default will be text (with default analyzer - standard) and keyword field with the keyword analyzer. If you do not want this behaviour you must specify the mapping explicitly during index creation, or update it and reindex the data
Try using this
GET asd01/_search
{
"query": {
"wildcard": {
"my-key.keyword": {
"value": "hello.*"
}
}
}
}

Is it better to have a field with 'Unkown' value or to not have the field at all?

I'm currently using elasticsearch 2.4.4. I have records about fruits in my ES. Some records have the field 'color' specified with a value and the same field doesn't exist in the remaining ones. I'm worried that when the size of ES data increases, will this affect my query performance. Should I repopulate my ES data so that the records which didn't have color field, now have it with "Unknown" value or should I not have it at all.
Example of a record with color field
{
"_index": "test_data",
"_type": "test_type",
"_id": "AVqlmVt1DMREQvQmAIpk",
"_source": {
"fruitName":"Apple",
"origin" : "New Zealand",
" weight" : "50gms",
"size" : "3 inches"
"color" : "Red"
}
Example of a record without color field
{
"_index": "test_data",
"_type": "test_type",
"_id": "AVqlmVt1DMREQvQmAIpn",
"_source": {
"fruitName":"Banana",
"origin" : "Europe",
" weight" : "500gms",
"size" : "6 inches"
}
So my question is that does elasticsearch perform better when the field exists with a random value than when the field doesn't exist at all.

Kibana 4 index patterns time-field

Is there a way to make Kibana-4 show a timestamp field which is a epoch time as the time-field when creating an index pattern.
I know how to make this with the _timestamp field by editing the metaFields in the settings, but I would like this to be a custom field.
Eg: Let's say this is the document I am storing in ES:
{
"_id": "AVCbqgiV7A6BIPyJuJRS",
"_index": "scm-get-config-stg",
"_score": 1.0,
"_source": {
"serverDetails": {
"cloudDC": "xxx",
"cloudName": "yyyy",
"hostName": "hostname",
"ipAddress": "10.247.194.49",
"runOnEnv": "stg",
"serverTimestamp": 1445720623246
}
},
"_type": "telemetry"
}
Now I would like to create an index pattern where the Time-field name should be serverTimestamp.

Is it possible to include '_id' in '_source' in ElasticSearch

Usually ElasticSearch documents are stored as:
{
"_index": "some_index",
"_type": "some_type",
"_id": "blah_blah",
"_score": null,
"_source": {
"field_a" : "value_a",
"field_b" : "value_b"
........
}
Is it possible to include _id in the _source itself while querying the data? e.g.
{
"_index": "some_index",
"_type": "some_type",
"_id": "blah_blah",
"_score": null,
"_source": {
"_id": "blah_blah", // Added in the _source object
"field_a" : "value_a",
"field_b" : "value_b"
........
}
Let's assume I do not have control over the data I am writing so cannot insert it in the source. Also, I can read the entire object and manually include it but wondering if there is a way to do so via ES query.
The _id field is neither indexed nor stored, meaning it dosen't really exist.
The _type field is only indexed ,but not stored. _id and _type are both matedata of elasticsearch, they concatenated together as id#type(_uid).

Resources