time-based when configure an index pattern not working - elasticsearch

Hi!
I have an issue about set a date field as time-based when I configure my index pattern. When I choose my date filed on the timefield name, I cannot Vizualise any data on the Discover part.
However, when I uncheck the box named Index contains time-based events, all data appears:
Maybe I forgot something during my mapping ? There is the mapping I've set for this index:
"index_test" : {
"mappings": {
"tr": {
"_source": {
"enabled":true
},
"properties" : {
"id" : { "type" : "integer" },
"volume" : { "type" : "integer" },
"high" : { "type" : "float" },
"low" : { "type" : "float" },
"timestamp" : { "type" : "date", "format" : "yyyy-MM-dd HH:mm:ss" }
}
}
}'
}
I am currently try to use timelion also, and it seems to not found any data to show. I think it cannot because of this time-based unchecked... Any idea about how set this timestamp as time-based without loose the data access on the Discover part ?

Simple question with simple answer... I just forgot to set the timepicker in the Right-top of the Discover part to show past data:

Related

Creating a new field into existing index - ElasticSearch

I am wanting to create a new field and add it to an existing index so that way I can send a unique value to that new field. I was hoping there was an API to do this without having to do it in the CLI of Kibana. But I ran into this article that tells you how to add new fields to an existing index.
I tried to add it under _source field but it did not allow me.
PUT customer-simulation-es-app-logs-development-2021-07/_mapping
{
"_source":{
"TransactionKey":{
"type": "keyword"
}
}
}
So I then added it to properties which allowed me:
PUT customer-simulation-es-app-logs-development-2021-07/_mapping
{
"properties":{
"TransactionKey":{
"type": "keyword"
}
}
}
To make sure it was updated I ran the cmd GET customer-simulation-es-app-logs-development-2021-07/_mapping which did return it.
{
"customer-simulation-es-app-logs-development-2021-07" : {
"mappings" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"TransactionKey" : {
"type" : "keyword"
},
"exceptions" : {
"properties" : {
"ClassName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
.....
But when I go to Discover and type in TransactionKey for the fields nothing pops up. Did I not add the new field correctly to the existing index?
If you're running a version prior to 7.11, then you need to go to Stack Management > Index pattern and refresh your index pattern before seeing your new field in the Discover view. You need to do this every time your index mapping changes.
Since 7.11, the index pattern are being refreshed automatically whenever needed.

What does "mappings" do in Elasticsearch?

I just started learning Elasticsearch. I am trying out to create index, adding data, deleting data, and search data.
I can also understand the settings of Elasticsearch.
When using "PUT" to use settings
{
"settings": {
"index.number_of_shards" : 1,
"index.number_of_replicas" : 0
}
}
When using "GET" to retrieve settings information
{
"dsm" : {
"settings" : {
"index" : {
"creation_date" : "1555487684262",
"number_of_shards" : "1",
"number_of_replicas" : "0",
"uuid" : "qsSr69OdTuugP2DUwrMh4g",
"version" : {
"created" : "7000099"
},
"provided_name" : "dsm"
}
}
}
}
However,
What does "mappings" do in Elasticsearch?
{
"kibana_sample_data_flights" : {
"aliases" : { },
"mappings" : {
"properties" : {
"AvgTicketPrice" : {
"type" : "float"
},
"Cancelled" : {
"type" : "boolean"
},
"Carrier" : {
"type" : "keyword"
},
"Dest" : {
"type" : "keyword"
},
"DestAirportID" : {
"type" : "keyword"
},
"DestCityName" : {
}, // just part of data
The mapping document is a way of describing the structure of your data and defining the types eg boolean, text, keyword. These types are important as they determine how your fields are indexed and analysed.
Elasticsearch supports dynamic mapping, so effectively performs an automatic best guess of the appropriate types but you may wish to override these.
I found this to be a useful article to explain the mapping process:
https://www.elastic.co/blog/found-elasticsearch-mapping-introduction
Indexing is determined by the field type for example where the type is 'keyword' the search engine will be expecting an exact match, when the type is 'text' the search engine will be trying to determine how well the document matches the query term and in so doing so will be performing a 'full text search'.
So for example:
- A search for jump should also match jumped, jumps, jumping, and perhaps even leap.
This is a great article describing exact vs full text search and is where I took the jump example: https://www.elastic.co/guide/en/elasticsearch/guide/current/_exact_values_versus_full_text.html
Much of the power of elasticsearch is in the mapping and analysis.
Its the mapping of the index. This means it describes the data that is stored in this index. Take a deeper look here.

Nest Aggregation with dynamic fields - elasticsearch

Is it possible to use nest to create buckets with keywords/fields that are not strongly typed?
Due to the nature of this project. I do not have any root objects to pass in.
Below is an example.
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t.Field(l => "CUSTOM_FIELD_HERE"))
)
);
string implicitly convert to Field, so you can pass a string for any field name
var result = client.Search<PortalDoc>(s => s
.Aggregations(a => a
.Terms("agg_objecttype", t => t
.Field("CUSTOM_FIELD_HERE")
)
)
);
Yes, something like that is possible. Looks here for my solution using nested fields. It allows to do all the operations on "dynamic" fields, but with somewhat greater efforts (nested fields are tougher to manipulate with). The gist has some proofs for search, but I implemented aggregations as well.
curl -XPOST localhost:9200/something -d '{
"mappings" : {
"live" : {
"_source" : { "enabled" : true },
"dynamic" : false,
"properties" : {
"customFields" : {
"type" : "nested",
"properties" : {
"fieldName" : { "type" : "string", "index" : "not_analyzed" },
"stringValue": {
"type" : "string",
"fields" : {
"raw" : { "type" : "string", "index" : "not_analyzed" }
}
},
"integerValue": { "type" : "long" },
"floatValue": { "type" : "double" },
"datetimeValue": { "type" : "date", "format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd" },
"booleanValue": { "type" : "boolean" }
}
}
}
}
}
}'
Search should be done in the same nested query using AND, and aggregations should be done in nested aggregation.
I made it for dynamic fields, but it probably can be adjusted for something else. I doubt there can be more flexibility with searchable/aggregatable fields due to principles how indices work.

How to add default values while adding a new field in existing mapping in elasticsearch

This is my existing mapping in elastic search for one of the child document
sessions" : {
"_routing" : {
"required" : true
},
"properties" : {
"operatingSystem" : {
"index" : "not_analyzed",
"type" : "string"
},
"eventDate" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"durations" : {
"type" : "integer"
},
"manufacturer" : {
"index" : "not_analyzed",
"type" : "string"
},
"deviceModel" : {
"index" : "not_analyzed",
"type" : "string"
},
"applicationId" : {
"type" : "integer"
},
"deviceId" : {
"type" : "string"
}
},
"_parent" : {
"type" : "userinfo"
}
}
in above mapping "durations" field is an integer array. I need to update the existing mapping by adding a new field called "durationCount" whose default value should be the size of durations array.
PUT sessions/_mapping
{
"properties" : {
"sessionCount" : {
"type" : "integer"
}
}
}
using above mapping I am able to update the existing mapping but I am not able to figure out how to assign a value ( which would vary for each session document like it should be durations array size ) while updating the mapping. any ideas ?
Well 2 recommendations here -
Instead of adding default value , you can adjust it in the query using missing filter. Lets say , you want to search based on a match query - Instead of just match query , use a bool query with should clause having the match and missing filter. inside filtered query. This way , those documents which did not have the field is also accounted.
If you absolutely need the value in that field for existing documents , you need to reindex the whole set of documents. Or , use the out of box plugin , update by query -

How to index dump of html files to elasticsearch?

I am totaly new in elastic so my knowledge is only from elasticsearch site and I need to help.
My task is to index large row data in html format into elastic search. I already crawled my data and stored it onto disk (200 000 html files). My question is what is the simplest way to index all html files into elasticsearch? Should I do it manualy by for each document to make put request to elastic? For example like:
curl -XPUT 'http://localhost:9200/registers/tomas/1' -d '{
"user" : "tomasko",
"post_date" : "2009-11-15T14:12:12",
"field 1" : "field data"
"field 2" : "field 2 data"
}'
And second question is if I have to parse HTML document to retrieve data for JSON field 1 like in example code over?
And finaly after indexing may I delete all HTML documents? Thanks for all.
I'd look at the bulk api that allows you to send more than document in a single request, in order to speed up your indexing process. You can send batch of 10, 20 or more documents, depending on how big they are.
Depending on what you want to index you might need to parse the html, unless you want to index the whole html as a single field (you might want to use the html strip char filter in that case to strip out the html tags from the indexed text).
After indexing I'd suggest to make sure the mapping is correct and you can find what you're looking for. You can always reindex using the _source special field that elasticsearch stores under the hood, but if you already wrote your indexer code you might want to use it again to reindex when needed (of course with the same html documents). In practice, you never index your data once... so be careful :) even though elasticsearch always helps you out with the _source field), it's just a matter of querying the existing index and reindex all its documents on another index.
#javanna's suggestion to look at the Bulk API will definitely lead you in the right direction. If you are using NEST, you can store all your objects in a list which you can then serialize JSON objects for indexing the content.
Specifically, if you want to strip the html tags out prior to indexing and storing the content as is, you can use the mapper attachment plugin - in which when you define the mapping, you can categorize the content_type to be "html."
The mapper attachment is useful for many things especially if you are handling multiple document types, but most notably - I believe just using this for the purpose of stripping out the html tags is sufficient enough (which you cannot do with the html_strip char filter).
Just a forewarning though - NONE of the html tags will be stored. So if you do need those tags somehow, I would suggest defining another field to store the original content. Another note: You cannot specify multifields for mapper attachment documents, so you would need to store that outside of the mapper attachment document. See my working example below.
You'll need to result in this mapping:
{
"html5-es" : {
"aliases" : { },
"mappings" : {
"document" : {
"properties" : {
"delete" : {
"type" : "boolean"
},
"file" : {
"type" : "attachment",
"fields" : {
"content" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "autocomplete"
},
"author" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets"
},
"title" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "autocomplete"
},
"name" : {
"type" : "string"
},
"date" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"keywords" : {
"type" : "string"
},
"content_type" : {
"type" : "string"
},
"content_length" : {
"type" : "integer"
},
"language" : {
"type" : "string"
}
}
},
"hash_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"raw_content" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "raw"
},
"title" : {
"type" : "string"
}
}
}
},
"settings" : { //insert your own settings here },
"warmers" : { }
}
}
Such that in NEST, I will assemble the content as such:
Attachment attachment = new Attachment();
attachment.Content = Convert.ToBase64String(File.ReadAllBytes("path/to/document"));
attachment.ContentType = "html";
Document document = new Document();
document.File = attachment;
document.RawContent = InsertRawContentFromString(originalText);
I have tested this in Sense - results are as follows:
"file": {
"_content": "PGh0bWwgeG1sbnM6TWFkQ2FwPSJodHRwOi8vd3d3Lm1hZGNhcHNvZnR3YXJlLmNvbS9TY2hlbWFzL01hZENhcC54c2QiPg0KICA8aGVhZCAvPg0KICA8Ym9keT4NCiAgICA8aDE+VG9waWMxMDwvaDE+DQogICAgPHA+RGVsZXRlIHRoaXMgdGV4dCBhbmQgcmVwbGFjZSBpdCB3aXRoIHlvdXIgb3duIGNvbnRlbnQuIENoZWNrIHlvdXIgbWFpbGJveC48L3A+DQogICAgPHA+wqA8L3A+DQogICAgPHA+YXNkZjwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD4xMDwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD5MYXZlbmRlci48L3A+DQogICAgPHA+wqA8L3A+DQogICAgPHA+MTAvNiAxMjowMzwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD41IDA5PC9wPg0KICAgIDxwPsKgPC9wPg0KICAgIDxwPjExIDQ3PC9wPg0KICAgIDxwPsKgPC9wPg0KICAgIDxwPkhhbGxvd2VlbiBpcyBpbiBPY3RvYmVyLjwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD5qb2c8L3A+DQogIDwvYm9keT4NCjwvaHRtbD4=",
"_content_length": 0,
"_content_type": "html",
"_date": "0001-01-01T00:00:00",
"_title": "Topic10"
},
"delete": false,
"raw_content": "<h1>Topic10</h1><p>Delete this text and replace it with your own content. Check your mailbox.</p><p> </p><p>asdf</p><p> </p><p>10</p><p> </p><p>Lavender.</p><p> </p><p>10/6 12:03</p><p> </p><p>5 09</p><p> </p><p>11 47</p><p> </p><p>Halloween is in October.</p><p> </p><p>jog</p>"
},
"highlight": {
"file.content": [
"\n <em>Topic10</em>\n\n Delete this text and replace it with your own content. Check your mailbox.\n\n  \n\n asdf\n\n  \n\n 10\n\n  \n\n Lavender.\n\n  \n\n 10/6 12:03\n\n  \n\n 5 09\n\n  \n\n 11 47\n\n  \n\n Halloween is in October.\n\n  \n\n jog\n\n "
]
}

Resources