ElasticSearch sorting doesn't work - sorting

I am trying to do sorting by "timestamp" field (which is not a "_timestamp" default).
"timestamp" field stores micro-unixtime in
"_source": {
...
"timestamp": "1381256450000"
...
}
On screenshot you can see this value in 'sort' (right side). Last record should be definitely on the top of the result. But it isn't.
Screenshot:

From elasticsearch website :
By default, the search request will fail if there is no mapping associated with a field. The ignore_unmapped option allows to ignore fields that have no mapping and not sort by them.
You have to map the field if you want to use it to sort the results.

Related

Search inside _id field Elasticsearch

recently I made a change to the way ids were being generated in my ES index. Previously, we were generating the ids in the code, using a format like: uuid_WEEKDAY_COUNTRY_TIMESTAMP
I removed this and instead let the value of this field be auto-generated by ES (as i guess it should be)
How can i write a query that checks none of the old-format ids are still being generated? I tried something like
GET /_search
{
"query": {
"query_string": {
"query": "*WEDNESDAY*",
"default_field": "_id"
}
}
}
But got errors saying i can't query _id field, only text or keyword
how can i do this otherwise?
thanks
The _id field is special field handled in elastic search as the ID of the document. It is not indexed field like other text fields, though we can set the value , for documents where we do not specify this field it is actually "generated" based on the UID of the document (see: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html 2.8k).
The drop side of this is that , this field only supports a limited subset of the query functionality. One way to get over this is to add a field called id_field (as a text / keyword) into the document itself and then term queries on this field

Elasticsearch - check which all fields are indexed in an index?

I browse to curl -XGET 'http://localhost:9200/<index_name>/_mappings and it returns the fields in an index.
This is one of the field.
{"dob" : { "type": "string", "analyzer" : "my_custom_analyzer"}}
With above response, does this mean DOB field is by default indexed? or index: true has to be explicitly there for this field to be indexed?
It seems you're using a very old version of Elasticsearch, likely 2.x or earlier.
However, based on the mapping you've shared, string fields are indexed by default, in your case, dob is analyzed by a custom analyzer called my_custom_analyzer and the resulting tokens will be indexed automatically.

Kibana scripted field which loops through an array

I am trying to use the metricbeat http module to monitor F5 pools.
I make a request to the f5 api and bring back json, which is saved to kibana. But the json contains an array of pool members and I want to count the number which are up.
The advice seems to be that this can be done with a scripted field. However, I can't get the script to retrieve the array. eg
doc['http.f5pools.items.monitor'].value.length()
returns in the preview results with the same 'Additional Field' added for comparison:
[
{
"_id": "rT7wdGsBXQSGm_pQoH6Y",
"http": {
"f5pools": {
"items": [
{
"monitor": "default"
},
{
"monitor": "default"
}
]
}
},
"pool.MemberCount": [
7
]
},
If I try
doc['http.f5pools.items']
Or similar I just get an error:
"reason": "No field found for [http.f5pools.items] in mapping with types []"
Googling suggests that the doc construct does not contain arrays?
Is it possible to make a scripted field which can access the set of values? ie is my code or the way I'm indexing the data wrong.
If not is there an alternative approach within metricbeats? I don't want to have to make a whole new api to do the calculation and add a separate field
-- update.
Weirdly it seems that the number values in the array do return the expected results. ie.
doc['http.f5pools.items.ratio']
returns
{
"_id": "BT6WdWsBXQSGm_pQBbCa",
"pool.MemberCount": [
1,
1
]
},
-- update 2
Ok, so if the strings in the field have different values then you get all the values. if they are the same you just get one. wtf?
I'm adding another answer instead of deleting my previous one which is not the actual question but still may be helpful for someone else in future.
I found a hint in the same documentation:
Doc values are a columnar field value store
Upon googling this further I found this Doc Value Intro which says that the doc values are essentially "uninverted index" useful for operations like sorting; my hypotheses is while sorting you essentially dont want same values repeated and hence the data structure they use removes those duplicates. That still did not answer as to why it works different for string than number. Numbers are preserved but strings are filters into unique.
This “uninverted” structure is often called a “column-store” in other
systems. Essentially, it stores all the values for a single field
together in a single column of data, which makes it very efficient for
operations like sorting.
In Elasticsearch, this column-store is known as doc values, and is
enabled by default. Doc values are created at index-time: when a field
is indexed, Elasticsearch adds the tokens to the inverted index for
search. But it also extracts the terms and adds them to the columnar
doc values.
Some more deep-dive into doc values revealed it a compression technique which actually de-deuplicates the values for efficient and memory-friendly operations.
Here's a NOTE given on the link above which answers the question:
You may be thinking "Well that’s great for numbers, but what about
strings?" Strings are encoded similarly, with the help of an ordinal
table. The strings are de-duplicated and sorted into a table, assigned
an ID, and then those ID’s are used as numeric doc values. Which means
strings enjoy many of the same compression benefits that numerics do.
The ordinal table itself has some compression tricks, such as using
fixed, variable or prefix-encoded strings.
Also, if you dont want this behavior then you can disable doc-values
OK, solved it.
https://discuss.elastic.co/t/problem-looping-through-array-in-each-doc-with-painless/90648
So as I discovered arrays are prefiltered to only return distinct values (except in the case of ints apparently?)
The solution is to use params._source instead of doc[]
The answer for why doc doesnt work
Quoting below:
Doc values are a columnar field value store, enabled by default on all
fields except for analyzed text fields.
Doc-values can only return "simple" field values like numbers, dates,
geo- points, terms, etc, or arrays of these values if the field is
multi-valued. It cannot return JSON objects
Also, important to add a null check as mentioned below:
Missing fields
The doc['field'] will throw an error if field is
missing from the mappings. In painless, a check can first be done with
doc.containsKey('field')* to guard accessing the doc map.
Unfortunately, there is no way to check for the existence of the field
in mappings in an expression script.
Also, here is why _source works
Quoting below:
The document _source, which is really just a special stored field, can
be accessed using the _source.field_name syntax. The _source is loaded
as a map-of-maps, so properties within object fields can be accessed
as, for example, _source.name.first.
.
Responding to your comment with an example:
The kyeword here is: It cannot return JSON objects. The field doc['http.f5pools.items'] is a JSON object
Try running below and see the mapping it creates:
PUT t5/doc/2
{
"items": [
{
"monitor": "default"
},
{
"monitor": "default"
}
]
}
GET t5/_mapping
{
"t5" : {
"mappings" : {
"doc" : {
"properties" : {
"items" : {
"properties" : {
"monitor" : { <-- monitor is a property of items property(Object)
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}

Timestamp not appearing in Kibana

I'm pretty new to Kibana and just set up an instance to look at some ElasticSearch data.
I have one index in Elastic Search, which has a few fields including _timestamp. When I go to the 'Discover' tab and look at my documents, each have the _timestamp field but with a yellow warning next to the field saying "No cached mapping for this field". As a result, I can't seem to sort/filter by time.
When I try and create a new index pattern and click on "Index contains time-based events", the 'Time-field name' dropdown doesn't contain anything.
Is there something else I need to do to get Kibana to recognise the _timestamp field?
I'm using Kibana 4.0.
You'll need to take these quick steps first :
Go to Settings → Advanced.
Edit the metaFields and add "_timestamp". Hit save.
Now go back to Settings → Indices and _timestamp will be available in the drop-down list for "Time-field name".
In newer versions you are required to specify the date field before you send your data.
Your date field must be in a standard format such as miliseconds after Epoch (long number) or - just as suggested by MrE - in ISO8601.
See more info here: https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html
Again, before you send your data to the index, you must specify the mapping for this field. In python:
import requests
mapping = '{"mappings": {"your_index": {"properties": {"your_timestamp_field": { "type": "date" }}}}}'
requests.put('http://yourserver/your_index', data=mapping)
...
send_data()
My es version is 2.2.0
You have to the right schema.
I follow the guide
Eg:
{
"memory": INT,
"geo.coordinates": "geo_point"
"#timestamp": "date"
}
If you have the #timestamp, you will see the
ps: if your schema doesn't have "date" field, do not check "Index
contains time-based events
The accepted answer is obsolete as of Kibana 2.0
you should use a simple date field in your data and set it explicitly using either a timestamp, or a date string in ISO 8601 format.
https://en.wikipedia.org/wiki/ISO_8601
you also need to set a mapping to date BEFORE you start sending data apparently.
curl -XPUT 'http://localhost:9200/myindex' -d '{
"mappings": {
"my_type": {
"properties": {
"date": {
"type": "date"
}
}
}
}
}'
Go to Settings->Indices, select your index, and click the yellow "refresh" icon. That will get rid of the warning, and perhaps make the field available in your visualization.

How to search fields with '-' characters in elastic search

I am new to elastic search. I have got following document where one of the field "eventId" has "-" in value.
When i try to search with complete value of eventId, i don't get any results.
Sample Document app/event
{
"tags": {}
"eventId": "cc98d57b-c6bc-424c-b54c-df1e3df0d942",
}
I haven't created any explicit settings for my index.
Thanks.
you should check if the tokenizer splits your value into multiple fields. Maybe your value is stored as 5 fields: "cc98d57b", "c6bc", "424c", "b54c" and "df1e3df0d942"
You can analyze that with the 'Kopf' Plugin (https://github.com/lmenezes/elasticsearch-kopf).
If that is your problem you should change your field mapping, so that the value is not analyzed ("index" : "not_analyzed").
For an example how to set that mapping see here: Elasticsearch mapping settings 'not_analyzed' and grouping by field in Java
After that, you should be able to search for your specific value.

Resources