Adding docs to elasticsearch via API, discovering them via kibana. How? - elasticsearch

I typically insert data (logs) into elasticsearch via logstash plugin. Then, I
can search them from kibana.
However, if I try to intert data in elasticsearch programatically (in order to
skip filebeat and logstash), I cannot find the data in kibana.
This is what I tested:
from elasticsearch import Elasticsearch
es = Elasticsearch(["XXX"], ...)
doc = {
"#version": 1,
"#timestamp": datetime.now(),
"timestamp": datetime.now(), # Just in case this is needed too
"message": "test message"
}
res = es.index(
index="foobar-2019.05.13", doc_type='whatever', id=3, body=doc,
refresh=True
)
# Doc is indexed by above code, as proved by
# es.search(
# index="foobar-*", body={"query": {"match_all": {}}}
#)
I added the index pattern `foobar-*`` to kibana in "Index Pattern -> Create
index pattern". Then, I can use "discover" page to search for documents in that
index. But no documents are found by kibana, even if those exist in
elasticsearch.
What I am missing? Are there any mappings that should be configured for index?
(note: using 6.x versions)
UPDATE: example of doc indexed, and mapping of index
# Example of doc indexed
{'_index': 'foobar-2019.05.13', '_type': 'doc', '_id': '41', '_score': 1.0,
'_source': {'author': 'foobar', 'message': 'karsa big and crazy. icarium crazy. mappo big.',
'timestamp': '2019-05-13T15:52:19.857898',
'#version': 1, '#timestamp': '2019-05-13T15:52:19.857900'}}
# mapping of foobar-2019.05.13'
{
"mapping": {
"doc": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "long"
},
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timestamp": {
"type": "date"
}
}
}
}
}

I found the issue... there was a 2 hour timezone difference between host were
python code is running and elasticsearch/kibana hosts.
So, since I was using datetime.now(), I was inserting documents with a timestamp "hours in the future", and I was searching for them "anywhere in the past".
If I look for them in the future (or, if I wait for 2 hours without updating
them), they are found.
Embarrassing mistake on my side.
Fix for me was to use datetime.now(timezone.utc)

Related

What is the correct setup for ElasticSearch 7.6.2 highlighting with FVH?

How to properly setup highlighting search words in huge documents using fast vector highlighter?
I've tried documentation and the following settings for the index (as Python literal, commented alternative settings, which I also tried, with store and without):
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"members": {
"dynamic": "strict",
"properties": {
"url": {
"type": "text",
"term_vector": "with_positions_offsets",
#"index_options": "offsets",
"store": True
},
"title": {
"type": "text",
#"index_options": "offsets",
"term_vector": "with_positions_offsets",
"store": True
},
"content": {
"type": "text",
#"index_options": "offsets",
"term_vector": "with_positions_offsets",
"store": True
}
}
}
}
}
Search done by the following query (again, commented places were tried one by one, in some combinations):
{
"query": {
"multi_match": {
"query": term,
"fields": ["url", "title", "content"]
},
},
"_source": {
#"includes": ["url", "title", "_id"],
# "excludes": ["content"]
},
"highlight": {
"number_of_fragments": 40,
"fragment_size": 80,
"fields": {
"content": {"matched_fields": ["content"]},
#"content": {"type": "fvh", "matched_fields": ["content"]},
#"title": {"type": "fvh", "matched_fields": ["title"]},
}
}
}
The problem is, that when FVH is not used, ElasticSearch complains that "content" field is too large. (And I do not want to increase the allowed size). When I add "fvh" type, ES complain that terms vectors are needed: Even though I've checked those are there by querying document info (offsets, starts, etc):
the field [content] should be indexed with term vector with position
offsets to be used with fast vector highlighter
It seems like:
When I omit "type": "fvh", it is not used even though documentation mentions it's the default when "term_vector": "with_positions_offsets".
I can see term vectors in the index, but ES does not find them. (indirectly, when indexing with term vectors the index is almost twice as large)
All the trials included removing old index and adding it again.
It's also so treacherous, that it fails only when a large document is encountered. Highlights are there for queries, where documents are small.
What is the proper way to setup highlights in ElasticSearch 7, free edition (I tried under Ubuntu with binary deb from the vendor)?
The fvh highlighter uses the Lucene Fast Vector highlighter. This highlighter can be used on fields with term_vector set to with_positions_offsets in the mapping. The fast vector highlighter requires setting term_vector to with_positions_offsets which increases the size of the index.
you can define a mapping like below for your field.
"mappings": {
"properties": {
"text": {
"type": "text",
"term_vector": "with_positions_offsets"
}
}
}
while querying for highlight fields, you need to use "type" : "fvh"
The fast vector highlighter will be used by default for the text field because term vectors are enabled.

Can't update mapping in elasticsearch

When putting an anaylzer into mapping using PUT /job/_mapping/doc/ but get conflicts.
But there isn't a anaylzer in mappings.
PUT /job/_mapping/doc/
{
"properties":{
"title": {
"type": "text",
"analyzer":"ik_smart",
"search_analyzer":"ik_smart"
}
}
}
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Mapper for [title] conflicts with existing mapping in other types:\n[mapper [title] has different [analyzer]]"
}
],
"type": "illegal_argument_exception",
"reason": "Mapper for [title] conflicts with existing mapping in other types:\n[mapper [title] has different [analyzer]]"
},
"status": 400
}
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"fielddata": true
},
The output config is like this.
output {
elasticsearch {
hosts => ["<Elasticsearch Hosts>"]
user => "<user>"
password => "<password>"
index => "<table>"
document_id => "%{<MySQL_PRIMARY_KEY>}"
}
}
You cant update mapping in elasticsearch, you can add mapping but not update mapping. Elasticsearch use mapping at the indexation time, that s why you cant update mapping of an existing field. Analyzer is part of the mapping, in fact if you don't specify one es a default one, analyzer tell elastic how to index the documents.
create a new index with your new mappings (include analyzer)
reindex your documents from your existing index to the new one (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html)
Updating Mapping:
Once a document is indexed to an index i.e. the mapping is generated under a given type as like in our case Mapping of EmployeeCode, EmployeeName & isDevelopers' is generated under type "customtype", we cannot modify it afterwards. In case if we want to modify it, we need to delete the index first and then apply the modified mapping manually and then re-index the data. But If you want to add an a new property under a given type, then it is feasible. For example, our document attached our index "inkashyap-1002" under type "customtype" is as follows:
{
"inkashyap-1002": {
"mappings": {
"customtype": {
"properties": {
"EmployeeCode": {
"type": "long"
},
"isDeveloper": {
"type": "boolean"
},
"EmployeeName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
now let's add another property "Grade" :
curl -XPUT localhost:9200/inkashyap-1002(IndexName)/customtype(TypeName)/2 — d '{
"EmployeeName": "Vaibhav Kashyap",
"EmployeeCode": 13629,
"isDeveloper": true,
"Grade": 5
}'
Now hit the GET mapping API. In the results, you can see there is another field added called "Grade".
Common Error:
In the index "inkashyap-1002", so far we have indexed 2 documents. Both the documents had the same type for the field "EmployeeCode" and the type was "Long". Now let us try to index a document like below:
curl -XPUT localhost:9200/inkashyap-1002/customtype/3 -d '{
"EmployeeName": "Vaibhav Kashyap",
"EmployeeCode": "onethreesixtwonine",
"isDeveloper": true,
"Grade": 5
}'
Note that here the "EmployeeCode" is given in string type, which indicates that it is a string field. The response to the above request will be like below:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failedtoparse[
EmployeeCode
]"
}
],
"type": "mapper_parsing_exception",
"reason": "failedtoparse[
EmployeeCode
]",
"caused_by": {
"type": "number_format_exception",
"reason": "Forinputstring: \"onethreesixtwonine\""
}
},
"status": 400
}
In the above response, we can see the error "mapper_parsing_exception" on the field "EmployeeCode". This indicates that the expected field here was of another type and not string. In such cases re-index the document with the appropriate type

Elasticsearch 6 rejecting mapping update to even on simple document

After upgrading from ES 5 to ES 6 I've got error message each time i want to store something in new index. However all old indexes are working fine.
The error message is:
Rejecting mapping update to [test] as the final mapping would have more than 1 type: [user, group]
Im using elasticsearch 6.3. It works properly on production server on previously created indexes. I've tried dropping index to no avail.
My test documents are:
PUT test/group/1
{
"id": "5b29fb9aa3d24b5a2b6b8fcb",
"_mongo_id_": "5b29fb9aa3d24b5a2b6b8fcb"
}
and
PUT test/user/1
{
"id": "5ad4800ca3d24be81d7a6806",
"_mongo_id_": "5ad4800ca3d24be81d7a6806"
}
Index mapping seems ok:
{
"mapping": {
"group": {
"properties": {
"_mongo_id_": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
You're trying to add more than one type per index
PUT test/group/1
PUT test/user/1
This behaviour is not allowed from ES 6.
From breaking changes
The ability to have multiple mapping types per index has been removed
in 6.0. New indices will be restricted to a single type. This is the
first step in the plan to remove mapping types altogether. Indices
created in 5.x will continue to support multiple mapping types.

Date_histogram Elasticsearch facet can't find field

I am using the date_histogram facet to find results based on a Epoch timestamp. The results are displayed on a histogram, with the date on the x-axis and count of events on the y-axis. Here is the code that I have that doesn't work:
angular.module('controllers', [])
.controller('FacetsController', function($scope, $http) {
var payload = {
query: {
match: {
run_id: '9'
}
},
facets: {
date: {
date_histogram: {
field: 'event_timestamp',
factor: '1000',
interval: 'second'
}
}
}
}
It works if I am using
field: '#timestamp'
which is in ISO8601 format; however, I need it to now work with Epoch timestamps.
Here is an example of what's in my Elasticsearch, maybe this can lead to some answers:
{"#version":"1",
"#timestamp":"2014-07-04T13:13:35.372Z","type":"automatic",
"installer_version":"0.3.0",
"log_type":"access.log","user_id":"1",
"event_timestamp":"1404479613","run_id":"9"}
},
When I run this, I receive this error:
POST 400 (Bad Request)
Any ideas as to what could be wrong here? I don't understand why I'd have such a difference from using the two different fields, as the only difference is the format. I researched as best I could and discovered I should be using 'factor', but that didn't seem to solve my problem. I am probably making a silly beginner mistake!
You need to set the indexing initially. Elasticsearch is good at defaults but it is not possible for it to determine if the provided value is a timestamp, integer or string. So its your job to tell Elasticsearch about the same.
Let me explain by example. Lets consider the following document is what you are trying to index:
{
"#version": "1",
"#timestamp": "2014-07-04T13:13:35.372Z",
"type": "automatic",
"installer_version": "0.3.0",
"log_type": "access.log",
"user_id": "1",
"event_timestamp": "1404474613",
"run_id": "9"
}
So initially you don't have an index and you index your document by making an HTTP request like so:
POST /test/date_experiments
{
"#version": "1",
"#timestamp": "2014-07-04T13:13:35.372Z",
"type": "automatic",
"installer_version": "0.3.0",
"log_type": "access.log",
"user_id": "1",
"event_timestamp": "1404474613",
"run_id": "9"
}
This creates a new index called test and a new doc type in index test called date_experiments.
You can check the mapping of this doc type date_experiments by doing so:
GET /test/date_experiments/_mapping
And what you get in the result is an auto-generated mapping that was generated by Elasticsearch:
{
"test": {
"date_experiments": {
"properties": {
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"#version": {
"type": "string"
},
"event_timestamp": {
"type": "string"
},
"installer_version": {
"type": "string"
},
"log_type": {
"type": "string"
},
"run_id": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "string"
}
}
}
}
}
Notice that the type of event_timestamp field is set to string. Which is why your date_histogram is not working. Also notice that the type of #timestamp field is already date because you pushed the date in the standard format which made easy for Elasticsearch to recognize your intention was to push a date in that field.
Drop this mapping by sending a DELETE request to /test/date_experiments and lets start from the beginning.
This time instead of pushing the document first, we will make the mapping according to our requirements so that our event_timestamp field is considered as a date.
Make the following HTTP request:
PUT /test/date_experiments/_mapping
{
"date_experiments": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "string"
},
"event_timestamp": {
"type": "date"
},
"installer_version": {
"type": "string"
},
"log_type": {
"type": "string"
},
"run_id": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "string"
}
}
}
}
Notice that I have changed the type of event_timestamp field to date. I have not specified a format because Elasticsearch is good at understanding a few standard formats like in the case of #timestamp field where you pushed a date. In this case, Elasticsearch will be able to understand that you are trying to push a UNIX timestamp and convert it internally to treat it as a date and allow all date operations on it. You can specify a date format in the mapping just in case the dates you are pushing are not in any standard formats.
Now you can start indexing your documents and starting running your date queries and facets the same way as you were doing earlier.
You should read more about mapping and date format.

elasticsearch search query for exact match not working

I am using query_string for search. Searching is working fine but its getting all records with small letters and capital letters match.But i want to exact match with case sensitive?
For example :
Search field : "title"
Current output :
title
Title
TITLE,
I want to only first(title). How to resolved this issue.
My code in java :
QueryBuilder qbString=null;
qbString=QueryBuilders.queryString("title").field("field_name");
You need to configure your mappings / text processing so tokens are indexed without being lowercased.
The "standard"-analyzer lowercases (and removes stopwords).
Here's an example that shows how to configure an analyzer and a mapping to achieve this: https://www.found.no/play/gist/7464654
With Version 5 + on ElasticSearch there is no concept of analyzed and not analyzed for index, its driven by type !
String data type is deprecated and is replaced with text and keyword, so if your data type is text it will behave like string and can be analyzed and tokenized.
But if the data type is defined as keyword then automatically its NOT analyzed, and return full exact match.
SO you should remember to mark the type as keyword when you want to do exact match with case sensitive.
code example below for creating index with this definition:
PUT testindex
{
"mappings": {
"original": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"APPLICATION": {
"type": "text",
"fields": {
"exact": {"type": "keyword"}
}
},
"type": {
"type": "text",
"fields": {
"exact": {"type": "keyword"}
}
}
}
}
}
}

Resources