Can I put the result from kibana to elasicsearch again? - elasticsearch

Can I put the response result that I query in Kibana dev tools into elasticsearch directly?
Or must I write a script to achieve it?
Any recommends?

Ok So here is one basic understanding after discussion.
Please observe carefully.
If you have head plugin installed for ES .
search for .kibana index .
open the .kibana index and you will have all the designed dashboards listed there with processd info.
Think ES as another Storage from where you can read the data and put that data into Another ES index.
Refer to this link :
https://www.elastic.co/blog/kibana-under-the-hood-object-persistence
Tools you can opt is Logstash for Reading and writing.
Grok pattern learning can give you good lead about that.
Tell me if need some real time pics for same problem.
Happy learning.

It is like you cook in kitchen and ask to put the cooked food in kitchen again.If you cooked food better consume it :)
See the visualization or processed data you see on kibana end is just for kibana.The algorithms or processing techniques for the data set residing at elastic search will be applied over the upcoming data set.
So offcourse you can put/consume your data in Elastic search back again.
It depends what sort of requirement you are facing.
Note : Data in elastic search(inverted index) after kibana processing not gonna change its architecture, due to which you are able to apply another processing techniques from kibana over the same index assuming that data is in it's earlier state.

Related

Implements popular keyword in ElasticSearch

I'm using ElasticSearch on AWS EC2.
And i want to implement today's popular keyword function in ES.
there is 3 indexes(place, genre, name), and i want see today's popular keyword in name index only.
I tried to use ES slowlog and logstash. but slowlog save logs every shard's log.
(ex)number of shards : 5 then 5 query log saved.
Is there any good and easy way to implement popular keyword in ES?
As far as I know, this is not supported by Elasticsearch and you need to build your own custom solution.
Design you mentioned using the slowlog is not good as you mentioned its on per shard basis, even if you do some more computing and able to merge and relate them to a single search at index level, it would not be good, as
you have to change the slow log configuration and for every index there needs to be a different threshold, you can change it to 0ms, to make sure you get all the search queries in slow logs, but that would take a huge disk space and would not be good for Elasticsearch performance.
You have to do some parsing of slow log in your application and if you do it runtime it would be very costly.
I think you can maintain a distributed cache in your application where you store the top searched keyword like the leaderboard of a multi-player gaming app, which is changing very frequently but in your case, you don't even have to update this cache very frequently. I would not go into much implementation details, but simple Hashmap of search term as key and count as value would solve the issue.
Hope this helps. let me know if you have questions.

How to use Elasticsearch to make files in a directory searchable?

I am very new to search engines and Elasticsearch, so please bear with me and apologies if this question sounds vague. I have a large directory with lots of .csv and .hdr files, and I want to be able to search text within these files. I've done the tutorials and read some of the documentation but I'm still struggling to understand the concept of indexing. It seems like all the tutorials show you how to index one document at a time, but this will take a long time as I have lots of files. Is there an easier way to make elasticsearch index all the documents in this directory and be able to search for what I want?
Elasticsearch can only search on documents it has indexed. Indexed means Elasticsearch has consumed a document one by one and stored it internally.
Normaly internal structure matters and you shold understand what you're doing to get best performance.
So you need a way to get your files into elastic search, I'm affraid there is no "one click way" to achieve this...
You need
Running cluster
Designed index on for the documents
Get document from filesystem to Elasticsearch
Your question is focused on 3).
For this, search for script examples or tools that can crawl your directory and provide Elasticsearch with documents.
5 seconds of using Google brought me to
https://github.com/dadoonet/fscrawler
https://gist.github.com/stevehanson/7462063
Theoretically it could be done with Logstash (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html), but I would give fscrawler a try.

Can Beats update existing documents in Elasticsearch?

Consider the following use case:
I want the information from one particular log line to be indexed into Elasticsearch, as a document X.
I want the information from some log line further down the log file to be indexed into the same document X (not overriding the original, just adding more data).
The first part, I can obviously achieve with filebeat.
For the second, does anyone have any idea about how to approach it? Could I still use filebeat + some pipeline on an ingest node for example?
Clearly, I can use the ES API to update the said document, but I was looking for some solution that doesn't require changes to my application - rather, it is all possible to achieve using the log files.
Thanks in advance!
No, this is not something that Beats were intended to accomplish. Enrichment like you describe is one of the things that Logstash can help with.
Logstash has an Elasticsearch input that would allow you to retrieve data from ES and use it in the pipeline for enrichment. And the Elasticsearch output supports upsert operations (update if exists, insert new if not). Using both those features you can enrich and update documents as new data comes in.
You might want to consider ingesting the log lines as is to Elasticearch. Then using Logstash, build a separate index that is entity specific and driven based on data from the logs.

Is it possible to know when some data is available for being searched in Elasticsearch?

I'm implementing a software in which data is sent to some web server, stored in an Elasticsearch and then queried right away. I know that Elasticsearch is a NoSQL following BASE (Basically Available, soft State, eventual consistency) principles which means there's no guarantee when your data will be available for searching.
That's why when I query for the data just being added to Elasticsearch, I have to wait for some time before it is found. Right now all I can do is to implement a polling mechanism to detect when data is completely applied. It is worth mentioning that if I'm using _id to retrieve a document, it is found right away. But if I'm searching for it using some type of Elasticsearch query (like term or query_string), it will take a while before the document is found.
So my question is: Is there a cheaper way to detect when data is completely indexed in Elasticsearch?
This part is done by the Refresh API, this API does not provide a way to know when the indexed data is available. But the folks of elastic are working in a hack to let the request wait for a refresh.
I think should be better if you take a look here: https://www.elastic.co/blog/refreshing_news
This post have a good overview of the issues and the stuffs that they are working to improve.
Hope it help :D

Comparison of Handling Logs and PDFs in Solr & Elasticsearch and Data Visualization in Banana & Kibana

How do Elasticsearch and Solr compare in respect to the following:
Indexing logs.
Indexing events.
Indexing PDF documents.
Ease of creating and distributing visualizations. Kibana vs Banana.
Support and documentation for developers.
Any help is appreciated.
EDIT
More specifically, i am trying to figure out how exactly a PDF document or an event can be indexed at all. I have worked a little bit on Elasticsearch and since i am a fan of JSON, i found it quite useful when i tried to index structured data.
For example logs are mostly structured and thus i guess easier to index and search. Now what if i want to index the whole log file itself?
Follow up
Is Kibana the only visualization tool available for Elasticsearch?
Is Banana the only visualization tool available for Solr?
Here is an answer to try to address just the Elasticsearch aspect of the post.
Take a look at https://github.com/elastic/elasticsearch-mapper-attachments for handling PDFs
For events/logs, you would need to transform those into structured data to index in Elasticsearch. You can have a field in there for the source (the log file the data came from and other information like that) - you will have all the data in the whole log file indexed in that fashion. You can take advantage of ES aggregations to group results based on log file, calculate statistics, etc.
The ELK stack is definitely worth a look.
I don't know if Kibana is the only visualization tool but it is probably the most popular and likely to offer more than something else.

Resources