Let's say we have 1 million documents indexed in elasticsearch with name,id,skills,etc like a form a resume.
If I search for kartheek in the search box it will retrieve some results right and I found kartheek resume and I have clicked and view the resume.
Once I will view the resume I need to view the similar profiles based on the viewed resume.
Is it possible in elasticsearch
1.I learned about More Like This Query but we need to give input paramter i.e 'LIKE'.
2.I have seen this link
http://www.datasciencecentral.com/profiles/blogs/document-similarity-analysis-using-elasticsearch-and-python
Please any can give me ideas regarding the above mentioned request.
Thanks
Kartheek Gummaluri
To go from 1.x to 2.x you need to do what is known as a full cluster restart upgrade, you wont lose data in this case, here are the relevant steps laid out well from the docs:
Elastic Search Upgrade Docs
Related
The opendistro alert history is stored in the .opendistro-alerting-alert-history-<date> index
Is it possible to get the alert query data/execution result from a past Completed alert?
I’m not able to find the data
Thanks in advance
Yes, all COMPLETED alerts are saved in .opendistro-alerting-alert-history-<date> index as you mentioned, reference.
Try adding an index pattern and make sure to include system indices: .opendistro-alerting-alert-history-*
You can always view them in ElasticSearch by using this query: http://my-awseome-es:9200/_cat/indices?expand_wildcards=open,hidden
As the index starts with .(period) it is hidden just like UNIX hidden files/folders.
I have made a grafana dashboard to visualize the alert data: 12875
Please have a look, it can provide a reference for Kibana.
Can I put the response result that I query in Kibana dev tools into elasticsearch directly?
Or must I write a script to achieve it?
Any recommends?
Ok So here is one basic understanding after discussion.
Please observe carefully.
If you have head plugin installed for ES .
search for .kibana index .
open the .kibana index and you will have all the designed dashboards listed there with processd info.
Think ES as another Storage from where you can read the data and put that data into Another ES index.
Refer to this link :
https://www.elastic.co/blog/kibana-under-the-hood-object-persistence
Tools you can opt is Logstash for Reading and writing.
Grok pattern learning can give you good lead about that.
Tell me if need some real time pics for same problem.
Happy learning.
It is like you cook in kitchen and ask to put the cooked food in kitchen again.If you cooked food better consume it :)
See the visualization or processed data you see on kibana end is just for kibana.The algorithms or processing techniques for the data set residing at elastic search will be applied over the upcoming data set.
So offcourse you can put/consume your data in Elastic search back again.
It depends what sort of requirement you are facing.
Note : Data in elastic search(inverted index) after kibana processing not gonna change its architecture, due to which you are able to apply another processing techniques from kibana over the same index assuming that data is in it's earlier state.
I have a problem with Kibana: Dashboard and visualizations don't show any results!
As you can see in this screenshot, in the discover tab I get some results! Which means data exists in my index "as-*". right? But I used a trick in order to display this data :
1) I changed the range to "Today" ==> it shows no result found !
2) I clicked "New" button ==> then i get my data displayed!
Is there an other way "more proper" to get data displayed?
Then in the my dashboard (or visualization) I can't get any results!! even if the range is the same as discover tab!
I restarted Kibana ==> no changes!
I deleted as-* then I created it => no changes!
I'm using Curator to create daily index and logstash to index the data into ES!
I'm stuck here ! I ll be glad if you can help me figure it out !
Thank you very much!
I am fairly new to elasticsearch and Kibana, but here are three mistakes that I made in the past:
Are you using the correct index? Make sure that the index you have chosen for the logs that are displayed on the visualise page and for the actual visualisations is the same.
Correct timeperiod: does the time period you have chosen contain the data you are looking for? Or did you happen to have zero logs during that time?
Correct filters and aggregations: when you were making the visualisations, did they show any results? Or were they empty from the beginning? Maybe one of your filters or aggregations is wrong and it's excluding the results you're expecting to see.
Not sure if this is any help, hope you've sold the problem by now :)
If you can see information in "discover", it means that kibana has connected to the database, and that the database has information. You shouldn't have to click the "New" button to see information in the discover view. I believe the "New" button in the "discover" page is used to create a new search.
Maybe try zooming on the time period of the data on the "discover" page, or
try checking the systems logs to see if logstash is successfully pushing information to Elasticsearch.
With Kibana, dashboards are made up of visualisations, and visulations are made up of searches.
The "No results found" on the dashboard page, shown in your second page is due to the visualisation having no results. I guess you imported visulations into Kibana.
I hope that helps.
I'm new to Kibana and Elastic Search and i have run into this problem:
My ES contains (among other stuff) also data containing the current value of one custom performance counter and i would like my dashboard to show this value, e.g., as a big number - therefore i tried to use the Metric visualization, but i have no idea on how to show only the last value. Any help would be highly appreciated. Thanks.
We had a similar issue for our use case. We found two ways to handle it:
If the data is periodically generated then you can use the Kibana feature of showing data of recent n days to see the latest data.
In our case, the above option was not possible so we went with a hack where we have a property in our documents called "IsLatest" so we apply a filter "IsLatest":true in all our charts where we need latest info. We have written our code which feeds data to ElasticSearch in such a way that it updates the older data and sets it's "IsLatest" to false.
Hope it helps
Specifically, I'm using Elasticsearch to do pagination, but this question could apply to any database.
Elasticsearch provides methods to paginate search results with handy from and to parameters.
So I run a query get me the most recent data from result 1 to 10
This works great.
The user clicks "next page" and the query is:
get me the most recent data from result 11 to 20
The problem is that in the time between the two queries, 2 new records have been added to the backing database, which means the paginated results will overlap (the last 2 from the first page show up as first two on the second page).
What's the best solution to avoid this? Right now, I'm adding a filter to the query that tell it to only include results later than the last result of the previous query. But it just seems hackish.
A filter is not a bad option, if you're already indexing a relevant timestamp. You have to track that timestamp on the client side in order to correctly prepare your queries. You also have to know when to get rid of it. But those aren't insurmountable problems.
The Scroll API is a solid option for this, because it effectively snapshots in time on the Elasticsearch side. The intent of the Scroll API is to provide a stable search query for deep pagination, which has to deal with the exact issue of change that you're experiencing.
You begin a Scrolling Search by supplying your query and the scroll parameter, for which Elasticsearch returns a scroll_id. You then make requests to /_search/scroll supplying that ID, each of which return a page of results and a new scroll_id for the next request.
(Note that you don't want the scan search type here. That's used to extract documents en masse, and does not apply any sorting.)
Compared to filtering, you do still have to track a value: the scroll_id for your next page of results. Whether that's easier than tracking a timestamp depends on your app.
There are other potential downsides to consider. Elasticsearch persists the context for your search on a single node within the cluster. Conceivably these could accumulate in your cluster, depending on how heavily you rely on scrolling search. You'll want to test the performance implications there. And if I recall correctly, scrolling searches also do not persist through a node failure or restart.
The ES documentation for the Scroll API provides good details on all of the above.
Bottom line: filtering by timestamp is actually not a bad choice. The Scroll API is another valid option, designed for a similar use case, but is not without its drawbacks.
Realise this is a bit old but with ElasticSearch 6.3 there's now the search_after feature for the request body which allows for cursor type paging:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-after.html
It is very similar to the scroll API but unlike it, the search_after parameter is stateless, it is always resolved against the latest version of the searcher.
You need to use scan API for this. Scan and scroll API let's you do point in time search and pagination.
Scan API -