Ideas on making a Java Application with Nutch/Elastic Search and Kibana - elasticsearch

I have an idea for an application make a search engine using tools Nutch, ES and Kibana. Nutch for crawling, ES for indexing and Kibana for the visualisation.
Currently, I have all the programs fine and I can successfully use them in terminal. My question is, is it possible to make a Java Application which incoporates Nutch, Es and Kibana all in one?
My idea for the application is that it will accept a URL for nutch to crawl, after crawling it will then accept terms to index. Finally, it will make a visualisation page with Kibana of the data.
Any pointers on how to do this?

Why do you want to have them as a single application? ES and Kibana are services and meant to run continuously. If you had StormCrawler (see comment above) that would be another continuous service. All you'd need to do is build a UI to send the URLs to a queue.

Related

How to know where the Elastic Search Hits are coming from

I have elastic search cluster.
Currently designing a python service for client for read and write query to my elastic search. The python service will not be maintained by me. Only internally python service will call our elastic search for fetching and writing
Is there any way to configure the elastic search so that we get to know that the requests are coming from python service, Or any way we can pass some extra fields while querying based on that fields we will get the logs
There is no online feature in elasticsearch to resolve your request. (you want to check the source and add fields to query).
but there is a solution for audit logs.
https://www.elastic.co/guide/en/elasticsearch/reference/current/enable-audit-logging.html
What you can do is placing a proxy in front of it and do the logging there, we have an Apache in front of our Elastic clusters to enable SSL-offloading there and add logging and ACL possibilities.

ELK with Grafana instead of Kibana for centralized log

When comes to centralized log tools, I see lot of comparison of ELK vs EFK vs Loki vs other.
But I have hard time to actually see information about "ELG", ELK (or EFK) but with Grafana instead of Kibana.
I know Grafana can use Elasticsearch as datasource, so it should be technically working. But how good is it? Any drawback compare to using Kibana? Maybe there are more existing dashboard for Kibana than Grafana when it comes to log?
I am asking this as I would like to have one UI system for both my metrics dashboard and my logs dashboard.
Kibana is part of the stack, so it is deeply integrated with elasticsearch, you have a lot of pre-built dashboards and apps inside Kibana like SIEM and Observability. If you use filebeat, metricbeat or any other beat to collect data it will have a lot of dashboards for a lot of systems, services and devices, so it is pretty easy to visualize your data without having to do a lot of work, basically you just need to follow the documentation.
But if you have some data that doesn't fit with one of pre-built dashboards, or want more flexibility and creat your own dashboards, Kibana needs more work than Grafana, and Kibana also only works with elasticsearch, so if you have other datasources you would need to put the data in elasticsearch. Also, if you want to have map visualizations, Kibana Map app is pretty good.
The Grafana plugin for Elasticsearch has some small bugs, but in overall it works fine, things probably will change for better since Elastic and Grafana made a partnership to improve the plugin.
So, if all your data is in elasticsearch, use Kibana, if you have different datasources, use grafana.

Is there application client for ElasticSeach 6.4.3 (similar to DBvear)

I tried to see my node data from application client (like DBvear), but I didn't found information about that. someone found way to connect DBvear to this version or to see the data by similar application?
I believe what you are looking for is GUI for Elasticsearch.
Typically the industry calls the elasticsearch stack as ELK stack and I believe what you are looking for is the K part of it which is Kibana.
I'm not sure if you are asking for SQL feature but if you are thinking to make use of the SQL feature you can check the Elasticsearch SQL plugin.
Other widely used client application for elasticsearch is Grafana. There are others available too(I think Splunk, Graylog, Loggly) but I believe Kibana and Grafana are the best bet.
Hope this helps!
Actually no, I using elastic search as a Database in different deployments and I don't want to maintenance Kibana instance (i prefer to see all the data in tool like DBvear)

how to implement elasticsearch

can kibana's console (in Dev Tools) be used for writing and implementing elasticsearch ? I am new to elasticsearch and very confused when it comes to doing hands-on it. thank you in advance.
kibana Dev tools makes calling elastic search API's easier so you can develop what ever you want in kibana Dev tools to make aggregation call or make query string to call the API's.
on the other hand you should use it with an SDK in your application like Elasticsearch JS for javascript so you can use the developed queries and aggregations in kibana to be used in your application and more you can monitor your shards health or put mapping for your indexes and more of functionality which can be found in Documentation, Although, you can find JS API's Documentation here
You can use Kibana Dev Tools to invoke REST API commands to perform cluster level actions such as taking snapshots, restore etc and also index simple documents. But, if you are looking to writing data to Elastic on a regular basis like ingesting server/ app logs or server metrics (CPU, memory, Disk usage etc) you should look at installing filebeats or metricbeats.

How to integrate AEM with ElasticSearch?

I have been through all the sites currently available to refer AEM & ElasticSearch, but could not find anything exact which is related to integration of these both.
Requirement : To create site search functionality for publish which will bring out all the results which are related to particular keyword. Currently we are using default AEM site search functionality, which very slow and thus we want to migrate it to ES. There are very less documents available on integration of these both, so we are troubling with it. Mainly we have to do this In Java.
That's because you are question is very vague. You have not specified what is it that you are trying to achieve. Do you want you the search results on the AEM publish side to be served by Elastic Search or do you want all your content(even in AEM author to be indexed?). There are multiple patterns hence it is not possible to provide a general answer. There are multiple ways you can integrate.
1) write custom replication agents in AEM to push content to ES.
2) create a workflow which can be triggered with launchers whenever node is added/modified. I would suggest you to refrain from this and consider option 1 instead as this will trigger too many workflow instances and will impact overall performance.
3) You can write crawlers to crawl your aem publish & index the content in ES.
4) you can write code which runs in ES(river in ES terminology) to fetch the content from AEM & index it.
Here is complete implementation of Apache Solr, Elasticsearch and Apache Lucene with AEM 6.5 - https://github.com/tadijam64/search-engines-comparison
There is detailed explanation of how every search engine works, and how it is integrated with AEM - step by step explained in six write-ups here
Its an old repo but may help you with the integration..
https://github.com/viveksachdeva/elasticsearch-cq
I know, this is an old question but I had the same problem and came up with a new implementation you can find on github:
https://github.com/deveth0/elasticsearch-aem
The usage is quite easy, you have to include several bundles and then configure, which Elasticsearch Instance to use.
Upon Page-Activation AEM triggers a Replication Agent that pushes the data to Elasticsearch.
For more detailed information, have a look at my blog

Resources