Where does elasticsearch store data - elasticsearch

I have managed to install and try elasticsearch.
I thought i need to install a nosql server like mongodb.
elasticsearch seems to embbed its own storage or database system.
So, i think elasticsearch is not just a search tool.
It also provides storage and database functions. Is this correct ?
Thanks

Related

Is there application client for ElasticSeach 6.4.3 (similar to DBvear)

I tried to see my node data from application client (like DBvear), but I didn't found information about that. someone found way to connect DBvear to this version or to see the data by similar application?
I believe what you are looking for is GUI for Elasticsearch.
Typically the industry calls the elasticsearch stack as ELK stack and I believe what you are looking for is the K part of it which is Kibana.
I'm not sure if you are asking for SQL feature but if you are thinking to make use of the SQL feature you can check the Elasticsearch SQL plugin.
Other widely used client application for elasticsearch is Grafana. There are others available too(I think Splunk, Graylog, Loggly) but I believe Kibana and Grafana are the best bet.
Hope this helps!
Actually no, I using elastic search as a Database in different deployments and I don't want to maintenance Kibana instance (i prefer to see all the data in tool like DBvear)

How to build relational graph using elasticsearch data

We are building log analytics applicaton in which we are using Graylog & Elasticsearch. Since I have installed Elasticsearch but somehow I want to take the data from elasticsearch and create relational graphs with the data on my own instead of using Xpack-Graph.
i could have used xpack graph api and do http calls to get data but its not free ware and i'm not sure that we will be able to buy one licence
is there any other alternative for xpack graph api which is free ??
or can i query directly to elastic using aggregation if so how feasible it is?? can yo share me some resource on this
Kindly share your thoughts on this.

Elastic search with Google Big Query

I have the event logs loaded in elasticsearch engine and I visualise it using Kibana. My event logs are actually stored in the Google Big Query table. Currently I am dumping the json files to a Google bucket and download it to a local drive. Then using logstash, I move the json files from the local drive to the elastic search engine.
Now, I am trying to automate the process by establishing the connection between google big query and elastic search. From what I have read, I understand that there is a output connector which sends the data from elastic search to Google big query but not vice versa. Just wondering whether I should upload the json file to a kubernete cluster and then establish the connection between the cluster and Elastic search engine.
Any help with this regard would be appreciated.
Although this solution may be a little complex, I suggest some solution that you use Google Storage Connector with ES-Hadoop. These two are very mature and used in production-grade by many great companies.
Logstash over a lot of pods on Kubernetes will be very expensive and - I think - not a very nice, resilient and scalable approach.
Apache Beam has connectors for BigQuery and Elastic Search, I would definitly perform this using DataFlow so you don´t need to implement a complex ETL and staging storage. You can read the data from BigQuery using BigQueryIO.Read.from (take a look to this if performance is important BigQueryIO Read vs fromQuery) and load it into ElasticSearch using ElasticsearchIO.write()
Refer this how read data from BigQuery Dataflow
https://github.com/GoogleCloudPlatform/professional-services/blob/master/examples/dataflow-bigquery-transpose/src/main/java/com/google/cloud/pso/pipeline/Pivot.java
Elastic Search indexing
https://github.com/GoogleCloudPlatform/professional-services/tree/master/examples/dataflow-elasticsearch-indexer
UPDATED 2019-06-24
Recently this year was release BigQuery Storage API which improve the parallelism to extract data from BigQuery and is natively supported by DataFlow. Refer to https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api for more details.
From the documentation
The BigQuery Storage API allows you to directly access tables in BigQuery storage. As a result, your pipeline can read from BigQuery storage faster than previously possible.
I have recently worked on a similar pipeline. A workflow I would suggest would either use the mentioned Google storage connector, or other methods to read your json files into a spark job. You should be able to quickly and easily transform your data, and then use the elasticsearch-spark plugin to load that data into your Elasticsearch cluster.
You can use Google Cloud Dataproc or Cloud Dataflow to run and schedule your job.
As of 2021, there is a Dataflow template that allows a "GCP native" connection between BigQuery and ElasticSearch
More information here in a blog post by elastic.co
Further documentation and step by step process by google

Install grafana without elasticseach

I’m trying to install grafana to work with OpenTSDB datasource. I’d like to know, what should I do to install it without elasticsearch?
I'm using grafana with Influxdb and I'm not using elasticsearch.
Grafana 2 is out in beta and I've been using that in production for a while. Grafana 2 now has its own data store, which either uses MySQL or SQLite. But you can always use Elasticsearch as well. You can read more about it here
Update: Stable version of Grafana 2 is now out, and it just works.
Grafana is a frontend, you will need some kind of database to store values and configuration in. I just grabbed the .tar.gz file from grafana's downloads page, created a config.js and pointed it at my influxdb server. No elasticsearch here, either.
You might want to take a look at gofana which will allow you to run Grafana without Elasticsearch. It's a self-contained binary that allows you to store dashboards on the filesystem and not in Elasticsearch or InfluxDB. It also supports HTTPS and basic authentication.
Note: I'm the author of gofana.

Run query on couchbase data imported using sqoop and hadoop connector

I am using sqoop with hadoop couchbase connector to import some data from couchbase to hdfs.
As stated in
http://docs.couchbase.com/hadoop-plugin-1.1/#limitations
querying is not supported for couchbase.
I want a solution to run query using the hadoop connector.
For ex:
I have 2 documents in db as follows:
{'doctype':'a'}
and
{'doctype':'b'}
I need to get only the docs which belong to docType=a.
Is there a way to do this?
If you want to select data from Couchbase, you don't need hadoop connector...you can just use couchbase view that filters on doc.doctype=='a'
See couchbase views documentation
On other hand, I recommend using new N1QL query functionality from Couchbase. It is quite flexible query language (similar to SQL), see online N1QL tutorial.
Note: If you look at compatibility for N1QL to run it has v2.2 and higher, see N1QL Compatibility You will need to deploy Couchbase N1QL Query server and point to your existing CB v2.2 cluster. see: Couchbase N1QL queries on server
Suggesting another alternative for Sqoop for the above requirement called 'Couchdoop'.
Couchdoop uses views to fetch data from Couchbase. Hence we can write a query as per our need and use Couchdoop to hit the view and fetch data.
https://github.com/Avira/couchdoop
Worked for me.

Resources