I have data loaded in elasticsearch.
How can I get elasticsearch data in h2o?
There is no direct way or API available to load data into H2O from elasticsearch. h2o supports files and JDBC, so you can write the data into the CSV file from ES. Then import data into the h2o using POST /3/ImportFiles. You can refer my answer related to it at how to create an h2oframe
The latest version of elasticsearch comes with an sql interface that can be connected to via jdbc or odbc. I haven't attempted to use this with H2O but in theory...
Related
Can you please suggest which logstash plugin is used for pulling data from Cosmos DB to Elasticsearch using Logstash?
If no such plug-ins, is there any other way to do the same?
Based on the Logstash plugins for Microsoft Azure Services and this thread,it seems that the cosmos db input plugin is not supported so far.
All i can find by now,you could use ADF copy activity to transfer your cosmos db data into above supported input source data residences,then complete subsequent work.
For example,use ADF to transfer cosmos db into sql db and follow this link to integrate with your elasticsearch service.
I have managed to install and try elasticsearch.
I thought i need to install a nosql server like mongodb.
elasticsearch seems to embbed its own storage or database system.
So, i think elasticsearch is not just a search tool.
It also provides storage and database functions. Is this correct ?
Thanks
I am learner of Apache nifi and currently expolering on "import mysql data to hdfs using apache nifi"
Please guide me on creating flow by providing an doc, end to end flow.
i have serached my sites, its not available.
To import MySQL data, you would create a DBCPConnectionPool controller service, pointing at your MySQL instance, driver, etc. Then you can use any of the following processors to get data from your database (please see the documentation for usage of each):
ExecuteSQL
QueryDatabaseTable
GenerateTableFetch
Once the data is fetched from the database, it is usually in Avro format. If you want it in another format, you will need to use some conversion processor(s) such as ConvertAvroToJSON. When the content of the flow file(s) is the way you want it, you can use PutHDFS to place the files into HDFS.
I'm looking for the best method to export data from elasticsearch.
is there something better than running a query with from/size, until all the data exported?
specifically, i want to copy parts of it to Neo4j, if there is any plugin for that..
I'm not aware of any kind of the plugin, which can do what you want.
But you can write one. I recommend you to use Jest, because default ElasticSearch Java client using different Lucene version than Neo4j and those versions are incompatible.
Second option is export data from ElasticSearch to CSV and then use Load CSV in Neo4j. This approach is good enough if import data is one time operation.
I am using sqoop with hadoop couchbase connector to import some data from couchbase to hdfs.
As stated in
http://docs.couchbase.com/hadoop-plugin-1.1/#limitations
querying is not supported for couchbase.
I want a solution to run query using the hadoop connector.
For ex:
I have 2 documents in db as follows:
{'doctype':'a'}
and
{'doctype':'b'}
I need to get only the docs which belong to docType=a.
Is there a way to do this?
If you want to select data from Couchbase, you don't need hadoop connector...you can just use couchbase view that filters on doc.doctype=='a'
See couchbase views documentation
On other hand, I recommend using new N1QL query functionality from Couchbase. It is quite flexible query language (similar to SQL), see online N1QL tutorial.
Note: If you look at compatibility for N1QL to run it has v2.2 and higher, see N1QL Compatibility You will need to deploy Couchbase N1QL Query server and point to your existing CB v2.2 cluster. see: Couchbase N1QL queries on server
Suggesting another alternative for Sqoop for the above requirement called 'Couchdoop'.
Couchdoop uses views to fetch data from Couchbase. Hence we can write a query as per our need and use Couchdoop to hit the view and fetch data.
https://github.com/Avira/couchdoop
Worked for me.