Options for automating data off loading from teradata to hadoop - hadoop

I am aware that Sqoop can be used to pull the data from sources such as teradata .But could you guys give any suggestion how to do data loading from teradata to hadoop with less man hours when there are hundreds of tables present.
Thanks in advance... :)

Related

Is there any ways for transfer from RDBMS to HDFS besides SQOOP?

I want other way than by using sqoop for transfer RDBMS to HDFS please give me clue
Please anyone explain to me, whats the relation between hive and sqoop?
Added to dev ツ's answers you have one more tool called streamsets data collector which help you to get data from mysql to HDFS by creating JDBC connection.

what is hive best suited for

I need daily snapshots from all databases of the enterprise and update hive with it.
In case that is the best approach, how do I approach this? I have used sqoop to manually import data to hive but what do I connect PHP to? Hive or Sqoop?
I understand hive is used for OLAP and not OLTP, but taking snapshots once in a day is what hive would be supporting nicely or I should consider other options like Hbase?
I am open to more suggestions considering that the data is structured for the most part.

Load data into Greenplum DB using MapReduce or Sqoop

I want to try loading the data into Greenplum using mapreduce or sqoop. For now, the ways to load greenplum db from hdfs is, creating an extenrnal table with gphdfs and then loading internal table. I want to tryout solution to directly load the data into greenplum with sqoop or mapreduce. I need some inputs on how i can proceed on this. Could you please help me out.?
With regards to Sqoop, Sqoop export will help to achieve this.
http://www.tutorialspoint.com/sqoop/sqoop_export.htm
While not sqoop, I am currently in the experimental phases of using Greenplum's external tables to load from hdfs. So far it seems to perform.

Unable to retain HIVE tables

I have set up a single node hadoop cluster on ubuntu.I have installed hadoop 2.6 version on in my machine.
Problem:
Everytime i create HIVE tables and load data into it , i can see the data by querying on it but once i shut-down my hadoop , tables gets wiped out. Is there any way i can retain them or is there any setting i am missing?
I tried some online solution provided , but nothing worked , kindly help me out with this.
Blockquote
Thanks
B
The hive table data is on the hadoop hdfs, hive just add a meta data and provide users sql like commands to prevent them from writing basic MR jobs.So if you shutdown the hadoop cluster,Hive cant find the data in the table.
But if you are saying when you restart the hadoop cluster, the data is lost.That's another problem.
seems you are using default derby as metastore.configure the hive metastore.i am pointing the link.please fallow it.
Hive is not showing tables

ETL associated with HADOOP database Hbase?

HI can anybody tell me which are the ETL Tools which can be used with Hbase which is the database of hadoop?
I mean to say like how the data in oracle database is used to pull data and work with in tools like Informatica and SSIS,is there any ETL tool that can be used for Hbase?
Kindly help me.
Take a look at Pentaho Data Integration for Hadoop.
Check out Cascading.HBase http://www.cascading.org/modules.html

Resources