where to set config values in cloudera hive setup? - hadoop

I am new to Cloudera quickstart. As per the requirement, we need to partition the data of large hive tables. there is cap of 100 dynamic partition in hive. We need to increase number of dynamic partitions in the configurations. I don't want to set it on the CLI everytime.
Where can i find the configuration file to update the following settings ?
Will sqoop create any problem while importing data from sql server to hive with dynamic partitions?

Hive is configured through hive-site.xml.
On your local server, execute this command to try and find it
locate hive-site.xml


Is it possible to use/query data using Pig/Tableau or some-other tool from HDFS which was inserted/loaded using a HIVE Managed table?

Is it possible to use or query data using Pig or Drill or Tableau or some other tool from HDFS which was inserted/loaded using a HIVE Managed table; or is it only applicable with data in HDFS which was inserted/loaded using a HIVE External table?
Edit 1: Is the data associated with Managed Hive Tables locked to Hive?
Managed and external tables only refer to the file system, not visibility to clients. Both can be accessed with Hive clients
Hive has HiveServer2 (use Thrift) which is a service that lets clients execute queries against Hive. It provides JDBC/ODBC.
So you have query data in hive however it is managed table by hive or external tables.
DBeaver/Tableau can query hive once connected to HiveServer2.
For Pig you can use HCatalog
pig -useHCatalog

Lineage is not visible for Hive Managed Table in HDP Atlas

I am using Atlas with HDP for creating the lineage flow for my hive tables but the lineage is only visible for the Hive External tables. I have created hive managed tables and perform a join operation to create a new table and imported the hive meta store using import-hive.sh placed under hook-bin folder. But the lineage for the managed table is not visible.
Even the HDFS directory is not listed for the managed table. But, if I check for the external table HDFS directory is available.
Can anyone help me over here? Thanks in advance.
There were two factors which were causing the issue in my case, first was with the Hive-Hook and second was with offsets.topic.replication.factor. To resolve this below steps were implemented:-
1) Re-install Hive Hook for the atlas
Grep the lists of the services installed for Apache Atlas and re-install the Hive-Hook jar
2) Kafka offset replication property
Changes the offsets.topic.replication.factor value to 1.
By implementing the above changes, lineage is reflecting in the Atlas itself for Hive as well as sqoop.

Does Hive depend on/require Hadoop?

Hive installation guide says that Hive can be applied to RDBMS, my question is, sounds like Hive can exist without Hadoop, right? It's an independent HQL engineer that could work with any data source?
You can run Hive in local mode to use it without Hadoop for debugging purposes. See below url
Hive provided JDBC driver to query hive like JDBC, however if you are planning to run Hive queries on production system, you need Hadoop infrastructure to be available. Hive queries eventually converts into map-reduce jobs and HDFS is used as data storage for Hive tables.

Unable to retain HIVE tables

I have set up a single node hadoop cluster on ubuntu.I have installed hadoop 2.6 version on in my machine.
Everytime i create HIVE tables and load data into it , i can see the data by querying on it but once i shut-down my hadoop , tables gets wiped out. Is there any way i can retain them or is there any setting i am missing?
I tried some online solution provided , but nothing worked , kindly help me out with this.
The hive table data is on the hadoop hdfs, hive just add a meta data and provide users sql like commands to prevent them from writing basic MR jobs.So if you shutdown the hadoop cluster,Hive cant find the data in the table.
But if you are saying when you restart the hadoop cluster, the data is lost.That's another problem.
seems you are using default derby as metastore.configure the hive metastore.i am pointing the link.please fallow it.
Hive is not showing tables

hadoop hive question

I'm trying to create tables pragmatically using JDBC. However, I can't really see the table I created from the hive shell. What's worse, when i access hive shell from different directories, i see different result of the database.
Is any setting i need to configure?
Thanks in advance.
Make sure you run hive from the same directory every time because when you launch hive CLI for the first time, it creates a metastore derby db in the current directory. This derby DB contains metadata of hive tables. If you change directories, you will have unorganized metadata for hive tables. Also the Derby DB cannot handle multiple sessions. To allow for concurrent Hive access you would need to use a real database to manage the Metastore rather than the wimpy little derbyDB that comes with it. You can download mysql for this and change hive properties for jdbc connection to mysql type 4 pure java driver.
Try emailing the Hive userlist or the IRC channel.
You probably need to setup the central Hive metastore (by default, Derby, but it can be mySQL/Oracle/Postgres). The metastore is the "glue" between Hive and HDFS. It tells Hive where your data files live in HDFS, what type of data they contain, what tables they belong to, etc.
For more information, see http://wiki.apache.org/hadoop/HiveDerbyServerMode
Examine your hadoop logs. For me this happened when my hadoop system was not setup properly. The namenode was not able to contact the datanodes on other machines etc.
Yeah, it's due to the metastore not being set up properly. Metastore stores the metadata associated with your Hive table (e.g. the table name, table location, column names, column types, bucketing/sorting information, partitioning information, SerDe information, etc.).
The default metastore is an embedded Derby database which can only be used by one client at any given time. This is obviously not good enough for most practical purposes. You, like most users, should configure your Hive installation to use a different metastore. MySQL seems to be a popular choice. I have used this link from Cloudera's website to successfully configure my MySQL metastore.
