How to get hive table comment through JDBC - jdbc

Assume that a table is created like following in Hive:
create table test1
(
field_name int
)comment='TestTableComment';
Now I'd like to get the comment for the table (TestTableComment) through JDBC in Java, how can I get it?

Have you tried DatabaseMetaData.getTables()?
Reference

Related

Hive table always set column comment is "from deserializer"

I execute query to create Hive table below:
CREATE TABLE db1.test_create_tbl( column1 smallint COMMENT 'desc of column')
COMMENT 'desc of table'
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
I execute query to display table schema below:
DESCRIBE db1.test_create_tbl
But when i get table schema, column description always display "from deserializer"
Please touch me, thanks
I have come across same issue. I have changed the hive configuration and it worked fine.
set the below parameter in hive shell.
set hive.serdes.using.metastore.for.schema=org.apache.hadoop.hive.serde2.OpenCSVSerde;

Can I add a subcolumn to a hive struct column using alter table?

I've a very complex table structure in hive, let's say that it's like the following table:
create table dirceu ( a struct<b:string,c:string>);
Now I do need to add another subcolumn to the a column, and it should have the structure b,c and d, I'm trying to do it with the following alter table:
alter table dirceu change column a a struct<b:string,c:string, d:string>;
But this throw the following error:
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
a (state=08S01,code=1)
Is there a way to do this using alter table? I know that I can do it using create table and copy the data, but I would like to know if there is another way to do it.
UPDATE
I'm using hive: 2.1.0.2.6.1.0-129
HortonWorks: HDP-2.6.1.0
This is the command you need to give,
ALTER TABLE dirceu CHANGE COLUMN a a STRUCT<b:STRING, c:STRING, d:STRING>;
You can't add a new field to Collection data type. Instead, you need to completely change the schema.
Hope you like the answer. Yippee!!
I guess it works by this command,
alter table dirceu add column a struct<d:string>;
Please let me know if it doesn't work by comment.

Hive Alter External Table and Update Schema

I am looking for a command to add columns and update schema for my Hive External table backed by Avro schema.
Here is what I have tried so far.
I have a Hive External Table with Avro backed Schema created with this command -
CREATE EXTERNAL TABLE `person_hourly`(
'personid' string COMMENT '',
'name' string COMMENT ''
)
PARTITIONED BY (
'partitiontime' string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'hdfs://nameservice1/web/PersonData/'
TBLPROPERTIES (
'avro.schema.url'='hdfs:///schemas/PersonV1.avsc'
)
I would like to add additional columns and update schema for this table.
alter table person_hourly ADD COLUMNS (lastname string ) SET TBLPROPERTIES ('avro.schema.url' = 'hdfs:///schemas/PersonV2.avsc')
But I cannot do this since I get an error
FAILED: ParseException line 1:64 missing EOF at 'SET' near ')'
So I tried adding column separately, which worked, but I cannot update the schema
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. at least one column must be specified for the table
The Data Definition Language (DDL) for ALTER TABLE can be found here
ALTER TABLE table_name SET TBLPROPERTIES table_properties;
 
table_properties:
  : (property_name = property_value, property_name = property_value, ... )
And your comment
I tried adding column separately, which worked
I think that's what you should do. Add the column, then set the properties
if you modify the schema in the hdfs, it will be detected by Hive. Hive read the schema on runtime, it doesn't save any schema information when you use avsc through avro.schema.url
Regards,
Hector
The code below worked for me..
You can change the schema definition in avsc file (with proper formatting) then can use simply alter command with setting path of updated schema file.
ALTER TABLE table_name SET TBLPROPERTIES ("path of updated schema avsc format file")

How to let CREATE TABLE...AS SELECT in HIVE do not populate data?

When I run CTAS in HIVE, the data is also populated simultaneously. But I just want to create the table, but not populate the data. How and what I should do? Thanks.
You can do that by using the LIKE keyword.
create table new_table_name LIKE old_table_name
This will create the table structure without the data.
Use create EXTERNAL table instead of create table. Observe External keyword.
Use where condition in select statement and give a value of where which fetches no records from hive.
Example table name demo1
id name country
1 abc India
2 xyz Germany
3 pqr France
In CREATE TABLE…AS SELECT in HIVE
Create table demo2...As SELECT id, name, country from demo1 where id=0;
So, in above where condition of id is given as 0 and from above data the select statement will fetch no record, similarly choose a value in where condition which returns no records. Hence no data will be inserted in newly created table.
#Sunil's answer helped me as well, I am just posting an addition that was necessary in my case.
The source table was in Avro format but the new one I wanted in ORC, hence,
CREATE TABLE dataaggregate_orc_empty LIKE dataaggregate_avro_compressed ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ('orc.compress'='ZLIB');
The above step can be split in two steps, if required :
CREATE TABLE dataaggregate_orc_empty LIKE dataaggregate_avro_compressed;
alter table dataaggregate_orc_empty set fileformat ORC;
I would be glad if someone provides inputs for the data format changes that occur in this process and related problems, if any.

Hive load specific columns

I am interested in loading specific columns into a table created in Hive.
Is it possible to load the specific columns directly or I should load all the data and create a second table to SELECT the specific columns?
Thanks
Yes you have to load all the data like this :
LOAD DATA [LOCAL] INPATH /Your/Path [OVERWRITE] INTO TABLE yourTable;
LOCAL means that your file is on your local system and not in HDFS, OVERWRITE means that the current data in the table will be deleted.
So you create a second table with only the fields you need and you execute this query :
INSERT OVERWRITE TABLE yourNewTable
yourSelectStatement
FROM yourOldTable;
It is suggested to create an External Table in Hive and map the data you have and then create a new table with specific columns and use the create table as command
create table table_name as select statement from table_name;
For example the statement looks like this
create table employee as select id as id,emp_name as name from emp;
Try this:
Insert into table_name
(
#columns you want to insert value into in lowercase
)
select columns_you_need from source_table;

Resources