Can I add a subcolumn to a hive struct column using alter table? - hadoop

I've a very complex table structure in hive, let's say that it's like the following table:
create table dirceu ( a struct<b:string,c:string>);
Now I do need to add another subcolumn to the a column, and it should have the structure b,c and d, I'm trying to do it with the following alter table:
alter table dirceu change column a a struct<b:string,c:string, d:string>;
But this throw the following error:
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
a (state=08S01,code=1)
Is there a way to do this using alter table? I know that I can do it using create table and copy the data, but I would like to know if there is another way to do it.
UPDATE
I'm using hive: 2.1.0.2.6.1.0-129
HortonWorks: HDP-2.6.1.0

This is the command you need to give,
ALTER TABLE dirceu CHANGE COLUMN a a STRUCT<b:STRING, c:STRING, d:STRING>;
You can't add a new field to Collection data type. Instead, you need to completely change the schema.
Hope you like the answer. Yippee!!

I guess it works by this command,
alter table dirceu add column a struct<d:string>;
Please let me know if it doesn't work by comment.

Related

After changing column name in hive, value of column are getting NULL

Working on hive table, where I need to change column name as below, its working as expected and changing column name but underline value of this column getting NULL.
ALTER TABLE db.tbl CHANGE hdfs_loaddate hdfs_load_date String;
Here changed column name is hdfs_load_date and values are getting NULL after renaming column name.
Does any one have idea to fix this. Thanks in advance!!
#Ajay_SK Referencing this article: Hive Alter table change Column Name
There is a comment:
Note that the column change will not change any underlying data if it is a parquet table. That is, if you have data in the table already, renaming a column will not make the data in that column accessible under the new name: select a from test_change; 1 alter table test_change change a a1 int; select a1 from test_change; null
He is specific to parquet, but the scenario you describe is similar where you have successfully changed the name, but hive still thinks the original data is in the original key.
A better approach to solve your issue, would be to create a new table of the schema you want with column name change. Then perform an Insert INTO new table select FROM * old table.

How to drop hive column?

I have two columns Id and Name in Hive table, and I want to delete the Name column. I have used following command:
ALTER TABLE TableName REPLACE COLUMNS(id string);
The result was that the Name column values were assigned to the Id column.
How can I drop a specific column of the table and is there any other command in Hive to achieve my goal?
In addition to the existing answers to the question : Alter hive table add or drop column
As per Hive documentation,
REPLACE COLUMNS removes all existing columns and adds the new set of columns.
REPLACE COLUMNS can also be used to drop columns. For example, ALTER TABLE test_change REPLACE COLUMNS (a int, b int); will remove column c from test_change's schema.
The query you are using is right. But this will modify only schema i.e, the metastore. This will not modify anything on data side.
So, before you are dropping the column you should make sure that you hav correct data file.
In your case the data file should not contain name values.
If you don't want to modify the file then create another table with only specific column that you need.
Create table tablename as select id from already_existing_table
let me know if this helps.

how to add a column after another one in monetDB

I am trying to add a new column in a monetDB database and I want it positioned after a specific one. In mysql this is possible using the AFTER keyword.
ALTER TABLE myTable ADD myNewColumn VARCHAR(255) AFTER myOtherColumn
I am trying this in the mclient:
sql>ALTER TABLE dbname.table_name ADD COLUMN new_name AFTER existing_name SET DEFAULT NULL;
What I get is a syntax error:
syntax error, unexpected AFTER in: "ALTER TABLE dbname.table_name ADD COLUMN new_name AFTER"
It is true that the ALTER documentation does not specify that AFTER exists, but I am hoping that anybody knows an alternative.
The safe way is to create an new table with the columns properly ordered and move the data; you probably know this already.
However if you really cannot do that, create a view:
CREATE VIEW AS SELECT [order the columns however you want here] FROM your_table;

Add enum type column to table

im trying to add a new column to my existing table 'Results', and it seems to be very easy but I cant see the mistake.
Here is my code:
SQL> Alter table results add column CAL ENUM('A','B');
ERROR at line 1: ORA-00904: : invalid identifier
What am I missing?
I've read this from w3 and this from java2s but cant see the difference to mine.
Thanks, and sorry for the dumb question.
OK, with an ORA- error I am assuming that this is an oracle database, and not mysql. you have both tags and you are linking to MySQL example, but the error is not a MySQL error.
Assuming that this IS an Oracle DB, then there is no native ENUM data type. You have to do this as follows: First - you add the column with a correctly defined data type, and second you create a constrained list of permitted values on that column as a check constraint.
Alter table results add (cal varchar2(1));
Alter table results add constraint chk_cal CHECK (cal in ('A','B')) ENABLE;
(EDITED to fix brackets in constraint creation line)

In Hive, how can I add a column only if that column does not exist?

I would like to add a new column to a table, but only if that column does not already exist.
This works if the column does not exist:
ALTER TABLE MyTable ADD COLUMNS (mycolumn string);
But when I execute it a second time, I get an error.
Column 'mycolumn' exists
When I try to use the "IF NOT EXISTS" syntax that is supported for CREATE TABLE and ADD PARTITION, I get a syntax error:
ALTER TABLE MyTable ADD IF NOT EXISTS COLUMNS (mycolumn string);
FAILED: ParseException line 3:42 required (...)+ loop did not match anything at input 'COLUMNS' in add partition statement
What I need is something that can execute itempotently so I can run my query whether this column exists or not.
You can partially work it around, by setting the hive.cli.errors.ignore flag. In this case hive CLI will force the execution of further queries even when queries on the way fail.
In this example:
SET hive.cli.errors.ignore=true;
ALTER TABLE MyTable ADD COLUMNS (mycolumn string);
ALTER TABLE MyTable ADD COLUMNS (mycolumn string);
ALTER TABLE MyTable ADD COLUMNS (mycolumn2 string);
hive will execute all queries, even though there'll be an error in the second query.
Well there is no direct way to do that. I mean through a single query.
There are two other ways:
1.) Using JDBC:
1.1) Do describe on the table name.
1.2) You will get a list of columns in result set.
1.3) Check if your columns exists or not by iterating through the result set.
2.) Using hive Metastore client:
2.1) Create a object of HiveMetastoreClient
2.2) HiveMetastoreClient.getFields(<>db_name, <table_name>).get(index).getName() will give you the column name.
2.3) Check if your column exists of not by comparing the list.
Hope it helps...!!!

Resources