Working on hive table, where I need to change column name as below, its working as expected and changing column name but underline value of this column getting NULL.
ALTER TABLE db.tbl CHANGE hdfs_loaddate hdfs_load_date String;
Here changed column name is hdfs_load_date and values are getting NULL after renaming column name.
Does any one have idea to fix this. Thanks in advance!!
#Ajay_SK Referencing this article: Hive Alter table change Column Name
There is a comment:
Note that the column change will not change any underlying data if it is a parquet table. That is, if you have data in the table already, renaming a column will not make the data in that column accessible under the new name: select a from test_change; 1 alter table test_change change a a1 int; select a1 from test_change; null
He is specific to parquet, but the scenario you describe is similar where you have successfully changed the name, but hive still thinks the original data is in the original key.
A better approach to solve your issue, would be to create a new table of the schema you want with column name change. Then perform an Insert INTO new table select FROM * old table.
Related
I have two columns Id and Name in Hive table, and I want to delete the Name column. I have used following command:
ALTER TABLE TableName REPLACE COLUMNS(id string);
The result was that the Name column values were assigned to the Id column.
How can I drop a specific column of the table and is there any other command in Hive to achieve my goal?
In addition to the existing answers to the question : Alter hive table add or drop column
As per Hive documentation,
REPLACE COLUMNS removes all existing columns and adds the new set of columns.
REPLACE COLUMNS can also be used to drop columns. For example, ALTER TABLE test_change REPLACE COLUMNS (a int, b int); will remove column c from test_change's schema.
The query you are using is right. But this will modify only schema i.e, the metastore. This will not modify anything on data side.
So, before you are dropping the column you should make sure that you hav correct data file.
In your case the data file should not contain name values.
If you don't want to modify the file then create another table with only specific column that you need.
Create table tablename as select id from already_existing_table
let me know if this helps.
I created a table and define the data type of every field which is the same with the source table. When I use "insert into table select ..." to fulfill data in this new table, there is no debug error. And I am sure the 'productid' field has no null value in source table which is bigint type. But after the inserting, I find a little amount of records' productid is null. I also try stored as textfile and parquet. It makes no sense that there is still null values in outcome table.
However, when I use "creata table as select .... from ...", there is no null in the outcome productid.
So I don't know where is the problem?
Thanks.
This happens mostly when the actual data which you are trying to load have different datatype than destination Hive table column data type in DDL OR if length is smaller in target table. Check productid missing value in terms of actual data in it and length Vs defined in DDL statement.
How can I change DATA TYPE of a column from number to varchar2 without deleting the table data?
You can't.
You can, however, create a new column with the new data type, migrate the data, drop the old column, and rename the new column. Something like
ALTER TABLE table_name
ADD( new_column_name varchar2(10) );
UPDATE table_name
SET new_column_name = to_char(old_column_name, <<some format>>);
ALTER TABLE table_name
DROP COLUMN old_column_name;
ALTER TABLE table_name
RENAME COLUMN new_column_name TO old_coulumn_name;
If you have code that depends on the position of the column in the table (which you really shouldn't have), you could rename the table and create a view on the table with the original name of the table that exposes the columns in the order your code expects until you can fix that buggy code.
You have to first deal with the existing rows before you modify the column DATA TYPE.
You could do the following steps:
Add the new column with a new name.
Update the new column from old column.
Drop the old column.
Rename the new column with the old column name.
For example,
alter table t add (col_new varchar2(50));
update t set col_new = to_char(col_old);
alter table t drop column col_old cascade constraints;
alter table t rename column col_new to col_old;
Make sure you re-create any required indexes which you had.
You could also try the CTAS approach, i.e. create table as select. But, the above is safe and preferrable.
The most efficient way is probably to do a CREATE TABLE ... AS SELECT
(CTAS)
alter table table_name modify (column_name VARCHAR2(255));
Since we can't change data type of a column with values, the approach that I was followed as below,
Say the column name you want to change type is 'A' and this can be achieved with SQL developer.
First sort table data by other column (ex: datetime).
Next copy the values of column 'A' and paste to excel file.
Delete values of the column 'A' an commit.
Change the data type and commit.
Again sort table data by previously used column (ex: datetime).
Then paste copied data from excel and commit.
I need to change a column type in more than 200 tables I am following the next recipe:
Disable all foreign constraints if the column is referenced by any FK
Store columns in varray and Drop primary key if the column is part of a PK
Create a temporal new column in the table with the same type
Update the temporal new column with original values
Delete values from original column
Change column type of original column
Update original column with temporal column values
Restore primary key if applied
Enable FK if applied
I am having some issues with the following cases
. When a primary key is compound (Multiple columns)
. I need to store the original FK and PK signature to allow me to restore them after the change
------- My ideas --------
Backup all_constraints and all_cons_columns records in a temporary table and after changing the column type resstoring the constraints info.
Keep with the same idea of storing the FK and PK signature to restore them after changing the column type
¿ Any suggestions ? would be appreaciate it, thx!!
You could try the old CTAS then rename method:
basically:
create new table as select c1,c1,c3 ... from old table (include data
type transformation here;
establish any indexes, constraints ...
drop old table (may need to disable donstraints first)
rename new table to old table
I have a tab-separated textfile in HDFS, and want to export this into a MySQL table.
Since the rows in the textfile do not have numerical ids, how do I export into a table with an ID automatically set during the SQL INSERT (autoincrement)?
If I try to export (id being the last defined attribute in the table), I get
java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at entity.__loadFromFields(entity.java:996)
If I take the autogenerated class and modify it to exclude the id-attribute, I get
java.io.IOException: java.sql.SQLException: No value specified for parameter 27
where parameter 27 is 'id'.
Version is Sqoop 1.3.0-cdh3u3
In Sqoop 1.4.1, writing a "null" in the text file field position corresponding to the autoincrement field worked for me. After exported to mySQL you will see an incremented and automatically asigned ID.
As somebody on the Sqoop mailinglist suggested:
Create a temporary table without the ID
Sqoop-export into this table
Copy the rows of this table into the final table (that has the autoincrement ID)
My source table is in HIVE. What works for me is that I add a column called id int, and populate the column as NULL. After sqoop, the mysql will receive insert (id, X, Y) values (null, "x_value, "y_value"). Then mysql knows to populate the id as auto-increment.