why there is always some null values when I use "insert into" in hive? - hadoop

I created a table and define the data type of every field which is the same with the source table. When I use "insert into table select ..." to fulfill data in this new table, there is no debug error. And I am sure the 'productid' field has no null value in source table which is bigint type. But after the inserting, I find a little amount of records' productid is null. I also try stored as textfile and parquet. It makes no sense that there is still null values in outcome table.
However, when I use "creata table as select .... from ...", there is no null in the outcome productid.
So I don't know where is the problem?
Thanks.

This happens mostly when the actual data which you are trying to load have different datatype than destination Hive table column data type in DDL OR if length is smaller in target table. Check productid missing value in terms of actual data in it and length Vs defined in DDL statement.

Related

How to change data type for column on partioned external Hive table (parquet) without deleting data?

I have a partitioned external Hive table. It has data loaded from parquet files. I have a few columns in that table that require a datatype change (TIMESTAMP -> STRING). Currently, when you query these columns, it returns NULL values because of the wrong data-type.
I ran ALTER TABLE table_name CHANGE col_1 col_1 STRING; to successfully change the datatype for that column to STRING, but when I query the table again, the data in that table is still showing NULL. Is there a way to update the data without dropping the partitions and re-loading the data from scratch?

After changing column name in hive, value of column are getting NULL

Working on hive table, where I need to change column name as below, its working as expected and changing column name but underline value of this column getting NULL.
ALTER TABLE db.tbl CHANGE hdfs_loaddate hdfs_load_date String;
Here changed column name is hdfs_load_date and values are getting NULL after renaming column name.
Does any one have idea to fix this. Thanks in advance!!
#Ajay_SK Referencing this article: Hive Alter table change Column Name
There is a comment:
Note that the column change will not change any underlying data if it is a parquet table. That is, if you have data in the table already, renaming a column will not make the data in that column accessible under the new name: select a from test_change; 1 alter table test_change change a a1 int; select a1 from test_change; null
He is specific to parquet, but the scenario you describe is similar where you have successfully changed the name, but hive still thinks the original data is in the original key.
A better approach to solve your issue, would be to create a new table of the schema you want with column name change. Then perform an Insert INTO new table select FROM * old table.

ORACLE SQL Query to fetch all table names IN DB whereever given value is treated as PK

Just want to know is this possible.
Say that if i have value 'X' and iam sure that this is referenced in some other tables as PK value but not sure about exactly which table is that, so i would like to know the list of those tables.
Pseudo query of above what i mentioned
SELECT TABLE_NAME FROM DBA_TABLES WHERE <<ATLEAST ONE OF THE TABLE ROW PK VALUE IS MATCHING EQUAL TO 'X'>>;

ODBC with Oracle Trigger Key Column

I'm trying to update some existing code that is supposed to write data to a variety of Databases (SQL, Access, Oracle) via ODBC, but I'm having a few problems with Oracle and am looking for any suggestions.
I've set my Oracle database up using a Trigger (basic tutorial online, which I'd like to support).
CREATE TABLE TABLE1 (
RECORDID NUMBER NOT NULL PRIMARY KEY,
ID VARCHAR(40) NULL,
COUNT NUMBER NULL
);
GO
CREATE SEQUENCE TABLE1_SEQ
GO
CREATE or REPLACE TRIGGER TABLE1_TRG
BEFORE INSERT ON TABLE1
FOR EACH ROW
WHEN (new.RECORDID IS NULL)
BEGIN
SELECT TABLE1_SEQ.nextval
INTO :new.RECORDID
FROM dual;
end;
GO
I then populate a DataTable using a SELECT * FROM TABLE1. The first problem is that this DataTable doesn't know that the RecordId column is auto-generated. If I have data in my table then I can't alter it because I get a error
Cannot change AutoIncrement of a DataColumn with type 'Double' once it
has data.
If I continue, ignoring this, then I quickly get stuck. If I create a new DataRow and try to insert it, I can't set RecordID to DBNull.Value because it complains that the column has to be non-null (NoNullAllowedException). I can't however generate a value myself, because I don't know what value I should be using really, and don't want to screw up the trigger by using the next available value.
Any suggestions on how I should insert data without ODBC complaining?
It does not appear that your first problem is with an Oracle database. There is no such thing as an "Autoincrement" column in Oracle. Are you sure that message is coming from an Oracle database?
With Oracle, you should be able to provide any dummy value on insert for the primary key, and the trigger will overwrite it.
There is also nothing in your provided description that would prevent you from updating this value in Oracle (since your trigger is on insert only) unless you have foreign key references to the key.

Using JPA/Oracle can I have a unique constraint that ignores string case?

I have a db table with a column that is a String. I do not consider the case to be significant (e.g. "TEST == "test"). Unfortunately, it appears that JPA2 does, because both values are inserted into my table; I would like the second one to be rejected.
Is there a generic way to annotate an "ignore-case" unique constraint on a string column?
As an alternative, I could also consider putting a unique "ignore-case" constraint on the actual db column. Is that possible in Oracle 10?
What I don't want to do is write code, because this occurs often in this particular db.
All help is greatly appreciated.
you can achieve this with a function-based unique index
create unique index <index_name> on <table_name> (UPPER(<column_name>));
for Example
create table t111( col varchar2(10));
create unique index test_idx on t111 (UPPER(col));
insert into t111 values('test');
insert into t111 values ('TEST');

Resources