Check if a column exists in Hive through shell script - shell

I am trying to write a shell script file that would check inside the file to see if there is a column present inside the file, but not inside the table's schema then add that column into the table's schema. While if a column is not present in the file but is there in the table's schema then remove that column. Column names are included in the file as headers.
I tried to find the answer to my above question here on SO, but I could not find the answer to it. If it is present here then declare it a duplicate question. Thanks for all the help.
The file might contain something like this:
First Last ID
James Hardy 1
John Smith 2
And the table may have:
Last ID
Hartley 4
Birkhold 5
So then I need to write a script that would add a new column called First into the table's schema, but if some column was present in the table but not in the file then it needs to be deleted. If a column already there in both file and table, then do nothing, but just add the data into the table

Related

Why does CSVREAD not work as expected when it is supposed to read the column names from the csv file?

According to the H2 documentation for CSVREAD
If the column names are specified (a list of column names separated with the fieldSeparator), those are used, otherwise (or if they are set to NULL) the first line of the file is interpreted as the column names.
I'd expect reading the csv file
id,name,label,origin,destination,length
81,foobar,,19,11,27.4
like this
insert into route select * from csvread ('routes.csv',null,'charset=UTF-8')
would work. However, actually a JdbcSQLIntegrityConstraintViolationException is thrown, saying NULL not allowed for column "ORIGIN" and indicating error code 23502.
If I explicitly add the column names to the insert statement like so,
insert into route (id,name,label,origin,destination,length) select * from csvread ('routes.csv',null,'charset=UTF-8')
it works fine. However, I'd prefer not to repeat myself - following the DRY principle :)
Using version 2.1.212.
The CSVREAD function produces a virtual table. Its column names can be specified in parameters or in the CSV file.
INSERT command with a query doesn't map column names from this query with column names of target table, it uses their ordinal positions instead. Value from the first column of the query is inserted into first column specified in insert column list or into first column of target table if insert column list isn't specified, the second is inserted into second column, and so on.
You can omit insert column list only if your table was defined with the same columns in the same order as in the source query (is your case in the CSV file). If your table has columns declared in different order or it has some additional columns, you need to specify this list.

How to select from another table during SQLLDR importing file

Is there way to create SQLLDR control file such a way it checks ID number before inserting data into another table and if ID does not match, loader throws it into dicard file?
In original import file there is a column: ID which include 8 digit set of numbers.
Then we have a ID table where these 8 digit ID´s found.
Now I need to check from this ID table first, if ID in file match or not. Matched ones will be inserted into table SETS and mismatches ended into sets.dsc -file.
Can I use WHEN or should I put this selection right into ID inside quotation marks?
We use Oracle 11
Why wouldn't you rather create referential integrity constraint? If ID doesn't exist in the "ID table" (which is "parent"), then such a row won't be loaded from the input file. Oracle will raise
ORA-02291: integrity constraint violated-parent key not found
error.
I mean, why reinvent the wheel?

Compare text file and table values for insert/update/delete

I have text file which looks like as below,
ID1~name1~city1~zipcode1~position1
ID2~name2~city2~zipcode2~position2
ID3~name3~city3~zipcode3~position3
ID4~name4~city4~zipcode4~position4
.
.
etc goes on...
This text file is the source file and I want split the file (~) and compare the table with ID.
If the value is not in the table, insert operation should perform.
If the id is available in the table but other column values are different then need to update the table.
If the id is not available in the text but available in the table then then the record should get deleted.
I did goggle it but i could find the below page,
https://www.experts-exchange.com/questions/27419804/VBScript-compare-differences-in-two-record-sets.html
Please help me how I can proceed with VBscript.
Whose leg you are trying to pull? Obviously the desired/resulting table is the input table, so use "load data infile" to import the file.

Temp Variables in Oracle SQL Loader

I need to upload data from flat files.
Platform/Version: Oracle 10g/Windows
My Flat file looks like below:
H,1,10302014
P,10.00,ABC
P,15.00,XYZ
P,14.75,BBY
T,3
First Record - Header (Row Indicator, FileType, Date)
second to Fourth - Detal Records (Row Indicator, Amount, Name)
Last Record - Trailer (Row Indicator, Number of Detail Records)
create table Mytable
(Row_ind Varchar2(2),
Amount number(6,2),
name varchar2(15),
file_Dt date);
I need to use the date(10302014) from header record to while inserting the detail records. Is is possible?
Note:
The file size is over a million records and i don't have update
permission on the file (the file is NOT in ASCII format)
If you're on Oracle 9i or above, there's a way to bind a value and use it later in the process, but I'm assuming you can tell the customer how to write or modify the control file.
I'm wondering if that might work where you use multiple inserts (the header record to a table maybe just to bind the column to a date column) and on the succeeding inserts include the bound column. Search Oracle for that on sql*loader. I found part of it here.

how to export header and data from separate tables while creating SSIS package

I have two tables in my database i.e. Columns and Data. Data in these tables are like:
Columns:
ID
Name
Data:
1 John
2 Steve
Now I want to create a package which will create a csv file like:
ID NAME
------------
1 John
2 Steve
Can we achieve the same output? I have searched on google but I haven't found any solution.
Please help me....
You can achieve this effect through a script task OR you can create a temporary dataset in SQL Server where you combine the first row of your Column table and append data from the Data table to this. My guess is, you would have to fight with the metadata issues while doing anything of this sort. Another suggestion that I can think of is to dump the combination to a flat file, but again you will have to take care of the metadata conversion.

Resources