Unable to create iceberg table using Trino - trino

I am trying to create iceberg table using trino but ddl is not executing stating that sort_order not find .below sample ddl
Create table <tableschema>.test ( C1 datetime,
C2 double ,
C3 double
)
With ( format='parquet',
Location='S3a:// location of file',
Partitioning=ARRAY['day(C1)'],
Sort_order=ARRAY['C2','C3']
)
DDL not executing when I am defining the sort_order for table . Without sort_order table getting created
Can you please help with this is anything I need to modify in DDL . I am using trino-jdbc380.jar for driver to connect

Related

Cannot create an external hadoop table in db2

i have a problem with creating an external hadoop table in db2.
I use this create statement:
CREATE EXTERNAL HADOOP TABLE DATA_LAKE.TABLE_TEST (
Id int,
blabla varchar(10)
)
but when i run this i got an error saying:
The name of the object to be created is identical to the existing name
"TABLE_TEST" of type "Table".. SQLCODE=-601,SQLSTATE=42710.
On my file share i don't have any table name with this name and also in db2 under DATA_LAKE schema i don't have any TABLE_TEST table. I also tried to find where is this table, maybe in a catalog , but i didn't find anything.
SELECT * FROM SYSCAT.TABLES
WHERE TABNAME LIKE '%TABLE_TEST%'
Please for any help

CTE (With table as) in sql server equivalent in hive?

I use WITH table_name AS (select...;) command in SQL Developer to create temporary table and use that temp table in following queries. What is the similar command in Hadoop Hive?
Using SQL assistant User interface on Hadoop Hive.
I tried the following example, which gives error-
Create table Failed,80:
CREATE TEMPORARY TABLE temp1(col1 string);
CREATE TEMPORARY TABLE temp2 AS Select * from table_name;
Maybe you must write case sensitive like this:
CREATE TEMPORARY TABLE temp1(col1 STRING);
The same CTE as in MySQL:
with your_table as (
select 'some value' --from etc etc
)
select * from your_table;
Another example: https://stackoverflow.com/a/54960324/2700344
Hive CTE Official docs

Import selected data from oracle db to S3 using sqoop and create hive table script on AWS EMR with selected data

I am new to big data technologies. I am working on below requirement and need help to make my work simpler.
Suppose i have 2 tables in oracle db and each table has 500 columns in it. my task is to move the selected columns data from both the tables (by join query) to AWS S3 and populate the data in hive table on AWS-EMR.
Currently to full-fill my requirement i follow below steps.
Creating external hive table on AWS-EMR with the selected columns. I know the column names but to identify the column data type for hive, i am going to oracle database tables and identifying the type of column in oracle and creating the hive script.
Once table is created, i am writing sqoop import command with selected query data and giving directory directory to S3.
Repair the table from the S3 data.
To explain in details,
Suppose T1 and T2 are two tables, T1 has 500 columns from T1_C1 to T1_C500 with various data type (Number, Varchar, Date) etc. Similarly T2 also has 500 columns from T2_C1 to T2_C500.
Now suppose i want to move some columns for ex: T1_C23,T1_C230,T1_C239,T2_C236,T1_C234,T2_C223 to S3 and create the hive table for selected columns and to know the data type i need to look into T1 and T2 table schema.
Is there any simpler way to achieve this ?
In above mentioned steps, First step takes lot of manual time because i need to look at the table schema and get the data type of selected columns and then create hive table.
To brief about work environment.
Services running on Data Center:
Oracle DB
Sqoop on linux machine.
sqoop talks to oracle db and configured to push the data on S3.
Services running on AWS:
S3
AWS EMR hive
hive talks to S3 and uses S3 data to repair the table.
1)
to ease your hive table generation, you may use Oracle dictionary
SELECT t.column_name || ' ' ||
decode(t.data_type, 'VARCHAR2', 'VARCHAR', 'NUMBER', 'DOUPLE') ||
' COMMENT '||cc.comments||',',
t.*
FROM user_tab_columns t
LEFT JOIN user_col_comments cc
ON cc.table_name = t.table_name
AND cc.column_name = t.column_name
WHERE t.table_name in ('T1','T2')
ORDER BY t.table_name, t.COLUMN_id;
First column of this data set will be your column list for CREATE TABLE command.
You need to modify DECODE to correctly trunslate Oracle types to Hive types
2)
As I remember, sqoop easily export table, so you may create view in Oracle to hide join query inside and export this view by sqoop:
CREATE OR REPLACE VIEW V_T1_T2 AS
SELECT * FROM T1 JOIN T2 ON ...;

Hive load specific columns

I am interested in loading specific columns into a table created in Hive.
Is it possible to load the specific columns directly or I should load all the data and create a second table to SELECT the specific columns?
Thanks
Yes you have to load all the data like this :
LOAD DATA [LOCAL] INPATH /Your/Path [OVERWRITE] INTO TABLE yourTable;
LOCAL means that your file is on your local system and not in HDFS, OVERWRITE means that the current data in the table will be deleted.
So you create a second table with only the fields you need and you execute this query :
INSERT OVERWRITE TABLE yourNewTable
yourSelectStatement
FROM yourOldTable;
It is suggested to create an External Table in Hive and map the data you have and then create a new table with specific columns and use the create table as command
create table table_name as select statement from table_name;
For example the statement looks like this
create table employee as select id as id,emp_name as name from emp;
Try this:
Insert into table_name
(
#columns you want to insert value into in lowercase
)
select columns_you_need from source_table;

How to create a temporary table in ORACLE with READ ONLY access

I am using CREATE GLOBAL TEMPORARY TABLE script to create a temporary table in the oracle DB but its showing SQL Error: ORA-01031: insufficient privileges. I Want to create a temp table with Read only access. Plz help me out in this.
What we are trying to achieve is:
We have to create a table in the destination database which is always GreenPlum.
In source database(Oracle) we are getting a select query from the USER for example: "select * from ABC A join DEF D on A.Col1=D.col1" then we are creating TEMP TABLE(In case of Oracle) on top of it for example "CREATE GLOBAL TEMPORARY TABLE table101 AS (select * from ABC A join DEF D on A.Col1=D.col1)".
Then using this TEMP table we get the required information from INFORMATION_SCHEMA for example "select * from ALL_TAB_COLUMNS where table_name='table101' ".By this we will get the column_name,data_type,character_maximum_length etc information. Using this information we can get "create table Statement" using Javascript .
Then we store this Create table statement in a variable & run it in Execute Row script(Step in pentaho data integration tool) which will create the Table in the destination DB.
Problem is that we have read only access in oracle. now what to do.
NOTE: In short, we are creating a table in the destination DB using the select statement from the source DB. Means structure of the table in Dest DB depends on the select query in Source DB.
If the target database is also an oracle database then you should be able to set up a database link there to the source database and use a "CREATE TABLE AS SELECT * FROM +source table+#+database link+;"
Whoops, I just noticed that this is from 2014. Oh well, maybe it well help future inquirers.

Resources