table are created in hbase shell are not detected in phoenix shell - hadoop

When I create table by phoenix shell it is detected in hbase shell by command list, but the same is not identified in Phoenix.
Phoenix just detects tables that are created in phoenix shell in addition HBase default table.
How can I fix this problem?

The problem is Phoenix is case-sensitive and only identify those tables that have names in uppercase.

You need to create a view on top of HBase table to perform any query in Phoenix.
To create a view, you need to be in phoenix and issue create view command like below
CREATE VIEW "<table_name>" ( ROWKEY VARCHAR PRIMARY KEY, "<column_family_name>"."<column_name>" <data_type>, "<column_family_name>"."<column_name>" <data_type> )
For more details you can check How to use existing HBase table in Apache Phoenix

Related

Spark(2.3) not able to identify new columns in Parquet table added via Hive Alter Table command

I have a Hive Parquet table which I am creating using Spark 2.3 API df.saveAstable. There is a separate Hive process that alters the same parquet table to add columns (based on requirements).
However, next time when I try to read the same parquet table into Spark dataframe, the new column which was added to the parquet table using Hive Alter Table command is not showing up in the df.printSchema output.
Based on initial analysis, it seems that there might be some conflict, and Spark is using its own schema instead of reading the Hive metastore.
Hence, I tried the below options :
Changing the spark setting:
spark.sql.hive.convertMetastoreParquet=false
and Refreshing the spark catalog:
spark.catalog.refreshTable("table_name")
However, the above two options are not solving the problem.
Any suggestions or alternatives would be super helpful.
This sounds like a bug described in SPARK-21841. JIRA description also contains the idea for a possible workaround:
...Interestingly enough it appears that if you create the table
differently like:
spark.sql("create table mydb.t1 select ip_address from mydb.test_table limit 1")
Run your alter table on mydb.t1 val t1 = spark.table("mydb.t1")
Then it works properly...
To fix this solution, you have to use the same alter command used in hive to spark-shell as well.
spark.sql("alter table TABLE_NAME add COLUMNS (col_A string)")

How to use describe 'table_name' in HBase shell to create a table.

I have to create a table in different cluster and i only have description of hbase table as handy. how do i create the new hbase table in different cluster?
go to hbase shell by typing Hbase shell in terminal in you new cluster, then give command create ‘<table name>’,’<column family>’ give you table name and column family name which you already have from describe 'table name' from previous cluster.
for more info:
https://www.tutorialspoint.com/hbase/hbase_create_table.htm
https://www.tutorialspoint.com/hbase/hbase_describe_and_alter.htm

HBase Shell - Create a reduced table from existing Hbase table

I want to create a reduced version of an HBase Table via Hbase shell. For example:
HBase Table 'test' is already present in HBase with following info:
TableName: 'test'
ColumnFamily: 'f'
Columns: 'f:col1', 'f:col2', 'f:col3', 'f:col4'
I want to create another table in HBase 'test_reduced' which looks like this
TableName: 'test_reduced'
ColumnFamily: 'f'
Columns: 'f:col1', 'f:col3'
How can we do this via HBase shell ? I know how to copy the table using snapshot command So I am mainly looking for dropping column names in HBase Table.
can't do it. you need to use Hbase Client API.
1- read the table in.
2- only "put" columns you want into your new table.
Cloudera came close by enabling users to perform "Partial HBase table copies" with "CopyTable" function, but that will allow you to change column_family names only ... (I am not sure you are using cloudera), but even that, is not what you are looking for.
for your ref:
http://blog.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/

Why hive doesn't allow create external table with CTAS?

In hive, create external table by CTAS is a semantic error, why?
The table created by CTAS is atomic, while external table means data will not be deleted when dropping table, they do not seem to conflict.
In Hive when we create a table(NOT external) the data will be stored in /user/hive/warehouse.
But during External hive table creation the file will be anywhere else, we are just pointing to that hdfs directory and exposing the data as hive table to run hive queries etc.
This SO answer more precisely Create hive table using "as select" or "like" and also specify delimiter
Am I missing something here?
Try this...You should be able to create an external table with CTAS.
CREATE TABLE ext_table LOCATION '/user/XXXXX/XXXXXX'
AS SELECT * from managed_table;
I was able to create one. I am using 0.12.
i think its a semantic error because it misses the most imp parameter of external table definition viz. the External Location of the data file! by definition, 1. External means the data is outside hive control residing outside the hive data warehouse dir. 2. if table is dropped data remains intact only table definition is removed from hive metastore. so,
i. if CTAS is with managed table, the new ext table will have file in warehouse which will be removed with drop table making #2 wrong
ii. if CTAS is with other external table, the 2 tables will point to same file location.
CTAS creates a managed hive table with the new name using the schema and data of the said table.
You can convert it to an external table using:
ALTER TABLE <TABLE_NAME> SET TBLPROPERTIES('EXTERNAL'='TRUE');

Sqoop - Create empty hive partitioned table based on schema of oracle partitioned table

I have an oracle table which has 80 columns and id partitioned on state column. My requirement is to create a hive table with similar schema of oracle table and partitioned on state.
I tried using sqoop -create-hive-table option. But keep getting an error
ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.IllegalArgumentException: Partition key state cannot be a column to import.
I understand that in Hive the partitioned column should not be in table definition, but then how do I get around the issue?
I do not want to manually write create table command, as I have 50 such tables to import and would like to use sqoop.
Any suggestion or ideas?
Thanks
There is a turn around for this.
Below is the procedure i fallow :
On Oracle run query to get the schema for a table and store it to a file.
Move that file to Hadoop
On Hadoop create a shell script which constructs a HQL file.
That hql file contains "Hive create table statement along with columns". For this we can use the above file(Oracle schema file copied to hadoop).
For this script to run u need to just pass Hive database name,table name, partition column name,path, etc.. depending on u r customization level.At the end of this shell script add "hive -f HQL filename".
If everything is ready it just takes couple of mins for each table creation.

Resources