How to use SQL query to define table in dbtable? - jdbc

In JDBC To Other Databases I found the following explanation of dbtable parameter:
The JDBC table that should be read. Note that anything that is valid in a FROM clause of a SQL query can be used. For example, instead of a full table you could also use a subquery in parentheses.
When I use the code:
CREATE TEMPORARY TABLE jdbcTable
USING org.apache.spark.sql.jdbc
OPTIONS (
url "jdbc:postgresql:dbserver",
dbtable "mytable"
)
everything works great, but the following:
dbtable "SELECT * FROM mytable"
leads to the error:
What is wrong?

Since dbtable is used as a source for the SELECT statement it has be in a form which would be valid for normal SQL query. If you want to use subquery you should pass a query in parentheses and provide an alias:
CREATE TEMPORARY TABLE jdbcTable
USING org.apache.spark.sql.jdbc
OPTIONS (
url "jdbc:postgresql:dbserver",
dbtable "(SELECT * FROM mytable) tmp"
);
It will be passed to the database as:
SELECT * FROM (SELECT * FROM mytable) tmp WHERE 1=0

Code In Scala
val checkQuery = "(SELECT * FROM " + inputTableName + " ORDER BY " + columnName + " DESC LIMIT 1) AS timetable"
val timeStampDf = spark.read.format("jdbc").option("url", url).option("dbtable", checkQuery).load()
Adding an alias is also necessary after the query in parenthesis.

Related

How do I search for all the columns/field names starting with "XYZ" in Azure Databricks

I would like to do a big search on all field/columns names that contain "XYZ".
I tried below sql but it's giving me an error.
SELECT
table_name
,column_name
FROM information_schema.columns
WHERE column_name like '%account%'
order by table_name, column_name
ERROR states "Table or view not found: information_schema.columns; line 4, pos 5"
information_schema.columns is not supported in Databricks SQL. There are no in-built views available to get the complete details of tables along with columns. There is SHOW TABLES (database needs to be given) and SHOW COLUMNS (table name needs to be given).
You might have to use Pyspark capabilities to get the required result. First use the following code to get the details of all tables and respective columns:
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *
final_df = None
for row in db_tables.collect():
if(final_df is None):
final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
.withColumn('database',lit(f'{row.database}'))\
.withColumn('tablename',lit(f'{row.tableName}'))\
.select('database','tablename','col_name')
else:
final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
.withColumn('database',lit(f'{row.database}'))\
.withColumn('tablename',lit(f'{row.tableName}'))\
.select('database','tablename','col_name'))
#display(final_df)
final_df.createOrReplaceTempView('req')
Create a view and then apply the following query:
%sql
SELECT tablename,col_name FROM req WHERE col_name like '%id%' order by tablename, col_name

Is there a Hive equivalent of SQL “LIKE ANY ( SUBQUERY )”

While Hive doesn't supports multi-value LIKE queries which are supported in SQL : ex.
SELECT * FROM user_table WHERE first_name LIKE ANY ( 'root~%' , 'user~%' );
We can convert it into equivalent HIVE queries as :
SELECT * FROM user_table WHERE first_name LIKE 'root~%' OR first_name LIKE 'user~%'
Does anyone know an equivalent solution that Hive does support in case sub-query is used with LIKE ? Have a look at below example :
SELECT * FROM user_table WHERE first_name LIKE ANY ( SELECT expr FROM exprTable);
As It doesn't have values in expression, I can't use same approach for generating multiple LIKE expression separated with OR / AND operator. Initially I thought to write HIVE UDF for it ? Can you please help me supporting such expression and finding HIVE equivalent ?
You can use Hive's RLIKE relational operator as shown below,
SELECT * FROM user_table WHERE first_name RLIKE 'root~|user~|admin~';
Hope this helps!
This is a case involving theta joins in Hive. There is a wiki page for this and a jira request. Please go through the details here on this page: https://cwiki.apache.org/confluence/display/Hive/Theta+Join
Your case is similar to the Side-Table Similarity case given on the page.
You need to convert the expr values into a map and then use regular expression to find the like. Alternatively you can also use union all with all the like expressions in separate SQL - the query might become tedious so you can programatically generate it.
What about this using EXISTS
SELECT * FROM user_table WHERE EXISTS ( SELECT * FROM exprTable WHERE first_name LIKE expr );

PostgreSQL - migrate a query with 'start with' and 'connect by' in oracle

I have the following query in oracle. I want to convert it to PostgreSQL form. Could someone help me out in this,
SELECT user_id, user_name, reports_to, position
FROM pr_operators
START WITH reports_to = 'dpercival'
CONNECT BY PRIOR user_id = reports_to;
A something like this should work for you (SQL Fiddle):
WITH RECURSIVE q AS (
SELECT po.user_id,po.user_name,po.reports_to,po.position
FROM pr_operators po
WHERE po.reports_to = 'dpercival'
UNION ALL
SELECT po.user_id,po.user_name,po.reports_to,po.position
FROM pr_operators po
JOIN q ON q.user_id=po.reports_to
)
SELECT * FROM q;
You can read more on recursive CTE's in the docs.
Note: your design looks strange -- reports_to contains string literals, yet it is being comapred with user_id which typicaly is of type integer.

query xmltable oracle 10g

I have a query that used XML input to generate a XML table, I give that table an alias "XMLalias". How can I query this table in some other select statement, which is part of same batch.
I want to do something like " select * from XMLalias ".
I am new to oracle so please excuse if this is something really simple.
thanks.
im not sure what you need exactly as i figure what you want is one of this two:
select * from
(select * from XMLalias ) insider
where insider.col1 /*.....*/
Or you wanted someting like that
select *
from XMLalias a,
XMLalias b
where a.key_col=b.other_key_col
and a.col1 = /*...... */
and b.col2 = /*...... */

Set array of values to SQL query

I'm using JDBC, I have to set array of values to a single column,
I know it works in Hibernate and Ibatis but it seems to be hard to get it working Pure JDBC sql.
I have an array of String values
names[] = new String[]{"A","B","C"};
and a Query like
select * from emp where name in(?)
I tried pstmt.setObject(1,names), it is not working..
This is not supported in pure JDBC. You have to generate a query so that the in clause contains one placeholder for each element of the array.
Spring's JDBC helper classes support named parameters and what you want to do.
This will work with the following Syntax:
"SELECT * FROM emp WHERE name IN ( SELECT column_value FROM TABLE( ? ) )"
pmst.setArray( 1, conn.createArrayOf( "VARCHAR", names ) );

Resources