Sqoop --where condition with multiple clause - oracle

I'm trying to get data from Oracle and import in Hadoop table. I'm making changes inn existing sqoop, i have to use --where to filter the record. For now we have in where date=somedate condition, now i need to add another condition like date = somedate and status ='Active'. I have make this change in --where. I'm not allowed to use --query 🥺.
Can you guys help me on this ?

You can try like this
--query "select * from table where status = 'Active' AND date=somedate AND $CONDITIONS"

use --where condition wrapped in double quotes like below.
--where " date = somedate and status ='Active'"
And good news is you can add as many conditions as possible. In fact you can add subquery as well - it should be syntactically correct in database.

This syntax is helpful for me:
--query "select * from table where date=somedate AND status ='Active' AND ($CONDITIONS)"
there is no need to use --where.

Related

Equivalent of 'IN' or 'NOT' operator in sqoop

I have a sqoop job torun, The conditions includes :
WHERE cond1='' AND date = '2-12-xxxx' AND date = '3-12-xxxx' AND date = '3-12-xxxx'.
Is there a IN conditional in sqoop similar to sql?
You can run sqoop import using --query and pass any query to get the data.
In --where you have to pass conditions like this --where "cond1='value' and cond2 in (<comma seperated values>)".
If you use where condition on table, it will apply like this select * from <table> where <condition specified in where clause> to fetch the data and hence you can pass any valid conditions in where.

HIve join without common filed

I have the following tables:
Table1:
user_name Url
Rahul www.cric.info.com
ranbir www.rogby.com
sahil www.google.com
banit www.yahoo.com
Table2:
Keyword category
cric sports
footbal sports
google search
I want to search Table1 by matching the keyword in Table2. I can perform the same using case statement and the query works but it is not the right approach because each time I have to add the case statement when I will add new search keyword.
select user_name from table1
case when url like '%cric%' then sports
else 'undefined'
end as category
from table1;
Thanks find the soluntions for this approach. FIrst we need to do the Join and after that we need to filter the record.
select user_name,url,Keyword,catagory from(select table1.user_name,table1.url ,table2.keyword,table2.catagory from table1 left outer join table2)a where a.url like (concat('%',a.phrase,'%')
Not sure about more current versions, but I've run into a similar problem... the primary issue is that Hive only supports equi-join statements... when you apply logic to either side of the join, it has difficulty translating into a Map Reduce function.
The alternative method, if you have a reliably structured field, is that you can create a matching key from the larger field. For example, if you know that you're looking for your keyword to exist in the second position of a dot-delimited URI, you could do something like:
select
Uri
, split(Uri, "\\.")[1] as matchKey
from
Table1
join Table2 on Table2.keyword = Table1.matchKey
;

Is there a Hive equivalent of SQL “LIKE ANY ( SUBQUERY )”

While Hive doesn't supports multi-value LIKE queries which are supported in SQL : ex.
SELECT * FROM user_table WHERE first_name LIKE ANY ( 'root~%' , 'user~%' );
We can convert it into equivalent HIVE queries as :
SELECT * FROM user_table WHERE first_name LIKE 'root~%' OR first_name LIKE 'user~%'
Does anyone know an equivalent solution that Hive does support in case sub-query is used with LIKE ? Have a look at below example :
SELECT * FROM user_table WHERE first_name LIKE ANY ( SELECT expr FROM exprTable);
As It doesn't have values in expression, I can't use same approach for generating multiple LIKE expression separated with OR / AND operator. Initially I thought to write HIVE UDF for it ? Can you please help me supporting such expression and finding HIVE equivalent ?
You can use Hive's RLIKE relational operator as shown below,
SELECT * FROM user_table WHERE first_name RLIKE 'root~|user~|admin~';
Hope this helps!
This is a case involving theta joins in Hive. There is a wiki page for this and a jira request. Please go through the details here on this page: https://cwiki.apache.org/confluence/display/Hive/Theta+Join
Your case is similar to the Side-Table Similarity case given on the page.
You need to convert the expr values into a map and then use regular expression to find the like. Alternatively you can also use union all with all the like expressions in separate SQL - the query might become tedious so you can programatically generate it.
What about this using EXISTS
SELECT * FROM user_table WHERE EXISTS ( SELECT * FROM exprTable WHERE first_name LIKE expr );

Ruby OCI8 DBI, how to check query generated after parameter binding? need to check for "in" queries

While using Ruby-DBI I am facing issues with parameter binding for where "in" queries.
Two questions:
How do I get sql generated after parameter binding?
Does in parameter for sql work properly if using DBI and OCI8?
My code looks like this:
dbh = DBI.connect(setting[:tns], setting[:username], setting[:password])
#date and in_params are parameters to sql query.
#In the query they are seen as ? "Question marks"
sth = dbh.execute(File.read('import_values.sql'), date, in_params)
The query looks like this:
SELECT date, col1, col2
FROM TABLEX
WHERE date = ?
AND col1 not in ( ? )
Please help.
I re-factored code to not use "in".

Is it possible to refer to column names via bind variables in Oracle?

I am trying to refer to a column name to order a query in an application communicating with an Oracle database. I want to use a bind variable so that I can dynamically change what to order the query by.
The problem that I am having is that the database seems to be ignoring the order by column.
Does anyone know if there is a particular way to refer to a database column via a bind variable or if it is even possible?
e.g my query is
SELECT * FROM PERSON ORDER BY :1
(where :1 will be bound to PERSON.NAME)
The query is not returning results in alphabetical order, I am worried that the database is interpreting this as:-
SELECT * FROM PERSON ORDER BY 'PERSON.NAME'
which will obviously not work.
Any suggestions are much appreciated.
No. You cannot use bind variables for table or column names.
This information is needed to create the execution plan. Without knowing what you want to order by, it would be impossible to figure out what index to use, for example.
Instead of bind variables, you have to directly interpolate the column name into the SQL statement when your program creates it. Assuming that you take precautions against SQL injection, there is no downside to that.
Update: If you really wanted to jump through hoops, you could probably do something like
order by decode(?, 'colA', colA, 'colB', colB)
but that is just silly. And slow. Don't.
As you are using JDBC. You can rewrite your code, to something without bind variables. This way you can also dynamically change the order-by e.g.:
String query = "SELECT * FROM PERS ";
if (condition1){
query = query+ " order by name ";
// insert more if/else or case statements
} else {
query = query+ " order by other_column ";
}
Statement select = conn.createStatement();
ResultSet result = select.executeQuery(query);
Or even:
String columnName = getColumnName(input);
Statement select = conn.createStatement();
ResultSet result = select.executeQuery("SELECT * FROM PERS ORDER BY "+columnName);
ResultSet result = select.executeQuery(
"SELECT * FROM PERS ORDER BY " + columnName
);
will always be a new statement to the database.
That means it is, like Thilo already explained, impossible to "reorder" an already bound, calculated, prepared, parsed statement. When using this result set over and over in your application and the only thing, which changes over time is the order of the presentation, try to order the set in your client code.
Otherwise, dynamic SQL is fine, but comes with a huge footprint.

Resources