I need your expert suggestion, Actually I am working on a Web project (using JSP and Oracle) having multiple database tables based on categories inwhich most of columns matches in tables, Now I want to create a search functioanlity on database tables which will search only on matching columns (these column exist in all tables). To do that I was thinking to create view (union of all tables) and then perform the search on view but I think this will degrade the performation since these tables are partitioned based on state and city and having huge data.
Example :
Table A
Col 1
Col 2
Col 3
Table B
Col 1
Col 2
Col 3
Col 4
Table C
Col 1
Col 2
Col 3
Col 5
Just to want to perform a search on col1, Col3 and Col3 (these columns exist in all tables)
Is there any other way to create the search to optimize the performance as well..??
Please help.
WITH table-a AS
(select col1,col3 from table1),
Table-b AS
( select col1,col3 from table1)
SELECT col1,col3
FROM table-a, table-b
Just a suggestion
Related
Let's say I have a table1 in schema1 like this:
Stu_ID
Math
1
A
2
B
3
B+
Now, I want to add a new column, for instance, Literature, into table1 in schema1.
ALTER TABLE schema1.table 1
ADD COLUMN Literature STRING
Table1 now looks like
Stu_ID
Math
Literature
1
A
NULL
2
B
NULL
3
B+
NULL
I want to load data from table2, shema2 based on the respective Stu_ID. Is there a way to do so? I have thought of UPDATE, but Impala only supports updating a kudu table according to my understanding. Please correct me if I'm wrong.
instead of update you can insert+overwrite.
insert overwrite schema1.table1 t1
select
t1.stu_id, t1.Math, t2.Literature
from schema1.table1 t1
join schema2.table2 t2 ON t1.stu_id=t2.stu_id
This will replace whole data of t1 and will replace with old data + new column.
There's total of three tables involved. one header base table, one material
base table, one staging table.
I have created the staging table with 4 columns, the values will be
updated from csv uploaded, column 1 is batch_no, column 2 is for
attribute.
>header base table(h) has batch_no and batch_id
>material base table(m) has batch_id, attr_m (empty, to be updated)
>staging table(s) has batch_no and attr_s
create table he (BATCH_ID number, BATCH_NO varchar2(30));
create table me (a6 varchar2(30), BATCH_id number);
create table s (batch_no varchar2(30), att varchar2(30));
I want to take values from attr_s and update attr_m against batch_no. How do I do that?
Here's my code, please help me fix this code, it doesn't work
update me
set a6 = (select att
from s where batch_no = (select he.batch_no
from he, s
where he.batch_no=s.batch_no))
error received:
single row subquery return multiple rows.
single row subquery return multiple rows
The update statement is applied to each individual row in ME. Therefore the assignment operation requires one scalar value to be returned from the subquery. Your subquery is returning multiple values, hence the error.
To fix this you need to further restrict the subquery so it returns one row for each row in ME. From your data model the only way to do this is with the BATCH_ID, like so:
update me
set a6 = (select att
from s where batch_no = (select he.batch_no
from he, s
where he.batch_no=s.batch_no
and he.batch_id = me.batch_id))
Such a solution will work providing that there is only one record in S which matches a given permutation of (batch_no, batch_id). As you have provided any sample data I can't verify that the above statement will actually solve your problem.
Suppose if the table name is ABC_XYZ_123. I want to extract the integer values after _.
The output should be integer values after _.
In the above case, the output should be 123.
I have used the below sql query.
select from table_name like 'XXX_%';
But I am not getting required output. Can anyone help me with this query.
Thanks
Using REGEXP_SUBSTR with a capture group we can try:
SELECT REGEXP_SUBSTR(name, '_(\d+)$', 1, 1, NULL, 1)
FROM yourTable;
The question is somewhat unclear:
it looks as if you're looking for table names that contain number at the end, while
query you posted suggests that you're trying to select those numbers from one of table's columns
I'll stick to
Suppose if the table name is ABC_XYZ_123
If that's so, it is the data dictionary you'll query. USER_TABLES contains that information.
Let's create that table:
SQL> create table abc_xyz_123 (id number);
Table created.
Query selects numbers at the end of table names, for all my tables that end with numbers.
SQL> select table_name,
2 regexp_substr(table_name, '\d+$') result
3 from user_tables
4 where regexp_like(table_name, '\d+$');
TABLE_NAME RESULT
-------------------- ----------
TABLE1 1
TABLE2 2
restore_point-001 001
ABC_XYZ_123 123 --> here's your table
SQL>
Apparently, I have a few of them.
I am trying to learn about deleting duplicate records from a Hive table.
My Hive table: 'dynpart' with columns: Id, Name, Technology
Id Name Technology
1 Abcd Hadoop
2 Efgh Java
3 Ijkl MainFrames
2 Efgh Java
We have options like 'Distinct' to use in a select query, but a select query just retrieves data from the table. Could anyone tell how to use a delete query to remove the duplicate rows from a Hive table.
Sure that it is not recommended or not the standard to Delete/Update records in Hive. But I want to learn how do we do it.
You can use insert overwrite statement to update data
insert overwrite table dynpart select distinct * from dynpart;
Just in case when your table has duplicate rows on few or selected columns. Suppose you have a table structure as shown down below:
id Name Technology
1 Abcd Hadoop
2 Efgh Java --> Duplicate
3 Ijkl Mainframe
2 Efgh Python --> Duplicate
Here id & Name columns having duplicate rows.
You can use analytical function to get the duplicate row as:
select * from
(select Id,Name,Technology,
row_Number() over (partition By Id,Name order by id desc) as row_num
from yourtable)tab
where row_num > 1;
This will give you output as:
id Name Technology row_num
2 Efgh Python 2
When you need to get both the duplicate rows:
select * from
(select Id,Name,Technology,
count(*) over (partition By Id,Name order by id desc) as duplicate_count
from yourtable)tab
where duplicate_count> 1;
Output as:
id Name Technology duplicate_count
2 Efgh Java 2
2 Efgh Python 2
you can insert distinct records into some other table
create table temp as select distinct * from dynpart
writing hive query over a table to pick the row with maximum value in column
there is table with following data for example:
key value updated_at
1 "a" 1
1 "b" 2
1 "c" 3
the row which is updated last needs to be selected.
currently using following logic
select tab1.* from table_name tab1
join select tab2.key , max(tab2.updated_at) as max_updated from table_name tab2
on tab1.key=tab2.key and tab1.updated_at = tab2.max_updated;
Is there any other better way to perform this?
If it is true that updated_at is unique for that table, then the following is perhaps a simpler way of getting you what you are looking for:
-- I'm using Hive 0.13.0
SELECT * FROM table_name ORDER BY updated_at DESC LIMIT 1;
If it is possible for updated_at to be non-unique for some reason, you may need to adjust the ORDER BY logic to break any ties in the fashion you wish.