Can I use distinct on two fields in Korma? - korma

I have a table with four fields: a, b, c and d.
I want a query like:
select distinct a, b from t;
The documentation suggests something like
(k/select my-table
(k/modifier "DISTINCT")
(k/fields :a :b))
But the generated SQL is like:
SELECT distinct a, b, c, d FROM my_table;
What I want is:
SELECT distinct a, b FROM my_table;
How do I restrict the distinct modifier to only two fields?
Experimenting with different modifier values (e.g. DISTINCT (a, b)) results in a mangled SQL Query.
Here's a complete example:
(k/defentity my-table (k/entity-fields :a :b :c :d))
(k/sql-only (k/select my-table (k/fields :a :b)
(k/modifier "DISTINCT")))
"SELECT DISTINCT `my-table`.`a`,
`my-table`.`b`,
`my-table`.`c`,
`my-table`.`d`,
`my-table`.`a`,
`my-table`.`b` FROM `my-table`"

The SQL keyword DISTINCT operates on the whole column in most (all?) databases, including MySQL and SQL Server. Therefore, in at least those databases, there is no way to use DISTINCT on a subset of fields.
A different query, possible like those suggested in SQL - 'DISTINCT' based on only some columns? might get you what you need.

Related

Using query hints to use a index in an inner table

I have a query which uses the view a as follows and the query is extremely slow.
select *
from a
where a.id = 1 and a.name = 'Ann';
The view a is made up another four views b,c,d,e.
select b.id, c.name, c.age, e.town
from b,c,d,e
where c.name = b.name AND c.id = d.id AND d.name = e.name;
I have created an index on the table of c named c_test and I need to use it when executing the first query.
Is this possible?
Are you really using this deprecated 1980s join syntax? You shouldn't. Use proper explicit joins (INNER JOIN in your case).
You are joining the two tables C and D on their IDs. That should mean they are 1:1 related. If not, "ID" is a misnomer, because an ID is supposed to identify a row.
Now let's look at the access route: You have the ID from table B and the name from tables B and C. We can tell from the column name that b.id is unique and Oracle guarantees this with a unique index, if the database is set up properly.
This means the DBMS will look for the B row with ID 1, find it instantly in the index, find the row instantly in the table, see the name and see whether it matches 'Ann'.
The only thing that can be slow hence is joining C, D, and E. Joining on unique IDs is extremely fast. Joining on (non-unigue?) names is only fast, if you provide indexes on the names. I'd recommend the following indexes accordingly:
create index idx_c on c (name);
create index idx_e on e (name);
To get this faster still, use covering indexes instead:
create index idx_b on b (id, name);
create index idx_c on c (name, id, age);
create index idx_d on d (id, name);
create index idx_e on e (name, town);

Join if not exists

I encountered following problem:
I have a long query (let's call it query "Z" ) that includes lots of joins and subqueries. It ouputs two columns:
A: item
B: Integer attribute, range guaranteed to be 1-10
I want to join from table X items (column A) that are not present as output of query Z and give them an arbitrary attribute value 10 (column B).
I tried creating subquery with inner subquery uisng not exists but that requires copying my original query inside and takes a lot of time (I didn't even manage to execute it).
Any suggestions? Thanks
It's not entirely clear from your question but I think you mean table X is a superset of the records from query Z. If so, a simple outer join should give you the result you want:
select coalesce(z.a, x.a) as a
, coalesce(z.b, 10) as b
from x
left outer join ( your query ) z
on z.a = x.a
If X is not a superset of Z then you should try a FULL OUTER JOIN instead.
I have assumed that column A works as a UID for the query Z and the table X. If this is not the case you'll need to tweak the above statement, or edit your question to include more details.
According to your comments , one possibility is to first make a union of the results of your "Z" query and then another select with the addition of only selecting the MIN() of your column B.
So it would look something along the lines of :
SELECT A , MIN(B) FROM
(
(QUERY Z) AS Z
union
(SELECT ITEM as A, 10 as B FROM X)
)
GROUP BY A
I did it similarily to what #Ancaron posted.
SELECT A, B FROM Z
UNION ALL
SELECT A, '10' FROM X
WHERE NOT EXISTS
(
select Z.A
from Z
WHERE Z.A=X.A
)

How to effeciently select data from two tables?

I have two tables: A, B.
A has prisoner_id and prisoner_name columns.
B has all other info about prisoners included prisoner_name column.
First I select all of the data that I need from B:
WITH prisoner_datas AS
(SELECT prisoner_name, ... FROM B WHERE ...)
Then I want to know all of the id of my prisoner_datas. To do this I need to combine information by prisoner_name column, because it's common for both tables
I did the following
SELECT A.prisoner_id, prisoner_datas.prisoner_name, prisoner_datas. ...,
FROM A, prisoner_datas
WHERE A.prisoner_name = prisoner_datas.prisoner_name
But it works very slow. How can I improve performance?
Add an index on the prisoner_name join column in the B table. Then the following join should have some performance improvement:
SELECT
A.prisoner_id,
B.prisoner_name,
B.prisoner_datas.id -- and other columns if needed
FROM A
INNER JOIN B
ON A.prisoner_name = B.prisoner_name
Note here that I used an explicit join syntax here. It isn't required, and the query plan might not change, but it makes the query easier to read. I don't think the CTE will change much, but the lack of an index on the join column should be important here.

Oracle: INSERT values from SELECT...JOIN, SQL Error: ORA-00947: not enough values

I'm trying to do the following:
INSERT INTO MyTable(a, b, c)
SELECT a FROM source1
JOIN source2 ON ...
Where source2 contains columns B and C.
However Oracle doesn't seem to like this and is telling me "SQL Error: ORA-00947: not enough values".
Am I doing something wrong here? Is this syntax even possible? Or do I have to rewrite it as:
SELECT a, b, c FROM source1, source2 WHERE ....
Thanks!
Use as many identifiers in the SELECT clause as in the INSERT clause, as in:
INSERT INTO MyTable(a, b, c)
SELECT s1.a, s2.b, s2.c FROM source1 s1
JOIN source2 s2 ON ...
The select needs to return the same number of columns as you listed in the INSERT statement.
So: yes, you need to rewrite the query to SELECT a,b,c FROM ...

INNER JOIN Optimization

I have two tables to join. TABLE_A (contains column 'a') and TABLE_BC (contains columns 'b' and 'c').
There is a condition on TABLE_BC. The two tables are joined by 'rowid'.
Something like:
SELECT a, b, c FROM main.TABLE_A INNER JOIN main.TABLE_BC WHERE (b > 10.0 AND c < 10.0) ON main.TABLE_A.rowid = main.TABLE_BC.rowid ORDER BY a;
Alternatively:
SELECT a, b, c FROM main.TABLE_A AS s1 INNER JOIN (SELECT rowid, b, c FROM main.TABLE_BC WHERE (b > 10.0 AND c < 10.0)) AS s2 ON s1.rowid = s2.rowid ORDER BY a;
I need to do this a couple of time with different TABLE_A, but TABLE_BC does not change... I could therefore speed things up by creating a temporary in-memory database (mem) for the constant part of the query.
CREATE TABLE mem.cache AS SELECT rowid, b, c FROM main.TABLE_BC WHERE (b > 10.0 AND c < 10.0);
followed by (many)
SELECT a, b, c FROM main.TABLE_A INNER JOIN mem.cache ON main.TABLE_A.rowid = mem.cache.rowid ORDER BY a;
I get the same result set from all the queries above, but the last option is by far the fastest one.
The problem is that I would like to avoid splitting the query into two parts. I would expect SQLite to do the same thing for me automatically (at least in the second scenario), but it does not seem to happen... Why is that?
Thanks.
SQLite is pretty light on optimization. The general rule of thumb: SmallTable Inner Join BigTable is faster than the reverse.
That being said I wonder if your first query would run faster in the following form:
SELECT a, b, c
FROM main.TABLE_A
INNER JOIN main.TABLE_BC ON main.TABLE_A.rowid = main.TABLE_BC.rowid
WHERE (b > 10.0 AND c < 10.0)
ORDER BY a;
Answer from the SQLite User Mailing List:
In short, because SQLite cannot read your mind.
To understand the answer compare speeds of executing one query (with
one TABLE_A) and creating an in-memory database, creating a table in
it and using that table in one query (with the same TABLE_A). I bet
the first option (straightforward query without in-memory database)
will be much faster. So SQLite selects the fastest way to execute your
query. It cannot predict what the future queries will be to understand
how to execute the whole set of queries faster. You can do that and
you should split your query in two parts.
Pavel

Resources