Merging Group by and order by in Oracle - oracle

This is what I have and below what I want to do. d. is a table, and act is a view that consists of about 50 combined tables:
SELECT d.extensionaspect1 AS "Exception ID",
d.causedat AS "Date",
max(act.date_) AS "Causedate",
act.prodsteppath AS "Production Step",
--I'm calling other selections on d, but i'm not going to incude
--them for the sake of clarity
FROM pasx.deviationevent d,
pasx.vtemp_prodsteppath act
--There's more here, but I'm not going to include them for the sake of clarity
WHERE act.date_ < d.causedat
--as before, there's a large amount of AND clauses here that make
--sure that I get the timestamps I want. This is a pretty complex query
GROUP BY d.causedat,
ORDER BY d.extensionaspect1;
Running like this gives me ORA-00936; missing expression. I'm very new at this, so I'm probably fundamentally misunderstanding how to use select max(act.date_) and use it to find the timestamp that occurred just before d.causedate. act.date_ contains all the possible timestamps that could lead to throwing an exception, but only the one before the timestamps in d.causedat are relevant to me.
vtemp_prodsteppath is a view and contains no order or group by's. How do I join these two tables (d and act) together? What expression am I missing?

Related

Oracle Sql group function is not allowed here

I need someone who can explain me about "group function is not allowed here" because I don't understand it and I would like to understand it.
I have to get the product name and the unit price of the products that have a price above the average
I initially tried to use this, but oracle quickly told me that it was wrong.
SELECT productname,unitprice
FROM products
WHERE unitprice>(AVG(unitprice));
search for information and found that I could get it this way:
SELECT productname,unitprice FROM products
WHERE unitprice > (SELECT AVG(unitprice) FROM products);
What I want to know is why do you put two select?
What does group function is not allowed here mean?
More than once I have encountered this error and I would like to be able to understand what to do when it appears
Thank you very much for your time
The phrase "group function not allowed here" is referring to anything that is in some way an "aggregation" of data, eg SUM, MIN, MAX, etc et. These functions must operate on a set of rows, and to operate on a set of rows you need to do a SELECT statement. (I'm leaving out UPDATE/DELETE here)
If this was not the case, you would end up with ambiguities, for example, lets say we allowed this:
select *
from products
where region = 'USA'
and avg(price) > 10
Does this mean you want the average prices across all products, or just the average price for those products in the USA? The syntax is no longer deterministic.
Here's another option:
SELECT *
FROM (
SELECT productname,unitprice,AVG(unitprice) OVER (PARTITION BY 1) avg_price
FROM products)
WHERE unitprice > avg_price
The reason your original SQL doesn't work is because you didn't tell Oracle how to compute the average. What table should it find it in? What rows should it include? What, if any, grouping do you wish to apply? None of that is communicated with "WHERE unitprice>(AVG(unitprice))".
Now, as a human, I can make a pretty educated guess that you intend the averaging to happen over the same set of rows you select from the main query, with the same granularity (no grouping). We can accomplish that either by using a sub-query to make a second pass on the table, as your second SQL did, or the newer windowing capabilities of aggregate functions to internally make a second pass on your query block results, as I did in my answer. Using the OVER clause, you can tell Oracle exactly what rows to include (ROWS BETWEEN ...) and how to group it (PARTITION BY...).

AzureDatabrick:Error in SQL statement: package.TreeNodeException: execute, tree: Exchange hashpartitioning

currently working with 2 temporary views A & B . while selecting records from individual views it gives results. But when creating 3rd view C with join of A & B it works but when we run any select query on 3rd view C it gives error "Error in SQL statement: package.TreeNodeException: execute, tree: Exchange hashpartitioning"
Please help whats going wrong here.
Possible reason could be, skewed join. i.e., the fields you are joining could have multiple combinations. This can happen mainly when the joining fields also possess null values in both the views then it could result in multiple null joined with multiple nulls.
It can be avoided by adding another possible field. Else join non null values separate and append null values from both sides.
If this does not solve the purpose, do share us some code snippet, we can try replicating and solve the issue.

Should I apply string manipulation after or before joining tables in Oracle

I have two tables need to inner join, one table has relatively small number of records compared to the other one. I need to apply some string manipulation to the smaller table, and my question is can I apply the string function after the join, or should I apply them in a sub query and then join the sub select to the bigger table?
An example would be something like this:
Option 1:
SELECT SUBSTR("SMALL_TABLE"."COL_NAME",x,y) "NEW_COL" FROM "BIG_TABLE"
JOIN "SMALL_TABLE" ON ...
Option 2:
SELECT "NEW_COL"
FROM "BIG_TABLE"
JOIN
(
SELECT SUBSTR("SMALL_TABLE"."COL_NAME",x,y) "NEW_COL" FROM "SMALL_TABLE"
) "T"
ON ...
Which is better for performance option 1 or 2?
I am using oracle 11g.
Regardless of how you structure the query, Oracle's optimizer is free to evaluate the function before or after the join. Assuming that the string manipulation is only done as part of the projection step (i.e. it is done only in the SELECT clause and is not used as a predicate in the WHERE clause), I would expect that Oracle would apply the SUBSTR before joining the tables if you used either formulation because it would then have to apply the function to fewer rows (though it can probably treat the SUBSTR as a deterministic call and cache the results if it applies the function after the join).
As with any query optimization question, the first step is always to generate a query plan and see if the different queries actually produce different plans. I would expect the plans to be identical and, thus, the performance to be identical. But there are any number of reasons that one of the two options might produce different plans on your system given your optimizer statistics, initialization parameters, etc.
It is better to apply the operations before doing the join and then joining and querying for the final result. This is called query optimization.
By doing so for ur question you will perform lesser operations when "join"ing as u will be eliminating the useless rows beforehand.
Lots of examples here : http://beginner-sql-tutorial.com/sql-query-tuning.htm
and this is the best one I could find : http://www.cse.iitb.ac.in/~sudarsha/db-book/slide-dir/ch14.ppt‎

LINQ - Using where or join - Performance difference?

Based on this question:
What is difference between Where and Join in linq?
My question is following:
Is there a performance difference in the following two statements:
from order in myDB.OrdersSet
from person in myDB.PersonSet
from product in myDB.ProductSet
where order.Persons_Id==person.Id && order.Products_Id==product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
and
from order in myDB.OrdersSet
join person in myDB.PersonSet on order.Persons_Id equals person.Id
join product in myDB.ProductSet on order.Products_Id equals product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
I would always use the second one just because it´s more clear.
My question is now, is the first one slower than the second one?
Does it build a cartesic product and filters it afterwards with the where clauses ?
Thank you.
It entirely depends on the provider you're using.
With LINQ to Objects, it will absolutely build the Cartesian product and filter afterwards.
For out-of-process query providers such as LINQ to SQL, it depends on whether it's smart enough to realise that it can translate it into a SQL join. Even if LINQ to SQL doesn't, it's likely that the query engine actually performing the query will do so - you'd have to check with the relevant query plan tool for your database to see what's actually going to happen.
Side-note: multiple "from" clauses don't always result in a Cartesian product - the contents of one "from" can depend on the current element of earlier ones, e.g.
from file in files
from line in ReadLines(file)
...
My question is now, is the first one slower than the second one? Does it build a cartesic product and filters it afterwards with the where clauses ?
If the collections are in memory, then yes. There is no query optimizer for LinqToObjects - it simply does what the programmer asks in the order that is asked.
If the collections are in a database (which is suspected due to the myDB variable), then no. The query is translated into sql and sent off to the database where there is a query optimizer. This optimizer will generate an execution plan. Since both queries are asking for the same logical result, it is reasonable to expect the same efficient plan will be generated for both. The only ways to be certain are to
inspect the execution plans
or measure the IO (SET STATISTICS IO ON).
Is there a performance difference
If you find yourself in a scenario where you have to ask, you should cultivate tools with which to measure and discover the truth for yourself. Measure - not ask.

Data structure to hold HQL or EJB QL

We need to produce a fairly complex dynamic query builder for retrieving reports on the fly. We're scratching our heads a little on what sort of data structure would be best.
It's really nothing more than holding a list of selectParts, a list of fromParts, a list of where criteria, order by, group by, that sort of thing, for persistence. When we start thinking about joins, especially outer joins, having clauses, and aggregate functions, things start getting a little fuzzy.
We're building it up interfaces first for now and trying to think as far ahead as we can, but definitely will go through a series of refactorings when we discover limitations with our structures.
I'm posting this question here in the hopes that someone has already come up with something that we can base it on. Or know of some library or some such. It would be nice to get some tips or heads-up on potential issues before we dive into implementations next week.
I've done something similar couple of times in the past. A couple of the bigger things spring to mind..
The where clause is the hardest to get right. If you divide things up into what I would call "expressions" and "predicates" it makes it easier.
Expressions - column references, parameters, literals, functions, aggregates (count/sum)
Predicates - comparisons, like, between, in, is null (predicates have expression as children, e.g. expr1 = expr2. Then you also having composites such as and/or/not.
The whole where clause, as you can imagine, is a tree with a predicate at the root, with maybe sub-predicates underneath eventually terminating with expressions at the leaves.
To construct the HQL you walk the model (depth first usually). I used a visitor as I need to walk my models for other reasons, but if you don't have multiple purposes you can build the rendering code right into the model.
e.g. If you had
"where upper(column1) = :param1 AND ( column2 is null OR column3 between :param2 and param3)"
Then the tree is
Root
- AND
- Equal
- Function(upper)
- ColumnReference(column1)
- Parameter(param1)
- OR
- IsNull
- ColumnReference(column2)
- Between
- ColumnReference(column3)
- Parameter(param2)
- Parameter(param3)
Then you walk the tree depth first and merge rendered bits of HQL on the way back up. The upper function for example would expect one piece of child HQL to be rendered and it would then generate
"upper( " + childHql + " )"
and pass that up to it's parent. Something like Between expects three child HQL pieces.
You can then re-use the expression model in the select/group by/order by clauses
You can skip storing the group by if you wish by just storing the select and before query construction scan for aggregate. If there is one or more then just copy all the non-aggregate select expressions into the group by.
From clause is just a list of table reference + zero or more join clauses. Each join clause has a type (inner/left/right) and a table reference. Table reference is a table name + optional alias.
Plus, if you ever get into wanting to parse a query language (or anything really) then I can highly recommend ANTLR. Learning curve is quite steep but there are plenty of example grammars to look at.
HTH.
if you need EJB-QL parser and data structures, EclipseLink (well several of it's internal classes) have good one:
JPQLParseTree tree = org.eclipse.persistence.internal.jpa.parsing.jpql.JPQLParser.buildParserFor(" _the_ejb_ql_string_ ").parse();
JPQLParseTree contains the all the data.
but generating EJB-QL back from modified JPQLParseTree is something you have to do yourself.

Resources