Google Sheet Query SUM exclude all SUM equal zero - sorting

cracking brain for this.
I have a very simple query with group and sum, and I want to exclude all sum results that are zero.
My actual query is:
=query(A:J;"SELECT D, SUM(I), SUM(H) WHERE C<>'S' GROUP BY D ORDER BY D DESC")
So.. I know I cant do something like this:
=query(A:J;"SELECT D, SUM(I), SUM(H) WHERE C<>'S' AND SUM(I)>0 GROUP BY D ORDER BY D DESC")
I'm trying with query inside filter, query inside query, but I can't figure out how to solve it.

try:
=QUERY(QUERY(A:J;
"select D,sum(I),sum(H)
where C<>'S'
group by D
order by D desc");
"where Col2>0"; 1)

Related

Regroup By in PigLatin

In PigLatin, I want to group by 2 times, so as to select lines with 2 different laws.
I'm having trouble explaining the problem, so here is an example. Let's say I want to grab the specifications of the persons who have the nearest age as mine ($my_age) and have lot of money.
Relation A is four columns, (name, address, zipcode, age, money)
B = GROUP A BY (address, zipcode); # group by the address
-- generate the address, the person's age ...
C = FOREACH B GENERATE group, MIN($my_age - age) AS min_age, FLATTEN(A);
D = FILTER C BY min_age == age
--Then group by as to select the richest, group by fails :
E = GROUP D BY group; or E = GROUP D BY (address, zipcode);
-- The end would work
D = FOREACH E GENERATE group, MAX(money) AS max_money, FLATTEN(A);
F = FILTER C BY max_money == money;
I've tried to filter at the same time the nearest and the richest, but it doesn't work, because you can have richest people who are oldest as mine.
An another more realistic example is :
You have demands file like : iddem, idopedem, datedem
You have operations file like : idope,labelope,dateope,idoftheday,infope
I want to return operations that matches demands like :
idopedem matches ideope.
The dateope must be the nearest with datedem.
If datedem - date_ope > 0, then I must select the operation with the max(idoftheday), else I must select the operation with the min(idoftheday).
Relation A is 5 columns (idope,labelope,dateope,idoftheday,infope)
Relation B is 3 columns (iddem, idopedem, datedem)
C = JOIN A BY idope, B BY idopedem;
D = FOREACH E GENERATE iddem, idope, datedem, dateope, ABS(datedem - dateope) AS datedelta, idoftheday, infope;
E = GROUP C BY iddem;
F = FOREACH D GENERATE group, MIN(C.datedelta) AS deltamin, FLATTEN(D);
G = FILTER F BY deltamin == datedelta;
--Then I must group by another time as to select the min or max idoftheday
H = GROUP G BY group; --Does not work when dump
H = GROUP G BY iddem; --Does not work when dump
I = FOREACH H GENERATE group, (datedem - dateope >= 0 ? max(idoftheday) as idofdaysel : min(idoftheday) as idofdaysel), FLATTEN(D);
J = FILTER F BY idofdaysel == idoftheday;
DUMP J;
Data in the 2nd example (note date are already in Unix format) :
You have demands file like :
1, 'ctr1', 1359460800000
2, 'ctr2', 1354363200000
You have operations file like :
idope,labelope,dateope,idoftheday,infope
'ctr0','toto',1359460800000,1,'blabla0'
'ctr0','tata',1359460800000,2,'blabla1'
'ctr1','toto',1359460800000,1,'blabla2'
'ctr1','tata',1359460800000,2,'blabla3'
'ctr2','toto',1359460800000,1,'blabla4'
'ctr2','tata',1359460800000,2,'blabla5'
'ctr3','toto',1359460800000,1,'blabla6'
'ctr3','tata',1359460800000,2,'blabla7'
Result must be like :
1, 'ctr1', 'tata',1359460800000,2,'blabla3'
2, 'ctr2', 'toto',1359460800000,1,'blabla4'
Sample input and output would help greatly, but from what you have posted it appears to me that the problem is not so much in writing the Pig script but in specifying what exactly it is you hope to accomplish. It's not clear to me why you're grouping at all. What is the purpose of grouping by address, for example?
Here's how I would solve your problem:
First, design an optimization function that will induce an ordering on your dataset that reflects your own prioritization of money vs. age. For example, to severely penalize large age differences but prefer more money with small ones, you could try:
scored = FOREACH A GENERATE *, money / POW(1+ABS($my_age-age)/10, 2) AS score;
ordered = ORDER scored BY score DESC;
top10 = LIMIT ordered 10;
That gives you the 10 best people according to your optimization function.
Then the only work is to design a function that matches your own judgments. For example, in the function I chose, a person with $100,000 who is your age would be preferred to someone with $350,000 who is 10 years older (or younger). But someone with $500,000 who is 20 years older or younger is preferred to someone your age with just $50,000. If either of those don't fit your intuition, then modify the formula. Likely a simple quadratic factor won't be sufficient. But with a little experimentation you can hit upon something that works for you.

returning values with a where clause

I have two tables as follows:
ScholarSubject
ScholarSubjectID<pk>
ScholarID
SubjectID
Mark
and
AdmissionReq
SubjectID
DegreeCode
MinumumMark
I'm trying to return everything from a Degree table (with PK degreeID) where the mark for a scholar is less than the minimum mark for admissions. My query is as follows:
public List<object> getDegreeByAPSandRequirements()
{
using (DataLayer.CareerDatabaseEntities context = new DataLayer.CareerDatabaseEntities())
{
return (from Degrees in context.Degrees
join admissions in context.AdmissionReqs on
Degrees.DegreeCode equals admissions.DegreeCode
join subject in context.Subjects on
admissions.SubjectID equals subject.SubjectID
join scholarsubject in context.ScholarSubjects on
subject.SubjectID equals scholarsubject.SubjectID
join scholar in context.Scholars on
scholarsubject.ScholarID equals scholar.ScholarID
where Degrees.APSScore <= scholar.APSScore && admissions.MinimumMark <= scholarsubject.NSC && scholarsubject.SubjectID.Equals(admissions.SubjectID)
select Degrees).Distinct().ToList<object>();
}
}
Everything works, except if I change one of the marks (in ScholarSubject) to a lesser value than the minumum mark (in AdmissionsReq) then it still returns a degree. I want to return a degree if both marks are greater than the minimum requirements and not only one of the marks.
What am I doing wrong? Can someone please help me??
I'm still not sure I understand what you're trying to do - unless you have just one scholar in your database it seems to me you will return a list of all degrees achieved by all scholars. If you don't want this, you need to filter on scholarID in the where clause.
But anyway - in order to get more information I would try to do 2 things.
I would change your query and start it with the scholar not with the degree as this might avoid some duplicates:
from scholar in context.Scholars
join scholarsubject in context.ScholarSubjects on scholar.ScholarID equals scholarsubject.ScholarID
join subject in context.Subjects on scholarsubject.SubjectID equals subject.SubjectID
join admission in context.AdmissionReqs on subject.SubjectID equals admission.SubjectID
join degree in context.Degrees on admission.DegreeCode equals degree.DegreeCode
where degree.APSScore <= scholar.APSScore
&& admission.MinimumMark <= scholarsubject.NSC
//&& scholarsubject.SubjectID.Equals(admission.SubjectID) // you should not need this line as you have the joins in place to assert this
select degree)
.Distinct()
.ToList<object>();
If this produces the same result as your previous query, then I'd change the return type to see what exactly you are getting - replace the last line with this and inspect the collection:
select new {ScholarID = scholar.ScholarID, Degree = degree})
.Distinct()
.ToList();

Having trouble getting an SQL query converted to Linq to Entities

I have the following SQL query:
SELECT Comment, JD, Jurisdiction, RegStatus, Region, SPDR_NAME
FROM dbo.Registered_status_by_year
GROUP BY Comment, JD, SPDR_NAME, Region, Jurisdiction, RegStatus
HAVING (Region = #Region) ORDER BY JD
I am trying to convert it to linq to entities. I have the following, so far:
var result = (from x in myEntities.IRTStatusByYearSet
group x by new { x.Comment, x.JD, x.SPDR_NAME, x.Region, x.Jurisdiction, x.RegStatus } into g
select g);
The problem is that I can't seem to use the "orderby" because "g" doesn't have any "properties" that are the column names.
Does anyone know how I might do this? I have looked for examples of doing the order by with grouping, but all of them show only grouping by one thing, or ordering by the "count" of items in grouping, instead of by some other value.
Have you tried orderby g.Key.JD?
var result = (from x in myEntities.IRTStatusByYearSet
group x by new { x.Comment, x.JD, x.SPDR_NAME, x.Region, x.Jurisdiction, x.RegStatus } into g
orderby g.Key.JD
select g);

Greater Than Condition in Linq Join

I had tried to join two table conditionally but it is giving me syntax error. I tried to find solution in the net but i cannot find how to do conditional join with condition. The only other alternative is to get the value first from one table and make a query again.
I just want to confirm if there is any other way to do conditional join with linq.
Here is my code, I am trying to find all position that is equal or lower than me. Basically I want to get my peers and subordinates.
from e in entity.M_Employee
join p in entity.M_Position on e.PostionId >= p.PositionId
select p;
You can't do that with a LINQ joins - LINQ only supports equijoins. However, you can do this:
var query = from e in entity.M_Employee
from p in entity.M_Position
where e.PostionId >= p.PositionId
select p;
Or a slightly alternative but equivalent approach:
var query = entity.M_Employee
.SelectMany(e => entity.M_Position
.Where(p => e.PostionId >= p.PositionId));
Following:
from e in entity.M_Employee
from p in entity.M_Position.Where(p => e.PostionId >= p.PositionId)
select p;
will produce exactly the same SQL you are after (INNER JOIN Position P ON E..PostionId >= P.PositionId).
var currentDetails = from c in customers
group c by new { c.Name, c.Authed } into g
where g.Key.Authed == "True"
select g.OrderByDescending(t => t.EffectiveDate).First();
var currentAndUnauthorised = (from c in customers
join cd in currentDetails
on c.Name equals cd.Name
where c.EffectiveDate >= cd.EffectiveDate
select c).OrderBy(o => o.CoverId).ThenBy(o => o.EffectiveDate);
If you have a table of historic detail changes including authorisation status and effective date. The first query finds each customers current details and the second query adds all subsequent unauthorised detail changes in the table.
Hope this is helpful as it took me some time and help to get too.

multiple grouping, inner joning in Linq

I am trying to translate this into Linq and cannot figure it out:
SELECT
CustomerOrder.ShipState, MONTH(OrderFulfillment.OrderDate) AS Mnth,
YEAR(OrderFulfillment.OrderDate) AS Yer,
SUM(OrderFulfillment.Tax) AS TotalTax
FROM
OrderFulfillment INNER JOIN
CustomerOrder ONOrderFulfillment.OrderID =CustomerOrder.OrderID
WHERE
(OrderFulfillment.Tax > 0)
GROUP BY
CustomerOrder.ShipState, MONTH(OrderFulfillment.OrderDate),
YEAR(OrderFulfillment.OrderDate)
ORDER BY
YEAR(OrderFulfillment.OrderDate) DESC, CustomerOrder.ShipState,
MONTH(OrderFulfillment.OrderDate) DESC
I have Linqpad and have gone through a bunch of the examples but cannot figure this out.
I think you want to do something like this:
from c in CustomerOrder
join o in OrderFulfillment on c.OrderId equals o.OrderId
where
o.Tax > 0
group o by
new { c.ShipState, Mnth = of.OrderDate.Month, Yer = of.OrderDate.Year }
into g
orderby
g.Key.Yer descending, g.ShipState, g.Key.Mnth descending
select
new { g.Key.ShipState, g.Key.Mnth, g.Key.Yer,
TotalTax = g.Sum(i => i.Tax) };
I haven't tried to compile it, but I think this is something along the lines of what you want.
The idea is that first you perform your join to link the customers and orders. Then apply your filter condition.
At that point, you want to get all the orders that have a particular group, so the group operator is applied.
Finally, order the results, then select out all info from the keys for each group, and sum up the tax in each of the group.
First of all, it would be nice to know what exactly you can't figure out. If you're completely lost and don't know where to begin then you need to google around for linq joining and grouping.
Here's something I did recently that may (possibly) point you in the right direction:
// This groups items by category and lists all category ids and names
from ct in Categories
join cst in Category_Subtypes
on ct.Category_Id equals cst.Category_Id
join st in Subtypes
on cst.Subtype_Id equals st.Subtype_Id
where
st.Type_Id == new Guid(id)
group ct by new { ct.Category_Id, ct.Display_Text } into ctg
select new
{
Id = ctg.Key.Category_Id,
Name = ctg.Key.Display_Text
}

Resources