How can I group results in multpiple conditions subgroups with eloquent laravel? - laravel

I have a product table and I am trying to make a query with eloquent in which I group the results into five subgroups depending on conditions.
Each subgroup will show the total number of rows that meet this condition.
DB::table(Product::TABLE_NAME)->select(
DB::raw("prize"),
DB::raw("count(*) as total")
)
->groupBy('prize')->get();
I don't know where put the conditions.
I expect this output:
id_group | total |
------------------
1 | 34 |
2 | 46 |
3 | 126 |
id_group represents each condition:
1 - prize = null
2 - prize = 0
3 - prize > 0
Solution:
I've got this, but I don't know if is the more efficient way or if exists other way to do it:
DB::table(Product::TABLE_NAME)->select(
DB::raw("SUM(CASE
WHEN prize IS NULL THEN 1 ELSE 0 END) AS 1"),
DB::raw("SUM(CASE
WHEN prize = 0 THEN 1 ELSE 0 END) AS 2"),
DB::raw("SUM(CASE
WHEN prize > 0 THEN 1 ELSE 0 END) AS 3")
)->get();

Related

Cannot validate data using complex check constraint case in mysql 7

ALTER TABLE
src_branches ADD CONSTRAINT bmod_to_id_check(
CASE WHEN bmod 7 THEN CASE WHEN id BETWEEN 0 AND 2 THEN 1 ELSE 0
CASE WHEN bmod 6 THEN CASE WHEN id BETWEEN 600 AND 699 THEN 1 ELSE 0
CASE WHEN bmod 5 THEN CASE WHEN id BETWEEN 500 AND 599 THEN 1 ELSE 0
CASE WHEN bmod 4 THEN CASE WHEN id BETWEEN 400 AND 499 THEN 1 ELSE 0
CASE WHEN bmod 3 THEN CASE WHEN id BETWEEN 300 AND 499 THEN 1 ELSE 0
CASE WHEN bmod 2 THEN CASE WHEN id BETWEEN 2 AND 300 THEN 1 ELSE 0
CASE WHEN bmod 1 THEN CASE WHEN id BETWEEN 2 AND 300 THEN 1 ELSE 0
END ELSE 1
END = 1
);
// my goal is keep my id in a set range when bmod = some integer
// i'm open to alternatives
Pay attention to the syntax of the CASE operator:
ALTER TABLE
src_branches ADD CONSTRAINT bmod_to_id check(
case bmod when 7 then id BETWEEN 0 AND 2
when 6 then id BETWEEN 600 AND 699
when 5 then id BETWEEN 500 AND 599
when 4 then id BETWEEN 400 AND 499
when 3 then id BETWEEN 300 AND 499
when 2 then id BETWEEN 2 AND 300
when 1 then id BETWEEN 2 AND 300
else 0
end)
So id BETWEEN x AND y is an expression that is true (1) or false (0).
Note the constraint syntax too.
ref: fiddle

How to increase the speed of pandas search process?

I would like to search keywords from a dataframe column, called 'string'.
The keywords are contained in a dictionary.
For each key, the value is an array of several keywords.
My concern is that the speed is very low and it takes a lot of time.
Maybe there are many loops involved and df.str.contains cannot be used.
How to speed up the process?
def match(string, keyword):
m = len(string)
n = len(keyword)
idx = string.find(keyword)
if idx == -1:
return 0
if len(re.findall('[a-zA-Z]', string[idx])) > 0:
if idx > 0:
if len(re.findall('[a-zA-Z]', string[idx - 1])) > 0:
return 0
if len(re.findall('[a-zA-Z]', string[idx+n-1])) > 0:
if idx + n < m:
if len(re.findall('[a-zA-Z]', string[idx + n])) > 0:
return 0
return 1
def match_keyword(df, keyword_dict, name):
df_new = pd.DataFrame()
for owner_id, keyword in keyword_dict.items():
try:
for index, data in df.iterrows():
a = [match(data['string'], word) for word in keyword]
t = int(np.sum(a))
if t > 0:
df_new.loc[index, name+'_'+str(owner_id)] = 1
else:
df_new.loc[index, name+'_'+str(owner_id)] = 0
except:
df_new[name+'_'+str(owner_id)] = 0
return df_new.astype(int)
Input:
String
0 New Beauty Company is now offering 超級discounts
1 Swimming is good for children and adults
2 Children love food though it may not be good
keywords:{'a':['New', 'is', '超級'], 'b':['Swim', 'discounts', 'good']}
Results:
'New' 'is' '超級' result(or relation)
0 1 1 1 1
1 0 1 0 1
2 0 0 0 0
'Swim' 'discounts' 'good' result(or relation)
0 0 1 0 1
1 0 0 1 1
2 0 0 1 1
Final results:
'a' 'b'
0 1 1
1 1 1
2 0 1
I believe need str.contains in loop by dict with word bondaries by \b with join by | for regex OR:
for k, v in keywords.items():
pat = '|'.join(r"\b{}\b".format(x) for x in v)
#print (pat)
df[k] = df['String'].str.contains(pat).astype(int)
print (df)
String a b
0 New Beauty Company is now offering discounts 1 1
1 Swimming is good for children and adults 1 1
2 Children love food though it may not be good 0 1
If need also columns by each value and create MultiIndex in columns:
df = df.set_index('String')
for k, v in keywords.items():
for x in v:
df[(k, x)] = df.index.str.contains(x).astype(int)
df.columns = pd.MultiIndex.from_tuples(df.columns)
print (df)
a b
New is Swim discounts good
String
New Beauty Company is now offering discounts 1 1 0 1 0
Swimming is good for children and adults 0 1 1 0 1
Children love food though it may not be good 0 0 0 0 1
And then is possible get max by MultiIndex:
df = df.max(axis=1, level=0)
print (df)
a b
String
New Beauty Company is now offering discounts 1 1
Swimming is good for children and adults 1 1
Children love food though it may not be good 0 1

laravel query builder expression about one to many table from the sql expression

a16s
id pic
1 1.jpg
2 2.jpg
3 3.jpg
4 4.jpg
a16s_like
id p_id u_id approve
1 1 2 0
2 1 1 1
3 1 5 1
4 1 6 1
5 1 7 0
6 2 2 0
7 2 3 0
8 2 1 1
9 4 4 0
10 4 3 1
11 4 2 1
SELECT
A.id,
A.PIC,
SUM(CASE WHEN B.approve IS NULL THEN 0 ELSE 1 END) AS Ashowcunt,
SUM(CASE WHEN B.approve=0 THEN 1 ELSE 0 END) AS Nshow,
SUM(CASE WHEN B.approve=1 THEN 1 ELSE 0 END) AS Yshow,
B.approve,
SUM(CASE WHEN B.approve=1 AND B.u_id=3 THEN 1 when B.approve=0 AND B.u_id=3 then 0 ELSE null END) AS U_id3show
FROM a16s AS A
LEFT JOIN a16s_like AS B ON A.ID = B.p_id
GROUP BY A.id,A.pic
to get the list and work well on mysql 5.7
when the u_id=2 to excute the select , I get
id pic Ashowocunt approve_0_count approve_1_count u_id2_approve
1 1.jpg 5 2 3 0
2 2.jpg 3 2 1 0
3 3.jpg 0 0 0 null
4. 4.jpg 3 0 3 0
u_id=3
id pic Ashowocunt approve_0_count approve_1_count u_id3_approve
1 1.jpg 0 0 0 null
2 2.jpg 0 1 0 0
3 3.jpg 0 0 0 null
4. 4.jpg 1 0 1 1
when I change the sql to laravel
$search_alls=
DB::select('A.id','A.route','B.approve')
->addSelect(DB::raw('SUM(CASE WHEN B.approve IS NULL THEN 0 ELSE 1 END) as Ashowcount'))
->addSelect(DB::raw('SUM(CASE WHEN B.approve = 0 THEN 1 ELSE 0 END) as Nshow'))
->addSelect(DB::raw('SUM(CASE WHEN B.approve = 1 THEN 1 ELSE 0 END) as Yshow'))
->addSelect(DB::raw('SUM(CASE WHEN B.approve = 1 AND b.u_id = 2 then 1
when B.approve = 0 AND b.u_id = 2 then 0 ELSE null END) as U_idshow'))
->from('a16s as A')
->join('a16s_like as B', function($join) {
$join->on('A.ID', '=', 'B.p_id');
})
->groupBy('A.id')
->orderby('A.id', 'DESC')
->paginate(12);
return View('comefo.results')
->with('search_alls', $search_alls)
->with('table',$table);
I got the error
Symfony \ Component \ Debug \ Exception \ FatalThrowableError (E_RECOVERABLE_ERROR)
Type error: Argument 1 passed to Illuminate\Database\Connection::prepareBindings() must be of the type array, string given, called in D:\AppServ\www\product\vendor\laravel\framework\src\Illuminate\Database\Connection.php on line 665
You have to explicitly create a query:
DB::query()->select(...
You place the table name in the wrong place. Use table method.
echo DB::table('a16s as A')
->select('A.id','A.route','B.approve')
...
->orderby('A.id', 'DESC')->toSql();
reveals normal sql like
select A.id, A.route, B.approve, SUM(CASE WHEN B.approve
IS NULL THEN 0 ELSE 1 END) as Ashowcount, SUM(CASE WHEN B.approve = 0
THEN 1 ELSE 0 END) as Nshow, SUM(CASE WHEN B.approve = 1 THEN 1 ELSE 0
END) as Yshow, SUM(CASE WHEN B.approve = 1 AND b.u_id = 2 then 1 when
B.approve = 0 AND b.u_id = 2 then 0 ELSE null END) as U_idshow from
a16s as A inner join a16s_like as B on A.ID = B.p_id
group by A.id order by A.id desc

Is there an efficent method similar to sql window functions in PySpark?

I am dealing with a huge dataframe containing 3 columns and 5 Bil. rows. The size of the data is 360 GB. For the analysis of the Data I am using the following set-up:
-Jupyternotebooks running on a AWS r4.16xlarge
-PySpark Kernel
The table called customer_sales looks similar to the following example:
+--------------------+----------+-------+
| business_unit_id | customer | sales |
+--------------------+----------+-------+
| 1 + a + 5000 +
| 1 + b + 2000 +
| 1 + c + 3000 +
| 1 + d + 5000 +
| 2 + f + 600 +
| 2 + c + 7000 +
| 3 + j + 200 +
| 3 + k + 800 +
| 3 + c + 4500 +
Now I want to get for each business_unit_id the customer with the highest sales. If there is a draw in sales between several customer I want to get them all. The information should be stored in a table called best_customers_for_each_unit. So for the above illustrated example the table best_customers_for_each_unit looks as follows:
+--------------------+----------+-------+
| business_unit_id | customer | sales |
+--------------------+----------+-------+
| 1 + a + 5000 +
| 1 + d + 5000 +
| 2 + c + 7000 +
| 3 + c + 4500 +
In a second step I want to count how often it happens that a customer is the one with the highest sales in a specific business_unit_id. The output of this query will be:
+----------+-------+
| customer | count |
+----------+-------+
| a + 1 +
| b + 1 +
| c + 2 +
For the first query I used spark.sql with window functions. The used query looks as follows:
best_customers_for_each_unit = spark.sql("""
SELECT
business_unit_id,
customer,
sales
FROM (
SELECT
business_unit_id,
customer,
sales,
dense_rank() OVER (PARTITION BY business_unit_id ORDER BY sales DESC)as rank
FROM customer_sales) tmp
WHERE
rank =1
""")
For the second query I used the following PySpark snipped:
best_customers_for_each_unit.groupBy("customer").count()
My queries do actually work, but it takes ages to only process a minor part of the data. So do you know any efficient way to do such queries with PySpark?
Regards

Oracle 11g - Adding a Total Column to a Pivot Table

I've created a pivot table with data from multiple tables (using JOINS). How can I add another column to the table which adds up each column from each row?
Example:
Category | A | B | C |
ABC 1 1 1
A 1 0 0
B 0 1 0
C 0 0 1
Category | A | B | C | TOTAL
ABC 1 1 1 3
A 1 0 0 1
B 0 1 0 1
C 0 0 1 1
SCOTT#research 15-APR-15> select * from testing ;
CATEG A B C
----- ---------- ---------- ----------
ABC 1 1 1
A 1 0 0
B 0 1 0
C 0 0 1
SCOTT#research 15-APR-15> select category,a,b,c, sum(a+b+c) as "total" from testing group by category,a,b,c order by category;
CATEG A B C total
----- ---------- ---------- ---------- ----------
A 1 0 0 1
ABC 1 1 1 3
B 0 1 0 1
C 0 0 1 1
In case you want to add a column, then can add one use a procedure to update the values using this,
alter table testing add total int;
use this procedure to update the values
create or replace procedure add_Test
is
sqlis varchar2(10);
total1 int;
begin
for i in (select * from testing) loop
select sum(a+b+c) into total1 from testing where category=i.category;
update testing set total=total1 where category=i.category;
end loop;
commit;
end;
exec add_test;
SCOTT#research 15-APR-15> select * from testing;
CATEG A B C TOTAL
----- ---------- ---------- ---------- ----------
ABC 1 1 1 3
A 1 0 0 1
B 0 1 0 1
C 0 0 1 1

Resources