Convert DAX if statement to Power Query (or M)? - dax

I need to create a calculated column in order to filter a Tabular model table with the following structure:
Table1
| ID | Attr A | Attr B | Value |
|-----|-----------|--------|-------|
| 123 | text here | blah | 130 |
| 123 | blah | blah | 70 |
| 456 | blah | blah | 90 |
| 456 | blah | blah | 110 |
And I want the following new column to be created:
| ID | Attr A | Attr B | Value | MaxValue |
|-----|-----------|--------|-------|----------|
| 123 | text here | blah | 130 | TRUE |
| 123 | blah | blah | 70 | FALSE |
| 456 | blah | blah | 90 | FALSE |
| 456 | blah | blah | 110 | TRUE |
I would like to create a calculated column using Power Query equivalent to the following DAX statement which returns TRUE if the Values column is the largest for a given ID, FALSE otherwise.
= IF(CALCULATE(MAX('Table1'[Value]),ALLEXCEPT('Table1','Table1'[ID])) = 'Table1'[Value], TRUE(), FALSE())
P.S. I used the default M language editor to generate an if shell statement so this is similar to what I'm looking for:
= Table.AddColumn(#"Changed Type", "MaxValue", each if [#"[Value]"] = 'some logic here' then true else false)

If your source table is set up like this and called Table1:
Then this M code should do what you're asking:
let
Source = Table1,
#"Grouped Rows" = Table.Group(Source, {"ID"}, {{"ValueMax", each List.Max([Value]), type number}, {"AllData", each _, type table [ID=text, Attr A=text, Attr B=text, Value=number]}}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Grouped Rows", "AllData", {"Attr A", "Attr B", "Value"}, {"Attr A", "Attr B", "Value"}),
#"Added Custom" = Table.AddColumn(#"Expanded AllData", "MaxValue", each [ValueMax]=[Value]),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"ID", "Attr A", "Attr B", "Value", "MaxValue"})
in
#"Removed Other Columns"
It should give you this result:

Related

How to split a row where there's 2 data in each cells separated by a carriage return?

Someone gives me a file with, sometimes, inadequate data.
Data should be like this :
+---------+-----------+--------+
| Name | Initial | Age |
+---------+-----------+--------+
| Jack | J | 43 |
+---------+-----------+--------+
| Nicole | N | 12 |
+---------+-----------+--------+
| Mark | M | 22 |
+---------+-----------+--------+
| Karine | K | 25 |
+---------+-----------+--------+
Sometimes it comes like this tho :
+---------+-----------+--------+
| Name | Initial | Age |
+---------+-----------+--------+
| Jack | J | 43 |
+---------+-----------+--------+
| Nicole | N | 12 |
| Mark | M | 22 |
+---------+-----------+--------+
| Karine | K | 25 |
+---------+-----------+--------+
As you can see, Nicole and Mark are put in the same row, but the data are separated by a carriage return.
I can do split by row, but it demultiply the data :
+---------+-----------+--------+
| Nicole | N | 12 |
| | M | 22 |
+---------+-----------+--------+
| Mark | N | 12 |
| | M | 22 |
+---------+-----------+--------+
Which make me lose that Mark is associated with the "2nd row" of data.
(The data here is purely an example)
One way to do this is to transform each cell into a list by doing a Text.Split on the line feed / carriage return symbol.
TextSplit = Table.TransformColumns(Source,
{
{"Name", each Text.Split(_,"#(lf)"), type text},
{"Initial", each Text.Split(_,"#(lf)"), type text},
{"Age", each Text.Split(_,"#(lf)"), type text}
}
)
Now each column is a list of lists which you can combine into one long list using List.Combine and you can glue these columns together to make table with Table.FromColumns.
= Table.FromColumns(
{
List.Combine(TextSplit[Name]),
List.Combine(TextSplit[Initial]),
List.Combine(TextSplit[Age])
},
{"Name", "Initial", "Age"}
)
Putting this together, the whole query looks like this:
let
Source = <Your data source>
TextSplit = Table.TransformColumns(Source,{{"Name", each Text.Split(_,"#(lf)"), type text},{"Initial", each Text.Split(_,"#(lf)"), type text},{"Age", each Text.Split(_,"#(lf)"), type text}}),
FromColumns = Table.FromColumns({List.Combine(TextSplit[Name]),List.Combine(TextSplit[Initial]),List.Combine(TextSplit[Age])},{"Name","Initial","Age"})
in
FromColumns

Undefined binding(s) detected when compiling SELECT query

I am following a tutorial for strapi and am stuck at a part where I query for dishes belonging to a restaurant. I'm sure everything is set up properly with a one(restaurant) to many(dishes) relationship defined but the query doesn't work. I've traced it to the actual query which is:
query {
restaurant(id: "1") {
id
name
dishes {
name
description
}
}
}
which returns an error when I run it in playground. The query doesn't show any issues while I write it and doesn't allow me to write anything like:
query {
restaurant(where:{id: "1"}) {
id
name
dishes {
name
description
}
}
}
My database is mysql and the two tables look like this:
mysql> describe dishes;
+-------------+---------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | YES | MUL | NULL | |
| description | longtext | YES | | NULL | |
| price | decimal(10,2) | YES | | NULL | |
| restaurant | int(11) | YES | | NULL | |
| created_at | timestamp | NO | | CURRENT_TIMESTAMP | |
| updated_at | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------------+---------------+------+-----+-------------------+-----------------------------+
7 rows in set (0.00 sec)
mysql> describe restaurants;
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | YES | MUL | NULL | |
| description | longtext | YES | | NULL | |
| created_at | timestamp | NO | | CURRENT_TIMESTAMP | |
| updated_at | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------------+--------------+------+-----+-------------------+-----------------------------+
5 rows in set (0.00 sec)
These tables where auto generated by strapi.
The full error in playground is this:
{
"errors": [
{
"message": "Undefined binding(s) detected when compiling SELECT query: select `restaurants`.* from `restaurants` where `restaurants`.`id` = ? limit ?",
"locations": [
{
"line": 2,
"column": 3
}
],
"path": [
"restaurant"
],
"extensions": {
"code": "INTERNAL_SERVER_ERROR",
"exception": {
"stacktrace": [
"Error: Undefined binding(s) detected when compiling SELECT query: select `restaurants`.* from `restaurants` where `restaurants`.`id` = ? limit ?",
" at QueryCompiler_MySQL.toSQL (/Users/redqueen/development/deliveroo/server/node_modules/knex/lib/query/compiler.js:85:13)",
" at Builder.toSQL (/Users/redqueen/development/deliveroo/server/node_modules/knex/lib/query/builder.js:72:44)",
" at /Users/redqueen/development/deliveroo/server/node_modules/knex/lib/runner.js:37:34",
"From previous event:",
" at Runner.run (/Users/redqueen/development/deliveroo/server/node_modules/knex/lib/runner.js:33:30)",
" at Builder.Target.then (/Users/redqueen/development/deliveroo/server/node_modules/knex/lib/interface.js:23:43)",
" at runCallback (timers.js:705:18)",
" at tryOnImmediate (timers.js:676:5)",
" at processImmediate (timers.js:658:5)",
" at process.topLevelDomainCallback (domain.js:120:23)"
]
}
}
}
],
"data":
Any idea why this is happening?
It seems this was a bug with the alpha.v20 and alpha.v21 versions of strapi. A bug fix has been published to solve it, an issue thread on github is here.

Elasticsearch index with jdbc driver

Sorry my english is bad
I am using elasticsearch and jdbc river. I have two table with many-to-many relations. For example:
product
+---+---------------+
| id| title |
+---+---------------+
| 1 | Product One |
| 2 | Product Two |
| 3 | Product Three |
| 4 | Product Four |
| 5 | Product Five |
+---+---------------+
product_category
+------------+-------------+
| product_id | category_id |
+------------+-------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 2 | 5 |
+------------+-------------+
category
+---+---------------+
| id| name |
+---+---------------+
| 1 | Category One |
| 2 | Category Two |
| 3 | Category Three|
| 4 | Category Four |
| 5 | Category Five |
+---+---------------+
I want to use array type.
{
"id": 1,
"name": "Product one",
"categories": {"Category One", "Category Two", "Category Three"}
},
How should I write a sql?
Use elasticsearch-jdbc structured objects with sql, no need to group_concat:
SELECT
product.id AS _id,
product.id,
title,
name AS categories
FROM product
LEFT JOIN (
SELECT *
FROM product_category
LEFT JOIN category
ON product_category.category_id = category.id
) t
ON product.id = t.product_id
Since river has been deprecated since ES v1.5, maybe run a standalone importer is better.

How are product attributes and attribute options stored in Magento database?

I am trying to figure out how the linkage between attribute and attribute options, and product and attributes are made in Magento. Is there any reference to how this is working? or anyone give me a hint on this.
Thanks,
Balan
As Alan Storm says: "you do not have to know about how your db works. You have to learn how the models work ". (This is not an exact quote. I gave you the meaning).
But I created own scheme to understand the DB structure. So this screen shows how it works:
Hope, it helps.
Also I recommend you to look through these links:
http://www.magentocommerce.com/wiki/2_-_magento_concepts_and_architecture/magento_database_diagram
http://alanstorm.com/magento_advanced_orm_entity_attribute_value_part_1
1) The attributes are stored in eav_attribute. There you get the attribute_id.
2) The options are stored in eav_attribute_option_value. There yout get the option_id.
3) The options are assigned to the product in catalog_product_entity_varchar. There you need the entity_id of the product, the attribute_id from 1) and the value which are the comma separated option_ids from 2)
I've found these queries to be very helpful for hunting down things like - where does it say the product color is black?, for example.
-- show_product_attr.sql
select
p.entity_id,
p.entity_type_id,
p.attribute_set_id,
p.type_id,
p.sku,
a.attribute_id,
a.frontend_label as attribute,
av.value
from
catalog_product_entity p
left join catalog_product_entity_{datatype} av on
p.entity_id = av.entity_id
left join eav_attribute a on
av.attribute_id = a.attribute_id
where
-- p.entity_id = 28683
-- p.sku = '0452MR'
p.entity_id = {eid}
;
And for attr_options
-- show_product_attr_options.sql
select
p.entity_id,
-- p.entity_type_id,
-- p.attribute_set_id,
p.type_id,
p.sku,
a.attribute_id,
a.frontend_label as attribute,
-- a.attribute_code,
av.value,
ao.*
from
catalog_product_entity p
left join catalog_product_entity_int av on
p.entity_id = av.entity_id
left join eav_attribute a on
av.attribute_id = a.attribute_id
left join eav_attribute_option_value ao on
av.value = ao.option_id
where
-- p.entity_id = 28683
p.entity_id = {eid}
;
You need to replace {datatype} with text, varchar, int, decimal, etc, for the first query, and {eid} with entity_id for both queries. Which you can do on the command like like this:
$ cat show_product_attr_options.sql | sed -e "s/{eid}/30445/" | mysql -uUSER -pPASS DATABASE -t
+-----------+---------+--------------+--------------+---------------------------+-------+----------+-----------+----------+--------------------+-------------+
| entity_id | type_id | sku | attribute_id | attribute | value | value_id | option_id | store_id | value | colorswatch |
+-----------+---------+--------------+--------------+---------------------------+-------+----------+-----------+----------+--------------------+-------------+
| 30445 | simple | 840001179127 | 96 | Status | 1 | 5972 | 1 | 0 | Male | NULL |
| 30445 | simple | 840001179127 | 102 | Visibility | 1 | 5972 | 1 | 0 | Male | NULL |
| 30445 | simple | 840001179127 | 122 | Tax Class | 2 | 5973 | 2 | 0 | Female | NULL |
| 30445 | simple | 840001179127 | 217 | Size | 257 | 17655 | 257 | 0 | XS | NULL |
| 30445 | simple | 840001179127 | 217 | Size | 257 | 17657 | 257 | 1 | XS | NULL |
| 30445 | simple | 840001179127 | 224 | Color | 609 | 18717 | 609 | 0 | Arctic Ice Heather | NULL |
| 30445 | simple | 840001179127 | 260 | Featured | 0 | NULL | NULL | NULL | NULL | NULL |
| 30445 | simple | 840001179127 | 262 | Clearance Product | 0 | NULL | NULL | NULL | NULL | NULL |
| 30445 | simple | 840001179127 | 263 | Skip from Being Submitted | 0 | NULL | NULL | NULL | NULL | NULL |
| 30445 | simple | 840001179127 | 283 | Discontinued | 0 | NULL | NULL | NULL | NULL | NULL |
+-----------+---------+--------------+--------------+---------------------------+-------+----------+-----------+----------+--------------------+-------------+
A similar set of sql scripts can be created for catalog.
Product Attributes are extra values that you can assign to a product and is stored in the main EAV table, by name, and the data is then stored in a few different tables based on the data type, like varchar, decimal, text Integer, date, etc.
if you had multiple values for your Product Attribute, then that will be stored in the Attribute Options tables, again, different tables based on the data type.
the following link explains the relationships better:
http://www.magentocommerce.com/wiki/2_-_magento_concepts_and_architecture/magento_database_diagram
And deeper developer's detail:
http://www.magentocommerce.com/knowledge-base/entry/magento-for-dev-part-7-advanced-orm-entity-attribute-value
And Attribute sets will be the other thing you come across, like the name suggests, a set of attributes grouped together. http://www.magentocommerce.com/knowledge-base/entry/how-do-i-create-an-attribute-set
HTH
Shaun
SELECT pei.value
FROM `catalog_product_entity_int` pei
JOIN `eav_attribute` ea
ON pei.attribute_id = ea .attribute_id
WHERE pei.entity_id = {your product_id}
AND ea.attribute_code = '{your attribute_code}'
Note that there are a number of different tables like catalog_product_entity_int depending on the type of the attribute, so one of those other ones could be appropriate.
You can get all product properties by using this query:
SELECT CPEV.entity_id, CPE.sku, EA.attribute_id, EA.frontend_label, CPEV.value
FROM catalog_product_entity_varchar AS CPEV
INNER JOIN catalog_product_entity AS CPE ON CPE.entity_id = CPEV.entity_id
INNER JOIN eav_attribute AS EA ON(CPEV.attribute_id = EA.attribute_id AND EA.entity_type_id = 4)
INNER JOIN catalog_eav_attribute AS CEA ON(CEA.attribute_id = EA.attribute_id AND CEA.is_visible_on_front = 1 AND CEA.is_visible_in_grid = 1)

How to remove repeated columns using ruby FasterCSV

I'm using Ruby 1.8 and FasterCSV.
The csv file I'm reading in has several repeated columns.
| acct_id | amount | acct_num | color | acct_id | acct_type | acct_num |
| 345 | 12.34 | 123 | red | 345 | 'savings' | 123 |
| 678 | 11.34 | 432 | green | 678 | 'savings' | 432 |
...etc
I'd like to condense it to:
| acct_id | amount | acct_num | color | acct_type |
| 345 | 12.34 | 123 | red | 'savings' |
| 678 | 11.34 | 432 | green | 'savings' |
Is there a general purpose way to do this?
Currently my solution is something like:
headers = CSV.read_line(file)
headers = CSV.read_line # get rid of garbage line between headers and data
FasterCSV.filter(file, :headers => headers) do |row|
row.delete(6) #delete second acct_num field
row.delete(4) #delete second acct_id field
# additional processing on the data
row['color'] = color_to_number(row['color'])
row['acct_type'] = acct_type_to_number(row['acct_type'])
end
Assuming you want to get rid of the hardcoded deletions
row.delete(6) #delete second acct_num field
row.delete(4) #delete second acct_id field
Can be replaced by
row = row.to_hash
This will clobber duplicates. The rest of the posted code will keep working.

Resources