Power Query Custom Column with Numbers and Text - powerquery

I am trying to clean up my data by converting certain values to a number "2" but need to leave remaining data and data type as is.
I am using the following code with a custom column step in power query (excel) but it is giving me an error when returning a number.
Number.From([Values_old]) otherwise if Text.Contains([Value_old],"not required",Comparer.OrdinalIgnoreCase) or Text.Lower([Value_old]) = "N/A" or Text.Lower([Value_old]) ="NA" or [Value_old] = "100" or [Value_old] = 100 then 2 else [Value_old]
See the result from the step:
I based my conditional column from the following comment I found in a forum: https://community.powerbi.com/t5/Desktop/data-type-which-contains-both-test-and-number/m-p/55785/highlight/true#M22664
However, this seems to break my if then else condition as well.

It's simple. you used the wrong column name, Values_old instead of Value_old
But your formula won't work then either since neither of these will ever work:
Text.Lower([Value_old]) = "N/A" or Text.Lower([Value_old]) ="NA"
because you are comparing something you just converted to lower case against an upper case
so you probably want below, which includes the try part you seem to have left out of your code
= Table.AddColumn(#"Changed Type", "Custom", each try Number.From([Value_old]) otherwise if Text.Contains([Value_old],"not required",Comparer.OrdinalIgnoreCase) or Text.Lower([Value_old]) = "n/a" or Text.Lower([Value_old]) ="na" or [Value_old] = "100" or [Value_old] = 100 then 2 else [Value_old])

Related

Please can I have a Power Query formula to allow me to check if cells contain some text and then replace?

I have a list of residents who have different professions including being a student. However, some have written “Student” or “Overseas Student” or something else with the word student in it. I would like Power Query to search the column for any cell containing “Student” and replace it with “Student” so it removes any other references. Please can someone help?
I have tried to write the formula but no it have been successful.
You'll need to use the Text.Contains function, this will return a True/False, if the search text is in the column. You will need to then wrap it with a IF statement like:
if Text.Contains([Column1], "Student") then "Student" else [Column1]
Which will result in the following new column.
It's not clear from your question, if you want to replace items in a string with "I am a overseas student" to "I am a student", you'll have to use the replace & contains function, with a multiple if statement, to check what string you are searching for then replace that value.
You can Transform the column in Power Query: (You'll need to edit your M-Code in the Advanced Editor to add the #"Normalize Student" line)
Source
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Profession", type text}}),
//edit table and column name to reflect your actual variables
#"Normalize Student" = Table.TransformColumns(#"Changed Type", {
"Profession", each if Text.Contains(_, "student", Comparer.OrdinalIgnoreCase) then "Student" else _, type text
})
in
#"Normalize Student"
Result

Bring Value with Sumifs in Pow.Query language to specified row, and column(location)

Next step? I have brought with sumifs and a lot sumif from other workbook, information to the exact row, column in excel workbook. Now I want to do the same with query language. I can bring two values if condition is met, but then it is unclear how I will bring the total sum to the one row in excel workbook. Can anyone show me the path? I guess I will need Data Model...
= Table.AddColumn(#"Changed Type", "Sumif", each if [Column2] =2 or [Column2]=1 then [Column3]+[Column4] else 0)
let
Source = Folder.Files...
#"C:\Users...
#"Imported Excel" = Excel.Workbook(#"C:\...
SegPL_Chart = #"Imported Excel"{[Name="SegPL_Chart"]}[Data],
#"Removed Top Rows" = Table.Skip(SegPL_Chart,12),
#"Removed Alternate Rows" = Table.AlternateRows(#"Removed Top Rows",1,1,90),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Alternate Rows"),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([Col1]="1" or [Col1]="2")),
#"Table Group = Table.Group(#"Filtered Rows", {}, List.TransformMany(Table.ColumnNames(#"Filtered Rows",(x)=>{each if x = "Names" then "Totals" else List.Sum(Table.Column(_,x))},(x,y)=>{x,y})),
#"append" = Table.Combine({#"Filtered Rows",#"Table Group"})
in
#"append"
It gives an error "in" Token comma needed..? What else I need to do bring total rows?
You can use several steps to create several helper columns with intermediate results of conditional sums. Then you can create a new column, sum up all the intermediate results and the delete the helper columns with the intermediate results.
Keep in mind that unlike Excel, the calculations in Power Query always return constants and you can then delete calculated columns you no longer need. So,
Create helper column 1 with complicated IF and Sum scenario
Create helper column 2 with complicated IF and Sum scenario
Create total column to add column 1 + column 2
Delete helper columns and keep only the total column
That gives me exact result what I was looking for, but it is with DAX formula in PowerPivot:
=SUMX(FILTER('TableName',[ColName] = 1),'TableName'[ColName2])
So would be glad to convert it to Power-Query formula

Data Manipulation - Power Query

Relatively new to Power Query. How would one get "value" into the "False" position based on "TEST"?
Table
Assuming you want to test one column, and place a result into a new column
Add Column ... Custom column ...
formula
= if [YourTestColumnName] = "TEST" then "FALSE" else null

Filter on date in PowerQuery (PowerBI)

I'm currently getting to much data from my cosmosDB, which I want to reduce to the last 8 weeks.
How can I filter in PowerQuery to get the last 8 weeks based on my date column.
This is my powerquery to get the data:
let
Source = DocumentDB.Contents("https://xxx.xxx", "xxx", "xxx"),
#"Expanded Document" = Table.ExpandRecordColumn(Source, "Document", {"$v"}, {"Document.$v"}),
#"Expanded Document.$v" = Table.ExpandRecordColumn(#"Expanded Document", "Document.$v", {"date"}, {"Document.$v.date"}),
#"Expanded Document.$v.date" = Table.ExpandRecordColumn(#"Expanded Document.$v", "Document.$v.date", {"$v"}, {"Document.$v.date.$v"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Document.$v.date",{{"Document.$v.date.$v", type text}})
in
#"Changed Type"
And this is how the data is in my CosmosDB:
{
"_id" : ObjectId("5c6144bdf7ce070001acc213"),
"date" : {
"$date" : 1549792055030
},
If you want to do all the work on your end (maybe the server can do some/all of it):
Assuming the 1549792055030 (shown in example) is a Unix timestamp expressed in milliseconds, to convert to a datetime in Power Query, try something like: #datetime(1970, 1, 1, 0, 0, 0) + #duration(0, 0, 0, 1549792055030/1000)
You seem to expand a record field named $v (which itself was nested within a field named date, which itself was nested within a field named $v) in your M code, but $v is not shown as being present in the structure. I mention this as it's confusing to know whether to follow your M code or the structure. I'm going to assume that you have $v field, which contains a date field, which itself contains a $date field. To get at the nested Unix timestamp, you could try something like: someRecord[#"$v"][date][#"$date"]
Since you're interested in only the last 8 weeks, you could test for something like: Date.IsInPreviousNWeeks(DateTime.AddZone(someDatetime, 0), 8). (You could also do it the other way, by converting 8 weeks ago before now to a Unix timestamp and then filter for timestamps >= to the value you've worked out.)
Putting the above together, we might get some M code that looks like:
let
Source = DocumentDB.Contents("https://xxx.xxx", "xxx", "xxx"),
filterDates = Table.SelectRows(Source, each
let
millisecondsSinceEpoch = Number.From([document][#"$v"][date][#"$date"]),
toDatetime = #datetime(1970, 1, 1, 0, 0, 0) + #duration(0, 0, 0, millisecondsSinceEpoch/1000),
toFilter = Date.IsInPreviousNWeeks(DateTime.AddZone(toDatetime, 0), 8)
in toFilter
)
in filterDates
The code above may be functional (hopefully) but, conceptually, it might not be the right way to do it. I am not familiar with the function DocumentDB.Contents, but this link (https://www.powerquery.io/accessing-data/document-db/documentdb.contents) suggests it has these parameters:
function (url as text, optional database as nullable any, optional
collection as nullable any, optional options as nullable record) as
table
and it goes on to say:
if the field Query is specified in the options record the results of
the query being executed on either the specified database and/or
collection will be returned.
What I understand this to mean is that if you change your first line to something like:
Source = DocumentDB.Contents("https://xxx.xxx", "xxx", "xxx", [Query = "..."])
and the query you specify in "..." is understood by the server (presume the query needs be in Cosmos DB's native query language), only the last 8 weeks' worth of data will be returned to you (meaning less data needs sending and less work for you). As I said, I'm unfamiliar with Azure Cosmos DB, so I can't really comment further. But this seems the better way of doing it.

Is there an ISNUMBER() or ISTEXT() equivalent for Power Query?

I have a column with mixed types of Number and Text and am trying to separate them into different columns using an if ... then ... else conditional. Is there an ISNUMBER() or ISTEXT equivalent for power query?
Here is how to check type in Excel Powerquery
IsNumber
=Value.Is(Value.FromText([ColumnOfMixedValues]), type number)
IsText
=Value.Is(Value.FromText([ColumnOfMixedValues]), type text)
hope it helps!
That depends a bit on the nature of the data and how it is originally encoded. Power Query is more strongly typed than Excel.
For example:
Source = Table.FromRecords({[A=1],[A="1"],[A="a"]})
Creates a table with three rows. The first row's data type is number. The second and third rows are both text. But the second row's text could be interpreted as a number.
The following is a query that creates two new columns showing if each row is a text or number type. The first column checks the data type. The second column attempts to guess the data type based on the value. The guessing code assumes everything that isn't a number is text.
Example Code
Edit: Borrowing from #AlejandroLopez-Lago-MSFT's comment for the interpreted type.
let
Source = Table.FromRecords({[A=1],[A="1"],[A="a"]}),
#"Added Custom" = Table.AddColumn(Source, "Type", each
let
TypeLookup = (inputType as type) as text =>
Table.FromRecords(
{
[Type=type text, Value="Text"],
[Type=type number, Value="Number"]
}
){[Type=inputType]}[Value]
in
TypeLookup(Value.Type([A]))
),
#"Added Custom 2" = Table.AddColumn(#"Added Custom", "Interpreted Type", each
let
result = try Number.From([A]) otherwise "Text",
resultType = if result = "Text" then "Text" else "Number"
in
resultType
)
in
#"Added Custom 2"
Sample output
Put it in logical test format
Value.Type([Column1]) = type number
Value.Type([Column1]) = type text
The function Value.Type returns a type, so by putting it in equation thus return a true / false.
Also, equivalently,
Value.Type([Column1]) = Date.Type
Value.Type([Column1]) = Text.Type
HTH
ISTEXT() doesn't exist in any language I've worked with - typically any numeric or date value can be converted to text so what would be a false result?
For ISNUMBER, I would solve this without any code by changing the Data Type to a number type e.g. Whole Number. Any rows that don't convert will show Error - you can then apply Replace Errors or Remove Errors to handle them.
Use Duplicate Column first if you don't want to disturb the original column.
I agree with Mike Honey.
I have a SKU code that is a mix of Char and Num.
Normally the last 8 Char are Numbers but in some weird circumstances the SKU is repeated with an additional letter but given the same EAN which causes chaos.
by creating a new temp column using Text.End(SKU, 1) I get only the last character. I then convert that column to Whole Number. Any Error rows are then removed to leave only the rows I need. I then delete the temp Column and am left with the Rows I need in the format I started with.

Resources