Power Query - How to extract after delimiter

Power Query - How to extract after delimiter - powerquery

I have info in a column, that needs to be split into two columns. It can be shown like:
1,000,1111,000 - what we should see is 1,000,111 - 1,000 - or
1,1111,100 - what we should see is 1,111 - 1,100
etc.
I need to separate these columns. I assume the conditions should be "If there are four digits after a comma, separate at this point, into two columns.
It's not immediately obvious how I should fix this. Any thoughts?
EDIT: essentially, the criteria is: If the 4th character after any comma is not another comma, move the 4th character onward onto another column.

This query splits the text string into a list, using its commas as delimiters; then looks at each list entry to find the one that is greater than 3 digits; then inserts a semicolon after the 3rd digit of that entry that is longer than 3 digits; then recombines the list into a text string, with commas; then splits that recombined string into two columns, using the semicolon as the delimiter.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Custom1 = Table.TransformColumns(Source, {"Column1", each Text.Combine(List.Transform(Text.Split(_,","), each if Text.Length(_) > 3 then Text.Insert(_,3,";") else _),",")}),
#"Split Column by Delimiter" = Table.SplitColumn(Custom1, "Column1", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Column1.1", "Column1.2"})
in
#"Split Column by Delimiter"
The table I used to develop/test this is simply this table, which I named Table1:
The query result looks like this:

Related

Modify query function so it can work as an arrayformula in Google Sheets

How do I modify this equation so I can use it with an array function instead of dragging it down.
SUBSTITUTE(JOIN(", ", UNIQUE(QUERY(A:D,"SELECT B WHERE C = '"&G2&"'"))), ", , ", "")
Explanation of the equation:
Have a function is used to extract and concatenate unique values from column B of a sheet named A:D, where the values in column C match a specific criteria. The function is made up of several parts:
It uses the QUERY function to extract all values from column B of sheet A:D where the values in column C match the specific criteria in G.
UNIQUE removes any duplicate values from previous step.
JOIN to concatenate into a single string separated by a comma to returns a string of unique values that match the criteria
SUBSTITUTE to replace occurrences of ", , " with an empty string.

can you try:
=BYROW(G2:G,LAMBDA(gx,IF(gx="",,TEXTJOIN(", ",1,IFNA(UNIQUE(FILTER(B:B,C:C=gx)))))))

Regular expression to remove a portion of text from each entry in commas separated list

I have a string of comma separated values, that I want to trim down for display purpose.
The string is a comma separated list of values of varying lengths and number of list entries.
Each entry in the list is formatted as a five character pattern in the format "##-NX" followed by some text.
e.g., "01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc..."
Is there an regular expression function I can use to remove the text after the 5 character prefix portion of each entry in the list, returning "01-NX, 02-NX, 09-NX, 12-NX,..."?
I am a novice with regular expressions and I haven't been able figure out how to code the pattern.

I think what you need is
regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1')
The inner REGEXP_REPLACE looks for a pattern like nn-NX (two numeric characters followed by "-NX") and any number of characters up to the next comma, then replaces it with the first and third term, dropping the "any number of characters" part.
The outer REGEXP_REPLACE looks for a pattern like two numeric characters followed by any number of characters up to the last NX, and keeps that part of the string.
Here is the Oracle code I used for testing:
with a as (
select '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' as myString
from dual
)
select mystring
, regexp_replace(regexp_replace(mystring, '(\d{2}-NX)(.*?)(,)', '\1\3'), '(\d{2}.*NX).*', '\1') as output
from a

This alternative calls REGEXP_REPLACE() once.
Match 2 digits, a dash and 'NX' followed by any number of zero or more characters (non-greedy) where followed by a comma or the end of the string. Replace with the first group and the 3rd group which will be either the comma or the end of the string.
EDIT: Took dougp's advice and eliminated the RTRIM by adding the 3rd capture group. Thanks for that!
WITH tbl(str) AS (
SELECT '01-NX sometext, 02-NX morertext, 09-NX othertext, 12-NX etc.' FROM dual
)
SELECT
REGEXP_REPLACE(str, '(\d{2}-NX)(.*?)(,|$)', '\1\3') str
from tbl;

How do I identify whether a column entry starts with a letter or a number using m code in power query?

I have a column that contains either letters or numbers. I want to add a column identifying whether each cell contains a letter or a number. The problem is that there are thousands of records in this particular database.
I tried the following syntax:
= Table.AddColumn(Source, "Column2", each if [Column1] is number then "Number" else "Letters")
My problem is that when I enter this, it returns everything as "Letter" because it looks at the column type instead of the actual value in the cell. This remains the case even when I change the column type from Text to General. Either way, it still produces "Letter" as it automatically assigns text as the data type since the column contains both text and numbers.

Use this expression:
= Table.AddColumn(Source, "Column2", each if List.Contains({"0".."9"}, Text.Start([Column1], 1)) then "Numbers" else "Letters")
Note: It would have been smart to add sample data to your question so I wouldn't have to guess what your data actually looks like!

Add column, custom column with
= try if Value.Is(Number.From([Column1]), type number) then "number" else "not" otherwise "not"
Peter's method works if the choice is AAA/111 but this one tests for A11 and 1BC as well

SSRS [Sort Alphanumerically]: How to sort a specific column in a report to be [A-Z] & [ASC]

I have a field set that contains bill numbers and I want to sort them first alphabetically then numerically.
For instance I have a column "Bills" that has the following sequence of bills.
- HB200
- SB60
- HB67
Desired outcome is below
- HB67
- HB200
- SB60
How can I use sorting in SSRS Group Properties to have the field sort from [A-Z] & [1 - 1000....]

This should be doable by adding just 2 separate Sort options in the group properties. To test this, I created a simple dataset using your examples.
CREATE TABLE #temp (Bills VARCHAR(20))
INSERT INTO #temp(Bills)
VALUES ('HB200'),('SB60'),('HB67')
SELECT * FROM #temp
Next, I added a matrix with a single row and a single column for my Bills field with a row group.
In the group properties, my sorting options are set up like this:
So to get this working, my theory was that you needed to isolate the numeric characters from the non-numeric characters and use each in their own sort option. To do this, I used the relatively unknown Regex Replace function in SSRS.
This expression gets only the non-numeric characters and is used in the top sorting option:
=System.Text.RegularExpressions.Regex.Replace(Fields!Bills.Value, "[0-9]", "")
While this expression isolates the numeric characters:
=System.Text.RegularExpressions.Regex.Replace(Fields!Bills.Value, "[^0-9]", "")
With these sorting options, my results match what you expect to happen.

In the sort expression for your tablix/table which is displaying the dataset, set the sort to something like:
=IIF(Fields!Bills.Value = "HB67", 1, IIF(Fields!Bills.Value = "HB200", 2, IIF(Fields!Bills.Value = "SB600", 3, 4)))
Then when you sort A-Z, it'll sort by the number given to it in the sort expression.
This is only a solution if you don't have hundreds of values, as this can become quite tedious to create if there's hundreds of possible conditions.

NIFI: Unable to extract two values from a list during each iteration over a loop

I would like to retrieve large SQL dump between date ranges. For the same, I constructed a loop over a date list, which intends to extract adjacent fields. Unfortunately, in my case, it doesnt work as planned.
Following is my flow:
Replace Text: Takes flowfile content date list as all_first_dates
Initialize Count:
While Loop:
Get first and adjacent dates:
However, on seeing the queue, I get the first and second as this:
Whereas, I desired as 2016-01-01 and 2016-01-02 for first and second respectively on my first iteration and so on.

check the description of the getDelimitedField function and it's parameters:
Description: Parses the Subject as a delimited line of text and returns just a single field from that delimited text.
Arguments:
index: The index of the field to return. A value of 1 will return the first field, a value of 2 will return the second field, and so on.
delimiter: Optional argument that provides the character to use as a field separator. If not specified, a comma will be used. This value must be exactly 1 character.
...
you are not passing the second parameter, so the coma used to split the subject, and you got the whole subject as one element in result.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Power Query - How to extract after delimiter - powerquery

Related

Modify query function so it can work as an arrayformula in Google Sheets

Regular expression to remove a portion of text from each entry in commas separated list

How do I identify whether a column entry starts with a letter or a number using m code in power query?

SSRS [Sort Alphanumerically]: How to sort a specific column in a report to be [A-Z] & [ASC]

NIFI: Unable to extract two values from a list during each iteration over a loop

Categories

Resources