I have a text like this : (fr_CA_conf_3001_3001_00863211-2_channel1__174c431c-96d2-5a53-b9e4-e04fcc191e61)
I need to extract the part of text before "__".
the expected result is : fr_CA_conf_3001_3001_00863211-2_channel1
Could you please help me to find the right formula using REGEXEXTRACT?
Thanks in advance,
Try:
=regexextract(A1,"^\((.*)__")
Where cell A1 contains your data.
You can also use an array to work down a sheet of values, like this in say Col B, row 1:
=arrayformula(if(A1:A<>"",regexextract(A1:A,"^\((.*)__"),))
Related
I have a column that contains either letters or numbers. I want to add a column identifying whether each cell contains a letter or a number. The problem is that there are thousands of records in this particular database.
I tried the following syntax:
= Table.AddColumn(Source, "Column2", each if [Column1] is number then "Number" else "Letters")
My problem is that when I enter this, it returns everything as "Letter" because it looks at the column type instead of the actual value in the cell. This remains the case even when I change the column type from Text to General. Either way, it still produces "Letter" as it automatically assigns text as the data type since the column contains both text and numbers.
Use this expression:
= Table.AddColumn(Source, "Column2", each if List.Contains({"0".."9"}, Text.Start([Column1], 1)) then "Numbers" else "Letters")
Note: It would have been smart to add sample data to your question so I wouldn't have to guess what your data actually looks like!
Add column, custom column with
= try if Value.Is(Number.From([Column1]), type number) then "number" else "not" otherwise "not"
Peter's method works if the choice is AAA/111 but this one tests for A11 and 1BC as well
I know of the m function Text.Proper which capitalizes all words in a sentence. However, I want to know how I can capitalize only the first word of a sentence?
Something along the lines of the following. You didn't specify any details
= Table.AddColumn(Source, "Converted", each Text.Upper(Text.Middle([Column1],0,1))&Text.Middle([Column1],1,Text.Length([Column1])))
Try this, Excel style ;-)
let
Input = "text to capitalize",
Output = Text.Upper(Text.Start(Input,1)) & Text.End(Input,Text.Length(Input)-1)
in
Output
There are a couple of decent answers already, but here's another option that demonstrates a couple more functions:
Text.Upper(Text.At([Text],0)) & Text.Range([Text], 1, Text.Length([Text]) - 1)
I would like to filter a specific value as well as blank values.
Example: Filter if the value is "VALUE" or ""
I tried this:
=filter({Summation!E2:K},match(Summation!D2:D,{$B$1,""},false))
And also tried this:
=filter({Summation!E2:K},or(match(Summation!D2:D,{$B$1},false),isblank(Summation!D2:D)))
But non of these work. How do I match for blank values. I want all blank values as well as those with the value B1.
You could use something like =QUERY(B1:K,"Select * where B='' and B='VALUE'",0)
That selects all data in the range B1:K where column B is blank (B='') AND where B is equal to VALUE (B='VALUE').
Replace B with whatever column contains the value you're trying to find.
I have a text input with '|' separator as
0.0000|25000| |BM|BM901002500109999998|SZ
which I split using PigStorage
A = LOAD '/user/hue/data.txt' using PigStorage('|');
Now I need to split the field BM901002500109999998 into different fields based on their position , say 0-2 = BM - Field1 and like wise.
So after this step I should get BM, 90100, 2500, 10, 9999998.
Is there any way in Pig script to achieve this, otherwise I plan to write an UDF and put separator on required positions.
Thanks.
You are looking for SUBSTRING:
A = LOAD '/user/hue/data.txt' using PigStorage('|');
B = FOREACH A GENERATE SUBSTRING($4,0,2) AS FIELD_1, SUBSTRING($4,2,7) AS FIELD_2, SUBSTRING($4,7,11) AS FIELD_3, SUBSTRING($4,11,13) AS FIELD_4, SUBSTRING($4,13,20) AS FIELD_5;
The output would be:
dump B;
(BM,90100,2500,10,9999998)
You can find more info about this function here.
I think that it will be much more efficient to use the built in UDF REGEX_EXTRACT_ALL.
You can get some idea of how to use this UDF from:
http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#REGEX_EXTRACT_ALL
STRSPLIT and REGEX_EXTRACT_ALL in PigLatin
Sorry for this extreme beginner question. I have a string variable originaltext containing some multiline text. I can convert it into an array of lines like so:
lines = originaltext.split("\n");
But I need help sorting this array. This DOES NOT work:
lines.sort;
The array remains unsorted.
And an associated question. Assuming I can sort my array somehow, how do I then convert it back to a single variable with no separators?
Your only issue is a small one - sort is actually a method, so you need to call lines.sort(). In order to join the elements together, you can use the join() method:
var originaltext = "This\n\is\na\nline";
lines = originaltext.split("\n");
lines.sort();
joined = lines.join("");