How can I write a Google Sheets query function that will sort a list that has a Letter and number properly? - sorting

In google sheets, I have a ton of data that needs to be sorted like this: P1, P2, P3, etc using a QUERY function. when I do an "ORDER BY" clause in my QUERY formula, it returns the list incorrectly putting P10 right after P1 as shown below
I got the list returned like this...
P1
P10
P2
P3
etc.
screenshot for reference
How can I get it to sort properly so that P10 comes after P9 and so forth?
Thank you!

What's happening here is entirely correct, it's just the consequence of how lexicographic ordering works with digits when the strings aren't all the same length. To get around this you need to split the column in two (one just containing the alphabetic portion, the other containing the numeric portion) and sort by both of those columns, OR use string manipulation to add padding zeroes to the number portion (i.e. make P1 into P01). There are various ways of doing all the above, and without any context it's difficult to state which is the most appropriate way for your needs.

you may try adding this sort\regex part to your working QUERY to get your expected output style.
=LAMBDA(z,SORT(z,--REGEXEXTRACT(INDEX(z,,1),"\d+"),1))(QUERY({A2:B13},"Select * ORDER BY Col1"))
Replace 1 in the INDEX(z,,1) with the appropriate column number of P in your query output.

Related

Multiple Search Key in a Matrix

I'm trying to solve this problem since some days now but it seems I have reached a dead end. Maybe someone would be able to help me.
I have two sheets. The first one contains the list of my clients and their delivery number depending of the weekday.
In my second sheet I would like to get the delivery number of the client (red cells) depending of the weekday I select (yellow cells).
I tried VLOOKUP formula, INDEX/MATCH, QUERY but I wasn't able to find a way to get the delivery number depending of the client's name and the weekday. I think the main issue is that in the first sheet the weekday is a column title.
Maybe the solution is simply to build my tables differently...
Thank you for your help
You can try something like this, assuming A2 and B2 the cells of first name and first day to look:
=INDEX(Sheet1!$1:$1000,MATCH(A2,Sheet1!$A:$A,0),MATCH(B2,Sheet1!$1:$1,0))
Or, if you want this same formula for the full column:
=byrow(A2:A,lambda(each,if(each="","",INDEX(Sheet1!$1:$1000,MATCH(each,Sheet1!$A:$A,0),MATCH(offset(each,0,1),Sheet1!$1:$1,0)))))
Also doable (are perhaps more simply) using a MAP/FILTER; with your 'Caption 1' table in Sheet1!A1:D4 and your 'Caption 2' table at the top-left of Sheet2, the following in Sheet2!C2 gives you the delivery number for a many names/days as you enter in the columns alongside:
=map(A2:A,B2:B,lambda(name,day,ifna(filter(filter(Sheet1!B2:D4,Sheet1!A2:A4=name),Sheet1!B1:D1=day))))
N.B. The IFNA blanks out errors for those rows where a Name/Day pair hasn't been entered yet. Extend the ranges in the filter to suit your real data.
all you need is simple vlookup:
=INDEX(IFNA(VLOOKUP(A9:A11&B9:B11,
SPLIT(FLATTEN(A2:A4&B1:D1&"​​"&B2:D4), "​​"), 2, )))

Automatically update formula after adding columns

My issue has been dealt with in here but I can't think thru and apply it to my formula
How to automatically update formula when inserting columns.
I would really appreciate some help here guys. Thank you!!!
This is the formula I got:
=IF($A$14=NAMES!$W2;$D15;
IF($A$14=NAMES!$W3;$G15;
IF($A$14=NAMES!$W4;$J15;
IF($A$14=NAMES!$W5;$M15;
IF($A$14=NAMES!$W6;$P15;
0)))))
After inserting three more columns it has to go to:
=IF($A$14=NAMES!$W2;$D15;
IF($A$14=NAMES!$W3;$G15;
IF($A$14=NAMES!$W4;$J15;
IF($A$14=NAMES!$W5;$M15;
IF($A$14=NAMES!$W6;$P15;
IF($A$14=NAMES!$W7;$S15;
0))))))
Upon adding 3 more columns to the right, it looks up for the value in $A$14 and if that matches the one in NAMES!$W + i where i is the row incrementing by 1, then it returns a specific column in the same row# 15, be it $D15, $G15, $J15, always jumping three columns.
The 3 columns are automatically inserted via a trigger based on date/time in GAS but I was not able to make the formula automatically update via GAS.
I'm not even sure if this is possible.
Please help!
Thank you!
I write here again because the comment is too short:
You're right!
Here's the link to the Sheet with editor access.
https://docs.google.com/spreadsheets/d/1Qkb96SgZ4dLRBebgxEiNkYiZ4sTtps-ZbBtBtE4J5os/edit?usp=sharing
So basically, whenever a new year is created or 3 more columns are added the formulas in C15, C16 and C17 be updated automatically from
C15 =IF($A$14=Dates!$A2;$D15;IF($A$14=Dates!$A3;$G15;0))
to =IF($A$14=Dates!$A2;$D15;IF($A$14=Dates!$A3;$G15;IF($A$14=Dates!$A3;$J15;0;0)))
C16 =IF($A14=Dates!$A2;$D16+$E15+$B13;IF($A14=Dates!$A3;$G16+$H15;0))
to =IF($A14=Dates!$A2;$D16+$E15+$B13;IF($A14=Dates!$A3;$G16+$H15;IF($A14=Dates!$A3;$J16+$K15;0)))
C17=IF($A14=Dates!$A2;$D17;IF($A14=Dates!$A3;$D17+$G17;0))
to =IF($A14=Dates!$A2;$D17;IF($A14=Dates!$A3;$D17+$G17;IF($A14=Dates!$A3;$D17+$G17+J17;0)))
and so on...
Daniel, I've had a try, and have found a way to get part of your answer. I believe the rest of your problem could be solved in the same way. Look at tab Test-GK in your sample sheet. Your original formula is in C15; mine is in C19.
Both update as you change the value in B14.
I've used a formula to calculate which cell is needed to be used, instead of lots of IF statements. The formula is as follows.
=INDIRECT(ADDRESS(15;(MATCH(A14;Dates!$A$2:$A$15;0))*3+1;4))
This uses MATCH to obtain an 'offset' value for your date range from in your Date values.
Then it multiplies this by 3 to jump to the right column.
So for the first date range in your list, "a 31 de diciembre 2018", MATCH returns "1" (first date range in the list) times 3 plus 1 equals "4", which ADDRESS converts to column "D". The row is always 15 using your sheet.
So ADDRESS(15,1,4) returns D15.
And INDIRECT gets the value of D15.
I think the same principle could be used for the formula in C16. But I'm not as clear what that formula is doing. Does your script put a value into B13?
Let me know if this looks useful.
Update#1. I've got the formula (I think) to replace your formula in C16 as well now. Except for the third term, adding B$13, in only the first case? If necessary, this could be handled with one IF statement, to check if the data range is "a 31 de diciembre 2018".
Let me know how the B13 value is used...
Update #2. Also added a formula to replace your IFs in C17. Effectively the same, but used a CHOOSE function to select which terms get added together, instead of an IF. Maybe a bit clearer to maintain if you add more years to the date range.
Try this formula in C17 - double check the logic.
=CHOOSE(MATCH(A14;Dates!$A$2:$A$15;0); D17; D17+G17; D17+G17+J17; D17+G17+J17+M17; D17+G17+J17+M17+P17; D17+G17+J17+M17+P17+S17; D17+G17+J17+M17+P17+S17+V17; D17+G17+J17+M17+P17+S17+V17+Y17)
It's good to 2025, but if you need another five years or so, you can just make it longer. There is perhaps a way to build it dynamically, but I didn't have the time now for that.

How to get the sum of values of a column in tmap?

I have 2 columns - Matches(Integer), Accounts_type(String). And i want to create a third column where i want to get proportions of matches played by different account types. I am new to Talend & am facing issue with this for past 2 days & did a lot of research but to no avail. Please help..
You can do it like this:
You need to read your source data twice (I used tFixedFlowInput_1 and tFixedFlowInput_2 with the same data). The idea is to calculate the total of your matches in tAggregateRow_1, it simply does a sum of all Matches without a group by column, then use that as a lookup.
The tMap then joins your source data with the calculated total. Since the total will always be one record, you don't need any join column. You then simply divide Matches by Total as required.
This is supposing you have unique values in Account_type; if you don't, you need to add another tAggregateRow between your source and tMap_1, in order to get sum of Matches for each Account_type (group by Account_type).

Best datatype to store postal codes in oracle

I'm new to Oracle, I'm using oracle 11g. I'm storing postal codes of UK. Values are like these.
N22 5HF
SW1 4JD
N14 8IT
N22 1JT
E1 5DP
e1 8DS
E3 8TU
I should be able to easily compare first four characters of each postal code.
What is the best data type to store these data ?
As a slight variation on Lalit's answer, since you want the outward code rather than a fixed substring of the first four characters (which could incude a space and the start of the inward code), you can create a virtual column based on the first word of the value:
postcode varchar2(8),
outward_code generated always as
(substr(postcode, 1, instr(postcode, ' ', 1, 1) - 1))
And optionally, but probably if you're using this to search, an index on the virtual column.
This assumes the post codes are formatted properly in the first place. It won't work if you don't always have the space between the outward and inward codes. And to answer your original question, the actual post code should be a varchar2(8) column to hold alphanumeric valus up to the maximum size and with the standard format.
SQL Fiddle demo.
I should be able to easily compare first four characters of each postal code.
Then keep these first four characters in a separate column. And index this column. You could keep the other characters in different column. Now, if the codes are a mixture of alphanumeric characters, then you are left with VARCHAR2 data type.
Your query predicate would like -
WHERE post_code_col = substr('N22 5HF', 1, 4)
Thus the indexed column post_code_col would be efficient in performance.
On 11g, you have the option to create a virtual column. However, indexing it would be equivalent to a function-based index. So I woukd prefer the first way as I suggested above.
It is better to normalize the table during the design phase, else the issues would start creeping in later.
In my opinion you should use varchar2 data type because this field will not going to be in mathematical calculations (they should not be int or decimal) and these fields are not big enough (so this should not be text)

How to sort only those rows which have no blank cell?

I have a Google Spreadsheet with two separate sheets. The first one is just a big list of names and data, and the second is supposed to be a sorted listing of all the data on the first sheet (sorted by, say, last name). Here is the current way I am defining the second sheet:
=sort(sheet1!A2:L100, sheet1!D2:D100, TRUE)
Which works fine for the most part, except for one issue: in sheet1, some of the cells in 4th column (column D) are blank. How can I change the formula so that the sorting ignores such rows which has a blank cell in column D?
The formulas i tried but got undesirable results :
=arrayformula(if(istext(sheet1!D2:D100), sort(sheet1!A2:L100, sheet1!D2:D100, true), ""))
It sorted as desired but with one issue - blank cells were not pushed at the end but scattered in between the rows.
=arrayformula(sort(filter(sheet1!A2:L100, istext(sheet1!D2:D100)),sheet1!D2:D100, true))
Though the filter part does its job perfectly but when coupled with sort, it is giving an error : Mismatched range lengths.
To filter out the rows with blank cells in column D, you could do something like #2, but as the error message suggested, the second argument would need to be filtered as well to ensure the ranges are the same length. Fortunately there is an easier way, and that is to use column indices rather than ranges:
=SORT(FILTER(sheet1!A2:L100;ISTEXT(sheet1!D2:D100));4;TRUE)
Alternatively you can use the QUERY function for this sort of thing:
=QUERY(sheet1!A2:L100;"select * where D != '' order by D";0)
For anyone looking this, the accepted answer works great if filtering out blank cells that are truly blank, but if the cells contain formulas that evaluate to blank (""), ISTEXT will evaluate to TRUE and blanks will not be filtered out. I modified the accepted answer slightly to work in my situation, in which I had cells containing formulas (that evaluated to "") that I wanted to filter out:
=SORT(FILTER(sheet1!A2:L100,sheet1!D2:D100 <> ""),4,TRUE)

Resources