Query referencing 20 sheets / Indirect error with multiple ranges - google-sheets-formula

I have 20 sheets (Eagle, Kestral etc) and want to reference the whole group of them, in different queries.
To stop query formula text being massive I have tried to use the Indirect function but looks like Indirect may not be able to return multiple ranges.
Example for just 2 sheets:
Query({Indirect(A1)}) where A1 contains the text Eagle!F3:I33;Kestrel!F3:I33
gives Indirect error "not a valid cell/range reference".
The 2 formulas below work OK but become unweildy when referencing 20 sheets.
Query({Eagle!F3:I33;Kestrel!F3:I33})
Query{indirect(A2); indirect(A3)} where A2 is Eagle!F3:I33 and A3 is Kestrel!F3:I33
Suggestions please (no script).
Challenge2 = How to include sheet name (bird) in Col1 of query output. Sheet name (bird) is written in cell A1 of each sheet.

Here is the solution that I settled on.
Problem summary
Challenge 1: Avoid oversized query formula when referencing many sheets/tabs.
Challenge2: Return sheet name as part of the query output.
Key information
Script is not an option as causes access and performance issues for users in my organisation.
Indirect function cannot pull multiple ranges into a Query.
There is not a function that returns sheet names (except within Script).
I started with a static list of sheet names.
Each sheet contains Name and Total data, but needs to be tagged with sheet name to identify it in output of query. Each sheet also included the sheet name in cell A1 (but not used in solution).
Solutions
Solution to Challenge 1: Specify the unique sheet ranges & select statements within hidden helper columns then reference them in the query.
Solution to Challenge 2: Insert sheet name as text within each select statement.
=query(
{query({indirect(B4)},C4);query({indirect(B5)},C5);
query({indirect(B6)},C6);query({indirect(B7)},C7);
query({indirect(B8)},C8);query({indirect(B9)},C9);
query({indirect(B10)},C10);query({indirect(B11)},C11);
query({indirect(B12)},C12);query({indirect(B13)},C13);
query({indirect(B14)},C14);query({indirect(B15)},C15);
query({indirect(B16)},C16);query({indirect(B17)},C17);
query({indirect(B18)},C18);query({indirect(B19)},C19);
query({indirect(B20)},C20);query({indirect(B21)},C21);
query({indirect(B22)},C22);query({indirect(B23)},C23)}
,"where Col3 >="&F2 &B2 ,0)
useful screen shot - helper columns and output
Cells F2 & B2 are user defined. F2 is the minimum value to return. B2 relates to ordering of output.
B2 creates an extra bit of text for select statement, depending on user defined dropdown in E2.
=if(E2="order by lap count"," order by Col3 desc",)
The ,0 at the end of the final wraparound query is the optional query header row clause. Zero tells query that the input data has no headers. Necessary for this query.
The curly brackets inside each sheet query convert column names F, G, H to Col1, Col2 Col3.
The curly brackets and semicolons in the final wraparound query combine the sheet query outputs into an array, one underneath the other.
Top Tip – When referencing multiple sheets/tabs in a query, it is better create a wraparound query (as above) to filter the output . This is because if you were to filter the individual sheet queries and one of them returned no data, the curly brackets in the wraparound query would return an array error.

Related

Google Sheet Query: Select misses data when there are different data type in a column?

I have a table like this:
a
b
c
1
2
abc
2
3
4.00
note c2 is text while c3 is a number.
When I do
=QUERY(A1:C,"select *")
The result is like
a
b
c
1
2
2
3
4.00
The "text" in C2 has been missed. You can see the live sheet here:
https://docs.google.com/spreadsheets/d/1UOiP1JILUwgyYUsmy5RzQrpGj7opvPEXE46B3xfvHoQ/edit?usp=sharing
How to deal with this issue?
QUERY is very useful, but it has a main limitation: only can handle one kind of data per column. The other data is left as blank. There are usually ways to try to overcome this from inside the QUERY, but I've found them unfruitful. What you can do is just to use:
={A:C}
You can work with filters by its own, but as a step-by-step to adapt the main features of query: If you need to add conditions, use LAMBDA INDEX and FILTER
For example, to check where A is not null:
=LAMBDA(quer,FILTER(quer,INDEX(quer,,1)<>""))({A:C}) --> with INDEX(quer,,1), I've accesed the first column
Where B is more than one cell and less than other:
=LAMBDA(quer,FILTER(quer,INDEX(quer,,2)>D1,INDEX(quer,,2)<D2))({A:C})
For sorting and limiting an amount of items, use SORTN. For example, you want to sort by 3rd column and limit to 5 higher values in that column:
=LAMBDA(quer,SORTN(FILTER(quer,INDEX(quer,,1)<>""),5,1,3,0))({A:C})
Or, to limit to 5 elements without sorting use ARRAY_CONSTRAIN:
=ARRAY_CONSTRAIN(LAMBDA(quer,FILTER(quer,INDEX(quer,,1)<>""))({A:C}),5)
There are other options, you can use REGEXMATCH and other options, and emulate QUERYs functions without missing data. Let me know!
shenkwen,
If you are comfortable with adding an Google App Script in your sheet to give you a custom function, I have a QUERY replacement function that supports all standard SQL SELECT syntax. I don't analyze the column data to try and force to one type based on which is the most common data in the column - so this is not an issue.
The custom function code - is one file and is at:
https://github.com/demmings/gsSQL/tree/main/dist
After you save, you have a new function from your sheet. In your example, the syntax would be
=gsSQL("select a,b,c from testTable", {{"testTable", "F150:H152", 60, true}})
If your data is on a separate tab called 'testTable'(or whatever you want), the second parameter is not required.
I have typed in your example data into my test sheet (see line 150)
https://docs.google.com/spreadsheets/d/1Zmyk7a7u0xvICrxen-c0CdpssrLTkHwYx6XL00Tb1ws/edit?usp=sharing

Xpath, fetching table with text and images in Google sheets

I'm trying to parse this table into Google Sheets: https://exvius.gamepedia.com/Chaining/Bolting_Strike
And getting the title text from where there are images.
I can't figure out how to get the text from the full table, as well as img/#alt in cases where it's available. I can get the table with
=IMPORTXML("https://exvius.gamepedia.com/Chaining/Bolting_Strike","//table[#class='wikitable']/tbody/tr[position()>=3]")
And only the image texts
=IMPORTXML("https://exvius.gamepedia.com/Chaining/Bolting_Strike","//table[#class='wikitable']/tbody/tr[position()>=3]/td/a/img/#alt")
But I can't seem to do both, is that a limitation of Google Sheets IMPORTXML?
I've tried with OR and other bool operators with no luck. Tried with axes but that was also a no go for me.
I propose something like this :
Sheet
Description:
In B1 we have the url of the webpage.
In B3 we have the following formula to import the first part of the table :
=QUERY(IMPORTHTML(B1;"table";1);"select Col1,Col2,Col3 OFFSET 2";0)
Columns L to O contain the following formulas to get the element names and the ability names (which will be used as a key in a VLOOKUP step). 4 formulas because an ability could have 2 element names. In L3,M3,N3,03 we have :
=IMPORTXML(B1;"//td/a[1]/img[#srcset]/ancestor::td[1]/preceding::a[1][#title]")
=IMPORTXML(B1;"//td/a[1]/img[#srcset]/#alt")
=IMPORTXML(B1;"//td/a[2]/img[#srcset]/ancestor::td[1]/preceding::a[1][#title]")
=IMPORTXML(B1;"//td/a[2]/img[#srcset]/#alt")
Formula in E4 is a one liner where the results of 2 VLOOKUP are merged together. We use VLOOKUP to pair each ability name with an element.
=ARRAYFORMULA(REGEXREPLACE(ARRAYFORMULA(IFERROR(VLOOKUP(C4:INDIRECT("C"&COUNTA(C:C)+2);L:M;2;FALSE);"")&"|"&ARRAYFORMULA(IFERROR(VLOOKUP(C4:INDIRECT("C"&COUNTA(C:C)+2);N:O;2;FALSE);"")));"^\||\|$";""))
To finish, in H3 we have the last part of the table :
=QUERY(IMPORTHTML(B1;"table";1);"select Col5,Col6 OFFSET 2";0)
The rest (colours, borders,..) is standard and conditionnal formatting.
Side note : I'm based in Europe so you might have to change ; with , in the formulas.

Linq to Excel ignoring header rows and using subheaders

I'm looking at Linq to Excel tutorials and they all seem pretty simple and straightforward excpet all of them assume the excel table being used has all column headers neatly placed on row 1 and starting at column A.
I need to query data from excel files where the tables not only start around row 6 (some may start at lower rows) and have headers and subheaders (headers represent a specific place/company; subheaders represent column values for that place like id, stock remaining, sales made, etc.).
Is there any way to specify for the query which row holds the headers I want to use so it only takes information from beneath them?
Can you just skip the number of rows you don't care about?
rows.Skip(1).Select(r => // Rest of your stuff here...
Better yet, query the appropriate range from the start like the LinqToExcel wiki suggests:
//Selects data within the B3 to G10 cell range
var indianaCompanies = from c in excel.WorksheetRange<Company>("B3", "G10")
where c.State == "IN"
select c;

Dynamic Sheet Name in Query in Google Spreashsheet

In Google spreadsheet I want to query data in another sheet but the problem is that the name of sheet is present in a cell. So is there a way in QUERY function to dynamically mention sheet name. Basically I am trying to do something like this but with dynamic sheet name:
=QUERY('2012'!A2:F;"select C, sum(F) where A='December' group by C order by sum(F) desc")
I tried to do this but I get Parse Error:
=QUERY(INDIRECT("Overview!L5")!A2:F;"select C, sum(F) where A='December' group by C order by sum(F) desc")
In which Overview!L5 is the cell with sheet name to query. I also tried to concatenate quotes around INDIRECT but that didnt work either.
I think it is quite evident what I am trying to do from the query i.e. get sum of values in cells grouped by values in other cells.
the INDIRECT looks to be the problem.
Try like this:
=query(INDIRECT(A1&"!A5:A10"),"select Col1")
i.e. if Cell A1 contains "food" the above is the same as:
=query(food!A5:A10,"select A")
and the same as:
=query(INDIRECT("food!A5:A10"),"select *")
**Note: the indirect uses "Col1" etc and not "A" because it does not pass the col letters.
Also ... The google groups forum might be a good place to look for spreadsheet formula answers. productforums.google.com/forum/#!categories/docs/spreadsheets
Easiest way to use dynamic structures in a query is to not include functions inside query but prepare strings in separate cells, for instance A1 for address B1 for arguments and then just QUERY(A1;B1)

a bit of a string matching conundrum in excel-vba

i'm writing a program at work for a categorizing issue.
i get data in the form of CODE, DESCRIPTION, SUB-TOTAL for example:
LIQ013 COGNAC 25
LIQ023 VODKA 21
FD0001 PRETZELS 10
PP0502 NAPKINS 5
Now it all generally follows something like this...the problem is my company supplies numerous different bars. So there are like 800 records a month with data like this. My boss wants to breakdown the data so she knows how much we spend on a certain category each month. For example:
ALCOHOL 46
FOOD 10
PAPER 5
What I've thought of is I setup a sort of "data-base" which is really a csv text file that contains entries like this:
LIQ,COGNAC,ALCOHOL
LIQ,VODKA,ALCOHOL
FD,PRETZELS,FOOD
FD,POPCORN,FOOD
I've already written code that imports the database as a worksheet and separates each field into its own column. I want excel to look through the file and when it sees LIQ and COGNAC to assign it the ALCOHOL designator. That way I can use a pivot table to get the category sums. For example I want the final product to look like this:
LIQ013 COGNAC 25 ALCOHOL
LIQ023 VODKA 21 ALCOHOL
FD0001 PRETZELS 10 FOOD
PP0502 NAPKINS 5 PAPER
Does anyone have any suggestions? My worry is that a single point expression match to JUST the code i.e. just to LIQ without a match to COGNAC as well would maybe result in problems later when there are conflicting descriptions? I'd also like the user to be able to add ledger entries so that the database of recognized terms grows and becomes more expansive and hopefully more accurate.
EDIT
as per #Marc 's request i'm including my solution:
code file
please note that this is a pretty dumb-ed down solution. i removed a bunch of the fail-safes and other bits of code that were relevant to a robust code but not to our particular solution.
in order to get this to work there are two parts:
the first is the macro source code
the second is the actual file
because all the fail-safes are removed, the file needs to be imported to excel exactly the way it appears. i.e. Sheet1 on the googleDoc should be Sheet1 on the excel, start pasting data at cell "A1". before the macro is run, be sure to select cell "A1" in Sheet1. as i said, there are implementations in the finished product to make it more user friendly! enjoy!
EDIT2
These links suck. They don't paste well into excel.
If your comfortable with it I can email you the actual workbook. Which would help in preserving the formatting etc.
Use a lookup table in a separate sheet. Column A of the lookup sheet contains the lookup value (e.g. PRETZELS), Column B contains the category (FOOD, ALCOHOL, etc). In the cells where you want the category to show up in your original sheet (let's use D3 for the result where B3 holds the "PRETZELS" value), type this formula:
=VLOOKUP(B3,OtherSheet!$A$1:$B$500,2,FALSE)
That assumes that your lookup table is in range A1:B500 of a worksheet named "OtherSheet".
This formula tells Excel to find the lookup value (B3) in column A of your lookup and return the corresponding value from column B of your lookup table. Absolute references (the $) ensure that your formula won't increment cell references when you copy/paste the formula in other cells.
When you get new categories and/or inventory, you can update your lookup table in this one place by just adding new rows to it.

Resources