How do I return multiple columns of data using ImportXML in Google Spreadsheets? - xpath

I'm using ImportXML in a Google Spreadsheet to access the user_timeline method in the Twitter API. I'd like to extract the created_at and text fields from the response and create a two-column display of the results.
Currently I'm doing this by calling the API twice, with
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/created_at")
in the cell at the top of one column, and
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/text")
in another.
Is there a way for me to create this display with a single call?

ImportXML supports using the xpath | separator to include as many queries as you like.
=ImportXML("http://url"; "//#author | //#catalogid| //#publisherid")
However it does not expand the results into multiple columns. You get a single column of repeating triplets (or however many attributes you've selected) as shown below in column A.
The following is deprecated
2015.06.16: continue is not available in "the new Google Sheets" (see: The Google Documentation for continue).
However you don't need to use the automatically inserted CONTINUE() function to place your results.
=CONTINUE($A$2, (ROW()-ROW($A$2)+1)*$A$1-B$1, 1)
Placed in B2 that should cleanly fill down and right to give you sane column data.
ImportXML is in A2.
A3 and below are how the CONTINUE() functions are automatically filled in.
A1 is the number of attributes.
B1:D1 are the attribute index for their columns.

Another way to convert the rows of =CONTINUE() into columns is to use transpose():
=transpose(importxml("http://url","//a | //b | //c"))

Just concatenate your queries with "|"
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/created_at | /statuses/status/text")

I posed this question to the Google Support Forum and this is was a solution that worked for me:
=ArrayFormula(QUERY(QUERY(IFERROR(IF({1,1,0},IF({1,0,0},INT((ROW(A:A)-1)/2),MOD(ROW(A:A)-1,2)),IMPORTXML("http://example.com","//td/a | //td/a/#href"))),"select min(Col3) where Col3 <> '' group by Col1 pivot Col2",0),"offset 1",0))
Replace the contents of IMPORTXML with your data and query and see if that works for you. I
Apparently, this attempts to invoke the IMPORTXML function only once. It's a solution for now, at least.
Here's the full thread.

This is the best solution (NOT MINE) posted in the comments below. To be honest, I'm not sure how it works. Perhaps #Pandora, the original poster, could provide an explanation.
=ArrayFormula(iferror(hlookup(1,{1;ARRAY},(row(A:A)+1)*2-transpose(sort(row(A1:A2)+0,1,0)))))
This is a very ugly solution and doesn't even explain how it works. At least I couldn't get it to work due to multiple errors, like i.e. to much parameters for IF (because an array is used). A shorter solution can be found here =ArrayFormula(iferror(hlookup(1,{1;ARRAY},(row(A:A)+1)*2-transpose(sort(row(A1:A2)+0,1,0))))) "ARRAY" can be replaced with IMPORTXML-Function. This function can be used for as much XPATHS one wants. – Pandora Mar 7 '19 at 15:51
In particular, it would be good to know how to modify the formula to accommodate more columns.

Related

Google Sheets Query Sorted Results

I am writing a query function that I would like sorted. I have this figured out. What I cannot solve is attempting to insert 2 blank rows between the sorted results. Is this at all possible? Here is my query as it currently stands. Works perfectly as written. Just would like to have a 2 row gap between results.
Thank you.
=query('Form Responses 1'!A:CM,"Select B,C,D,I,L,AU,AX Where K = '"&Titles!B2&"' OR AW = '"&Titles!B2&"'Order by K,AW",0)
enter image description here
Since you've clarified your question to say that you want two blank rows, only to be added at a specific point in your query results, I've created another answer. But you haven't answered my question about what criteria you have for deciding where to insert these two lines, so I've made an assumption based on your data. This formula should be easily adapted if you have some other criteria for where to insert the blank lines. See my added tab HelpGK in yur sample sheet.
Try this formula, where you have your query:
={query('Data Sheet'!A:J,"Select B,C,D,E,F,H,I Where E='' and (G = 'ABC' OR J = 'ABC') Order by J,G",0);
{"","","","","","",""};
{"","","","","","",""};
query('Data Sheet'!A:J,"Select B,C,D,E,F,H,I Where E<>'' and (G = 'ABC' OR J = 'ABC') Order by J,G",0)}
Note that I don't believe your exact Desired Result can be obtained from your Data sheet. You are missing data rows. But this formula produces the result in the screenshot you provided.
EDIT:
Just realised that I created an answer for a single column of data.
Please share a sample sheet since this is not your case.
You can try this formula, but you'll need to replace this text "A4:A8" with your exact query:
=ArrayFormula(flatten(split(A4:A8 & "♥ ♥ ","♥",0,0)))
Note that the column with the "~" values in the blank cells is just to demonstrate what is happening.
If you had provided a sample sheet, with sample data, it would have been easier.
Let me know if this works for you. If not, please provide a sample sheet, with sample data, and let us know what else you need.
Be aware that the you can't put any values into the two inserted rows of blank cells, or the arrayformula will fail. But if you are planning to save the results of this as values, then pasted into place, then you could overwrite the blank cells, if desired.
Also be aware that FLATTEN is an undocumented function, and may possibly be removed from Sheets. If this is for a critical application, and alternative formula can be provided. Let us know if that is a concern.
I will delete my earlier answer, which mistakenly focused on a one column result.
This messy formula seems to handle multiple columns. If it works for you, I'll see if I can simplify it or clean it up. I can also provide details of the steps it is doing, if necessary. Replace the text "A4:C8" in this formula with the range where your query result currently exists. Once you've tested that, you can replace the text "A4:C8" with your actual query statement, to have everything done in one formula, if you want.
=ARRAYFORMULA(SUBSTITUTE(SUBSTITUTE(SPLIT(FLATTEN(ARRAYFORMULA(SPLIT(TRANSPOSE(ARRAYFORMULA(QUERY(TRANSPOSE(ARRAYFORMULA(SUBSTITUTE(A4:C8," ","♥"))),,99^99)&"♦~♦~")),"♦",1,0)))," ",1,0),"~",""),"♥"," "))
Please let me know if this works for you. If you have any issues, please share a sample sheet, with only sample data, and clarify what isn't working as you expect.

Get meaning for a word from dictionary Using XPath Google sheets importxml function

I'm trying to use the IMPORTXML function in Google sheets to grab the meaning and information words on https://www.powerthesaurus.org/
I kind of succeeded getting some data from another website, but as a newbie, I got some troubles to get any data when I try on this one in this Google sheet in D6 cell.
=ImportXML("https://www.powerthesaurus.org/"&A6,"//*[#id='link link--primary link--term']")
Could someone help to educate me with the correct formula?
You're looking for synonyms. Note you can display up to 200 on Power Thesaurus.
To get the 50 first synonyms in one cell (since you have one word per row), you can try this :
Create 50 numbered columns in your GoogleSheet.
Apply this formula to the first cell and drag it to the right.
=IMPORTXML("https://www.powerthesaurus.org/abbreviation/synonyms";"(//div[#class='pt-thesaurus-card__term'])"&"["&B2&"]")
Then use join formula to get all the words in one cell (XX:XX is the range of your columns, B3:F3 on the provided screenshot).
=JOIN("|";XX:XX)
Result :
Alternatively we could have use this one-liner (and make some cleanup afterwards) but GoogleSheet returns a blank cell whereas the XPath is perfectly valid :
=IMPORTXML("https://www.powerthesaurus.org/abbreviation/synonyms";"normalize-space(//div[#class='pt-list-terms__container'])")

ArrayFormula column disappears when sorting in a filter view in Google Sheets

I'd like to use ArrayFormula to populate a column in spreadsheet, but when I Sort A->Z in a filter view, the ArrayFormula column vanishes. In some cases, the column includes a #REF! error about the range, and in some cases the column is just blank after the Sort. The following is a simplified version of what I'm trying to do (in my actual application, I'm doing a Vlookup to another sheet):
https://docs.google.com/spreadsheets/d/1XbqqedOjuSKuE-ZLIHNw59-r01EsNMpx7YVqOoxSOR4/edit?usp=sharing
The column 3 header uses an ArrayFormula to copy from column 1. If you go to the Filter 1 filter view, you'll note that column 3 is blank except for an error. This happens after I try to Sort Z->A on column 2. In my more complicated use-case, involving a Vlookup, after a Sort the column disappears entirely (leaving no #REF! error). Before sorting in both cases, everything is fine.
How do I make ArrayFormula values persist in filter views after sorting?
Thanks for your help!
I'm guessing that, because your references are normal (relative, not anchored/absolute), the range A2:A10 after sorting down turns into something absurd, like A7:A4, depending on actual sorted values.
Also, if you hover with your mouse on the #REF error, what does it tell you?
Anyway, try using absolute references in your formula:
=arrayformula({"Column 3"; A$2:A$10})
Edit
Fascinating. It's the first time I see this type of error. Taking it at face value, it seems that it's a limitation of Google Spreadsheets - you cannot use ARRAYFORMULAS spanning multiple rows inside sorted filter views, because, like I sort of guessed, it messes up the ARRAYFORMULA's range (as indicated by the fact that the formula is now in C4 instead of C1).
But that gives you also the solution: do not include the cell with the arrayformula in the filter view. Instead of making your filter view's target range A1:C20, make it A1:B20. Then the arrayformula in C1 will be untouched by the filter and will indeed continue to work.
I have found a solution for my usecase, in your case, it could be:
=arrayformula(if(row(C:C)=1;"Column 3";A:A))
But you'll need to consider the whole columns in your formulas.
Example
Have you tried A2:A?
If you don't put an ending row, means the end of the column.
It worked for me.
Cheers

Filtering the output for importhtml in Google Sheets

I am building a google sheet to do calculations based on information I found on different websites and stumbled upon the IMPORTHTML function in Google Sheets.
Terrific, I want to import tables and then use some of the values out of those tables to build my sheet and make further calculations.
However, since the function retrieves both the headers and all the information in the table that makes it quite hard to work with. Instead I would like to pull only certain of the data, preferably specific cells in the table pulled.
Is this possible?
For example:
=ImportHtml("http://en.wikipedia.org/wiki/Demographics_of_India"; "table";3)
returns a huge list, what if I would like to pull only the values of B7 and D7? Is that possible? Even filtering out a single row would be useful, whatever that is more feasible. The most important part is that I can get a single row and dont have the full table.
Found the INDEX function, doing exactly what I need it to do!

Why is this function throwing a filter error?

I'm working with a Google Spreadsheet that's pulling data from another sheet if certain conditions are met. Well, at least that's what it should be doing—instead, I'm getting "No matches are found in FILTER evaluation."
The function is:
=filter(importRange("https://docs.google.com/spreadsheets/d/1Z_7hl4uEc-an2rOUgOd_zYhCeb_QNIZopahJqBYooRg/edit#gid=0", "Sheet1!R2:R5000"), SEARCH( A3 , index(importRange("https://docs.google.com/spreadsheets/d/1Z_7hl4uEc-an2rOUgOd_zYhCeb_QNIZopahJqBYooRg/edit#gid=0", "Sheet1!V2:V5000")) ) )
I've tried it with a variety of row and column parameters for the index() function. I've also tried adding * to the beginning and end of the search term in A3, in case that's the issue. I've also tried putting quotes around the value in A3.
What am I missing? Sample spreadsheet is here.
I can't find a reference at the moment, but there is a known issue associated with the fact that the newest version of Sheets requires that you explicitly allow access to the other sheet via ImportRange. The issue is, when the ImportRange is nested, it doesn't give the opportunity to allow access - it will just return a #REF error inside your formula.
The work around is to just invoke the ImportRange by itself first (you could use a smaller range):
=ImportRange("https://docs.google.com/spreadsheets/d/abcdefg","Sheet1!R2")
then "Allow access" when prompted; then nest it in your formula.
As an aside, it is advisable to use ImportRange as few times as possible, so in your case it might be better to use QUERY:
=QUERY(ImportRange("https://docs.google.com/spreadsheets/d/abcdefg","Sheet1!R2:V5000"),"select Col1 where Col5 contains '"&A3&"'",0)
You can cheat the IMPORTRANGE issue by having a page which just pulls a single cell from every sheet you want to reference nested. Once it's been given permission the permission persists throughout the sheet.

Resources