Xpath query, making a certain query more generic - xpath

I'm trying to extract information from Wikipedia tables.
More specifically, I'm trying to make a list of all teams and all players in the premier league.
Until now I'm able to traverse over the whole teams in the premier league 2019-2020 table of teams, for every team there I get in it Wikipedia page and traverse over its player's getting their information.
I thought there is a fixed template that all premier league teams in Wikipedia have their table of players at position 3 but after traversing 6 teams it faced a team that it's table is in 2nd place.
So I was using the following XPath query on every team wiki page
"//table[3]/tbody//tr[position() > 1]//td[4]//span/a/#href"
but for example, the following team players table is at position 2, how can I make this query more generic and not fix it a certain position? I have noticed that all of my relevant tables have an element before it with the text "First-team squad"
The HTML of the table is too long, so I post here the wiki link of a certain team
https://en.wikipedia.org/wiki/Crystal_Palace_F.C.
Hope to get help! thanks.

You have to use another "anchor" which works for each page. The table you need is always the first after the span element "Players".
So with this :
//span[#id='Players']/following::table[1]//span[#class="fn"]//text()
You'll get the names of all players of the current squad team.
With this :
//span[#id='Players']/following::table[1]//span[#class="fn"]//#href
You'll get the associated URLs. /!\ Some players don't have a wikipedia webpage.
So you can have 26 player names but 25 urls. Like here :
https://en.wikipedia.org/wiki/Chelsea_F.C.

Related

How to display only filtered data range in data validation rule in Google sheet

I have a table with two sheets. The first (Records) records the players from the form and a filter is performed here to determine the team (1 or 2 or 3) also.
The second table (Players) will perform calculations for individual players (one row for one player). In the first column, I will select a player using a dropdown. The problem is that all players are in the dropdown. I need the filtered ones (eg team 1 and 3). Can anyone help me? Thanks
Example sheet here
Update:
I'll apply a filter to the team first. E.g. 1 and 2 in the Records table.
Then select the winner in the Players sheet (from the filtered list 1 and 2), then select the loser from the same filtered list, but there will not be the player I just selected as the winner.
=SUBTOTAL(103; C2)
=FILTER(Records!D2:D; Records!F2:F=1)
demo spreadsheet
You may use a filter() function where you create a dynamic dropdown based on the team number you select. I have included an example where you can select a combo of a few team numbers together. If you choose only team 1 and leave the other two dropdowns blank (yellow), it will only show the names from team 1.
Dynamic Dropdown Sample

Google Sheets: dynamic sum-formula of multiple sheets

I'm working on a call tracking sheet for our sales teams to see their numbers.
Now I've following case I don't know how to solve.
Every sales person has his own sheet with his name.
In the main sheet I want to add up the data from all sellers, which currently happens via the following formula:
='Closer 1'!C4+'Closer 2'!C4+'Closer 3'!C4+'Closer 4'!C4+'Closer 5'!C4+'Closer 6'!C4+'Closer 7'!C4+'Closer 8'!C4
My question now is, how can I dynamically extend the formula with a database table so that when another seller is added, I don't have to adjust a formula?
That the formula is automatically supplemented by the additional person?
Here I added a picture how does it look like: Picture 1
The Sheet of a sales person looks like: Picture 2
Because It would need really much time to change all the formulas for every day of the year.
Thank you very much for your help guys!
My idea is to use INDIRECT function and store sellers names in sum sheet.
This formula treats string as a reference so you can take sheetnames from your spreadsheet.
My sheet you can find here:
https://docs.google.com/spreadsheets/d/1f31vxTFhAvmPNzx5oIleZHYfyxnFz73jCHkbP6_5hmA/copy

Power BI. Sort a column with repeated values based on another column

For my requirement, I've got a specific layout for a report. To simplify, the series of Categories should present the Areas in a particular order (financial information).
Every Category will be present on a different Page on Power BI. However, as you can see, some areas belong to multiple Categories. Because of this, I'm getting an error message if I try to order this column by an index, and I can't modify the name of the area.
Is it possible to specify on a cross table that I do not need it to perform any alphabetic order?
I've looked for a possible answer, but so far I have not found any solution to this.
Regards.

Google Sheets Extract Data from Table and make a row per data set

I'm stuck with Google Sheets.
Situation:
I have a data table with projects. Each project as a few attributes, most importantly including which team member has worked on the project this month.
Goal:
I need to convert the data to a new table that is built up differently. I need one row per project per active team member.
Sample data and goal:
https://docs.google.com/spreadsheets/d/1QcNPsvHX8hBNUpCJiutof8yD8ukFYcCXM_pLNrQmDUs/edit?usp=sharing (can edit)
As you can see, SEO and Island now have two rows instead of one, as Jan AND Chris have worked on the projects this month.
Approach:
I tried FILTER, QUERY (with pivot) and thought about Scripting (basically its an iteration over the Matrix B3:E8...). However, I am not particularly skilled at Sheets and am very thankful for your help. Thanks a billion, guys!!!
You can do this a fairly standard way by using Textjoin to join together the corresponding column headers and other data for the non-blank cells, then separating it into rows then rows and columns with the Transpose and Split functions:
=ArrayFormula(split(transpose(split(textjoin("¶",true,if(B3:E8="","",A3:A8&"|"&F3:F8&"|"&G3:G8&"|"&H3:H8&"|"&I3:I8&"|"&B2:E2)),"¶")),"|"))

Well formed query suggestions

I am developing an autocomplete feature in which i intend to show query suggestions something like this:
students who live in {City_name} [ City_name could contain values from list of cities ]
example_type 1 :
students who live in New...
[ following query suggestions should pop-up ] :
students who live in New york
students who live in New
Jersey
(Looking up different entities [here cities, sports (eg: "students who play basketball" etc...]. )
example_type 2:
students who live in New york and play ba...
[ following query suggestions should pop-up ] :
students who live in New York and play basketball
students who
live in New York and play baseball
etc..
I have tried building basic autocomplete on entities index using ElasticSearch, which is gisted here.
(In my case, the child/entities index is dumped using a river-plugin.) I have naively checked on Nested Types and Parent / Child relationship but was not able to exactly figure out whether its the right fit for my requirement.
I am not sure on how to index these (parent) phrases alongwith
child index to enable autocomplete search and generate possible suggest trees by querying/searching a single index.
It would be great if i can get some help to solve this kind of problem.
Thanks in advance!
I'd index phrases such as:
live in New york
live in New Jersey
play basketball
play baseball
And then do some work client side to figure out you've started a new section in the query, and then only send the letters in the new section to ES for typeahead completion.
This will take some work on the front end, but this i could see working. The other alternative being indexing every possible variation on a query phrase for typeahead, but I highly doubt that's viable.

Resources