Get Element using XPath in Puppeteer - xpath

I am trying to scrape multiple elements with the same class names but each has a different number of children. I am looking for a way to select specific elements using the xpath(this would make it easiest for my loop).
const gameTimeElement = await page.$$('//*[#id="section-content"]/div[2]/div[1]/div/div['+ i + ']');
const gameTimeString = await gameTimeElement[j].$eval('h3', (h3) => h3.innerHTML);
This currently does not work.
After I select the element, I grab the h3 tag inside and evaluate it to get the innerHTML.
Is there a way to do this utilizing xpath?
<div id="section-content" style="display: block;">
</div>
<div class="matches">
<div class="day day-28-1" data-week="1" style="display: block;">
<h4>Sat, March 28, 2020</h4>
<div class="day-wrap">
<div class="match region-7-57d5ab4-9qs98v" data-week="1">
<h3 class="time">2:00PM
<span>(Central Daylight Time)</span>
<span class="fr">Best of 7</span>
</h3>
<div class="row ac ">
<div class="col-xs-3 ar">
<img class="team-logo" src="url"></div>
<div class="col-xs-2 al">
<h4 class="loss">(NA)<br>
<span class="team-name">Team1</span>
<br>
<span class="win spoiler-wrap">0</span>
</h4>
</div>
<div class="col-xs-2">
<img class="league-logo" src="url">
<h4> V.S.</h4>
</div>
<div class="col-xs-2 ar">
<h4 class="">(NA)<br>
<span class="team-name">Team2</span>
<br>
<span class="win spoiler-wrap">4</span>
</h4>
</div>
This is a sample of what I am working with for HTML on the website.

Yes, div class="day-wrap" could have a different number of childs. But I don't think that's a problem.
You want to get game times of all Rocket League matches. As you've noticed, games times are located within h3 elements. You can access it directly with one of the following XPaths :
//div[#id="section-content"]//h3
//div[#class="day-wrap"]//h3
//div[contains(#class,"match region")]//h3
If you want something for a loop then you can try :
(//div[#class="day-wrap"]//h3)[i]
where i is the number to increment (from 1 to x).
Side notes : your sample data looks incorrect (according to your XPath). You have a closing div line 2 and it seems you omit div class="row middle-xs center-xs weeks" before div class="matches".

Related

xpath: How to combine multiple conditions on different axes

I try to extract all links based on these three conditions:
Must be part of <div data-test="cond1">
Must have a <a href="..." class="cond2">
Must not have a <img src="..." class="cond3">
The result should be "/product/1234".
<div data-test="test1">
<div>
<div data-test="cond1">
Link 1
<div class="test4">
<div class="test5">
<div class="test6">
<div class="test7">
<div class="test8">
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div data-test="test2">
<div>
<div data-test="cond1">
Link 2
<div class="test4">
<div class="test5">
<div class="test6">
<div class="test7">
<div class="test8">
<img src="bild.jpg" class="cond3">
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
I'm able to extract the links with the following xpath query.
//div[starts-with(#data-test,"cond")]/a[starts-with(#class,"cond")]/#href
(I know the first part is not really neccessary. But better safe than sorry.)
But I'm still struggling with excluding the links containing an descendant img tag and how to add it to the query above.
This should do what you want:
//div[#data-test="cond1" and not(.//img[#class="cond3"])]
/a[#class="cond2"]
/#href
/product/1234

How to check if a node in xpath has a certain child x but also doesn't have a child y

I'm trying to get the non-forked repositories of a given github users. Currently I managed to get all the repositories with this xpath query:
parser.xpath("//ul[#data-filterable-for='your-repos-filter']/li/div/div/h3/a/#href").map{|repository| ...}
The point is I need to filter out the ones that the next 'sibling' of the last div is not a span, something like:
parser.xpath("//ul[#data-filterable-for='your-repos-filter']/li/div/div/h3 and not span/a/#href").map{|repository| ...}
The HTML I'm looking for is the following (inspecting one of the forked repositories):
<li class="col-12 d-flex width-full py-4 border-bottom public fork" itemprop="owns" itemscope itemtype="http://schema.org/Code">
<div class="col-10 col-lg-9 d-inline-block">
<div class="d-inline-block mb-1">
<h3 class="wb-break-all">
<a href="/DominikAngerer/rails-boilerplate" itemprop="name codeRepository" >
rails-boilerplate</a>
</h3>
<span class="f6 text-gray mb-1">
Forked from <a class="muted-link" href="/polomasta/rails-boilerplate">polomasta/rails-boilerplate</a>
</span>
</div>
<div>
<p class="col-9 d-inline-block text-gray mb-2 pr-4" itemprop="description">
Ruby on Rails Storyblok Starter Boilerplate
</p>
</div>
When is not a forked repository, those that I'm looking for, there is no such <span class="f6 text-gray mb-1">
Is it possible such query, if so how?
You can use the following XPath to select the links of non-forked repositories :
//div[#class="d-inline-block mb-1"][not(./span[contains(.,"Forked from")])]//#href
Output : 17 nodes for https://github.com/DominikAngerer?tab=repositories

XPath: how to select elements that are related to other on the same level

The question is simple but I don't have enough practice for this case :)
How to get price text value from every div within "block" if we know that we need only item_promo elements.
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">123</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">456</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">789</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">222</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">333</div>
</div>
You could use the xpath :
//div[#class='block']/*[#class='item_promo']/following-sibling::div[#class='item_price']/text()
You look for div elements that has attribute class with value item_promo and look at its following sibling which has an attribute item_price and grab the text.
This XPath,
//div[div/#class='item_promo']/div[#class='item_price']
will return those item_price class div elements with sibling item_promo class div elements:
<div class="item_price">123</div>
<div class="item_price">456</div>
<div class="item_price">789</div>
This will work regardless of label/price order.

Collapse using Transition Vue

I would like to use Vue's collapse in my code, but I have an error.
[Vue warn]: <transition-group> children must be keyed: <p>
My component:
<template xmlns:v-model="http://www.w3.org/1999/xhtml" xmlns:v-on="http://www.w3.org/1999/xhtml">
<section style="background-color: #dedede;">
<div class="container-fluid">
<div class="Consult-faq container">
<div class="row">
<div class="col-sm-12">
<h2>Cursos</h2>
<a v-for="(course,id) in courses" v-on:click="course.show = !course.show">
<a v-on:click="show = !show">
<div class="col-xs-12" style="border-bottom: solid;border-bottom-color: #999999;border-bottom-width:1px ">
<div class="col-xs-12">
<h4>
<i v-if="course.show" class="fa fa-plus-square text-right" aria-hidden="true"/>
<i v-else class="fa fa-minus-square text-right" aria-hidden="true"/>
{{course.text}}
</h4>
</div>
</div>
<transition-group name="fade">
<p v-if="show">
<div class="col-xs-12">
<article v-for="n in 2" class="Module-content">
<div class=" col-sm-12 col-md-6" style="position: relative;">
<div v-for="(course, index) in course.courses">
<course-card v-if="index % 2 == n - 1" :course="course"></course-card>
</div>
</div>
</article>
</div>
</p>
</transition-group>
</a>
</a>
</div>
</div>
</div>
</div>
</section>
</template>
<script>
export default{
props : [
'courses'
],
data(){
return {
show: false
}
},
mounted() {
console.log(this.courses)
}
}
</script>
So, I'd like to know to collapse item per item. Like this in image.
When I click to expand, all courses expand or close all courses close.
Transition is irrelevant here (though you can get rid of that warning by using transition instead of transition-group, because the transition is only acting on a single node, not a group.)
Right now you're depending on a single variable show to control all of the elements' visibility, so they will all respond to clicks on any of them:
<a v-on:click="show = !show">
<p v-if="show" >
You need individual variables for each element if you want them to expand/collapse separately. You partially did this already, just change the remaining instances of show with course.show and you should be good to go.
(Probably want to clean up that nested <a> within <a> while you're at it; you can just remove the inner one.)
I solved this using vue-resource, I was using Guzzle in Laravel and require data in Controller make this not reactive. And I solved this problem using vue-resource in component.

Whitespace is clickable with href image's

I am using bootstrap framework.
<div class="container">
<h1>Menu</h1>
<div class="row">
<div class="col-md-4">
<img src="images/placeholder-200x200.jpg" alt="Image" class="img-rounded center-block">
Step 1: Credit & Money
</div>
<div class="col-md-4">
<img src="images/placeholder-200x200.jpg" alt="Image" class="img-rounded center-block">
Step 1: Credit & Money
</div>
<div class="col-md-4">
<img src="images/placeholder-200x200.jpg" alt="Image" class="img-rounded center-block">
Step 1: Credit & Money
</div>
</div>
</div> <!-- /container -->
Whitespace on the left and right sides of the images are also clickable - looks like .center-block is the culprit. How to solve?
A block spans the entire div, and centers by using margin. As a link, I would suggest instead removing the center-block from the images themselves, creating a class:
.center {
text-align: center;
}
and setting that class on the containing div, in your case:
<div class="col-md-4 center">
Or something similiar.
Also I would suggest placing your text description for each image into a div, since without the image being a block, the text would flow next to it. Simply placing the text in a paragraph tag would suffice.
Here is a jsbin to demonstrate:
http://jsbin.com/zamavoha/1/edit

Resources