Search specific text on a webpage using xpath - xpath

I want to search specific text on a webpage using XPath.
<?php
$url = 'http://www.barringtonsports.com/browse/hockey_sticks/show/325/list';
$html = file_get_contents($url);
$doc = new DOMDocument();
#$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$found = $xpath->evaluate("//span[contains(text(),'blablabla')]");
if(!$found){
echo "NOT FOUND";
}
else{
echo "found";
}
?>
it always give the output found as the text blablabla is not in the webpage.
where is the problem?
Is my evalute expression correct for searching specific text?

evaluate will not return a boolean for a XPath expression that selects a node, you have to use:
$found = $xpath->evaluate("boolean(//span[contains(text(),'blablabla')])");

Related

laravel custom blade directive except tag function for htmlentites output

\Blade::directive('specialReplace', function($expression){
$expression = explode(',', $expression);
$exception = $expression[0];
$output = htmlspecialchars($expression[1]);
if ($exception == "img") {
$output = str_replace("<img", "<img", $output);
$output = str_replace("/>", "/>", $output);
} else {
$output = str_replace("<".$exception.">", "<".$exception.">",$output);
$output = str_replace("</".$exception.">", "</".$exception.">",$output);
}
return "<?PHP echo $output?>";
});
#specialReplace(img, <img src=....)
I try to make a custom function for html out image from database without htmlentites in laravel.
My problem is I get an error syntax error, unexpected '&' which I have no idea
anyone know how to fix this?
Try that, it should fix the error
return "<?PHP echo \"$output\"?>";
Also as an argument to your function, given the input
#specialReplace(img, <img src=....)
You'll receive an exact string you passed to the direcive (together with the brackets)
(img, <img src=....)
It doesn't look like you parse it properly.

preg_match search pattern and insert

Hi can any one help me I try to insert one image or text from one variable using preg_match
First I record the Variable in one file and I call:
function bbanner_fetch($a) {
$bannerUrl = get_config('bbanner','bannerUrl');
Follow by:
$url = $_SERVER['REQUEST_URI'];
switch($url){
case "/";
if(preg_match("/^(\Social)/", strtolower($line)))
{
echo 'SEE DETAILS FOUND';
$a->page['htmlhead'] .= "$bannerUrl" . "\r\n";
break;
}
That is a test using the word Social is In the Url /
But I not find or insert nothing My idea is insert a text or image embed
after the code "<script type="text/javascript"> $(document).ready(function() { $("#id_username").focus();} );</script>"
from the variable bannerUrl
Any one have please any idea how is solve?
Thanks

extract first image and removed the first image from string

I have a string that has both text and images but i would like to remove the first image only using php.
$string = 'This is my test <img src="link_to_image1">, some other text.
<img src="link_to_another_image" border="0">';
$str = preg_replace('/\<img src=\"[aA-zZ0-9\/\_\.]+\"\>/','',$string, 1);
to do it using a callback. might be a little more flexible for you in case you're wanting to do stuff in the future based on images at different positions in the string
class imgReplacer {
function cb($matches){
if (!$this->counter){
$this->counter++;
return '';
} else {
return $matches[0];
}
}
}
$ir = new imgReplacer;
$ir->counter = 0;
$string = 'This is my test <img src="link_to_image1">, some other text. <img src="link_to_another_image" border="0">';
$string = preg_replace_callback(
'#(<img.*?>)#',
array(&$ir, 'cb'),
$string);
echo $string;
This is my test , some other text. <img src="link_to_another_image" border="0">
$feed_desc = preg_replace('/(<)([img])(\w+)([^>]*>)/', '', $str,1);

XPath to find anchor with no text?

I have a couple of anchor tags which have no text, so no text here, how do I find those anchors with no text?
I've tried this but it's returning nothing:
global $post;
$doc = new DOMDocument();
$doc->loadHtml($post->post_content);
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//a[string-length(.) = 0]');
foreach($nodes as $node) {
$remove_elements[] = $node->getAttribute('href');
}
return $remove_elements;
The html looks like this.
<br/>
<br/>
<br/>
<br/>
If you want to know if a node has no text(), you can use string-length() which excepts a node. In this case '.' is a reference to current element.
You can do
//a[string-length(.) = 0]

Incorrect characters in output

I'm trying to learn web scraping with Xpath. The code below works, however the output contains of incorrect characters and I can't manage to get this right.
Example:
Output: Emåmejeriet
How it should be: Emåmejeriet
PHP Code:
<?php
// Tried with these parameters but they doesn't make any difference
$html = new DOMDocument('1.0', 'UTF-8');
$html->loadHtmlFile('http://thesite.com/thedoc.html);
$xpath = new DOMXPath($html);
$nodelist = $xpath->query("//table");
foreach ($nodelist as $n) {
echo $n->nodeValue."\n";
}
?>
What can I do to fix this?
You should try encode() & decode() php functions if using ISO8859-15 or iconv() if not.
Example :
<?php
iconv_set_encoding("internal_encoding", "UTF-8");
iconv_set_encoding("output_encoding", "ISO-8859-1");
?>

Resources