how to pull href link - xpath

I am trying to pull a link from a page that is in a formal I can't seem to find by simply googling... it might be simple but xpath is not my area of expertise
I am using c# and trying to pull the link and just write it to the console to figure out how to get the link
here is my C# code
var document = webGet.Load("http://classifieds.castanet.net/cat/vehicles/cars/0_-_4_years_old/");
var browser = document.DocumentNode.SelectSingleNode("//a[starts-with(#href,'/details/')]");
if (browser != null)
{
string htmlbody = browser.OuterHtml;
Console.WriteLine(htmlbody);
}
the html code section is
<div class="last">…</div>13»
<select name="sortby" class="sortby" onchange="doSort(this);">
<option value="">Most Recent</option>
<option value="of" >Oldest First</option>
<option value="mw" >Most Views</option>
<option value="lw" >Fewest Views</option>
<option value="lp" >Lowest Price</option>
<option value="hp" >Highest Price</option>
</select><div style="clear:both"></div>
</div>
<br /><br /><br />
<a href="/details/2008_vw_gti/1454282/" class="prod_container" >
<h2>2008 VW GTi</h2>
<div style="float:left; width:122px; z-index:1000">
<div class="thumb"><img src="http://c.castanet.net/img/28/thumbs/1454282-1-1.jpg" border="0"/></div>
<div class="clear"></div>
mls
</div>
<div class="descr">
The most fun car I have owned. Dolphin Grey, 4 door, Dual Climate control, DRG Transmission with paddle shift. Leather...
</div>
<div class="pdate">
<p class="price">$19,000.00</p>
<p class="date">Kelowna<br />Posted: Oct 15, 2:54 PM<br />Views: 349</p>
</div>
<div style="clear:both" ></div>
<div class="seal"><img src="/images/bookmark.png" /></div>
</a>
<a href="/details/price_drop_gorgeous_rare_white_2009_honda_accord_ex-l_coupe/1447341/" class="prod_container" >
<h2>PRICE DROP!!! Gorgeous Rare White 2009 Honda Accord EX-L Coupe </h2>
<div style="float:left; width:122px; z-index:1000">
<div class="thumb"><img src="http://c.castanet.net/img/28/thumbs/1447341-1-1.jpg" border="0"/></div>
<div class="clear"></div>
sun2010
</div>
<div class="descr">
the link I'm trying to get is the "/details/2008_vw_gti/1454282/" part. THanks

HTML isn't XML.
XPath is a tool for navigating through XML documents, however HTML does not conform to XML requirements. The HTML you've linked about isn't well formed XML, and as such XPath won't work.
You either need to look at using an HTML to XML convertor, and then adding the output of that conversion to your question to write XPath against, or investigate using a different tool for the job. I'd suggest doing a Google search for "C# HTML scrapers", but I'm not familiar with .Net so I can't offer a narrower option.

Try the following Xpath expression :
//a[#class="prod_container"]/#href

Related

Thymeleaf switch block returns incorrect value

I have a switch block in my thymeleaf page where I show an image depending on the reputation score of the user:
<h1>
<span th:text="#{user.reputation} + ${reputation}">Reputation</span>
</h1>
<div th:if="${reputation lt 0}">
<img th:src="#{/css/img/troll.png}"/>
</div>
<div th:if="${reputation} == 0">
<img th:src="#{/css/img/smeagol.jpg}"/>
</div>
<div th:if="${reputation gt 0} and ${reputation le 5}">
<img th:src="#{/css/img/samwise.png}"/>
</div>
<div th:if="${reputation gt 5} and ${reputation le 15}">
<img th:src="#{/css/img/frodo.png}"/>
</div>
<div th:if="${reputation gt 15}">
<img th:src="#{/css/img/gandalf.jpg}"/>
</div>
This statement always returns smeagol (so reputation 0), eventhough the reputation of this user is 7: example
EDIT:
I was wrong, the image showing was a rogue line:
<!--<img th:src="#{/css/img/smeagol.jpg}"/>-->
but I commented it out. Now there is no image showing.
EDIT2:
changed my comparators (see original post) and now I get the following error:
The value of attribute "th:case" associated with an element type "div" must not contain the '<' character.
EDIT3:
Works now, updated original post to working code
According to the documentation, Thymeleaf's switch statement works just like Java's - and the example suggests the same.
In other words: you cannot do
<th:block th:switch="${reputation}">
<div th:case="${reputation} < 0">
[...]
but would need to do
<th:block th:switch="${reputation}">
<div th:case="0">
[...]
which is not what you want.
Instead, you will have to use th:if, i.e. something like this:
<div th:if="${reputation} < 0">
<img th:src="#{/css/img/troll.png}"/>
</div>
Change
<div th:case="0">
<img th:src="#{/css/img/smeagol.jpg}"/>
</div>
to
<div th:case="${reputation == 0}">
<img th:src="#{/css/img/smeagol.jpg}"/>
</div>

Python/Plone: Getting all keywords and showing for EDIT content is very slow

Python/Plone: Getting all keywords and showing for EDIT content is very slow (keywords.pt)
No of keywords is 20000 so traversing these huge no of keywords is taking one minute.
Keywords which no has grown large is taking time....any solution is welcomed
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:tal="http://xml.zope.org/namespaces/tal"
xmlns:metal="http://xml.zope.org/namespaces/metal"
xmlns:i18n="http://xml.zope.org/namespaces/i18n"
i18n:domain="plone">
<head><title></title></head>
<body>
<!-- Keyword Widgets -->
<metal:view_macro define-macro="view"
tal:define="kssClassesView context/##kss_field_decorator_view;
getKssClasses nocall:kssClassesView/getKssClassesInlineEditable;">
<div metal:define-macro="keyword-field-view"
tal:define="kss_class python:getKssClasses(fieldName,
templateId='widgets/keyword', macro='keyword-field-view');
uid context/UID|nothing"
tal:attributes="class kss_class;
id string:parent-fieldname-$fieldName-$uid">
<ul metal:define-slot="inside">
<li tal:repeat="item accessor"
tal:content="item"/>
</ul>
</div>
</metal:view_macro>
<metal:define define-macro="edit">
<metal:use use-macro="field_macro | context/widgets/field/macros/edit">
<tal:define metal:fill-slot="widget_body" define="contentKeywords accessor;
allowedKeywords python: context.collectKeywords(fieldName, field.accessor, widget.vocab_source);
site_props context/portal_properties/site_properties|nothing;
format widget/format | string:select;
allowRolesToAddKeywords site_props/allowRolesToAddKeywords|nothing;">
<div tal:condition="allowedKeywords" id="existingTagsSection">
<tal:comment tal:replace="nothing">
dl semantically associates selector name with values
</tal:comment>
<dl id="existingTags">
<label for="subject">
<dt id="existingTagsTitle">uuuuuuuuuuuuuuuuuuuuuuuuu
<span i18n:translate="label_select_existing_tags">
Select from existing tags.
</span>
</dt>
<span id="existingTagsHelp" class="formHelp" i18n:translate="label_existingTagsHelp">
Use Control/Command/Shift keys to select multiple tags.
</span>
<tal:comment tal:replace="nothing">
Type-to-skip functionality with javascript enabled
could be described as
"Hover and type the first letter to skip through tags."
However, on touch-driven devices, vertical hover typically
scrolls the page, so horizontal hover is necessary to enable this.
Alternatively, clicking any of the tags also enables type-to-skip.
So the help could technically be extended to handle this special case
as "Hover or click and type the first letter to skip through tags.",
but I think this would be confusing to the majority of users.
The decision at this point is to not try to explain any of this on the page.
</tal:comment>
</label>
<div class="visualClear"><!-- --></div>
<select id="predefined_subjects"
name="predefined_subjects:list"
size="14"
multiple="multiple"
tal:condition="python:format!='checkbox'"
tal:attributes="id string:${fieldName};
name string:${fieldName}_existing_keywords:list;">
<option value="#" tal:repeat="keyword allowedKeywords"
tal:content="keyword" tal:attributes="value keyword;
selected python:test(context.unicodeTestIn(keyword, value), 'selected', None)">
An existing tag
</option>
</select>
<tal:comment tal:replace="nothing">
These spans are hidden by css, and used by the JavaScript called below.
</tal:comment>
<span id="noTagsSelected" i18n:translate="label_noTagsSelected">No tags currently selected.</span>
<span id="oneOrMoreTagsSelected" i18n:translate="label_oneOrMoreTagsSelected">% tags currently selected.</span>
<tal:comment tal:replace="nothing">
Call js to modify this widget with both a scrollbar and checkboxes.
There may be a better place to put this js call;
examples exist in others' widget.py and js files,
but having it here covers cases where some but not all select elements
call js to be modified.
Todo: The #subject should eventually refer to the template variable.
</tal:comment>
<script type="text/javascript">
jq(document).ready( function() {
jq("#subject").multiSelect();
});
</script>
<input type="hidden"
value=""
tal:condition="not:field/required | nothing"
tal:attributes="name string:${fieldName}_existing_keywords:default:list" />
<tal:loop tal:repeat="keyword allowedKeywords"
tal:condition="python:format=='checkbox'">
<div class="ArchetypesKeywordValue" id=""
tal:attributes="id string:archetypes-value-${fieldName}_${repeat/keyword/number}">
<input class="blurrable"
tal:attributes="
type string:checkbox;
name string:${fieldName}_existing_keywords:list;
id string:${fieldName}_${repeat/keyword/number};
checked python:test(context.unicodeTestIn(keyword, value), 'checked', None);
value keyword" />
<label
tal:content="keyword"
tal:attributes="for string:${fieldName}_${repeat/keyword/number}">
An existing tag
</label>
</div>
</tal:loop>
</dl>
<dl id="selectedTagsSection">
<dt id="selectedTagsHeading" class="formHelp"></dt>
<dd id="selectedTags"></dd>
</dl>
<div class="visualClear"><!-- --></div>
</div>
<!-- <tal:condition condition="python:not widget.roleBasedAdd or (allowRolesToAddKeywords and [role for role in user.getRolesInContext(context) if role in allowRolesToAddKeywords])">-->
<dl id="newTagsSection">
<label for="subject_keywords">
<dt id="newTagsTitle">
<span i18n:translate="label_create_new_tags">
Create and apply new tags.
</span>
</dt>
<span id="newTagsHelp" i18n:translate="label_newTagsHelp" class="formHelp">
Enter one tag per line, multiple words allowed.
</span>
</label>
<br />
<dd id="newTags">
<textarea
id="entered_subjects"
name="subject:lines"
rows="4"
tal:attributes="id string:${fieldName}_keywords;
name string:${fieldName}_keywords:lines;"
tal:define="subject python:[item for item in value if not context.unicodeTestIn(item,allowedKeywords)]"
tal:content="python:'\n'.join(subject)">
A new tag
</textarea>
</dd>
</dl>
<!-- </tal:condition>-->
</tal:define>
</metal:use>
</metal:define>
<div metal:define-macro="search">
<div metal:use-macro="context/widgets/keyword/macros/edit">
</div>
</div>
</body>
</html>
I fear is a know issue of the old Plone 4 keyword widget.
You should probably change the widget.
An add-on like eea.tags should help.
You can also try to use Plone 5 widget from plone.app.widgets, but this probably it's a more complex task (and not without side effects).

listElement property not behaving as expected

Hi have created a JSFiddle of my problem here.
http://jsfiddle.net/L7o1nct6/2/
I will also repeat the code here as Stackoverflow is forcing me to do.
JavaScript
<!-- using fine uploader 5.1.3 at http://keysymmetrics.com/jsfiddle/jquery.fine-uploader.js -->
$(document).ready(function()
{
$("#fine-uploader").fineUploader({
listElement: $('#listElement'),
debug: true,
template: 'qq-template-bootstrap',
request: {
endpoint: "/my-endpoint"
}
});
});
HTML
<script type="text/template" id="qq-template-bootstrap" class="qq-uploader-selector">
<div class="row">
<div class="col-sm-4" >
<div class="qq-upload-button-selector
qq-upload-drop-area-selector
drag-drop-area" >
<div>Drag and drop files here or click to upload</div>
</div>
</div>
</div>
<div class="qq-upload-list-selector" id="#listElement" >
<div class="panel panel-default" >
<div class="panel-body" >
<div class="qq-progress-bar-container-selector progress">
<div class="qq-progress-bar-selector progress-bar"></div>
</div>
<span class="qq-upload-spinner-selector qq-upload-spinner"></span>
<span class="qq-upload-file-selector qq-upload-file"></span>
<span class="qq-upload-size-selector qq-upload-size"></span>
<span class="qq-upload-status-text-selector qq-upload-status-text"></span>
<img class="qq-thumbnail-selector" qq-max-size="100" />
</div><!-- close panel-body -->
</div><!-- close panel -->
</div>
</script>
<h1>Fine Uploader Test</h1>
<div id="fine-uploader"></div>
When viewing the JSFiddle example, if you open the debug console, you will see the message "Uncaught Error: Could not find the file list container in the template!".
I am unsure what this means, I thought I could use the listElement property to tell fine-uploader which element to use for this list?
On a side note, if I cut and paste the div with id=listElement and move it adjacent to the div with class=qq-upload-button-selector then this example works fine.
Any help would be appreciated, I have spent hours on this and haven't found an answer for this on stackoverflow either.
A couple issues with your code:
"#listelement" is not a valid html element ID in all browsers.
You are attempting to select an element that does not yet exist in the DOM. It's not clear why you are specifying a list element anyway. Fine uploader should find the list in the template when it renders.

Google Structured Data Testing Tool dont validate goodrelations extension

<div
itemscope="itemscope"
itemtype="http://schema.org/Product"
itemid="urn:mpn:123456789">
<link
itemprop="additionalType"
href="http://www.productontology.org/id/Lawn_mower">
<span
itemprop="http://purl.org/goodrelations/v1#category"
content="Lawn mower">
Lawn mower
</span>
</div>
There is above an fragment of my markup and when I put on Google Structured Data Testing Tool I'm receiving the error:
'Error: Page contains property "http://purl.org/goodrelations/v1#category" which is not part of the schema.'.
I was thinking about remove microdata from span tag and keep only the link tag above with microdata to make it validate.
On [http://www.productontology.org/doc/Lawn_mower] there is the statement : "Breaking news: schema.org has just implemented our proposal to define an additionalType property with the use of this service in mind!" and I think it means it is compatible.
This error can impact my SEO? There is some advise to me? I searched about it a lot and can't found anything related.
The final markup after #daviddeering help:
<div itemscope="itemscope" itemtype="http://schema.org/Product" itemid="urn:mpn:123456789">
<a href="http://127.0.0.1/jkr/123456789" itemprop="url">
<img itemprop="image" alt="Partnumber:123456789" src="http://127.0.0.1/jkr/img/123456789.jpg" content="http://127.0.0.1/jkr/img/123456789.jpg">
<span itemprop="name">123456789 - Bosh lawn mower</span>
</a>
<span>PartNumber: </span>
<span itemprop="mpn">123456789</span>
<span>Line: </span>
<span itemprop="additionalType" href="http://www.productontology.org/id/Lawn_Mower">Lawn mower</span>
<span>Manuf.: </span>
<div itemscope="itemscope" itemprop="manufacturer"
itemtype="http://schema.org/Organization"><span itemprop="name">Bosh</span>
</div>
<div itemprop="offers" itemscope="itemscope" itemtype="http://schema.org/Offer">
<meta itemprop="availabilityStarts" content="2013-10-20 05:27:36"><span itemprop="priceCurrency" content="USD">USS</span><span itemprop="price" content="565.29">565,29*</span>
<link itemprop="availability" href="http://schema.org/OutOfStock"><span itemprop="inventoryLevel" content="0">Ask for it</span>
</div>
</div>
Well the product schema must always include a name. And the structure of your last itemprop line was incorrect. So the following code tested fine in Google's testing tool:
<div
itemscope="itemscope"
itemtype="http://schema.org/Product"
itemid="urn:mpn:123456789">
<span itemprop="name">Name of Lawn Mower</span>
<link
itemprop="additionalType"
href="http://www.productontology.org/id/Lawn_mower">
<span rel="gr:hasBusinessFunction" resource="http://purl.org/goodrelations/v1#sell"
content="Lawn mower">
Lawn mower
</span>
</div>
Although in your case, I'm not sure if it's necessary to combine the product schema and the GoodRelations markup. You could create the entire markup using just GoodRelations, or you could use schema.org and simply use the tag [link
itemprop="additionalType"
href="http://www.productontology.org/id/Lawn_mower"] where it currently is in the code then continue using schema to mark up the rest.

Rich Snippets : Microdata itemprop out of the itemtype?

I've recently decided to update a website by adding rich snippets - microdata.
The thing is I'm a newbie to this kind of things and I'm having a small question about this.
I'm trying to define the Organization as you can see from the code below:
<div class="block-content" itemscope itemtype="http://schema.org/Organization">
<p itemprop="name">SOME ORGANIZATION</p>
<p itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
<span itemprop="streetAddress">Manufacture Street no 4</span>,
<span itemprop="PostalCode">4556210</span><br />
<span itemprop="addressLocality">CityVille</span>,
<span itemprop="addressCountry">SnippetsLand</span></p>
<hr>
<p itemprop="telephone">0444 330 226</p>
<hr>
<p>info#snippets.com</p>
</div>
Now, my problems consists in the following: I'd like to also tag the LOGO in order to make a complete Organization profile, but the logo stands in the header of my page, and the div I've posted above stands in the footer and the style/layout of the page doesnt permit me to add the logo in here and also make it visible.
So, how can I solve this thing? What's the best solution?
Thanks.
You can use the itemref attribute.
Give your logo in the header an id and add the corresponding itemprop:
<img src="acme-logo.png" alt="ACME Inc." itemprop="logo" id="logo" />
Now add itemref="logo" to your div in the footer:
<div class="block-content" itemscope itemtype="http://schema.org/Organization" itemref="logo">
…
</div>
If this is not possible in your case, you could "duplicate" the logo so that it’s included in your div, but not visible. Microdata allows meta and link elements in the body for this case. You should use the link element, as http://schema.org/Organization expects an URL for the logo property. (Alternatively, add it via meta as a separate ImageObject).
<div class="block-content" itemscope itemtype="http://schema.org/Organization">
…
<link itemprop="logo" src="logo.png" />
…
</div>
Side note: I don’t think that you are using the hr element correctly in your example. If you simply want to display a horizontal line, you should use CSS (e.g. border-top on the p) instead.
Dan, you could simply add in the logo schema with this code:
<img itemprop="logo" src="http://www.example.com/logo.png" />
So in your example, you could simply tag it as:
<div class="block-content" itemscope itemtype="http://schema.org/Organization">
<p itemprop="name">SOME ORGANIZATION</p>
<img itemprop="logo" src="http://www.example.com/logo.png" />
<p itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
<span itemprop="streetAddress">Manufacture Street no 4</span>,
<span itemprop="PostalCode">4556210</span><br />
<span itemprop="addressLocality">CityVille</span>,
<span itemprop="addressCountry">SnippetsLand</span></p>
<hr>
<p itemprop="telephone">0444 330 226</p>
<hr>
<p>info#snippets.com</p>
</div>
I believe that should work for your particular case and it won't actually show the logo and you wouldn't have to mark up the logo separately. Hope that helps.

Resources