xpath navigating documents in a collection with sibling - xpath

I have a collection of 1000s of TEI documents in the variable $data (note: the top-most node in each document is tei:TEI).
Within Xpath (under Xquery) I can output all the #type="example" documents with:
let $coll := $data/tei:TEI[#type="example"]
return $coll
I can then further extract one document from the above result with:
let $coll := $data/tei:TEI[#type="example"]
return $coll[#xml:id="TC0005"]
The above work fine.
Now, I would like to get the documents before and after a certain document, which I assume could be done with preceding-sibling / following-sibling:
let $coll := $data/tei:TEI[#type="example"]
return ($coll[#xml:id="TC0005"]/preceding-sibling[1],
$coll[#xml:id="TC0005"],
$coll[#xml:id="TC0005"]/following-sibling[1])
However the above only returns the document for $coll[#xml:id="TC0005"].
Is this syntax correct for navigating document to document within the collection?
Many thanks.

In a collection or sequence of document nodes you don't have any siblings, siblings only exists in each document tree, so I think you simply want positional access in the form of
let $examples := $data/tei:TEI[#type="example"]
for $example at $pos in $examples
where $example/#xml:id = 'TC0005'
return (
$examples[$pos - 1],
$example
$examples[$pos + 1]
)
That is based on my understanding of XQuery and XPath sequences, I hope it applies to your collection in an XQuery database as well.

Related

Is it possible to get data contained in another document by id, when map function is running for some document in couchbase view?

I have two kinds of documents in my couchbase bucket with keys like -
product.id.1.main
product.id.2.main
product.id.3.main
and
product.id.1.extended
product.id.2.extended
product.id.3.extended
I want to write a view for documents of first kind, such that when some conditions are matched for a document, I can emit the attributes contained in the documents of first kind as well as the document of second kind.
Something like -
function(doc, meta){
if((meta.id).match("product.id.*.main") && doc.attribute1.match("value1"){
var extendedDocId = replaceMainWithExtended(meta.id)
emit(meta.id, doc.attribute1 + getExtendedDoc(extendedDocId).extendedAttribute1 );
}
}
I want to know how to implement this kind of function in couchbase views -
getExtendedDoc(extendedDocId).extendedAttribute1

Is it possible to filter the descendant elements returned from an XPath query?

At the moment, I'm trying to scrape forms from some sites using the following query:
select * from html
where url="http://somedomain.com"
and xpath="//form[#action]"
This returns a result like so:
{
form: {
action: "/some/submit",
id: "someId",
div: {
input: [
... some input elements here
]
}
fieldset: {
div: {
input: [
... some more input elements here
]
}
}
}
}
On some sites this could go many levels deep, so I'm not sure how to begin trying to filter out the unwanted elements in the result. If I could filter them out here, then it would make my back-end code much simpler. Basically, I'd just like the form and any label, input, select (and option) and textarea descendants.
Here's an XPath query I tried, but I realised that the element hierarchy would not be maintained and this might cause a problem if there are multiple forms on the page:
//form[#action]/descendant-or-self::*[self::form or self::input or self::select or self::textarea or self::label]
However, I did notice that the elements returned by this query were no longer returned under divs and other elements beneath the form.
I don't think it will be possible in a plain query as you have tried.
However, it would not be too much work to create a new data table containing some JavaScript that does the filtering you're looking for.
Data table
A quick, little <execute> block might look something like the following.
var elements = y.query("select * from html where url=#u and xpath=#x", {u: url, x: xpath}).results.elements();
var results = <url url={url}></url>;
for each (element in elements) {
var result = element.copy();
result.setChildren("");
result.normalize();
for each (descendant in y.xpath(element, filter)) {
result.node += descendant;
}
results.node += result;
}
response.object = results;
» See the full example data table.
Example query
use "store://VNZVLxovxTLeqYRH6yQQtc" as example;
select * from example where url="http://www.yahoo.com"
» See this query in the YQL console
Example results
Hopefully the above is a step in the right direction, and doesn't look too daunting.
Links
Open Data Tables Reference
Executing JavaScript in Open Data Tables
YQL Editor
This is how I would filter specific nodes but still allow the parent tag with all attributes to show:
//form[#name]/#* | //form[#action]/descendant-or-self::node()[name()='input' or name()='select' or name()='textarea' or name()='label']
If there are multiple form tags on the page, they should be grouped off by this parent tag and not all wedged together and unidentifiable.
You could also reverse the union if it would help how you'd like the nodes to appear:
//form[#action]/descendant-or-self::node()[name()='input' or name()='select' or name()='textarea' or name()='label'] | //form[#name]/#*

Can I use linq to join two result sets on an ordinal/ index #?

I'm trying to use linq to objects with html agility pack to join two result sets on their relative ordinal position. One set is a list of headers, the other is a set of tables, with each table corresponding to one of the header values. Each set has a count of five. I've read the post here which looks very similar, but can't get it to translate to my purposes.
Here is what I'm using to get the two html node collections:
HtmlNodeCollection ratingsChgsHdrs = htmlDoc.DocumentNode.SelectNodes("//div[#id='calendar-header']");
HtmlNodeCollection ratingsChgsTbls = htmlDoc.DocumentNode.SelectNodes("//table[#class='calendar-table']");
The collection ratingsChgsHdrs contains the headers for each of the tables in ratingsChgsTbls, within the InnerText property. The end result I'm looking for is one result set consisting of all of the rows from all five tables, with the header value added as a property to each row. I hope that is clear.. any help would be great.
This might work:
ratingsChgsHdrs.Select((x, i) => new { x, ratingsChgsTbls.ElementAt(i) });

Querying M:M relationships using Entity Framework

How would I modify the following code:
var result = from p in Cache.Model.Products
from f in p.Flavours
where f.FlavourID == "012541-5-5-5-651"
select p;
So that f.FlavourID is supplied a range of ID's as a supposed to just one value as shown in the above example?
Given the following ERD Model:
Products* => ProdCombinations <= *Flavours
ProdCombinations is a junction/link table and simply has one composite key in there.
Of the top of my head
string [] ids = new[]{"012541-5-5-5-651", "012541-5-5-5-652", "012541-5-5-5-653"};
var result = from p in Cache.Model.Products
from f in p.Flavours
where ids.Contains(f.FlavourID)
select p;
There are some limitations, but an array of ids has worked for me before. I've only actually tried with SQL Server backend, and my IDs were integers.
As I understand it, Linq needs to translate your query into SQL, and it's only possible sometimes. For example it's not possible with IEnumerable<SomeClass>, which produces a runtime error, but possible with a collection of simple types.

Rearranging active record elements in Yii

I am using a CDbCriteria with its own conditions, with & order clauses. However, the order i want to give to the elements in the array is way too complex to specify in the order clause.
The solution i have in mind consists of obtaining the active records with the defined criteria like this
$theModelsINeed = MyModel::model()->findAll($criteria);
and then rearrange the order from my php code. How can i do this? I mean, i know how to iterate through its elements, but i don´t know if it is possible to actually change them.
I have been looking into this link about populating active records, but it seems quite complicated and maybe someone could have some better advice.
Thanks
There is nothing special about Yii's active records. The find family of methods will return an array of objects, and you can sort this array like any other array in PHP.
If you have complex sort criteria, this means that probably the best tool for this is usort. Since you will be dealing with objects, your user-defined comparison functions will look something like this:
function compare($x, $y)
{
// First sort criterion: $obj->Name
if ($x->Name != $y->Name) {
return $x->Name < $y->Name ? -1 : 1; // this is an ascending sort
}
// Second sort criterion: $obj->Age
if ($x->Age != $y->Age) {
return $x->Age < $y->Age ? 1 : -1; // this is a descending sort
}
// Add more criteria here
return 0; // if we get this far, the items are equal
}
If you do want to get an array as a result, you can use this method for fetching data that supports dbCriteria:
$model = MyModel::model()->myScope();
$model->dbCriteria->condition .= " AND date BETWEEN :d1 AND :d2";
$model->dbCriteria->order = 'field1 ASC, field2 DESC';
$model->dbCriteria->params = array(':d1'=>$d1, ':d2'=>$d2);
$theModelsINeed = $model->getCommandBuilder()
->createFindCommand($model->tableSchema, $model->dbCriteria)
->queryAll();
The above example shows using a defined scope and modifying the condition with named parameters.
If you don't need Active Record, you could also look into Query Builder, but the above method has worked pretty well for me when I want to use AR but need an array for my result.

Resources