Is there an XPath equivilent for Linq to XML? - linq

I have been using Linq to XML for a few hours and while it seems lovely and powerful when it comes to loops and complex selections, it doesn't seem so good for situations where I just want to select a single node value which XPath seems to be good at.
I may be missing something obvious here but is there a way to use XPath and Linq to XML together without having to parse the document twice?

You can still use XPath, with the XPathEvaluate, XPathSelectElement and XPathSelectElements extension methods. You can also call CreateNavigator to create an XPathNavigator.

Related

How to search elements matching an xpath expression in emacs nxml-mode?

Is there a way to interactively search for a nodes that matches a given xpath expression in emacs?
I would like something similar to re-forward-search but instead of using a regular expression I'd type an xpath expression.
I don't have an answer wrt XPath queries; sorry. But you might try Icicles search search keys M-s M-s x and M-s M-s X (commands icicle-search-xml-element and icicle-search-xml-element-text-node).
These let you search the contents and the text() nodes, respectively, of top-level XML elements whose names match a regexp that you provide.
For icicle-search-xml-element, can have any of these
forms:
<ELEMENTNAME>...</ELEMENTNAME>
<ELEMENTNAME ATTRIBUTE1="..."...>...</ELEMENTNAME>
<ELEMENTNAME/>
<ELEMENTNAME ATTRIBUTE1="...".../>
You can alternatively choose to search, not the search contexts as
defined by the element-name regexp, but the non-contexts, that is, the
buffer text that is outside such elements. To do this, use `C-M-~'
during completion. (This is a toggle, and it affects only future
search commands, not the current one.)
For icicle-search-xml-element-text-node, the top-level matching elements must not have attributes. Only top-level elements of the form <ELEMENTNAME>...</ELEMENTNAME> are
matched.
HTH.
I did something like that a long time ago. I can't give you any details now, but I'll provide an overview of the approach I took.
I created some Emacs functions to interact with (query) a native XML database. I did it with a MarkLogic server once and with a Berkley DB XML database another time. One of those functions simply queried the database. Another one of the functions would send an XQuery query that included an Emacs buffer or buffer selection.
The native XML database server would process the query, return the results, and my Emacs functions would render the result in a result buffer.
This approach allowed me to query the XML with XPath and XQuery, which is a much more powerful query language that includes XPath. (I wrote about XQuery a long time ago, here: https://www.ibm.com/developerworks/library/x-xqueryxpath/)
As difficult as all of this might sound, it turned out to be surprisingly easy.

Get the inner XML using XPath?

This is my XML
<my_xml>
<record>
<p>hello <b>world</b> this is some html</p>
</record>
</my_xml>
Can I use XPath to return the following?
<p>hello <b>world</b> this is some html</p>
my_xml/record/child::*
child::* selects all element children of the context node
see details
The quick answer is, no. You can't accomplish this with XPath, but, once you select the parent node (i.e. "record" in your example), you should be able to manipulate it in whichever language you are using to parse the XML. Unfortunately, it may not be "easy".
It sounds like you would want something like the innerHTML property, but for XML DOM instead of the HTML DOM. Unfortunately, nothing like this exists for the XML DOM. If you don't care about the nodes themselves, you could use the textContent property; in the case of your example, you would get "hello world this is some html", which doesn't seem to be what you want.
Check out this similar question, which includes a parsing algorithm in Java. It seems that you will need to write a similar algorithm in whichever language you're using to parse the XML.
For anyone looking for this in the future, this IS very much possible to do using a DOT, that will return the entire node content as text (at least in MSSQL xpath it does).
'(/my_xml/record/.)[1]'

Nokogiri: ids Vs hierarchy xpath performance

I have to write down the xml schema for a dataset which is hierarchically organized. It has to be parsed by Nokogiri for information retrieval. My question is, under a performance point of view, is it better to respect the hierarchy or to flatten it?
E.g.
<item_1 id="id_1">
<item_2 id="id_2">value</item_2>
</item_1>
or
<item id_1="id_2" id_2="id_2">value</item>
I know that multiple attributes should be avoided as far as readability and maintainability are concerned, but performance is my priority.
If you want the absolute fastest performance and the documents are large, you probably don't want to use XPath at all. A SAX (or Reader) filter will be the fastest.
But if you are going to have Nokogiri parse the document and create a DOM for XPath, I don't think it will make much difference whether you query using:
doc.xpath('/item1[#id=x]/item2[#id=y]') #first case
or
doc.xpath('/item[#id_1=x and #id2=y]') #second case
Of course, benchmarking these two solutions against your real data is the only way to know for sure.

Xpath best practices

I have a readonly xml file and I have a set of xpath values.
I need to create a function which would take in the xpath and return the value(s) corresponding to the xpath.
I am a little confused regarding what would be the best way to proceed. The options I am thinking are using the regular XPathDocument/Navigator/Iterator classes or using LINQ to xml.
The function I am trying to implement is:
T GetString(string inputXpath) where T could be bool/string/array etc.
Can someone help?Also, this function is going to be called all across the application, so performance might be a consideration.
Thank you!
-Agent
What you want to write will just return:
XpathNavigator.Evaluate(inputXpath);
Obviously, T must be just... object :)
Read the XpathNavigator.Evaluate() documentation here.

Is it possible to alter a php file using XPath?

I"m unsure about this. Would having PHP ( or I guess any template language like Django's or Mako or whatever ) inside an html file prevent me from making changes to it with XPath?
I'm very new to XPath. I would think that you could not, but as I said, I'm unsure.
Xpath is a query language. You use it to query XML content, not change it.
You can use Xpath in conjunction with other technologies (XSLT is the first one that comes to mind) in order to query you XML and then use the results of these queries to transform your XML.
XPath doesn't change the XML document.
Use XSLT or a any other XPath-hosting language that can produce a new XML document.

Resources