how do i add a comment to an xpath? - xpath

for example i have an xpath and wish to add a comment near it to identify it.
/html/body/div/table/tr/td/a{this is a link}

XPATH 2.0 does allow comments.
From http://www.w3.org/TR/xpath20/#comments:
Comments may be used to provide informative annotation for an
expression. Comments are lexical constructs only, and do not affect
expression processing.
Comments are strings, delimited by the symbols (: and :). Comments may
be nested.
A comment may be used anywhere ignorable whitespace is allowed (see
A.2.4.1 Default Whitespace Handling).
The following is an example of a comment:
(: Houston, we have a problem :)
Bad news if we ever need to parse XML containing emoticons! :-)
As an aside - as I was looking for this info in the context of working with Tibco Designer for BusinessWorks v5.x, where comments can be added within the TIBCO Designer XPATH formula builder using:
{-- Houston, we've had a problem --}

Not a comment syntax, but you can give string literals as predicate, which evaluates as true (imho) and should not change the outcome of the expression. I don't know if this has big performance drawbacks.
/html/body/div/table["this is"]["a table"]/tr/td/a["this is a link"]
But like mjv said, I also would stick to the syntax of the host language.

2019 edit
As pointed out in #Sepster's reply and elsewhere, starting with XPath 2.0, comments became possible with their cute "smiley face"-looking syntax. I'm only about 10 years late in editing this reply to mention very useful fact ;-)
Original reply c. 2009 (assumed XPATH 1.0)
No, the XPATH syntax doesn't allow to embed comments within the path string.
This is typically not a significant limitation because paths are usually short and a comment can be placed nearby, in the particular syntax of the host language (XSLT, C#, whatever...)

Related

What is spifno1stsp really doing as a rsyslog property?

I was reading the template documentation of rsyslog to find better properties and I stumble upon this one:
spifno1stsp - expert options for RFC3164 template processing
However, as you can see, the documentation is quite vague. Moreover, I have not been able to find a longer explanation anywhere. The only mentions found with Google are always about the same snippet or the same very short description.
Indeed, there is no explanation of this property:
on the entire rsyslog.com website,
or in the RFC3164,
or anywhere else actually.
It is like everybody copy & paste the same snippet here and there but it is very difficult to understand what it is actually doing.
Any idea ?
Think of it as somewhat like an if statement. If a space is present, don't do anything. Otherwise, if a space is not present, add a space.
It is useful for ensuring that just one space is added to the output, often between two strings.
For any cases like this that you find where the docs can be improved please feel free to open an issue with a request for clarification in the official GitHub rsyslog documentation project. The documentation team is understaffed, but team members will assist where they can.
If you're looking for general help, the rsyslog-users mailing list is also a good resource. I've learned a lot over the years by going over the archives and reading prior threads.
Back to your question about the spifno1stsp option:
While you will get a few hits on that option, what you'll probably find more results on is searching for the older string template option, sp-if-no-1st-sp. Here is an example of its use from the documentation page you linked to:
template(name="forwardFormat" type="string"
string="<%PRI%>%TIMESTAMP:::date-rfc3339% %HOSTNAME% %syslogtag:1:32%%msg:::sp-if-no-1st-sp%%msg%"
)
Here is the specific portion that is relevant here:
`%msg:::sp-if-no-1st-sp%%msg%`
From the Property Replacer documentation:
sp-if-no-1st-sp
This option looks scary and should probably not be used by a user. For
any field given, it returns either a single space character or no
character at all. Field content is never returned. A space is returned
if (and only if) the first character of the field’s content is NOT a
space. This option is kind of a hack to solve a problem rooted in RFC
3164: 3164 specifies no delimiter between the syslog tag sequence and
the actual message text. Almost all implementation in fact delimit the
two by a space. As of RFC 3164, this space is part of the message text
itself. This leads to a problem when building the message (e.g. when
writing to disk or forwarding). Should a delimiting space be included
if the message does not start with one? If not, the tag is immediately
followed by another non-space character, which can lead some log
parsers to misinterpret what is the tag and what the message. The
problem finally surfaced when the klog module was restructured and the
tag correctly written. It exists with other message sources, too. The
solution was the introduction of this special property replacer
option. Now, the default template can contain a conditional space,
which exists only if the message does not start with one. While this
does not solve all issues, it should work good enough in the far
majority of all cases. If you read this text and have no idea of what
it is talking about - relax: this is a good indication you will never
need this option. Simply forget about it ;)
In short, sp-if-no-1st-sp (string template option) is analogous to spifno1stsp (standard template option).
Hope that helps.

Find but skip strings and comments?

One thing that constantly annoys me about VS is that when I do a Find or Find all, it looks in comments, strings, and other places. When I'm trying to find a particular bit of code, like and rent, it finds it all over. Is there a way to limit searches just to code?
Not sure if there is a specific setting to ignore comments, but you could do a regex find. For example, assuming you want to find "text", you could use this:
^(?!\s*?//).*?text
Caveats:
Assumes comments start with // as first non-whitespace characters. E.g. C# comment types
Doesn't work for comments at the end of code lines (only comments on their own lines)
Doesn't work with block comments, for example /* comment */
So overall it isn't perfect by any means, but depending how many hits you are getting, it might help to cut them down which can be useful if you have a lot of false positives in one-liner comments
The 'Find All References' function may suit you : it ignores all commented-out code and text in strings. CTRL+K, R is the keyboard shortcut.
(Note that it's designed for going from a specific instance of a search string to all other instances. so if you haven't already found an instance of what you're searching for, you would have to (temporarily) type one in to the editor window, then search. Also it's not available for all languages : I know it works fine for C#, though.)

Search using Xapian Omega - with Wild Cards or Regular Expressions

We are confronting different search engines for our research
archives and having browsed the Xapian-Omega documentation, we
decided to try it out since the Omega option appears to be an
appropriate solution with several interesting search options.
We installed Xapian-Omega on a Linux Server (Deb 7) and tested
the setup with success. However we are unsure as to how one can
employ or perhaps even enable the use of Wild Cards or Regular
Expressions with Xapian-Omega.
We read that for Xapian one has to enable the Wild Card option
"QueryParser flags"
Could someone clarify this ?
ie. explain with or indicate a page with an example or two.
But we did not see much information regarding examples with Omega
CGI and although this latter runs well, wild card options
(such as * for the general wild card and ? as a single character),
do not seem to work as expected by default and they would be
useful, even though stemming and substrings etc may be functional.
Eg: It would be interesting to be able to employ standard simple
wild char searches with a certain precision such as :
medic* for medicine medical medicament
or with ? for single characters
Can Regexp be recognised with Omega ?
eg : sep[ae]r[ae]te(\w+)?
or searching for structured formats such as Email or Credit Card
Numbers or certain formula types in research papers etc.
In a note from Olly Betts long ago (Dev Mailing List) regarding
this one suggestion was to grep the index file but this would
defeat the RAD advantage of Omega.
Any examples of searches using Omega with Wild Cards or Regular
Expressions would be most appreciated ... even an indication of
a page where information regarding this theme is well presented
with examples illustrating how to develop advanced searches
using Xapian alone would be most welcome (PHP or Python perhaps).
(We are not concerned for the moment about the eventual
substantial increase in the size of the index size or in the
time to index the archive)
You can enable right-wildcards (such as "medic*") in Omega using $set{flag_wildcard,1} (covered in the Omegascript documentation), which enables FLAG_WILDCARD. There's a section in the user manual on using wildcards.
Xapian doesn't provide support for regular expression searching, although in theory I believe it would be possible to support, if potentially costly (depending on the regex). It would have to run the regular expression against unstemmed terms in the database, and then feed them into the search. Where it becomes difficult is if the regex expands to a lot of terms (eg just 'a' as a regex). There's also some subtlety in making it efficient; it's easy to jump through the term list to something with a constant prefix, and you'd want to take advantage of that if possible.
For your example of sep[ae]r[ae]te(\w+)?, it sounds like you actually want a combination of spelling correction (for the a-e substitutions, which you can enable using $set{flag_spelling_correction,1}) and stemming (for the trailing letters after 'te'; Omega defaults to English stemming, but that can be changed), or either wildcard or partial match support.
If you do need regular expressions for your use case, then I'd suggest bringing it up on the xapian-discuss mailing list. Xapian has moved on since the last discussion, and I believe it would be easier to build such support now than it was then.
James Ayatt: Thank you for your answer and help, my apologies for this belated reply, a distraction with other work.
We had already seen the Omegascript page but it was not clear to us how to employ these options with the CGI interface. Also the use of * seems to be for trailing chars, is that correct ? ie not for internal groups of words eg: omeg*ipt; there are cases where the stemming option would not be sufficient. We did not see an option for single wild chars, sometimes represented by ? in certain search engines. Could you comment here ?
Regarding the use of regular expressions we had immagined that it might not be quite as simple as one could hope. The examples mentioned in the preceding post were of course simple possible uses, there are of course many more. Your comment on using the stemming option seems appropriate.
In certain cases it could be interesting to enable some type of regexp option for the extraction of text forms, such as those mentioned. The quick extractiion of such text, perhaps together with some surrounding text could be very useful.
We will certainly try your proposal with the mailing list.
Thank you again.

Recommendation on using abbreviations in CamelCase from Code Complete

In the latest code review I was asked, why did I change the method name from GetHDRFrame to GetHdrFrame, while HDR is an abbreviation. I'm pretty sure there was such recommendation in Code Complete: when using abbreviations in CamelCase names, treat them as regular words. But I cannot find the place where it is written. Could somebody give me an exact phrase in Code Complete, where it is stated?
There is a similar question with the useful links to MS rules, but I'm looking for Code Complete quote.
As far as I've been able to determine, there is no such advice in Code Complete. But it does say:
People have managed to have zealous, blistering debates over fine
points such as whether the first character in a name should be
capitalized (TotalPoints vs. totalPoints), but as long as you and
your team are consistent, it won't make much difference.
And that may help you avoid such nitpicking in future code reviews. ;-)

Freemarker ".vars" names can't contain dashes?

We're using Freemarker version 2.3.16, and I've just tracked down a weird bug in one of our apps. It came down to there now being hyphens in some of our product code strings. The codes are used to pull hashes of localized text from the global scope using .vars.
Reducing the issue brought me to an example that anyone can try:
${.vars["foo-bar"]} in a template outputs 0
${.vars["foo+bar"]} outputs nullnull
${.vars["foobar"]} correctly triggers an InvalidReferenceException
All three should trigger exceptions. Instead, it appears the .vars parameter string is being evaluated! :-(
http://freemarker.sourceforge.net/docs/app_faq.html#faq_strange_variable_name implies this should work.
I saw mention of a similar issue a few weeks ago on the Freemarker mailing list, and it was suggested to prefix the parameter string with "#". That might work with other hashes, but it does NOT work with .vars. I just took a working example (.vars["resources_title"]) and changing it made it throw an InvalidReferenceException (.vars["#resources_title"]). I also tried it on the hyphenated reference, and it also threw the exception.
Upgrading to 2.3.18 did not seem to make a difference.
Sorry for the delay. After some good mailing-list help on places to put breakpoints, here's I wrote back to the list on June 10th:
Short story: It's not a Freemarker issue. Rather the Struts team chose to hard-wire Freemarker to treat .vars names as OGNL expressions, and there seems no way to tell OGNL to not parse them. So under Struts, "-" and "+" (and possibly other characters) cannot appear in .vars names.
Long story...
freemarker.core.BuiltinVariable (line 192) is where Freemarker starts to process .vars expressions
freemarker.core.Environment (line 1088) hands control over to the "rootDataModel" which the Struts team hard-wired to be an instance of org.apache.struts2.views.freemarker.ScopesHashModel
line 70 of that class (using version 2.1.8.1 of Struts) calls "stack.findValue"; "stack" has been wired to be an instance of com.opensymphony.xwork2.ognl.OgnlValueStack
at line 236 this class in turn asks an instance of OgnlUtil to find the object, and that's where the name is assumed to be an OGNL expression and is parsed, turning "foo-bar" into ( foo - bar )
At no point along the way does there seem to be a choice to NOT treat the .vars name as an expression (a comment in FreemarkerResult hints at the possibility, but the code doesn't follow through). In theory I could have my implementation of FreemarkerManager create a variant of ScopesHashModel, but that would take a lot of work to change all the associated classes with it.
(Nor does there seem to be a way to escape "-" characters in OGNL expressions. Seems there was discussion 5-6 years ago to do this, but.... .vars( "foo\\-bar" ) fails on finding "-" after "\", so presumably "-" isn't escapable?)
:-(
I'm not clear what the use-case is for treating .vars names as expressions... but I don't think Struts is going to change, now. Rather than override a half dozen Struts classes, I instead changed the code that loads our ResourceBundles into the value stack: it now changes the names to replace "-" and "_", and likewise my .vars names are changed the same way in the template and... tada. It works. Woo.
Works for me. And like already mentioned on the freemarker-user mailing list: maybe you use a strange data model, or even a fancy ObjectWrapper. But a discussion like this is probably better suited for the freemarker-user mailing list...
It works if it added with escape foo\-bar.
"Only single backslash"
Since freemarker version 2.3.22 is it possible to use dot (.), minus sign (-) or colon (:) in a variable name (details here).
In my case, it fails if I tried to use with freemarker 2.3.21 variables like :
api["x-link"]
If I change freemarker to version 2.3.22 it works.

Resources