how to remove WhiteSpaces in telegram instant view API? - xpath

I am trying to setup a Telegram Instant View for a website.
this website has a lot of empty tags that i don't know how to remove them.
<p style="text-align: justify;"> </p>
i want to find a way to remove and get rid of these kind of tags

If you want to remove just use #replace function:
#replace("\u00a0", ""): //nodes/text()
If you want to remove tags with spaces, call #remove and pass a predicate, that selects empty nodes after space normalization.
#remove: //nodes[not( normalize-space(text()) )]

I found it Ehasn:
First replace all " " content with sample string like "null-tag" the remove them using this code:
#replace("\u00a0", "null-tag"): //p
#remove: //p[.="null-tag"]
Note that it might change some contents check some pages to make sure it work fine for you.

Related

find xpath of label with non breaking space

I am trying to get xpath for following html code, but nothing seems to be working. I appreciate your suggestion. I need to get the xpath based on text.
<label class="label" securityidpath="ACCOUNTS_FS.PART_ACCOUNT_HEADER_FS.PART_ACCOUNT_STATUS" title="Part Account Status">Part Account Status:
</label>
FYI, I tried following variant xpath
//label[normalize-space(text())='Part Account Status:\u00a0']
//label[normalize-space(text())='Part Account Status:\u000a']
//label[normalize-space(text())='Part Account Status:\u202f']
and all the options as per following url https://en.wikipedia.org/wiki/Whitespace_character
Thank You,
Yougander
You can use
/label[normalize-space(text())='Part Account Status: ']
Or use the hexadecimal variant   instead of the decimal  .
Also note that XPath uses slashes (and not backslashes) to define paths, so referencing the root node label would be done by /label.
The typo lable instead of label is trivial.
BTW you can avoid the trouble with the entity by using the title attribute in your XPath:
label[normalize-space(#title)='Part Account Status']
your element name is wrong and change UTF character name as HexaDecimal entity values:
//label[normalize-space(text())='Part Account Status: ']
XPATH require forward slash

(Grafana Table) ${__cell} containing an apostrophe/single quote breaks the Query String to Kibana

In Grafana I've got a table panel which contains some names (one each row), that, if clicked, open a new window on Kibana passing via URL the name the user clicked (${__cell}) in order to Drill-Down that particular name.
This use to works fine, but I'm facing a problem when then name contains a special character such as "Identita' Digitale" (without double quote): as you can see it contains an apostrophe/single quote that breaks the query so the Kibana's URL becomes uncomplete.
Try
${__cell:lucene}
instead of
${__cell}
All special characters should be escaped for Lucene query. Actually, you need URL encode for your case - you may try other advanced formatting options.
Doc: http://docs.grafana.org/reference/templating/#advanced-formatting-options
Another dirty hackish solution, use JS to urlencode link in the onclick event, add this string at the end of your link definiton in the Grafana:
" onclick="location.href=encodeURI(this);
So in full HTML it will create link:
<a href="<URL>" onclick="location.href=encodeURI(this);">...
Syntax in my example can be wrong, it may need some minor changes to work properly. You can use jQuery in theory.

ImportXML and replacing quotes for enters

I'm trying to import a Google Play Store description into a Google spreadsheet, and that works fairly well with this formula:
=importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "//div[#itemprop='description']")
However, I'm running into the issue that this:
Keeping up with friends is faster than ever.<p>• See what friends are up to...</p>
Will be parsed as:
"Keeping up with friends is faster than ever.• See what friends are up to..."
Ideally I'd like to see the <p> tag replaced by a break, or at least a space. I've been trying the following formula
=importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "normalize-space(translate(//div[#itemprop='description'],'"',' '))")
but this removes every occurrence of &, q, u, o, t and ;
How can I replace these HTML tags for a break or space?
You can actually use this:
=join(char(10),IMPORTXML("https://play.google.com/store/apps/details?id=com.facebook.katana","//*[#jsname='C4s9Ed']"))
which gives you a newline for each element. Note that for the first example if you want to replace the •, you would want to sub that with a space or new line.
If you just want a space instead of a new line for either of those you can modify the char(10) to a " " instead.
here is another App page I tried it with:
=join(char(10),IMPORTXML("https://play.google.com/store/apps/details?id=com.facebook.orca","//*[#jsname='C4s9Ed']"))
Try:
=SUBSTITUTE(importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "//div[#itemprop='description']"), "•"," ")

Processing form input in a Joomla component

I am creating a Joomla component and one of the pages contains a form with a text input for an email address.
When a < character is typed in the input field, that character and everything after is not showing up in the input.
I tried $_POST['field'] and JFactory::getApplication()->input->getCmd('field')
I also tried alternatives for getCmd like getVar, getString, etc. but no success.
E.g. John Doe <j.doe#mail.com> returns only John Doe.
When the < is left out, like John Doe j.doe#mail.com> the value is coming in correctly.
What can I do to also have the < character in the posted variable?
BTW. I had to use & lt; in this question to display it as I want it. This form suffers from the same problem!!
You actually need to set the filtering that you want when you grab the input. Otherwise, you will get some heavy filtering. (Typically, I will also lose # symbols.)
Replace this line:
JFactory::getApplication()->input->getCmd('field');
with this line:
JFactory::getApplication()->input->getRaw('field');
The name after the get part of the function is the filtering that you will use. Cmd strips everything but alphanumeric characters and ., -, and _. String will run through the html clean tags feature of joomla and depending on your settings will clean out <>. (That usually doesn't happen for me, but my settings are generally pretty open to the point of no filtering on super admins and such.
getRaw should definitely work, but note that there is no filtering at all, which can open security holes in your application.
The default text filter trims html from the input for your field. You should set the property
filter="raw"
in your form's manifest (xml) file, and then use getRaw() to retrieve the value. getCmd removes the non-alphanumeric characters.

removing <br/> from GET request

I'm using a get request to get some page data but need to strip the break tags from the finished file. Basically what I'm doing is taking the output of the get request and saving it to a file but it has hundereds of break tags in it I need removed. I'm fine with running a batch or vb script after the file is saved to remove the tags but I'm not sure how on how to do that either. So far the only solutions I have seen is to remove entire lines.
EDIT: This will be deployed to multiple Windows servers so I would like to keep the requirements as minimal as possible. I.E. commands/software that Windows has by default.
If you're au fait with Python, you could use Beautiful Soup to remove <br /> elements in a fairly robust manner. See here for how to remove elements from the tree.
Unless I have misunderstood you could replace the break tags using the replace function in vbscript (assumed from the tag). For example:
cleanedText = Replace(rawText,"<br/>",""))
More information on usage can be found here
http://www.w3schools.com/Vbscript/func_replace.asp
It is worth mention though that that function acts verbatim so you might have to run through a few times to get all common tag markup:
cleanedText = Replace(rawText,"<br/>","")) //no spaces
cleanedText = Replace(cleanedText,"<br />","")) // a space
cleanedText = Replace(cleanedText,"<br>","")) // unterminated

Resources