Value without fraction - jstl

How can I print a value without fraction? For example:
${bites/1024}
where bites equals to 100000. Output is 97,65625. I want to print 97 instead.

You can do this with jstl format numbers.
Here's a example.
<%# taglib uri="http://java.sun.com/jsp/jstl/fmt" prefix="fmt" %>
<c:set var="bites" value="100000" />
<c:set var="result" value="${bites/1024}"/>
<fmt:formatNumber value="${result-(result%1)}" pattern="#"></fmt:formatNumber>
pattern="#" - remove decimal places,you can specify your own decimal places if you want.
n-(n%1) - equation result to floor value.
Or with combination of substring in jstl functions.
<%# taglib uri="http://java.sun.com/jsp/jstl/fmt" prefix="fmt" %>
<%# taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="fn" %>
<fmt:formatNumber value="${fn:substring(bites/1024, 0, 3)}" type="NUMBER"></fmt:formatNumber>

Related

How to remove repeated HTML elements except first one?

I have an HTML file with some repeated text along the document. The repeated strings have font size 4 or 5 and my goal is to delete all those
repeated strings except the first appeareance.
For example:
India! with size=5 appears 9 times and with size=4 appears 2 times. Then I'd like to remove all appeareances of India with size=5 and leave the first.
India!
I've tried with sed command in bash (I'm open to suggestions to do it with other tools) doing as below, but doesn't work because removes everything after the first match:
sed 's/<font size=\"[4-5]\".*<\/font>//g'
and I get as output only this:
<!DOCTYPE html> <html> <body>
<h1>Some header</h1>
<p> </p>
<p> This is other text. </p>
</body>
</html>
My input file is this:
<!DOCTYPE html>
<html>
<body>
<h1>Some header</h1>
<p>
<font size="5">India!</font>
<p>
<font size="4">Japan!</font>
</p>
</p>
<p>Some text 1</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 2</p>
<p>
<font size="5">India!</font>
<p>
<font size="4">Japan!</font>
</p>
</p>
<p>Some text 3</p>
<p>
<font size="5">Uganda!</font>
</p>
<p>Some text 4</p>
<p>
<font size="5">India!</font>
<p>
<font size="4">Japan!</font>
</p>
</p>
<p>Some text 5</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 6</p>
<p>
<font size="5">Cameroon!</font>
</p>
<p>Some text 7</p>
<p>
<font size="4">India!</font>
</p>
<p>Some text 8</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 9</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 10</p>
<p>
<font size="5">Pakistan!</font>
</p>
<p>Some text 11</p>
<p>
<font size="5">Pakistan!</font>
</p>
<p>Some text 12</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 13</p>
<p>
<font size="4">Uganda!</font>
</p>
<p>
<font size="5">India!</font>
</p>
<p>Some text 14</p>
<p>
<font size="4">India!</font>
</p>
<p> This is other text. </p>
</body>
</html>
I show in image below the input(to the left) and output desired(to the rigth) in text format and HTML preview.
As you requested in your comment, here is a slightly different program to remove the associated paragraph tags as well.
In order to remove the <p> and </p> before and after the lines you want removed ( the duplicates ), I found it conceptually easier to run through the file twice.
The first pass through the file, I keep track of whether or not I've seen the combination of font size and country just as before. In addition, I also track the line numbers (FNR) of the lines that need to be removed. The code "knows" the first pass through the file when NR == FNR. NR is total number of records so far and FNR is the record number in the file. Thus, when they are equal, awk is parsing the first file.
In the second pass through the same input file, I print out the current record if it is not marked as suppressed. The FNR is used to index the suppressed array because FNR is the same in the first pass as the second pass of the file.
Lastly, in order to tell awk to parse the file twice, we'll need to pass the input file to awk twice on the command line.
Here's the revised code. I also illustrate how to parse your input file twice by adding the file (let's call it input.html) two times to the command line:
awk -F"[\"<>= ]*" '
NR == FNR {
if ( $2 == "font" )
{
if (seen[ $4,$5 ] )
suppress[ NR - 1 ] = suppress[ NR ] = suppress[ NR + 1 ] = 1
seen[$4,$5] = 1
}
next
}
! suppress[ FNR ]
' input.html input.html
Here's an awk 'solution' for you:
awk -F"[\"<>= ]*" '
$2 == "font" {
if (!printed[ $4,$5 ] )
print
printed[$4,$5] = 1
next
}
1
'
Since awk is not a robust HTML parser, it's really not a great general solution. However, if your input files are consistent, this small script may do the trick.

Remove comments from XHTML using XSLT

How can i remove commented lines from XHTML using XSLT ,
Example : test.xtml
<html>
<head/>
<body>
This is Test Code.
<!--Test Comment -->
</body>
</html>
XSLT :: The below XSLT gives warning " Severity: warning Description: Ambiguous rule match for html[1]/body[1]/comment()[1] Matches both "comment()" on line 10 of remove_comment.xsl
and "node()|#*" on line 4 of remove_comment.xsl "
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="comment()"/>
Give the template on line 10 with match="comment()" a higher priority e.g. <xsl:template match="comment()" priority="5"/>.

XPath in Nokogiri returning empty array [] whereas I am expecting to have results

I am trying to parse XML files using Nokogiri, Ruby and XPath. I usually don't encounter any problem but with the following I can't make any xpath request:
doc = Nokogiri::HTML(open("myfile.xml"))
doc.("//Meta").count
# result ==> 0
doc.xpath("//Meta")
# result ==> []
doc.xpath(.).count
# result => 1
Here is an simplified version of my XML File
<Answer xmlns="test:com.test.search" context="hf%3D10%26target%3Dst0" last="0" estimated="false" nmatches="1" nslices="0" nhits="1" start="0">
<time>
...
</time>
<promoted>
...
</promoted>
<hits>
<Hit url="http://www.test.com/" source="test" collapsed="false" preferred="false" score="1254772" sort="0" mask="272" contentFp="4294967295" did="1287" slice="1">
<groups>
...
</groups>
<metas>
<Meta name="enligne">
<MetaString name="value">
</MetaString>
</Meta>
<Meta name="language">
<MetaString name="value">
fr
</MetaString>
</Meta>
<Meta name="text">
<MetaText name="value">
<TextSeg highlighted="false" highlightClass="0">
La
</TextSeg>
</MetaText>
</Meta>
</metas>
</Hit>
</hits>
<keywords>
...
</keywords>
<groups>
...
</groups>
How can I get all children of <Hit> from this XML?
Include the namespace information when calling xpath:
doc.xpath("//x:Meta", "x" => "test:com.test.search")
You can use the remove_namespaces! method and save your day.
This is one of the most FAQ XPAth questions -- search for "XPath default namespace".
If there is no way to register a namespace for the default namespace and use the registered prefix (say "x" in //x:Meta) then use:
//*[name() = 'Meta` and namespace-uri()='test:com.test.search']
If it is known that Meta can only belong to the default namespace, then the above can be shortened to:
//*[name() = 'Meta`]

Targeting part of a comment using XPath

I'm trying to use xpath to return the value "Vancouver", from either the comment or the text after it. Can anyone point me in the right direction?
The location li is always the first item but is not always present, and the number of list items after it varies for each item.
<item>
<title>
<description>
<!-- Comment #1 -->
<ul class="class1">
<li> <!-- ABC Location=Vancouver -->Location: Vancouver</li>
<li> <!-- More comments -->Text</li>
<li> text</li>
</ul>
</description>
</item>
This will pull it from the text after the comment:
substring-after(//ul[#class='class1']/li[position()=1 and contains(.,'Location:')],'Location: ')
This specifies the first <li> inside the <ul> of class 'class1', only when it contains 'Location:', and takes the string after 'Location:'. If you want to relax the requirement that it be the first li, use this:
substring-after(//ul[#class='class1']/li[contains(.,'Location:')],'Location: ')
This isn't eloquent, and it could cause issues if your "Location: #####" were to change structurally, because this is a static solution, but it works for the above:
substring(//item//li[1],12,string-length(//item//li[1])-10)
And this returns the string equivalent, not a node.
Rushed this one a bit, so I'll give a better solution with time but this is just something to think about...
(it just strips off "Location: " and returns whatever's after it..)
Use:
substring-after(/*/description/ul
/li[1]/text()[starts-with(., 'Location: ')],
'Location: '
)
To extract the location from the comment use:
substring-after(/*/description/ul
/li[1]/comment()[starts-with(., ' ABC Location=')],
' ABC Location='
)
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:copy-of select=
"substring-after(/*/description/ul
/li[1]/text()[starts-with(., 'Location: ')],
'Location: '
)
"/>
==========
<xsl:copy-of select=
"substring-after(/*/description/ul
/li[1]/comment()[starts-with(., ' ABC Location=')],
' ABC Location='
)
"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<item>
<title/>
<description>
<!-- Comment #1 -->
<ul class="class1">
<li>
<!-- ABC Location=Vancouver -->Location: Vancouver
</li>
<li>
<!-- More comments -->Text
</li>
<li> text</li>
</ul>
</description>
</item>
the two XPath expressions are evaluated and the results of the evaluations are copied to the output:
Vancouver
==========
Vancouver

XPath: how to get text from this and next tag?

i have HTML like this:
<h1>Hello1</h1>
<p>World1</p>
<h1>Hello2</h1>
<p>World2</p>
<h1>Hello2</h1>
<p>World2</p>
So i need to get at the one time Hello1 with World1, Hello2 with World2 etc
UPDATE: I use Ruby Mechanize library
The Ruby library "Mechanize" uses the Nokogiri parsing library, so you can call Nokogiri directly. One potential solution might look something like this:
require 'mechanize'
require 'pp'
html = "<h1>Hello1</h1>
<p>World1</p>
<h1>Hello2</h1>
<p>World2</p>
<h1>Hello2</h1>
<p>World2</p>"
results = []
Nokogiri::HTML(html).xpath("//h1").each do |header|
p = header.xpath("following-sibling::p[1]").text
results << [header.text, p]
end
pp results
EDIT:
This example was tested with Mechanize v2.0.1 which uses Nokogiri ~v1.4. I also tested directly against Nokogiri v1.5.0 without issue.
EDIT #2:
This example answers a follow-up question to the original solution:
require 'nokogiri'
require 'pp'
html = <<HTML
<h1>
<p>
<font size="4">
<b>abide by (something)</b>
</font>
</p>
</h1>
<p>
<font size="3">- to follow the rules of something</font>
</p>
The cleaning staff must abide by the rules of the school.
<br>
<h1>
<p>
<font size="4">
<b>able to breathe easily again</b>
</font>
</p>
</h1>
<p>
My friend was able to breathe easily again when his company did not go bankrupt.
<br>
HTML
doc = Nokogiri::HTML(html)
results = []
Nokogiri::HTML(html).xpath("//h1").each do |header|
h1 = header.xpath("following-sibling::p/font/b").text
results << h1
end
pp results
H1 tags with nested elements are invalid, so Nokogiri corrects the error during the parsing process. The process to get at the formerly nested elements is very similar to the original solution.
Note: I glazed over the XPath part of this request. This answer is for an XSLT style sheet instead.
Expanding your XML example to give it a root element:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<h1>Hello1</h1>
<p>World1</p>
<h1>Hello2</h1>
<p>World2</p>
<h1>Hello3</h1>
<p>World3</p>
</root>
You could use a for-each loop along with "following-sibling" to get the elements with something like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output encoding="UTF-8" method="text"/>
<xsl:template match="/">
<!-- start lookint for <h1> nodes -->
<xsl:for-each select="/root/h1">
<!-- output the h1 text -->
<xsl:value-of select="."/>
<!-- print a dash for spacing -->
<xsl:text> - </xsl:text>
<!-- select the next <p> node -->
<xsl:value-of select="following-sibling::p[1]"/>
<!-- print a new line -->
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The output would look like this:
Hello1 - World1
Hello2 - World2
Hello3 - World3

Resources