pdfclown highlighting doesn't work for some pdf files - pdfclown

I am using the pdfclown library to highlight some text inside the pdf file but for some reason, I get nullpointerexception error when I run TextHighlightSample.
[java] java.lang.NullPointerException
[java] at java.util.Hashtable.hash(Hashtable.java:239)
[java] at java.util.Hashtable.put(Hashtable.java:519)
[java] at org.pdfclown.documents.contents.fonts.SimpleFont.onLoad(SimpleFont.java:139)
[java] at org.pdfclown.documents.contents.fonts.Font.load(Font.java:738)
[java] at org.pdfclown.documents.contents.fonts.Font.<init>(Font.java:351)
[java] at org.pdfclown.documents.contents.fonts.SimpleFont.<init>(SimpleFont.java:62)
[java] at org.pdfclown.documents.contents.fonts.TrueTypeFont.<init>(TrueTypeFont.java:68)
[java] at org.pdfclown.documents.contents.fonts.Font.wrap(Font.java:253)
[java] at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:72)
[java] at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:1)
[java] at org.pdfclown.documents.contents.ResourceItems.get(ResourceItems.java:119)
[java] at org.pdfclown.documents.contents.objects.SetFont.getResource(SetFont.java:119)
[java] at org.pdfclown.documents.contents.objects.SetFont.getFont(SetFont.java:83)
[java] at org.pdfclown.documents.contents.objects.SetFont.scan(SetFont.java:97)
[java] at org.pdfclown.documents.contents.ContentScanner.moveNext(ContentScanner.java:1330)
[java] at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:811)
[java] at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:777)
[java] at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:770)
[java] at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.get(ContentScanner.java:690)
[java] at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.access$0(ContentScanner.java:682)
[java] at org.pdfclown.documents.contents.ContentScanner.getCurrentWrapper(ContentScanner.java:1154)
[java] at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:633)
[java] at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:647)
[java] at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:647)
[java] at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
[java] at org.pdfclown.samples.cli.TextHighlightSample.run(TextHighlightSample.java:56)
[java] at org.pdfclown.samples.cli.SampleLoader.run(SampleLoader.java:140)
[java] at org.pdfclown.samples.cli.SampleLoader.main(SampleLoader.java:56)
Does anyone know how to solve this problem?

The foreground issue
The foreground issue is that PdfClown in SimpleFont.onLoad() (while reading the Widths from the font dictionary into its own structures) assumes that it has a glyphIndexes entry for each codes value for a key from the FirstChar-based indices in the Widths array:
if(glyphWidthObjects != null)
{
ByteArray charCode = new ByteArray(
new byte[]
{(byte)((PdfInteger)getBaseDataObject().get(PdfName.FirstChar)).getIntValue()}
);
for(PdfDirectObject glyphWidthObject : glyphWidthObjects)
{
int glyphWidth = ((PdfNumber<?>)glyphWidthObject).getIntValue();
if(glyphWidth > 0)
{
Integer code = codes.get(charCode);
if(code != null)
{
glyphWidths.put(
glyphIndexes.get(code), //<<<<<<<<<<<<<<<<<<<<<<
glyphWidth
);
}
}
charCode.data[0]++;
}
}
If you check for null here, e.g. replacing
if(code != null)
by
if(code != null && glyphIndexes.get(code) != null)
you will get rid of the NullPointerException.
Usually there are glyphIndexes entries for all those values. Thus, usually you don't get the NullPointerException here. But PdfClown in its attempt to be able to extract as much as possible uses a mixture of information from the PDF objects and the embedded font objects, and there still seem to be some shortcomings in the coordination of those information, e.g. in case of your document:
The background issue
While constructing a TrueTypeFont object for the font SourceSansPro-Regular PdfClown
(Font.load) tries to read a ToUnicode map to get a mapping from character codes to Unicode and put it into codes; unfortunately the font has no ToUnicode map; thus, codes remains null;
(OpenFontParser construction in TrueTypeFont.loadEncoding initially called by SimpleFont.onLoad) tries to read information from the embedded font file; among other data it retrieved a mapping 32..213 -> 0..44 mapping character codes to in-font glyph indices;
(still in TrueTypeFont.loadEncoding initially called by SimpleFont.onLoad) sets the font object's glyphIndexes member to that map; if there was a codes mapping already now, this would be used here to change the mapping to a mapping Unicode -> 0..44; but codes is null (see above), so glyphIndexes remains as is;
(still in TrueTypeFont.loadEncoding initially called by SimpleFont.onLoad) as there is no codes mapping yet, it creates one based on the MacRomanEncoding entry from the PDF font dictionary;
(still in TrueTypeFont.loadEncoding initially called by SimpleFont.onLoad) if there were no glyphIndexes yet, it would derive one from the current codes mapping and the Widths array; but we already have one, so it remains as is;
(SimpleFont.onLoad) finally it tries to put the contents of the PDF font dictionary's Widths array into its glyphWidths map. The code (see above) assumes that glyphIndexes is a mapping of Unicode codes and, therefore, translates them using codes first. Unfortunately glyphIndexes here is not from Unicode codes but from character codes. Thus the failure observed above occurs.
Font extraction in PdfClown 0.1.3 is in need of clean-up. It tries to make use of information from both the PDF objects and the embedded fonts (which is a good idea) but for some situations like here shoots itself in the foot.
But it's still an early 0.x version after all, so some issues are to be expected...

Related

Function not found when generating code with DDL Database - jooq

I have a spring boot gradle project with a mysql database. Previously under jooq version 3.13.6 my sql was parsed without errors. When updating to a higher jooq version (3.14.X and 3.15.X) and generating/parsing the migrations with jooq, I get the following output:
SEVERE DDLDatabase Error: Your SQL string could not be parsed or
interpreted. This may have a variety of reasons, including:
The jOOQ parser doesn't understand your SQL
The jOOQ DDL simulation logic (translating to H2) cannot simulate your SQL
org.h2.jdbc.JdbcSQLSyntaxErrorException: Function "coalesce" not found;
A basic sql example where the error occurs is given below. Parsing the same view worked with jooq 3.13.6.
DROP VIEW IF EXISTS view1;
CREATE VIEW view1 AS
SELECT COALESCE(SUM(table1.col1), 0) AS 'sum'
FROM table1;
I am currently lost here. I don't see any related changes in the jooq changelog.
Any help or directions to further have a look into are highly appreciated.
Extended Stacktrace:
11:10:30 SEVERE DDLDatabase Error : Your SQL string could not be parsed or interpreted. This may have a variety of reasons, including:
- The jOOQ parser doesn't understand your SQL
- The jOOQ DDL simulation logic (translating to H2) cannot simulate your SQL
If you think this is a bug or a feature worth requesting, please report it here: https://github.com/jOOQ/jOOQ/issues/new/choose
As a workaround, you can use the Settings.parseIgnoreComments syntax documented here:
https://www.jooq.org/doc/latest/manual/sql-building/dsl-context/custom-settings/settings-parser/
11:10:30 SEVERE Error while loading file: /Users/axel/projects/service/./src/main/resources/db/migration/V5__create_view1.sql
11:10:30 SEVERE Error in file: /Users/axel/projects/service/build/tmp/generateJooq/config.xml. Error : Error while exporting schema
org.jooq.exception.DataAccessException: Error while exporting schema
at org.jooq.meta.extensions.AbstractInterpretingDatabase.connection(AbstractInterpretingDatabase.java:103)
at org.jooq.meta.extensions.AbstractInterpretingDatabase.create0(AbstractInterpretingDatabase.java:77)
at org.jooq.meta.AbstractDatabase.create(AbstractDatabase.java:332)
at org.jooq.meta.AbstractDatabase.create(AbstractDatabase.java:322)
at org.jooq.meta.AbstractDatabase.setConnection(AbstractDatabase.java:312)
at org.jooq.codegen.GenerationTool.run0(GenerationTool.java:531)
at org.jooq.codegen.GenerationTool.run(GenerationTool.java:237)
at org.jooq.codegen.GenerationTool.generate(GenerationTool.java:232)
at org.jooq.codegen.GenerationTool.main(GenerationTool.java:204)
Caused by: org.jooq.exception.DataAccessException: SQL [create view "view1" as select "coalesce"("sum"("table1"."col1"), 0) "sum" from "table1"]; Function "coalesce" not found; SQL statement:
create view "view1" as select "coalesce"("sum"("table1"."col1"), 0) "sum" from "table1" [90022-200]
at org.jooq_3.15.5.H2.debug(Unknown Source)
at org.jooq.impl.Tools.translate(Tools.java:2988)
at org.jooq.impl.DefaultExecuteContext.sqlException(DefaultExecuteContext.java:639)
at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:349)
at org.jooq.meta.extensions.ddl.DDLDatabase.load(DDLDatabase.java:183)
at org.jooq.meta.extensions.ddl.DDLDatabase.lambda$export$0(DDLDatabase.java:156)
at org.jooq.FilePattern.load0(FilePattern.java:307)
at org.jooq.FilePattern.load(FilePattern.java:287)
at org.jooq.FilePattern.load(FilePattern.java:300)
at org.jooq.FilePattern.load(FilePattern.java:251)
at org.jooq.meta.extensions.ddl.DDLDatabase.export(DDLDatabase.java:156)
at org.jooq.meta.extensions.AbstractInterpretingDatabase.connection(AbstractInterpretingDatabase.java:100)
... 8 more
Caused by: org.h2.jdbc.JdbcSQLSyntaxErrorException: Function "coalesce" not found; SQL statement:
create view "view1" as select "coalesce"("sum"("table1"."col1"), 0) "sum" from "table1" [90022-200]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:576)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
at org.h2.message.DbException.get(DbException.java:205)
at org.h2.message.DbException.get(DbException.java:181)
at org.h2.command.Parser.readJavaFunction(Parser.java:3565)
at org.h2.command.Parser.readFunction(Parser.java:3770)
at org.h2.command.Parser.readTerm(Parser.java:4305)
at org.h2.command.Parser.readFactor(Parser.java:3343)
at org.h2.command.Parser.readSum(Parser.java:3330)
at org.h2.command.Parser.readConcat(Parser.java:3305)
at org.h2.command.Parser.readCondition(Parser.java:3108)
at org.h2.command.Parser.readExpression(Parser.java:3059)
at org.h2.command.Parser.readFunctionParameters(Parser.java:3778)
at org.h2.command.Parser.readFunction(Parser.java:3772)
at org.h2.command.Parser.readTerm(Parser.java:4305)
at org.h2.command.Parser.readFactor(Parser.java:3343)
at org.h2.command.Parser.readSum(Parser.java:3330)
at org.h2.command.Parser.readConcat(Parser.java:3305)
at org.h2.command.Parser.readCondition(Parser.java:3108)
at org.h2.command.Parser.readExpression(Parser.java:3059)
at org.h2.command.Parser.parseSelectExpressions(Parser.java:2931)
at org.h2.command.Parser.parseSelect(Parser.java:2952)
at org.h2.command.Parser.parseQuerySub(Parser.java:2817)
at org.h2.command.Parser.parseSelectUnion(Parser.java:2649)
at org.h2.command.Parser.parseQuery(Parser.java:2620)
at org.h2.command.Parser.parseCreateView(Parser.java:6950)
at org.h2.command.Parser.parseCreate(Parser.java:6223)
at org.h2.command.Parser.parsePrepared(Parser.java:903)
at org.h2.command.Parser.parse(Parser.java:843)
at org.h2.command.Parser.parse(Parser.java:815)
at org.h2.command.Parser.prepareCommand(Parser.java:738)
at org.h2.engine.Session.prepareLocal(Session.java:657)
at org.h2.engine.Session.prepareCommand(Session.java:595)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1235)
at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:212)
at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:201)
at org.jooq.tools.jdbc.DefaultStatement.execute(DefaultStatement.java:102)
at org.jooq.impl.SettingsEnabledPreparedStatement.execute(SettingsEnabledPreparedStatement.java:227)
at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:414)
at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:335)
... 16 more
> Task :generateJooq FAILED
Jooq Configuration:
jooq {
version = "3.15.5"
edition = JooqEdition.OSS
configurations {
main {
generationTool {
generator {
name = 'org.jooq.codegen.KotlinGenerator'
strategy {
name = 'org.jooq.codegen.DefaultGeneratorStrategy'
}
generate {
relations = true
deprecated = false
records = true
immutablePojos = true
fluentSetters = true
daos = false
pojosEqualsAndHashCode = true
javaTimeTypes = true
}
target {
packageName = 'de.project.service.jooq'
}
database {
name = 'org.jooq.meta.extensions.ddl.DDLDatabase'
properties {
property {
key = 'scripts'
value = 'src/main/resources/db/migration/*.sql'
}
property {
key = 'sort'
value = 'semantic'
}
property {
key = 'unqualifiedSchema'
value = 'none'
}
property {
key = 'defaultNameCase'
value = 'lower'
}
}
}
}
}
}
}
You probably have the following configuration set:
<property>
<key>defaultNameCase</key>
<value>lower</value>
</property>
In jOOQ 3.15, this transforms all identifiers to lower case and quotes them before handing the SQL statement to H2 behind the scenes for DDL simulation, in order to emulate e.g. PostgreSQL behaviour, where unquoted identifiers are lower case, not upper case as in many other RDBMS.
There's a bug in the current implementation, which also quotes built-in functions, not just user defined objects. See:
https://github.com/jOOQ/jOOQ/issues/9931 (general problem related to "system names")
https://github.com/jOOQ/jOOQ/issues/12752 (DDLDatabase specific problem)
The only workaround I can think of would be to turn off that property again, and manually quote all identifiers to be lower case. Alternatively, instead of using the DDLDatabase, you can always connect to an actual database instead, e.g. by using testcontainers. This will be much more robust in many ways, anyway, than the DDLDatabase
In any case, this is quite the frequent problem, so, I've fixed this for the upcoming jOOQ 3.16. The above setting will no longer quote "system names", which are well known identifiers of built-in functions

iText7 PdfTextExtractor.GetTextFromPage "'StandardEncoding' is not a supported encoding name."

I have a method in our software that pulls the text from a PDF, from a scan or text generated.
I usually try the GetTextFromPage() method first. If it doesn't return text, then I move onto OCR'ing the page.
I have a particular 6 page PDF with the first three pages being a scanned document, and the last two being a form.
On this PDF I'm getting an error that I can't figure out how to resolve.
'StandardEncoding' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
Parameter name: name
at System.Globalization.EncodingTable.internalGetCodePageFromName(String name)
at System.Globalization.EncodingTable.GetCodePageFromName(String name)
at iText.IO.Util.IanaEncodings.GetEncodingEncoding(String name)
at iText.IO.Util.EncodingUtil.ConvertToBytes(Char[] chars, String encoding)
at iText.IO.Font.PdfEncodings.ConvertToBytes(String text, String encoding)
at iText.IO.Font.FontEncoding.FillNamedEncoding()
at iText.IO.Font.FontEncoding.CreateFontEncoding(String baseEncoding)
at iText.Kernel.Font.PdfType1Font..ctor(PdfDictionary fontDictionary)
at iText.Kernel.Font.PdfFontFactory.CreateFont(PdfDictionary fontDictionary)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.GetFont(PdfDictionary fontDict)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.SetTextFontOperator.Invoke(PdfCanvasProcessor processor, PdfLiteral operator, IList`1 operands)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.InvokeOperator(PdfLiteral operator, IList`1 operands)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.ProcessContent(Byte[] contentBytes, PdfResources resources)
at iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(PdfPage page, ITextExtractionStrategy strategy, IDictionary`2 additionalContentOperators)
at iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(PdfPage page)
at EFR.OCR.OCR.ExtractTextFromPDF(FileInfo fileInfo, Int32 StartingPage, Int32 NumberOfPages) in P:\Cloud\Dropbox\EF Recovery\OCRTest\EFR.OCR\OCR.vb:line 113
I've processed many PDFs through my code, some text, some scans, some mixed together. Some had forms... This is the first time that I've had this error.
Here's a snippet of my code...
Using reader As New iText.Kernel.Pdf.PdfReader(fileInfo.FullName)
reader.SetUnethicalReading(True)
Using sourceDoc As New iText.Kernel.Pdf.PdfDocument(reader)
If NumberOfPages = 0 Then NumberOfPages = sourceDoc.GetNumberOfPages
For i As Integer = StartingPage To StartingPage + NumberOfPages - 1
Dim pageText As String = ""
Try
pageText = iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(sourceDoc.GetPage(i))
Catch ex As Exception
OCRLog.Log($"Error attempting to extract text from page {i}. {ex.ToString}")
End Try
If pageText = "" Then
'extract this page
Dim results As OCRResults = ExtractTextFromPDFImagePage(fileInfo.FullName, i)
pageText = results.Text
pageItems.Add(New OCRResults.PagesClass(results.Accuracy, True, pageText))
Else
pageItems.Add(New OCRResults.PagesClass(100, False, pageText))
End If
stringBuilder.Append(pageText)
Next
Return New OCRResults(stringBuilder.ToString, pageItems)
End Using
End Using
Any ideas?
There is an error in the PDF, just as indicated by the error text "'StandardEncoding' is not a supported encoding name.".
The fonts on the page you shared use the name StandardEncoding in their Encoding entries. This is not a valid name here. According to the specification ISO 32000-1 the only valid values here are MacRomanEncoding, MacExpertEncoding, and WinAnsiEncoding, see Table 111 – Entries in a Type 1 font dictionary – and Table 114 – Entries in an encoding dictionary.
Adobe Preflight also complains about these names when checking for syntax errors:
An unexpected value is associated with the key
Key: BaseEncoding
Value: /StandardEncoding
Type: CosName
Formal Representation: Encoding
Cos ID: 38
Traversal Path: ->Pages->Kids->[0]->Resources->Font->WARSP->Encoding
An unexpected value is associated with the key
Key: Encoding
Value: /StandardEncoding
Type: CosName
Formal Representation: Font.FontType1
Cos ID: 27
Traversal Path: ->Pages->Kids->[0]->Resources->Font->Arial,Bold
An unexpected value is associated with the key
Key: BaseEncoding
Value: /StandardEncoding
Type: CosName
Formal Representation: Encoding
Cos ID: 22
Traversal Path: ->Pages->Kids->[0]->Resources->Font->Arial->Encoding
An unexpected value is associated with the key
Key: BaseEncoding
Value: /StandardEncoding
Type: CosName
Formal Representation: Encoding
Cos ID: 19
Traversal Path: ->Pages->Kids->[0]->Resources->Font->ARROW->Encoding
(Excerpt from a preflight report for your shared PDF)
In spite of StandardEncoding not being a valid name here, the PDF specification knows a "Standard Encoding", see Annex D of ISO 32000-1. Most likely your document attempts to refer to that encoding at the locations outlined above.
If you need to extract text from the document in question, therefore, you may want to follow the recommendation of the error message:
For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
The Encoding class here is the one in System.Text.
To extract the text from your PDF, therefore, it should suffice to implement an EncodingProvider that for the name StandardEncoding provides an Encoding instance according to the information from the STD column of the table in Annex D.2 – Latin Character Set and Encodings – of ISO 32000-1.

Getting Content is not allowed in prolog when running in Java

I am facing Content is not allowed in prolog exception when I run my xxx.jmx file from Java (Jmeter 5.0).
I tested the jmx in the GUI mode and everything works fine and in the Java I am just following the standard way to call the jmx file and execute it.
The jmx just have some normal stuff. Sending HTTP request and validate the expected and received XML (I am using this snipped to validate):
import org.apache.commons.io.FileUtils
expect = FileUtils.readFileToString(new File('some_path'))
XmlParser parser = new XmlParser()
expectedXML = new XmlSlurper().parseText(expect)
actualXML = new XmlSlurper().parseText(prev.getResponseDataAsString())
if (expectedXML != actualXML) {
AssertionResult.setFailure(true)
AssertionResult.setFailureMessage('Mismatch between expected and actual XML \n'+ prev.getResponseDataAsString())
and the stack trace:
2018/10/24 15:18:03,386 12675 [ERROR ] [Thread Group 1-1] (JSR223Assertion.java:52) – Problem in JSR223 script: Validate resposne
javax.script.ScriptException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:320)
at org.codehaus.groovy.jsr223.GroovyCompiledScript.eval(GroovyCompiledScript.java:72)
at javax.script.CompiledScript.eval(CompiledScript.java:92)
at org.apache.jmeter.util.JSR223TestElement.processFileOrScript(JSR223TestElement.java:221)
at org.apache.jmeter.assertions.JSR223Assertion.getResult(JSR223Assertion.java:49)
at org.apache.jmeter.threads.JMeterThread.processAssertion(JMeterThread.java:901)
at org.apache.jmeter.threads.JMeterThread.checkAssertions(JMeterThread.java:892)
at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:565)
at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:486)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:253)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at groovy.util.XmlSlurper.parse(XmlSlurper.java:207)
at groovy.util.XmlSlurper.parse(XmlSlurper.java:260)
at groovy.util.XmlSlurper.parseText(XmlSlurper.java:286)
at groovy.util.XmlSlurper$parseText.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at Script1.run(Script1.groovy:9)
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:317)
... 10 more
UPDATE 1:
The problem of using Response Assertion is it cannot ignore the space or tab. So if the format is not exactly the same when I use equal will always fail. Any ideas how to ignore these things by using Response Assertion?
UPDATE 2:
I found out that the issue is not related to the BOM. Is becasue if I run the jmx from my Java application:
prev.getResponseDataAsString()
above function always return:
${__FileToString(${inputFilePath},,)}
but not the actual response. This function is come from the body data of the HTTP Request sampler!!!!!! If I give the acutal body there then I am able to run the jmx...... Any idea how to deal with this dynamic body data?
It might be the case your "expected" XML file contains BOM and it causes your code failure.
BOM is basically 3 first bytes so you can remove them using code like:
def expect = FileUtils.readFileToString(new File('some_path')).getBytes().flatten()
1.upto(3) {
expect.remove(0)
}
XmlParser parser = new XmlParser()
def expectedXML = parser.parseText(new String(expect.toArray(new Byte[0])))
The rest of your code should work fine.
Check out The Groovy Templates Cheat Sheet for JMeter article to learn more Groovy tips and tricks.
Also be informed that in majority of cases it's easier to use Response Assertion or when it comes to XML - XPath Assertion, Java code will work faster than Groovy in any case.

Service Now : transform map stopped due to error: java.lang.NumberFormatException

The transform is getting aborted but only if I marked the checkbox copy empty fields and also the rest of the entry of the Import set is getting stuck at pending, also I verified the transform script but no luck.
Below is the error :
Import set: ISETxxxxxxx transform stopped due to error: java.lang.NumberFormatException
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:596)
at java.math.BigDecimal.<init>(BigDecimal.java:383)
at java.math.BigDecimal.<init>(BigDecimal.java:806)
at com.glide.script.glide_elements.GlideNumber.getSafeBigDecimal(GlideNumber.java:42)
at com.glide.currency.GlideElementCurrency.coerceAmount(GlideElementCurrency.java:406)
at com.glide.currency.GlideElementCurrency.cleanAmount(GlideElementCurrency.java:389)
at com.glide.currency.GlideElementCurrency.setDisplayValue(GlideElementCurrency.java:136)
at com.glide.currency.GlideElementCurrency.setValue(GlideElementCurrency.java:89)
at com.glide.db.impex.transformer.TransformerField.copyEmptyFields(TransformerField.java:202)
at com.glide.db.impex.transformer.TransformerField.setValue(TransformerField.java:130)
at com.glide.db.impex.transformer.TransformerField.transformField(TransformerField.java:84)
at com.glide.db.impex.transformer.TransformRow.transformCurrent(TransformRow.java:100)
at com.glide.db.impex.transformer.TransformRow.transform(TransformRow.java:69)
at com.glide.db.impex.transformer.Transformer.transformBatch(Transformer.java:150)
at com.glide.db.impex.transformer.Transformer.transform(Transformer.java:76)
at com.glide.system_import_set.ImportSetTransformerImpl.transformEach(ImportSetTransformerImpl.java:239)
at com.glide.system_import_set.ImportSetTransformerImpl.transformAllMaps(ImportSetTransformerImpl.java:91)
at com.glide.system_import_set.ImportSetTransformer.transformAllMaps(ImportSetTransformer.java:64)
at com.glide.system_import_set.ImportSetTransformer.transformAllMaps(ImportSetTransformer.java:50)
at com.snc.automation.ScheduledImportSetJob.runImport(ScheduledImportSetJob.java:55)
at com.snc.automation.ScheduledImportJob.execute(ScheduledImportJob.java:45)
at com.glide.schedule.JobExecutor.execute(JobExecutor.java:83)
at com.glide.schedule.GlideScheduleWorker.executeJob(GlideScheduleWorker.java:207)
at com.glide.schedule.GlideScheduleWorker.process(GlideScheduleWorker.java:145)
at com.glide.schedule.GlideScheduleWorker.run(GlideScheduleWorker.java:62)
I'm guessing you have a field that required that is a decimal or similar.
The error java.lang.NumberFormatException indicates it's failing to convert an empty string to 0.0.
Use a source script line to convert this, something along the lines of this
answer = (function transformEntry(source) {
if (source.u_number_field.nil())
return 0.0;
})(source);

can not make xpath with functions work with JDOM2

I used JDOM1 before to parse xmls with xpath, and tired with the non-generic style, so I decide to try JDOM2, OK, everything works perfect for me ( the generic, XPathFactory, XPathExpression). then I try a xpath statement with contains function :
XPathExpression<Text> timeXpath = XPathFactory.instance().compile(
"./p[contains(.,'time:')]/text()", Filters.textOnly());
String time = timeXpath.evaluateFirst(div).getTextTrim();
then I got exeptions:
java.lang.IllegalStateException: Unable to evaluate expression. See cause
at org.jdom2.xpath.jaxen.JaxenCompiled.evaluateRawFirst(JaxenCompiled.java:200)
at org.jdom2.xpath.util.AbstractXPathCompiled.evaluateFirst(AbstractXPathCompiled.java:327)
at peace.org.tm.spider.star.DamaiStarSpider.syncStarTracks(DamaiStarSpider.java:123)
at peace.org.tm.spider.star.DamaiStarSpider.main(DamaiStarSpider.java:156)
Caused by: org.jaxen.UnresolvableException: Function :contains
at org.jaxen.SimpleFunctionContext.getFunction(SimpleFunctionContext.java:142)
at org.jaxen.ContextSupport.getFunction(ContextSupport.java:189)
at org.jaxen.Context.getFunction(Context.java:153)
at org.jaxen.expr.DefaultFunctionCallExpr.evaluate(DefaultFunctionCallExpr.java:183)
at org.jaxen.expr.DefaultPredicate.evaluate(DefaultPredicate.java:106)
at org.jaxen.expr.PredicateSet.evaluatePredicates(PredicateSet.java:188)
at org.jaxen.expr.DefaultLocationPath.evaluate(DefaultLocationPath.java:218)
then I tried:
XPathExpression<Text> timeXpath = XPathFactory.instance().compile(
"./p[fn:contains(.,'time:')]/text()", Filters.textOnly());
String time = timeXpath.evaluateFirst(div).getTextTrim();
xpath compile failed:
java.lang.IllegalArgumentException: Unable to compile './p[fn:contains(.,'time:')]/text()'. See Cause.
at org.jdom2.xpath.jaxen.JaxenCompiled.<init>(JaxenCompiled.java:152)
at org.jdom2.xpath.jaxen.JaxenXPathFactory.compile(JaxenXPathFactory.java:82)
at org.jdom2.xpath.XPathFactory.compile(XPathFactory.java:282)
at peace.org.tm.spider.star.DamaiStarSpider.syncStarTracks(DamaiStarSpider.java:91)
at peace.org.tm.spider.star.DamaiStarSpider.main(DamaiStarSpider.java:156)
Caused by: org.jaxen.XPathSyntaxException: Unexpected '('
at org.jaxen.BaseXPath.<init>(BaseXPath.java:136)
at org.jaxen.BaseXPath.<init>(BaseXPath.java:157)
at org.jdom2.xpath.jaxen.JaxenCompiled.<init>(JaxenCompiled.java:150)
... 4 more
I already google the stack trace for about 2 hours, nothing useful founded, I think maybe I made a very stupid mistake, is anyone can figure it out for me? thanks!
I can't reproduce the exceptions you are getting.... I am using JDOM 2.0.5 with Jaxen 1.1.6.
I have created the following:
public static void main(String[] args) {
Element root = new Element ("root");
Element p = new Element("p");
p.addContent(" Return this time: Boo! ");
root.addContent(p);
XPathExpression<Text> timeXpath = XPathFactory.instance().compile(
"./p[contains(.,'time:')]/text()", Filters.textOnly());
XPathDiagnostic<Text> xpd = timeXpath.diagnose(root, true);
System.out.println(xpd);
System.out.println(timeXpath.evaluateFirst(root).getTextTrim());
}
and it produces:
[XPathDiagnostic: './p[contains(.,'time:')]/text()' evaluated (first) against org.jdom2.Element produced raw=1 discarded=0 returned=1]
Return this time: Boo!
I believe you may have out-of-date Jaxen class libraries?

Resources