InvoiceAddRq fails for unknown reason - etl

I hope I stay in line here, I don't want to waste your time.
I developed a simple app that extracts customer invoice data from a database and writes xml requests to load the invoice data into QuickBooks. It works well. I've run into and fixed a dozen or so quirks and glitches along the way; it's in production about a year.
While loading today's invoices into QuickBooks, one failed.
There were no objectionable characters or oddities in the data of the failed invoice; I compared it to other invoices and found very similar data that was successful; I verified the ListID values for each entity; I am stumped.
If anybody can help with this, I will appreciated it greatly.
Here's the xml that failed today (the only change is that I blocked a couple of names):
FAILED InvoiceAddRq:
<?xml version="1.0"?>
<?qbxml version="8.0"?>
<QBXML>
<QBXMLMsgsRq onError="stopOnError">
<InvoiceAddRq requestID="16012816370245509">
<InvoiceAdd>
<CustomerRef><ListID>3E00001-1139583887</ListID></CustomerRef>
<ARAccountRef><ListID>380000-1137509930</ListID></ARAccountRef>
<TemplateRef><ListID>B0000-1142608867</ListID></TemplateRef>
<TxnDate>2016-01-28</TxnDate>
<RefNumber>60125008</RefNumber>
<PONumber>AI600292569</PONumber>
<TermsRef><ListID>20000-1137508984</ListID></TermsRef>
<ShipDate>2016-01-25</ShipDate>
<Other>XYXYX PLASTICS</Other>
<InvoiceLineAdd>
<ItemRef><ListID>20000-1139578831</ListID></ItemRef>
<Desc>RECOVERING 1/25DEL. 1/26 # 11:57 AM EST – POD XYXYX</Desc>
<Quantity>1</Quantity>
<Rate>480.79</Rate>
<Other1>6</Other1>
<Other2>3466</Other2>
</InvoiceLineAdd>
</InvoiceAdd>
<IncludeRetElement>TxnID</IncludeRetElement><IncludeRetElement>TimeCreated</IncludeRetElement><IncludeRetElement>RefNumber</IncludeRetElement>
</InvoiceAddRq>
</QBXMLMsgsRq>
</QBXML>
I don't see anything wrong here. I compared this with the xml for many successful InvoiceAdd requests that occurred before and after this one. Then I tried it again for the insanity check (same result).
Below I'll post a similar invoice as a successful example.
SUCCESSFUL InvoiceAddRq:
<?xml version="1.0"?>
<?qbxml version="8.0"?>
<QBXML>
<QBXMLMsgsRq onError="stopOnError">
<InvoiceAddRq requestID="160128164851801">
<InvoiceAdd>
<CustomerRef><ListID>80000915-1294766937</ListID></CustomerRef>
<ARAccountRef><ListID>380000-1137509930</ListID></ARAccountRef>
<TemplateRef><ListID>B0000-1142608867</ListID></TemplateRef>
<TxnDate>2016-01-28</TxnDate>
<RefNumber>60125011</RefNumber>
<PONumber>2200166200</PONumber>
<TermsRef><ListID>20000-1137508984</ListID></TermsRef>
<ShipDate>2016-01-26</ShipDate>
<Other>X&Y MEHOOPANY </Other>
<InvoiceLineAdd>
<ItemRef><ListID>20000-1139578831</ListID></ItemRef>
<Desc>UNABLE TO DELIVER DUE TO THE WEATHER 1-22DEL. 1-25, POD. AXYXYX BXYXYXY #14:30</Desc>
<Quantity>1</Quantity>
<Rate>148.60</Rate>
<Other1>1</Other1>
<Other2>3</Other2>
</InvoiceLineAdd>
</InvoiceAdd>
<IncludeRetElement>TxnID</IncludeRetElement><IncludeRetElement>TimeCreated</IncludeRetElement><IncludeRetElement>RefNumber</IncludeRetElement>
</InvoiceAddRq>
</QBXMLMsgsRq>
</QBXML>
Thanks in advance.
edit: I forgot to mention that the error was very generic
"QuickBooks found an error when parsing the provided XML text stream."
Source:QBXMLRP2.RequestProcessor.2

Ok, I found the cause. The logged message cited an invalid byte "(–)". The dash character was bouncing my Invoice. I had previously looked into invoices using a dash and found many were successful loaded using the same method. What I had failed to notice was that the dash characters were different in all of the successful invoices I found except 1.
The failed xml includes – or, if you prefer, – or "En dash" instead of the normal ascii dash "-" (-).
The SDK validator doesn't flag it as a problem. Also, 1 invoice that included an "En dash" had previously loaded successful; I can't explain that.
I use character substitution tables to deal with the requirements (or idiosyncrasies) of Xml, QbXml, and the database (Oracle). Instead of changing my encoding, I'll substitute - for all occurrences of – (and —, "Em dash", for that matter). This worked for the previously failing invoice.
Thanks again to WilliamLorfing for the help.

Related

Issue with xsl-fo :footnote when generating pdf/ua-1 document with fop.: "tagged PDF note id is missing"

I have an issue with <fo:footnote> when generating pdf/ua-1 document with fop.
The resulting pdf displays correctly the footnote in the page but don’t pass the pdf-ua validation. A severe error on pdf tag Note “id is missing” is raised so the document is not conformed. I'm using PAC3 for the conformance test.
In the example below I have extracted the basic <fo:footnote> element which has a unique id.
How can I generate the missing Id attribute in the pdf tagged Note element?
Here is the xsl-fo really simple footnote. Note that I used an id to reference the footnote.
some text...
<fo:footnote id="FNE0001">
<fo:inline font-size="6pt" baseline-shift="super">E0001</fo:inline>
<fo:footnote-body>
<fo:block>
<fo:inline>E0001</fo:inline><fo:inline > JO L 139 du 29.5.2002, p. 9.</fo:inline>
</fo:block>
</fo:footnote-body>
</fo:footnote> some text...
Apache FOP has been set to generate pdf-ua through the conf file as follow:
<renderers>
<renderer mime="application/pdf">
<!-- Before setting the pdf-ua-mode, we must insert metadata Title in FO declaration -->
<pdf-ua-mode>PDF/UA-1</pdf-ua-mode>
....
PAC3 is checking against the failure conditions in the Matterhorn Protocol; latest edition at https://www.pdfa.org/resource/the-matterhorn-protocol/.
PAC3 is probably reporting on failure condition 19-003, ID entry of the tag is not present.
fo:footnote-body can have an id property. You could try adding an ID to that. In the fo:inline for the footnote marker, you might also need to add an fo:basic-link that refers to that ID.
FWIW, your footnote does not cause that error when checking PDF/UA generated by AH Formatter, and I'm only guessing at what FOP would be doing differently. (PAC3 and I generally disagree when it complains about a footnote being a possibly incorrect use of a Note tag, but that's another story: PAC3 tries to automate checking the conditions that should be checked by a human, and it doesn't always get it right.)

Writing Capybara expectations to verify phone numbers

I'm using AWS Textract to pull information from PDF documents. After the scanned text is returned from AWS and persisted to a var, I'm doing this:
phone_number = '(555) 123-4567'
scanned_pdf_text.should have_text phone_number
But this fails about 20% of the time because of the non-deterministic way that AWS is returning the scanned PDF text. On occasion, the phone numbers can appear either of these two ways:
(555)123-4567 or (555) 123-4567
Some of this scanned text is very large, and I'd prefer not to go through the exercise of sanitizing the text coming back if I can avoid it (I'm also not good at regex usage). I also think using or logic to handle both cases seems to be a little heavy handed just to check text that is so similar (and clearly near-identical to the human eye).
Is there an rspec matcher that'll allow me to check on this text? I'm also using Capybara.default_normalize_ws = true but that doesn't seem to help in this case.
Assuming scanned_pdf_text is a string and the only differences you're seeing is in spaces then you can just get rid of the spaces and compare
scanned_pdf_text.gsub(/\s+/, '').should eq('(555)123-4567') # exact
scanned_pdf_text.gsub(/\s+/, '').should match('(555)123-4567') # partial
scanned_pdf_text.gsub(/\s+/, '').should have_text('(555)123-4567') # partial

Makecat failure: no members found

I am trying to modify existing input cdf file to use SHA256 instead of SHA1 by adding following two lines under [CatalogHeader] section:
CatalogVersion=2
HashAlgorithms=SHA256
Executing makecat.exe now gives me following failure message even though nothing under [CatalogFiles] has changed:
Failed: CryptCATCDFEnumMembersByCDFTagEx. Last Error: 0x00000057
Failed: No members found. Last Error: 0x00000057
Failed 0x00000057 (87)
Makecat does find and hash all files if I take out two lines I added.
Can anybody give me an idea what might be going wrong here?
Here is an example cdf file for MCVE:
[CatalogHeader]
Name=MCVE.cat
CatalogVersion=2
HashAlgorithms=SHA256
[CatalogFiles]
MCVE.xml=MCVE.xml
MCVE.xml is any old xml file you can find.
I encountered the same problem but was able to get it to work by putting '< HASH >' (without spaces) in front of each file entry. Example:
[CatalogFiles]
<HASH>manifest.json=.\manifest.json
<HASH>bsi.json=.\bsi.json
However, this causes the catalog file's entries to be tagged by their hash, instead of their filename, when viewing the .cat file in Windows Explorer. You can somewhat work around this by adding a custom attribute to display the filename in the catalog entry's details, as follows:
[CatalogFiles]
<HASH>manifest.json=.\manifest.json
<HASH>manifest.jsonATTR1=0x11010001:File:manifest.json
<HASH>bsi.json=.\bsi.json
<HASH>bsi.jsonATTR1=0x11010001:File:bsi.json
The attribute type is composed of (https://learn.microsoft.com/en-us/windows/desktop/seccrypto/makecat):
0x10000000: attribute is included in the catalog's hash
0x01000000: don't create a duplicated attribute with SHA1 hash (when using SHA256 and catalog version 2)
0x00010000: attribute is in plaintext, not base64
0x00000001: attribute is a keyvalue pair (e.g. File=bsi.json)
I discovered this workaround after running into the same problem as you when I found this example here: https://www-user.tu-chemnitz.de/~heha/viewzip.cgi/basteln/PC/USB2LPT/usb2lpt.zip/src/Makefile?auto=MAK
Hope this helps.
Can't add comments yet ---
Just wanted to say Jonathan's example with the 0x11010001 attribute works great, but PowerShell's Test-FileCatalog will still say it fails to parse the file. Using FilePath instead of File fixed this. Not sure if this is in the spec or just a powershell quirk or what, but it's what PowerShell does with New-FileCatalog.
Bonus points for not including the SHA1 hash, thanks!

EOF with Nokogiri

I have the following line in a long loop
page = Nokogiri::HTML(open(topic[:url].first)).xpath('//ul[#class = "pages"]//li').first
Sometimes my Ruby application crashes raising the "End of file reached " exception in this line.
How can I resolve this problem? Just a begin;raise;end block?
Is a script that performs a forum backup, so is important that doesn't skip any thread.
Thanks in advance.
In addition to #Phrogz's excellent advice (in particular about at_css with the simpler expression), I would pull the raw xml [content] separately:
page = if (content = open(topic[:url].first)).strip.length > 0
Nokogiri::HTML(content).xpath('//ul[#class = "pages"]//li').first
end
I would suggest that you should first to fix the underlying issue so that you do not get this error.
Does the same URL always cause the problem? (Output it in your log files.) If so, perhaps you need to URI encode the URL.
Is it random, and therefor likely related to a connection hiccup or server problem? If so, you should rescue the specific error and then retry one or more times to get the crucial data.
Secondarily, you should know that the CSS syntax for that query is far simpler:
page = Nokogiri.HTML(...).at_css('ul.pages li')
Not only is this less than half the bytes, it allows for cases like <ul class="foo pages"> that the XPath would miss.
Using at_css (or at_xpath) is the same as .css(...).first, but is faster and simpler.

How do I remove HTML encoded characters from a string?

I have a string which contains some HTML encoded characters and I want to remove them:
"<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"
What you want to do is doable many ways. Perhaps looking at why you might want to do that will help. Usually when I want to remove encoded HTML, I want to recover the contents of the HTML. Ruby has some modules that make it easy.
require 'cgi'
require 'nokogiri'
html = "<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"
puts CGI.unescapeHTML(html)
which outputs:
<div>Hi All,</div><div class="paragraph_break">< /></div><div>Starting today we are initiating PoLS.</div><div class="paragraph_break"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class="paragraph_break"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class="paragraph_break"><br /></div><div>All the best!</div><div class="paragraph_break"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>
If I want to take it a step farther and remove the tags, retrieving all the text:
puts Nokogiri::HTML(CGI.unescapeHTML(html)).content
Will output:
Hi All,Starting today we are initiating PoLS.Please use the following communication protocols:1. Task Breakup and allocation - Gravity2. All mail communications - BC messages3. Reports on PoC / Spikes: Writeboard4. Non story related tasks: BC To-Do5. All UI and HTML will communicated to you through BC.6. For File sharing, we'll be using Dropbox.7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.You'll have been given necessary accesses to all these portals. Please start using them judiciously.All the best!Thanks,Saurav
Which is where I usually want to get when I see that sort of string.
Ruby's CGI makes encoding and decoding HTML easy. The Nokogiri gem makes it easy to remove the tags.
I think the easiest way to do this is, Assuming you want to use the html in the string.
raw CGI.unescapeHTML('The string you want to manipulate')
If you have assigned that string to a variable s, is this the result you want?
puts s.gsub(/<[^&]*>/, '')
I would suggest:
clean = str.gsub /<.+?>/, ''

Resources