How to remove "amp;" from getting sent in XML? - ruby

i have a string "Travel & Hospitality". like this.
When i sent it through XML API the output is like below.
"Travel & Hospitality"
i tried removing it with ruby code like below and sent it through XML.
"Travel & Hospitality".gsub("&","&").
<Specialization__c>Travel & Hospitality</Specialization__c>
even though gsub is removing "amp;" again while sending it through XML tags again the amp; word is coming.
How can i remove it.my desired output is
"Travel & Hospitality"

XML doesn't allow such a thing. & is not allowed to appear unescaped.
You could have an XML file like this instead:
<Specialization__c><![CDATA[Travel & Hospitality]]></Specialization__c>
That would work, but the problem is how to convince your XML output library to do something like that. It might not even be possible at all. (I might be wrong about that last part. I know nothing about ruby)

Related

How to prevent liberty from converting "+" to " " in URL

We had an issue where clients were sending "+" as part of a parameter value without percent-encoding it. After digging in, it looks like converting "+" to " " is from HTML form encoding, but not part of the URL spec.
I found https://www.ibm.com/mysupport/s/question/0D50z00005phvXb/urls-with-or-2b-in-the-path-or-query-are-incorrectly-decoded-to-space?language=en_US which sounds exactly like what we're hitting, but with Liberty 19.0.0.8 (and probably for some time), even excpicitly setting decodeUrlPlusSign="false" doesn't seem to help.
That is, when we call req.getParameter(queryParameterName) it is returning the value with a " " instead of a "+".
I'm setting it in server.xml as follows:
<webContainer disableXPoweredBy="true" decodeUrlPlusSign="false" />
What exactly is decodeUrlPlusSign supposed to do? Is it working as expected?

Populating an email with data from IBM i (AS400) screen

I'm trying to grab data from an AS400 screen & populate an email using that data but seem to have bumped into something I'm struggling to overcome. Here's a slice of what I have so far:
Dim polNo
polNo = GetText(10,18,10)
Dim wsh
Set wsh=CreateObject("WScript.Shell")
subSub1_()
sub subSub1_()
// Just doing this to check the text I have
SendKeys(polNo)
// Sent the eMail with the text
wsh.Run "mailto:testing#somemailbox.com?Subject=" & polNo
end sub
With the above, the resulting email subject line takes only the first word upto the first space. From what I've found, this is a parsing issue & have discovered the following line that should help.
polNo = Chr(34) + Replace(polNo,chr(34),chr(34)&chr(34))
The above line places all of the text in quotes (I know this because my SendKeys line now shows the GetText result with a " at the start.
The issue is when I reach the mailto line as Outlook pops up a window saying:
"The command line argument is not valid. Verify the switch you are using."
My end result will be an email that has a subject & a body with text taken from various parts of the screen.
Solved: Thanks to dmc below, he started me on the right line.
However, the solution was not to use Chr(34) but to use something as simple as:
polNo = Replace(polNo," ","20%")
Although it might not look like it, you're constructing a URL. As such, the contents of that URL must be URL Encoded. Certain characters can't be included in a URL, including a space. Those characters are represented with a percent sign followed by the ASCII code of the character in hexadecimal. For example, a space is changed to %20.
See the link below for a VBScript routine that will URL encode and decode strings.
http://www.justskins.com/forums/wsh-equivalent-of-server-38778.html
Edit: Although this is commonly known as URL encoding, the thing you're constructing is technically a URI. Wikipedia has a good page that explains further.

Preserving whitespace / line breaks with REXML

I'm using Ruby 1.9.3 and REXML to parse an XML document, make a few changes (additions/subtractions), then re-output the file. Within this file is a block that looks like this:
<someElement>
some.namespace.something1=somevalue1
some.namespace.something2=somevalue2
some.namespace.something3=somevalue3
</someElement>
The problem is that after re-writing the file, this block always ends up looking like this:
<someElement>
some.namespace.something1=somevalue1
some.namespace.something2=somevalue2 some.namespace.something3=somevalue3
</someElement>
The newline after the second value (but never the first!) has been lost and turned into a space. Later, some other code which I have no control or influence over will be reading this file and depending on those newlines to properly parse the content. Generally in this situation i'd use a CDATA to preserve the whitespace, but this isn't an option as the code that parses this data later is not expecting one - it's essential that the inner text of this element is preserved exactly as-is.
My read/write code looks like this:
xmlFile = File.open(myFile)
contents = xmlFile.read
xmlDoc = REXML::Document.new(contents, { :respect_whitespace => :all })
xmlFile.close
{perform some tasks}
out = ""
xmlDoc.write(out, 2)
File.open(filePath, "w"){|file| file.puts(out)}
I'm looking for a way to preserve the whitespace of text between elements when reading/writing a file in this manner using REXML. I've read a number of other questions here on stackoverflow on this subject, but none that quite replicate this scenario. Any ideas or suggestions are welcome.
I get correct behavior by removing the indent (second) parameter to Document.write():
#xmlDoc.write(out, 2)
xmlDoc.write(out)
That seems like a bug in Document.write() according to my reading of the docs, but if you don't really need to set the indentation, then leaving that off should solve yor problem.

MSXML / XPath: special Characters

I'm reading and writing XML-files with Microsoft XML Core Services 6.0 (MSXML).
When writing element-content with "special" chars that have to be escaped
in the context of xml, like writing "&" as & i dont have to care
about this because MSXML does this conversion. This means, if i assign a text
to an element, e.g. oXMLElement.Text = "1 & 2" , MSXML actually writes
oXMLElement.Text = 1 & 2 when i create a XML-file. Thats pretty nice
and saves me some work.
Now, what i want to do, is to "de-mask" XML-strings
automatically. So, i read from a XML-file with the selectNodes-method, which
works by adding an XPath-statement, e.g. //ns:element/text(). Unfortunately,
the result-string i get looks like 1 & 2 and not like 1 & 2. Is there
a way to tell the MSXML-object or maybe the XPath-statement to give me an
"de-masked" string? I´m using MSXML with ObjectPal / Paradox, so the best
solution would be a method from the MSXML-library or a "special" XPath-
statement.
What you're seeing is the "escaped" XML notation for the text. This is what you should see if you use the .xml property to retrieve the string.
To get the string without the escapes, use .nodeValue.

Ruby - Writing Hpricot data to a file

I am currently doing some XML parsing and I've chosen to use Hpricot because of it's ease of use and syntax, however I am running into some problems. I need to write a piece of XML data that I have found out to another file. However, when I do this the format is not preserved. For example, if the content should look like this:
<dict>
<key>item1</key><value>12345</value>
<key>item2</key><value>67890</value>
<key>item3</key><value>23456</value>
</dict>
And assuming that there are many entries like this in the document. I am iterating through the 'dict' items by using
hpricot_element = Hpricot(xml_document_body)
f = File.new('some_new_file.xml')
(hpricot_element/:dict).each { |dict| f.write( dict.to_original_html ) }
After using the above code, I would expect that the output look like the following exactly like the XML shown above. However to my surprise, the output of the file looks more like this:
<dict>\n", " <key>item1</key><value>12345</value>\n", " <key>item2</key><value>67890</value>\n", " <key>item3</key><value>23456</value\n", " </dict>
I've tried splitting at the "\n" characters and writing to the file one line at a time, but that didn't seem to work either as it did not recognize the "\n" characters. Any help is greatly appreciated. It might be a very simple solution, but I am having troubling finding it. Thanks!
hpricot_element = Hpricot::XML(xml_document_body)
File.open('some_new_file.xml', 'w') {|f| f.write xml_document_body }
Don't use an an xml parser if you want the original xml to be written. It is unnecessary. You should still use one if you want to further process the data, though.
Also, for XML, you should be using Hpricot::XML instead of just Hpricot.
My solution was to just replace the literal '\n' characters with line breaks and remove the extra punctuation by simply adding two gsubs that looked like the following:
f.write( dict.to_original_html.gsub('\n', "\n").gsub('" ,"', '') )
I don't know why I didn't see this before. Like I said, it might be an easy answer that I wasn't seeing and that's exactly how it turned out. Thanks for all the answers!

Resources