XmlUnit empty Elements - xmlunit

I try to compare two xml with xmlUnit. I have the following problem. When i have two empty elements like the example below xmlUnit identificate the elements as a difference. Can i configure xmlUnit to ignore this?
</name> and <name></name>
I am only interesting in difference like the next two examples.
<name>test1</name> and <name>test2</name>
difference: test1 and test2
or
<name>test1</name> and <name></name>
difference
test1 and ...
My code:
`
Diff diff = new Diff(fr1, fr2);
DetailedDiff detailedDiff = new DetailedDiff(diff);
List differenceList = detailedDiff.getAllDifferences();
List differences = detailedDiff.getAllDifferences();
for (Object object : differences) {
Difference difference = (Difference)object;
String node1;
String node2;
node1 = difference.getControlNodeDetail().getNode().getNodeName() + " " + difference.getControlNodeDetail().getNode().getNodeValue();
node2 = difference.getTestNodeDetail().getNode().getNodeName() + " " + difference.getTestNodeDetail().getNode().getNodeValue();
}
`

Assuming your </name> is a typo and it is <name/> as per the comment,
then you could try the following.
XMLUnit.setIgnoreWhitespace(true);
Seems to work for me.
ie.
When I try to compare <Carp1></Carp1> with <Carp1/>.
Without the above setting, I get
Expected text value '
' but was '
' - comparing <CfgDN ...>
</CfgDN> at /CfgDN[1]/text()[19] to <CfgDN ...>
</CfgDN> at /CfgDN[1]/text()[19]
With the above setting, all is similar and identical.

Related

Syntax to generate a Syntax in SPSS

I’m trying to construct a Syntax to generate a Syntax in SPSS, but I’m having some issues…
I have an excel file with metadata and I would like to use it in order to make a syntax to extract information from it (like this, if I have a huge database, I just need to keep the excel updated – add/delete variables, etc. - and then run a syntax to extract the needed information for a new syntax).
I also noticed the produced syntax has always around 15Mb, which is a lot (applied to more than 500 lines)!
I don’t use Python due to run syntax in different computers and/or configurations.
Any ideas? Can anyone please help me?
Thank you in advance.
Example:
(test.xlsx – sheet 1)
Var Code Label List Var_label (concatenate Var+Label)
V1 3 Sex 1 V1 “Sex”
V2 1 Work 2 V2 “Work”
V3 3 Country 3 V3 “Country”
V4 1 Married 2 V4 “Married”
V5 1 Kids 2 V5 “Kids”
V6 2 Satisf1 4 V6 “Satisf1”
V7 2 Satisf2 4 V7 “Satisf2”
(information from other file)
List = 1
1 “Male”
2 “Female”
List = 2
1 “Yes”
2 “No”
List = 3
1 “Europe”
2 “America”
3 “Asia”
4 “Africa”
5 “Oceania”
List = 4
1 “Very unsatisfied”
10 “Very satisfied”
I want to make a Syntax that generates a new syntax to apply “VARIABLE LABELS” and “VALUE LABELS”. So, I thought about something like this:
GET DATA
/TYPE=XLSX
/FILE="test.xlsx"
/SHEET=name 'sheet 1'
/CELLRANGE=FULL
/READNAMES=ON
/DATATYPEMIN PERCENTAGE=95.0.
EXECUTE.
STRING vlb (A15) labels (A150) value (A12) lab (A1500) point (A2) separate (A50) space (A2) list1 (A100) list2 (A100).
SELECT IF (Code=1).
COMPUTE vlb = "VARIABLE LABELS".
COMPUTE labels = CONCAT (RTRIM(Var_label)," ").
COMPUTE point = ".".
COMPUTE value = "VALUE LABELS".
COMPUTE lab = CONCAT (RTRIM(Var)," ").
COMPUTE list1 = '1 " Yes "'.
COMPUTE list2 = '2 "No".'.
COMPUTE space = " ".
COMPUTE separate="************************************************.".
WRITE OUTFILE = "list_01.sps" / vlb.
WRITE OUTFILE = "list_01.sps" /labels.
WRITE OUTFILE = "list_01.sps" /point.
WRITE OUTFILE = "list_01.sps" /value.
WRITE OUTFILE = "list_01.sps" /lab.
WRITE OUTFILE = "list_01.sps" /list1.
WRITE OUTFILE = "list_01.sps" /list2.
WRITE OUTFILE = "list_01.sps" /space.
WRITE OUTFILE = "list_01.sps" /separate.
WRITE OUTFILE = "list_01.sps" /space.
If there is only one variable with same list (ex: V1), it works ok. However, if there is more than one variable having the same list, it reproduces the codes as much times as number of variables (Ex: V2, V4 and V5).
What I have (Ex: V2, V4 and V5), after running code above:
VARIABLE LABELS
V2 "Work"
.
VALUE LABELS
V2
1 " Yes "
2 " No "
************************************************.
VARIABLE LABELS
V4 "Married"
.
VALUE LABELS
V4
1 " Yes "
2 " No "
************************************************.
VARIABLE LABELS
V5 "Kids"
.
VALUE LABELS
V5
1 " Yes "
2 " No "
************************************************.
What I would like to have:
VARIABLE LABELS
V2 "Work"
V4 "Married"
V5 "Kids"
.
VALUE LABELS
V2 V4 V5
1 " Yes "
2 " No "
I think there are probably ways to automate the whole process better, including the use of your second data source. But for the scope of this question I will suggest a way to get what you asked for specifically.
The key is to build the command with special conditions for first and last lines:
string cmd1 cmd2 (a200).
sort cases by code.
match files /file=* /first=first /last=last /by code. /* marking first and last lines.
do if first.
compute cmd1="VARIABLE LABELS".
compute cmd2="VALUE LABELS".
end if.
if not first cmd1=concat(rtrim(cmd1), " /"). /* "/" only appears from the second varname.
compute cmd1=concat(rtrim(cmd1), " ", Var_label).
compute cmd2=concat(rtrim(cmd2), " ", Var).
do if last.
compute cmd1=concat(rtrim(cmd1), " .").
compute cmd2=concat(rtrim(cmd2), " ", ' 1 " Yes " 2 "No". ').
end if.
exe.
The commands are now ready, but we don't want to get them mixed up so we'll stack them one under the other, and only then write them out:
add files /file=* /rename cmd1=cmd /file=* /rename cmd2=cmd.
exe.
WRITE OUTFILE = "var definitions.sps" / cmd .
exe.
EDIT:
Note that the code above assumes you've already run a select cases if code = ... and that there is a single code in all the remaining lines.
Note also I added an exe. command at the end - without running that the new syntax will appear empty.

Changing node values in XML using VBS with similar names

I am facing this challenge with changing values of nodes in an xml file which has same names, using VBScript. Following is the sample XML:
- <MappingData>
<Name>Name 1</Name>
- <ValueField FieldName="Name 2">
<CharValue>Value 1</CharValue>
</ValueField>
- <ValueField FieldName="Name 3">
<CharValue>Value 1</CharValue>
</ValueField>
My requirement is to change the values of both occurrences of tag
<CharValue>
.
I could have done this if there was any attribute in these tags, but in this case I am stuck.
I tried the following code, but could'nt get what I need.
Set NodeList = objXMLDoc.documentElement.selectNodes("//MappingData/ValueField/CharValue")
For i = 0 To NodeList.length - 1
node.Text = "Value 1"
Next
Any help is appreciated. Thanks.
Try the following code
Set xDoc = CreateObject( "Msxml2.DOMDocument.6.0" )
xDoc.setProperty "SelectionLanguage", "XPath"
xDoc.async = False
xDoc.load "test.xml"
Set nNodes = xDoc.selectNodes("//MappingData/ValueField/CharValue")
For i = 0 To nNodes.Length - 1
nNodes(i).text = "Changed value " & i
Next
xDoc.Save "test.xml"

XMLUNIT - Comparing two xmls with differing Nodes

I have 2 XMLs to be compared:
File1.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<member>
<SchoolID>1021</SchoolID>
<CandidateType>First Year</CandidateType>
<CandidateName>John</CandidateName>
</member>
File2.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<member>
<CandidateID>3147</CandidateID>
<SchoolID>1021</SchoolID>
<CandidateType>Second Year</CandidateType>
<CandidateName>Peter</CandidateName>
</member>
I am using xmlunit to compare, however the output I am getting is like:
Similar? false
Identical? false
***********************
Expected number of child nodes '2' but was '3' - comparing <member...> at /member[1] to <member...> at /member[1]
***********************
***********************
Expected sequence of child nodes '0' but was '1' - comparing <SchoolID...> at /member[1]/SchoolID[1] to <SchoolID...> at /member[1]/SchoolID[1]
***********************
***********************
Expected text value 'John' but was 'Peter' - comparing <CandidateName ...>John</CandidateName> at /member[1]/CandidateName[1]/text()[1] to <CandidateName ...>Peter</CandidateName> at /member[1]/CandidateName[1]/text()[1]
***********************
***********************
Expected sequence of child nodes '1' but was '2' - comparing <CandidateName...> at /member[1]/CandidateName[1] to <CandidateName...> at /member[1]/CandidateName[1]
***********************
***********************
Expected presence of child node 'null' but was 'CandidateID' - comparing at null to <CandidateID...> at /member[1]/CandidateID[1]
I need to represent the output such that it only tells me the following differences:
Node CandidateID is missing in File1.xml and the data difference for Node CandidateName. I don't need the extra details. Is there a way to tweak the output of detDiff.getAllDifferences().
The code snapshot looks like:
try {
// fr1 and fr2 are my two xml files.
Diff diff = new Diff(fr1, fr2);
System.out.println("Similar? " + diff.similar());
System.out.println("Identical? " + diff.identical());
DetailedDiff detDiff = new DetailedDiff(diff);
List differences = detDiff.getAllDifferences();
for (Object object : differences) {
Difference difference = (Difference)object;
System.out.println("***********************");
System.out.println(difference);
System.out.println("***********************");
}
Depending on your needs you could either filter the differences after the comparison or override the DifferenceListener in order to ignore the differences you are not interested in.
For filtering it seems you could just look at isRecoverable and strip away all recoverable differences.
A custom DifferenceListener would be something like
detDiff.overrideDifferenceListener(new DifferenceListener() {
#Override
public int differenceFound(Difference diff) {
if (diff.getId() == DifferenceConstants.CHILD_NODELIST_SEQUENCE_ID
|| diff.getId() == DifferenceConstants.CHILD_NODELIST_LENGTH_ID) {
return RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
}
return RETURN_ACCEPT_DIFFERENCE;
}
#Override
public void skippedComparison(Node arg0, Node arg1) { }
});
before you call getAllDifferences.

Join array of strings into 1 or more strings each within a certain char limit (+ prepend and append texts)

Let's say I have an array of Twitter account names:
string = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
And a prepend and append variable:
prepend = 'Check out these cool people: '
append = ' #FollowFriday'
How can I turn this into an array of as few strings as possible each with a maximum length of 140 characters, starting with the prepend text, ending with the append text, and in between the Twitter account names all starting with an #-sign and separated with a space. Like this:
tweets = ['Check out these cool people: #example1 #example2 #example3 #example4 #example5 #example6 #example7 #example8 #example9 #FollowFriday', 'Check out these cool people: #example10 #example11 #example12 #example13 #example14 #example15 #example16 #example17 #FollowFriday', 'Check out these cool people: #example18 #example19 #example20 #FollowFriday']
(The order of the accounts isn't important so theoretically you could try and find the best order to make the most use of the available space, but that's not required.)
Any suggestions? I'm thinking I should use the scan method, but haven't figured out the right way yet.
It's pretty easy using a bunch of loops, but I'm guessing that won't be necessary when using the right Ruby methods. Here's what I came up with so far:
# Create one long string of #usernames separated by a space
tmp = twitter_accounts.map!{|a| a.insert(0, '#')}.join(' ')
# alternative: tmp = '#' + twitter_accounts.join(' #')
# Number of characters left for mentioning the Twitter accounts
length = 140 - (prepend + append).length
# This method would split a string into multiple strings
# each with a maximum length of 'length' and it will only split on empty spaces (' ')
# ideally strip that space as well (although .map(&:strip) could be use too)
tweets = tmp.some_method(' ', length)
# Prepend and append
tweets.map!{|t| prepend + t + append}
P.S.
If anyone has a suggestion for a better title let me know. I had a difficult time summarizing my question.
The String rindex method has an optional parameter where you can specify where to start searching backwards in a string:
arr = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
str = arr.map{|name|"##{name}"}.join(' ')
prepend = 'Check out these cool people: '
append = ' #FollowFriday'
max_chars = 140 - prepend.size - append.size
until str.size <= max_chars do
p str.slice!(0, str.rindex(" ", max_chars))
str.lstrip! #get rid of the leading space
end
p str unless str.empty?
I'd make use of reduce for this:
string = %w[example1 example2 example3 example4 example5 example6 example7 example8 example9 example10 example11 example12 example13 example14 example15 example16 example17 example18 example19 example20]
prepend = 'Check out these cool people:'
append = '#FollowFriday'
# Extra -1 is for the space before `append`
max_content_length = 140 - prepend.length - append.length - 1
content_strings = string.reduce([""]) { |result, target|
result.push("") if result[-1].length + target.length + 2 > max_content_length
result[-1] += " ##{target}"
result
}
tweets = content_strings.map { |s| "#{prepend}#{s} #{append}" }
Which would yield:
"Check out these cool people: #example1 #example2 #example3 #example4 #example5 #example6 #example7 #example8 #example9 #FollowFriday"
"Check out these cool people: #example10 #example11 #example12 #example13 #example14 #example15 #example16 #example17 #FollowFriday"
"Check out these cool people: #example18 #example19 #example20 #FollowFriday"

REXML parsing an XML in ruby

Folks,
I am using REXML for a sample XML file:
<Accounts title="This is the test title">
<Account name="frenchcustomer">
<username name = "frencu"/>
<password pw = "hello34"/>
<accountdn dn = "https://frenchcu.com/"/>
<exporttest name="basic">
<exportname name = "basicexport"/>
<exportterm term = "oldschool"/>
</exporttest>
</Account>
<Account name="britishcustomer">
<username name = "britishcu"/>
<password pw = "mellow34"/>
<accountdn dn = "https://britishcu.com/"/>
<exporttest name="existingsearch">
<exportname name = "largexpo"/>
<exportterm term = "greatschool"/>
</exporttest>
</Account>
</Accounts>
I am reading the XML like this:
#data = (REXML::Document.new file).root
#dataarr = ##testdata.elements.to_a("//Account")
Now I want to get the username of the frenchcustomer, so I tried this:
#dataarr[#name=fenchcustomer].elements["username"].attributes["name"]
this fails, I do not want to use the array index, for example
#dataarr[1].elements["username"].attributes["name"]
will work, but I don't want to do that, is there something that i m missing here. I want to use the array and get the username of the french user using the Account name.
Thanks a lot.
I recommend you to use XPath.
For the first match, you can use first method, for an array, just use match.
The code above returns the username for the Account "frenchcustomer" :
REXML::XPath.first(yourREXMLDocument, "//Account[#name='frenchcustomer']/username/#name").value
If you really want to use the array created with ##testdata.elements.to_a("//Account"), you could use find method :
french_cust_elt = the_array.find { |elt| elt.attributes['name'].eql?('frenchcustomer') }
french_username = french_cust_elt.elements["username"].attributes["name"]
puts #data.elements["//Account[#name='frenchcustomer']"]
.elements["username"]
.attributes["name"]
If you want to iterate over multiple identical names:
#data.elements.each("//Account[#name='frenchcustomer']") do |fc|
puts fc.elements["username"].attributes["name"]
end
I don't know what your ##testdata are, I tried with the following testcode:
require "rexml/document"
#data = (REXML::Document.new DATA).root
#dataarr = #data.elements.to_a("//Account")
# Works
p #dataarr[1].elements["username"].attributes["name"]
#Works not
#~ p #dataarr[#name='fenchcustomer'].elements["username"].attributes["name"]
##dataarr is an array
#dataarr.each{|acc|
next unless acc.attributes['name'] =='frenchcustomer'
p acc.elements["username"].attributes["name"]
}
##dataarr is an array
puts "===Array#each"
#dataarr.each{|acc|
next unless acc.attributes['name'] =='frenchcustomer'
p acc.elements["username"].attributes["name"]
}
puts "===XPATH"
#data.elements.to_a("//Account[#name='frenchcustomer']").each{|acc|
p acc.elements["username"].attributes["name"]
}
__END__
<Accounts title="This is the test title">
<Account name="frenchcustomer">
<username name = "frencu"/>
<password pw = "hello34"/>
<accountdn dn = "https://frenchcu.com/"/>
<exporttest name="basic">
<exportname name = "basicexport"/>
<exportterm term = "oldschool"/>
</exporttest>
</Account>
<Account name="britishcustomer">
<username name = "britishcu"/>
<password pw = "mellow34"/>
<accountdn dn = "https://britishcu.com/"/>
<exporttest name="existingsearch">
<exportname name = "largexpo"/>
<exportterm term = "greatschool"/>
</exporttest>
</Account>
</Accounts>
I'm not very familiar with rexml, so I expect there is a better solution. But perhaps aomebody can take my code to build a better solution.

Resources