Basically questions says it all, how can I convert an xml file to yaml?
I've tried this:
require 'active_support/core_ext/hash/conversions'
require 'yaml'
file = File.open("data/mconvert.xml", "r")
hash = Hash.from_xml(file.read)
yaml = hash.to_yaml
File.open("data/mirador.yml", "w") { |file| file.write(yaml) }
But, I am getting an "Exception parsing" error. I thought that was because I had dashes in an xml tag name, so I replaced the dashes with dashcharacterr But that still didn't work.
If we have a look at the XML 1.0 specification, we'll see that start tags look like this:
[40] STag ::= '<' Name (S Attribute)* S? '>'
and then elsewhere, we find the definition of Name:
[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
You'll notice that - is not in NameStartChar so this:
<-vikings->1336162202</-vikings->
is not valid XML and this part of your code:
hash = Hash.from_xml(file.read)
is failing because your file doesn't contain XML, it contains text that looks like XML but isn't quite real XML.
Fix your data/mconvert.xml file to contain real XML and try again.
If you try a simple experiment in the Rails console, you'll see what's going on:
> Hash.from_xml('<-vikings->1336162202</-vikings->')
REXML::ParseException: #<REXML::ParseException: malformed XML: missing tag start
Line: 1
Position: 33
Last 80 unconsumed characters:
<-vikings->1336162202</-vikings->>
notice the "malformed XML: missing tag start"?
Related
I'm attempting to parse email body to excel file.
After some manipulations, my current output is an array, where each line is data related to a product.
[Β Β
"Periods: 01.01.2023 - 01.02.2023 | Code: 111 | Code2: 1111 | product-name",Β Β
"Periods: 01.01.2023 - 01.02.2023 | Code: 222 | Code2: 2222 | product-name2"
]
I need to replace the 3rd occurrence of " | " with " | Product: " , so i can get field Product before the product name.
I've tried to use Apply to each -> current item -> various ways to find 3rd occurrence and replace it, but can't succeed.
Any suggestion?
You should be able to loop through each item and perform a simple replace expression like thus ...
replace(item(), split(item(), ' | ')[3], concat('Product: ', split(item(), ' | ')[3]))
That should get you across the line. Of course, I'm basing my answer off the limited information you provided.
I have put the data from a file into an array, then I am just staying with the data I want of that array which looks like follows:
Basically what I want, is to access each column independently. As the file will keep changing I don't want something hard coded, I would have done it already :).
Element0: | data | address | type | source | disable |
Element1: | 0x000001 | 0x123456 | in | D | yes |
Element2: | 0x0d0f00 | 0xffffff | out | M | yes |
Element3: | 0xe00ab4 | 0xaefbd1 | in | E | no |
I have tried with the regexp /\|\s+.*\s+\|/it prints just few lines (it removes the data I care of). I also tried with /\|.*\|/ and it prints all empty.
I have googled the split method and I know that this is happening it is because of the .* removing the data I care of. I have also tried with the regexp \|\s*\| but it prints the whole line. I have tried with many regexp's but at this moment I can't think of a way to solve this.
Any recommendation?
`line_ary = ary_element.split(/\|\s.*\|/)
unless line_ary.nil? puts line_ary`
You should use the csv class instead of trying to regex parse it. Something like this will do:
require 'csv'
data = CSV.read('data.csv', 'r', col_sep: '|')
You can access rows and columns as a 2 dimentional array, e.g. to access row 2, column 4: data[1][3].
If for example you just wanted to print the address column for all rows you could do this instead:
CSV.foreach('data.csv', col_sep: '|') do |row|
puts row[2]
end
I'd probably use a CSV parser for this but if you want to use a regex and you're sure that you'll never have | inside one of the column values, then you want to say:
row = line.split(/\s*\|\s*/)
so that the whitespace on either side of the pipe becomes part of the delimiter. For example:
> 'Element0: | data | address | type | source | disable |'.split(/\s*\|\s*/)
=> ["Element0:", "data", "address", "type", "source", "disable"]
> 'Element1: | 0x000001 | 0x123456 | in | D | yes |'.split(/\s*\|\s*/)
=> ["Element1:", "0x000001", "0x123456", "in", "D", "yes"]
Split together with strip might be the easiest option. Have you tried something like this?
"Element3:...".split(/\|/).collect(&:strip)
I'm currently trying to test a service that should be properly replacing certain special characters within a certain Unicode range, including emojis, transportation icons, emoticons and dingbats. I have been using Cucumber and Ruby to do the testing, and my latest scenario outline won't work. I've tried looking up other ways of getting the character from the examples table, however I can't seem to get it working, and the cucumber printout just complains that the Given step isn't defined.
Here is my feature scenario:
Scenario Outline: I update a coupon with a name including emojis/emoticons/dingbats/symbols
Given I have a name variable with a <character> included
When I patch my coupon with this variable
Then the patch should succeed
And the name should include replacement characters
Examples:
| character |
| π³ |
| π
|
| π₯ |
| π |
| β |
| β |
| β |
| β |
| β¨ |
| π |
| π¦Ώ° |
And Here is my step definition for the Given (which is the step that is complaining that it isn't defined)
Given(/^I have a name variable with a (\w+) included$/) do |char|
#name = 'min length ' + char
#json = { 'name' => #name }.to_json
end
I've tried using some regex's to capture the character, and a (\w+) and (\d+), although I can't find information on how to capture the special character. It's possible for me to write 11 different step definitions, but that would be such poor practice it would drive me nuts.
Unless you have spaces in your specials, itβs safe to use non-space \S:
Given(/^I have a name variable with a (\S+) included$/) do |char|
...
\w would not give you the desired result, since \w is resolved to [a-zA-Z0-9_].
I am using a scenario table (multiline step arguments) to check some data from a screen using cucumber, using the in built .diff! method on the Cucumber AST table.
I would like to check the content matches against regular expressions.
Scenario: One
Then the table appears as:
| One | Two | Three |
| /\d+/ | /\d+/ | /\d+/ |
The actual table could look something like
| One | Two | Three |
| 123 | 456 | 789 |
which this scenario is translated to "as long as there are some digits, I don't care"
An example step implementation that fails:
Then /^the table appears as:$/ do |expected_table|
actual_table = [['One','Two', 'Three'],['123', '456', '789']]
expected_table.diff! actual_table
end
Error:
Then the table appears as: # features/step_definitions/my_steps.rb:230
| One | Two | Three |
| /\\d+/ | /\\d+/ | /\\d+/ |
| 123 | 456 | 789 |
Tables were not identical (Cucumber::Ast::Table::Different)
I have tried using step transforms to transform the cells into regular expressions, but they still aren't identical.
Transform code:
expected_table.raw[0].each do |column|
expected_table.map_column! column do |cell|
if cell.respond_to? :start_with?
if cell.start_with? "/"
cell.to_regexp
else
cell
end
else
cell
end
end
end
which provides the eror:
Then the table appears as: # features/step_definitions/my_steps.rb:228
| One | Two | Three |
| (?-mix:\\d+) | (?-mix:\\d+) | (?-mix:\\d+) |
| 123 | 456 | 789 |
Tables were not identical (Cucumber::Ast::Table::Different)
Any ideas? I am stuck.
Using regular expressions in a scenario is almost certainly the wrong approach. Cucumber features are intended to be read and understood by business-focussed stakeholders.
How about writing the step at a higher level, such as as:
Then the first three columns of the table should contain a digit
There is no way to do it without writing your own implementation of diff! method from Ast::Table. Take a look into cucumber/lib/ast/table.rb. Internally it uses diff-lcs library to do an actual comparison which doesn't support regex match.
It seems that you want to write this in a way that provides the cool diff output. Otherwise, I'd look at writing this such that you simply check the rows. It won't be as pretty, and it won't get you the diff of the entire table, but it's something.
Then /^the table appears as:$/ do |expected_table|
actual_table = [['One','Two', 'Three'],['123', '456', '789']]
expected_table.raw.each_with_index { |row, y|
row.each_with_index { |cell, x|
actual_table[x][y].should == cell
}
}
end
How can I print a large variable nicely in an irb prompt? I have a variable that contains many variables which are long and the printout becomes a mess to wade through. What if I just want the variable names without their values? Or, can I print each one on a separate line, tabbed-in depending on depth?
Or, can I print each one on a separate line, tabbed-in depending on depth?
Use pp (pretty print):
require 'pp'
very_long_hash = Hash[(1..23).zip(20..42)]
pp very_long_hash
# Prints:
{1=>20,
2=>21,
3=>22,
4=>23,
5=>24,
6=>25,
7=>26,
8=>27,
9=>28,
10=>29,
11=>30,
12=>31,
13=>32,
14=>33,
15=>34,
16=>35,
17=>36,
18=>37,
19=>38,
20=>39,
21=>40,
22=>41,
23=>42}
If you want something that is even more awesome than "pretty" print, you can use "awesome" print. And for a truly spaced out experience, sprinkle some hirbal medicine on your IRb!
Hirb, for example, renders ActiveRecord objects (or pretty much any database access library) as actual ASCII tables:
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+
| id | created_at | description | name | namespace | predicate | value |
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+
| 907 | 2009-03-06 21:10:41 UTC | | gem:tags=yaml | gem | tags | yaml |
| 906 | 2009-03-06 08:47:04 UTC | | gem:tags=nomonkey | gem | tags | nomonkey |
| 905 | 2009-03-04 00:30:10 UTC | | article:tags=ruby | article | tags | ruby |
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+