Base64.decode64 in ruby returning strange results - ruby

I'm having problems in decoding a string using Base64.decode64 in Ruby. As a test, I'm using this site that decodes strings in php: https://rnd.feide.no/simplesaml/module.php/saml2debug/debug.php.
As a test, I'm using this string:
fZJNT%2BMwEIbvSPwHy%2Fd8tMvHympSdUGISuwS0cCBm%2BtMUwfbk%2FU4zfLvSVMq2Euv45n3fd7xzOb%2FrGE78KTRZXwSp5yBU1hpV2f8ubyLfvJ5fn42I2lNKxZd2Lon%2BNsBBTZMOhLjQ8Y77wRK0iSctEAiKLFa%2FH4Q0zgVrceACg1ny9uMy7rCdaM2%2Bs0BWrtppK2UAdeoVjW2ruq1bevGImcvR6zpHmtJ1MHSUZAuDKU0vY7Si2h6VU5%2BiMuJuLx65az4dPql3SHBKaz1oYnEfVkWUfG4KkeBna7A%2Fxm6M14j1gZihZazBRH4MODcoKPOgl%2BB32kFz08PGd%2BG0JJIkr7v46%2BhRCaEpod17DCRivYZCkmkd4N28B3wfNyrGKP5bws9DS6PKDz%2FMpsl36Tyz%2F%2Fax1jeFmi0emcLY7C%2F8SDD0Z7dobcynHbbV3QVbcZW0TlqQemNhoqzJD%2B4%2Fn8Yw7l8AA%3D%3D
The output should be:
<?xml version="1.0" encoding="UTF-8"?>
<samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" ID="agdobjcfikneommfjamdclenjcpcjmgdgbmpgjmo" Version="2.0" IssueInstant="2007-04-26T13:51:56Z" ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" ProviderName="google.com" AssertionConsumerServiceURL="https://www.google.com/a/solweb.no/acs" IsPassive="true"><saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">google.com</saml:Issuer><samlp:NameIDPolicy AllowCreate="true" Format="urn:oasis:names:tc:SAML:2.0:nameid-format:unspecified" /></samlp:AuthnRequest>
But in Ruby I keep getting a strange output:
}\222MO?0\206?H?\a??|\264???jRuA\210J????\233?LS\aۓ?8???IS*?K\257??}????\377\254a;??e|\022\247\234\201SXiWg????~?y~~6#iM+\026]غ'??6L:\022?C?;?J?$\234\264#\"(\261Z?~?8\255ǀ\n\rg??\214˺?u\2436??Z?i\244\255\224\001רV5\266\256?m??"g/G\254?kI???Q\220.\f\2454\275\216ҋhzUN~\210ˉ\270\274z??t???!?)\254????}Y\026Q?*G\201\235\256??\031\2723^#?b\205\226\263\005\021?0?ܠ\243΂_\201?i\005?O\017\031߆ВH\222\276??\241D&\204\246\207u?0?\212?
I\244w\203v??|ܫ\030\243?o
.\217(<\3772\233%ߤ?????X?h\264zg\vc\260\277? ??\236ݡ\2672\234v?Wt\025m?V?9jA鍆\212\263$?\270\376\177\030ù|\000
The code I use is:
require 'cgi'
require 'base64'
Base64::decode64(CGI::unescape('fZJNT%2BMwEIbvSPwHy%2Fd8tMvHympSdUGISuwS0cCBm%2BtMUwfbk%2FU4zfLvSVMq2Euv45n3fd7xzOb%2FrGE78KTRZXwSp5yBU1hpV2f8ubyLfvJ5fn42I2lNKxZd2Lon%2BNsBBTZMOhLjQ8Y77wRK0iSctEAiKLFa%2FH4Q0zgVrceACg1ny9uMy7rCdaM2%2Bs0BWrtppK2UAdeoVjW2ruq1bevGImcvR6zpHmtJ1MHSUZAuDKU0vY7Si2h6VU5%2BiMuJuLx65az4dPql3SHBKaz1oYnEfVkWUfG4KkeBna7A%2Fxm6M14j1gZihZazBRH4MODcoKPOgl%2BB32kFz08PGd%2BG0JJIkr7v46%2BhRCaEpod17DCRivYZCkmkd4N28B3wfNyrGKP5bws9DS6PKDz%2FMpsl36Tyz%2F%2Fax1jeFmi0emcLY7C%2F8SDD0Z7dobcynHbbV3QVbcZW0TlqQemNhoqzJD%2B4%2Fn8Yw7l8AA%3D%3D'))
What could possibly be wrong? Thanks in advance.

I have no idea where you got the idea that that string of yours is a base64-encoded version of your XML. If you pass the first bit of it (<?x) through Base64.encode64() then CGI.escape(), you get:
PD94
at the start, which is nothing like your string. In fact, your first four characters "fZJN" are values 31, 25, 9 and 13 in base 64 so will give you:
011111 011001 001001 001101
then, grouping them in octets instead of sextets (I guess that's the right word):
01111101 10010010 01001101
7D 92 4D
which are not the characters you're expecting to see.
Putting the whole string in gives you:
PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4gPHNhbWxw
OkF1dGhuUmVxdWVzdCB4bWxuczpzYW1scD0idXJuOm9hc2lzOm5hbWVzOnRj
OlNBTUw6Mi4wOnByb3RvY29sIiBJRD0iYWdkb2JqY2Zpa25lb21tZmphbWRj
bGVuamNwY2ptZ2RnYm1wZ2ptbyIgVmVyc2lvbj0iMi4wIiBJc3N1ZUluc3Rh
bnQ9IjIwMDctMDQtMjZUMTM6NTE6NTZaIiBQcm90b2NvbEJpbmRpbmc9InVy
bjpvYXNpczpuYW1lczp0YzpTQU1MOjIuMDpiaW5kaW5nczpIVFRQLVBPU1Qi
IFByb3ZpZGVyTmFtZT0iZ29vZ2xlLmNvbSIgQXNzZXJ0aW9uQ29uc3VtZXJT
ZXJ2aWNlVVJMPSJodHRwczovL3d3dy5nb29nbGUuY29tL2Evc29sd2ViLm5v
L2FjcyIgSXNQYXNzaXZlPSJ0cnVlIj48c2FtbDpJc3N1ZXIgeG1sbnM6c2Ft
bD0idXJuOm9hc2lzOm5hbWVzOnRjOlNBTUw6Mi4wOmFzc2VydGlvbiI+Z29v
Z2xlLmNvbTwvc2FtbDpJc3N1ZXI+PHNhbWxwOk5hbWVJRFBvbGljeSBBbGxv
d0NyZWF0ZT0idHJ1ZSIgRm9ybWF0PSJ1cm46b2FzaXM6bmFtZXM6dGM6U0FN
TDoyLjA6bmFtZWlkLWZvcm1hdDp1bnNwZWNpZmllZCIgLz48L3NhbWxwOkF1
dGhuUmVxdWVzdD4=
When you escape that, you get:
PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4gPHNhbWxw%0AOkF1dGhuUmVxdWVzdCB4bWxuczpzYW1scD0idXJuOm9hc2lzOm5hbWVzOnRj%0AOlNBTUw6Mi4wOnByb3RvY29sIiBJRD0iYWdkb2JqY2Zpa25lb21tZmphbWRj%0AbGVuamNwY2ptZ2RnYm1wZ2ptbyIgVmVyc2lvbj0iMi4wIiBJc3N1ZUluc3Rh%0AbnQ9IjIwMDctMDQtMjZUMTM6NTE6NTZaIiBQcm90b2NvbEJpbmRpbmc9InVy%0AbjpvYXNpczpuYW1lczp0YzpTQU1MOjIuMDpiaW5kaW5nczpIVFRQLVBPU1Qi%0AIFByb3ZpZGVyTmFtZT0iZ29vZ2xlLmNvbSIgQXNzZXJ0aW9uQ29uc3VtZXJT%0AZXJ2aWNlVVJMPSJodHRwczovL3d3dy5nb29nbGUuY29tL2Evc29sd2ViLm5v%0AL2FjcyIgSXNQYXNzaXZlPSJ0cnVlIj48c2FtbDpJc3N1ZXIgeG1sbnM6c2Ft%0AbD0idXJuOm9hc2lzOm5hbWVzOnRjOlNBTUw6Mi4wOmFzc2VydGlvbiI%2BZ29v%0AZ2xlLmNvbTwvc2FtbDpJc3N1ZXI%2BPHNhbWxwOk5hbWVJRFBvbGljeSBBbGxv%0Ad0NyZWF0ZT0idHJ1ZSIgRm9ybWF0PSJ1cm46b2FzaXM6bmFtZXM6dGM6U0FN%0ATDoyLjA6bmFtZWlkLWZvcm1hdDp1bnNwZWNpZmllZCIgLz48L3NhbWxwOkF1%0AdGhuUmVxdWVzdD4%3D%0A
So, the bottom line is that you're getting junk from the decode because the data is not of the correct format.

It appears that the data is also deflated/compressed.
require 'zlib'
inflated=Base64::decode64(CGI::unescape('fZJNT%2BMwEIbvSPwHy%2Fd8tMvHympSdUGISuwS0cCBm%2BtMUwfbk%2FU4zfLvSVMq2Euv45n3fd7xzOb%2FrGE78KTRZXwSp5yBU1hpV2f8ubyLfvJ5fn42I2lNKxZd2Lon%2BNsBBTZMOhLjQ8Y77wRK0iSctEAiKLFa%2FH4Q0zgVrceACg1ny9uMy7rCdaM2%2Bs0BWrtppK2UAdeoVjW2ruq1bevGImcvR6zpHmtJ1MHSUZAuDKU0vY7Si2h6VU5%2BiMuJuLx65az4dPql3SHBKaz1oYnEfVkWUfG4KkeBna7A%2Fxm6M14j1gZihZazBRH4MODcoKPOgl%2BB32kFz08PGd%2BG0JJIkr7v46%2BhRCaEpod17DCRivYZCkmkd4N28B3wfNyrGKP5bws9DS6PKDz%2FMpsl36Tyz%2F%2Fax1jeFmi0emcLY7C%2F8SDD0Z7dobcyHbbV3QVbcZW0TlqQemNhoqzJD%2B4%2Fn8Yw7l8AA%3D%3D'))
zlib = Zlib::Inflate.new(-Zlib::MAX_WBITS)
zlib.inflate(inflated)

Related

Parsing out contents of XML tag in Ruby

I have an XML, that as I understand it has already been parsed by tags. My goal is to parse all the information that is in the <GetResidentsContactInfoResult> tag. In this tag of the sample xml below there are two records in here which begin each with the Lease PropertyId key. How can I iterate over the <GetResidentsContactInfoResult> tag and print out the key/value pairs for each record? I'm new to Ruby and working with XML files, is this something I can do with Nokogiri?
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<GetResidentsContactInfoResponse xmlns="http://tempuri.org/">
<GetResidentsContactInfoResult><PropertyResidents><Lease PropertyId="21M" BldgID="00" UnitID="0903" ResiID="3" occustatuscode="P" occustatuscodedescription="Previous" MoveInDate="2016-01-07T00:00:00" MoveOutDate="2016-02-06T00:00:00" LeaseBeginDate="2016-01-07T00:00:00" LeaseEndDate="2017-01-31T00:00:00" MktgSource="DBY" PrimaryEmail="noemail1#fake.com"><Occupant PropertyId="21M" BldgID="00" UnitID="0903" ResiID="3" OccuSeqNo="3444755" OccuFirstName="Efren" OccuLastName="Cerda" Phone2No="(832) 693-9448" ResponsibleFlag="Responsible" /></Lease><Lease PropertyId="21M" BldgID="00" UnitID="0908" ResiID="2" occustatuscode="P" occustatuscodedescription="Previous" MoveInDate="2016-02-20T00:00:00" MoveOutDate="2016-04-25T00:00:00" LeaseBeginDate="2016-02-20T00:00:00" LeaseEndDate="2017-02-28T00:00:00" MktgSource="PW" PrimaryEmail="noemail1#fake.com"><Occupant PropertyId="21M" BldgID="00" UnitID="0908" ResiID="2" OccuSeqNo="3451301" OccuFirstName="Donna" OccuLastName="Mclean" Phone2No="(713) 785-4240" ResponsibleFlag="Responsible" /></Lease></PropertyResidents></GetResidentsContactInfoResult>
</GetResidentsContactInfoResponse>
</soap:Body>
</soap:Envelope>
This uses Nokogiri to find all the GetResidentsContactInfoResponse elements, and then Active Support to convert the inner text to a hash of key-value pairs.
Read "sparklemotion/nokogiri" and "Tutorials" regarding installing and using Nokogiri.
Read "Active Support Core Extensions" about more capabilities of Active Support (though the guide does not include Hash.from_xml). To install it simply do gem install activesupport.
I assume you're fine with Nokogiri as you mentioned it in your question.
If you don't want to use Active Support, consider looking into "Convert a Nokogiri document to a Ruby Hash" as an alternative to the line Hash.from_xml(elm.text):
# Needed in order to use the `Hash.from_xml`
require 'active_support/core_ext/hash/conversions'
def find_key_values(str)
doc = Nokogiri::XML(str)
# Ignore namespaces for easier traversal
doc.remove_namespaces!
doc.css('GetResidentsContactInfoResponse').map do |elm|
Hash.from_xml(elm.text)
end
end
Usage:
# Option 1: if your XML above is stored in a variable called `string`
find_key_values string
# Option 2: if your XML above is stored in a file
find_key_values File.open('/path/to/file')
Which returns:
[{"PropertyResidents"=>
{"Lease"=>
[{"PropertyId"=>"21M",
"BldgID"=>"00",
"UnitID"=>"0903",
"ResiID"=>"3",
"occustatuscode"=>"P",
"occustatuscodedescription"=>"Previous",
"MoveInDate"=>"2016-01-07T00:00:00",
"MoveOutDate"=>"2016-02-06T00:00:00",
"LeaseBeginDate"=>"2016-01-07T00:00:00",
"LeaseEndDate"=>"2017-01-31T00:00:00",
"MktgSource"=>"DBY",
"PrimaryEmail"=>"noemail1#fake.com",
"Occupant"=>
{"PropertyId"=>"21M",
"BldgID"=>"00",
"UnitID"=>"0903",
"ResiID"=>"3",
"OccuSeqNo"=>"3444755",
"OccuFirstName"=>"Efren",
"OccuLastName"=>"Cerda",
"Phone2No"=>"(832) 693-9448",
"ResponsibleFlag"=>"Responsible"}},
{"PropertyId"=>"21M",
"BldgID"=>"00",
"UnitID"=>"0908",
"ResiID"=>"2",
"occustatuscode"=>"P",
"occustatuscodedescription"=>"Previous",
"MoveInDate"=>"2016-02-20T00:00:00",
"MoveOutDate"=>"2016-04-25T00:00:00",
"LeaseBeginDate"=>"2016-02-20T00:00:00",
"LeaseEndDate"=>"2017-02-28T00:00:00",
"MktgSource"=>"PW",
"PrimaryEmail"=>"noemail1#fake.com",
"Occupant"=>
{"PropertyId"=>"21M",
"BldgID"=>"00",
"UnitID"=>"0908",
"ResiID"=>"2",
"OccuSeqNo"=>"3451301",
"OccuFirstName"=>"Donna",
"OccuLastName"=>"Mclean",
"Phone2No"=>"(713) 785-4240",
"ResponsibleFlag"=>"Responsible"}}]}}]

Nokogiri removing xml encoding

I am using nokogiri to decode some xml. This xml does have some html as values. I am seeing some strange behavior when parsing this. It appears nokogiri is removing some of the html encoded tags, so when i parse the html I am unable to decode it properly. See examples below:
doc = Nokogiri::XML '<?xml version="1.0"?><manifest
xmlns="http://www.imsglobal.org/xsd/imscp_v1p1"
identifier="Manifest-eaf97d26-aa83-4399-8e9b-ae9f6f5fc6a2"
xmlns="http://www.imsglobal.org/xsd/imscp_v1p1"
xmlns:imsmd="http://www.imsglobal.org/xsd/imsmd_v1p2"
xmlns:imsqti="http://www.imsglobal.org/xsd/imsqti_v2p1">
<imsmd:langstring><p>
 These are the<strong>instructions</strong> for the pool</p></imsmd:langstring>'
this yields the following value:
"<?xml version=\"1.0\"?>\n<manifest xmlns=\"http://www.imsglobal.org/xsd/imscp_v1p1\" xmlns:imsmd=\"http://www.imsglobal.org/xsd/imsmd_v1p2\" xmlns:imsqti=\"http://www.imsglobal.org/xsd/imsqti_v2p1\" identifier=\"Manifest-eaf97d26-aa83-4399-8e9b-ae9f6f5fc6a2\">\n<imsmd:langstring>p
 These are thestrong instructions/strong for the pool/p</imsmd:langstring></manifest>\n"
Notice how the < > tags are missing. However the following works as expected.
doc = Nokogiri::XML '<?xml version="1.0"?><imsmd:langstring><p>
 These are the<strong> instructions</strong> for the pool</p></imsmd:langstring>'
and gives the following result
"<?xml version=\"1.0\"?>\n<imsmd:langstring><p>
 These are the<strong> instructions</strong> for the pool</p></imsmd:langstring>\n"
I am sure I am missing something but can't figure out what is causing this.

Using ruby SAX parsers for GB2312 encoded xml

Good day,
I have a lot of big xml files that i need to parse, but problems is they have 'gb2312' encoding. I would normaly use SAX parser for this.
So here is in example of xml:
<?xml version="1.0" encoding="gb2312"?>
<Root>
<ValueList Count="112290" FieldCount="11">
<Item1 Value1="23743" Value2="Дипломатия � Пустой кувшин" Value3="1" Value4="" Value5="6" Value6="0" Value7="0" Value8="0" Value9="0" Value10="0" Value11="0"/>
<Item2 Value1="6611" Value2="ДЛ � 018 омела � золотой кинжал" Value3="1" Value4="" Value5="6" Value6="0" Value7="0" Value8="0" Value9="0" Value10="0" Value11="0"/>
<Item3 Value1="6608" Value2="Наука (ДЛ)�круг фей 021�тяпка" Value3="1" Value4="" Value5="6" Value6="0" Value7="0" Value8="0" Value9="0" Value10="0" Value11="0"/>
<Item4 Value1="6612" Value2="Знаки ДЛ � 003руны � разрушение" Value3="1" Value4="" Value5="6" Value6="0" Value7="0" Value8="0" Value9="0" Value10="0" Value11="0"/>
....
</Root>
I'm trying to use Nokogiri SAX (also tried libxml-ruby with same result) parser:
require 'nokogiri'
class SchemaParser < Nokogiri::XML::SAX::Document
def initialize
#cnt = 0
end
def start_element name, attrs =[]
if name == "Item1"
#cnt+= 1
puts #cnt
end
end
end
parser = Nokogiri::XML::SAX::Parser.new(SchemaParser.new)
parser.parse_io(File.open('2_4_EQUIPMENT_ESSENCE.xml'), 'gb2312')
But this gives error "`check_encoding': 'GB2312' is not a valid encoding (ArgumentError)". If I remove encoding declaration and let Nokogiri detect encoding himself, I will receive this error:
encoding error : input conversion failed due to input error, bytes 0xA8 0x43 0x20 0xA7
encoding error : input conversion failed due to input error, bytes 0xA8 0x43 0x20 0xA7
I/O error : encoder error
I also tried to open File with proper encoding, but that didn't help SAX parser:
[3] pry(main)> f = File.open('2_4_EQUIPMENT_ESSENCE.xml', "r:gb2312")
=> #<File:2_4_EQUIPMENT_ESSENCE.xml>
[4] pry(main)> f.external_encoding.name
=> "GB2312"
Did anyone use 'gb2312' encoding with SAX parsers in ruby? Any recommendations how to proceed?
It seems the issue is that Libxml2 does not support the GB2312 encoding (see here for a list of supported encodings).
I'm not sure if you have tried this, but I think you can work around this by removing the encoding declaration from the XML files (so Libxml2 does not try to transcode the data) and set the external encoding of the File object to GB2312, because then Ruby will transcode the file to UTF-8 as it is read, and from then on everything will remain as UTF-8.
So, here is my workaround.
Problems:
Some of characters presented in xml are not 'gb2312' encoding, I have found that 'GB18030' would be a better choice with full Chinese characters.
I converted all xml's to utf8, so i can use SAX parser.
I ended up with this rake task:
desc "convert chinese xml files to utf-8"
task :convert do
rm_rf 'data/utf8'
mkdir 'data/utf8'
Dir.foreach('data') {|f|
if f.end_with?('.xml')
puts "converted:: data/utf8/#{f}" if system("iconv -f GB18030 -t UTF-8 data/#{f} > data/utf8/#{f}")
end
}
#replace encodings for xml files
system("bundle exec ruby -pi -e \"gsub(/gb2312/, 'UTF-8')\" data/utf8/*.xml")
end

I always get an UndefinedConversionError in Ruby 2.0 while scraping with Mechanize

When I try to submit a textarea with Mechanize and Ruby 2.0, I always get an
Encoding::UndefinedConversionError: U+0151 from UTF-8 to ISO-8859-1
Then I tryied to convert the text with Iconv, I got a similar result:
Iconv.iconv("LATIN1", "UTF-8", text)
I get this error message:
Iconv::IllegalSequence: "őzködik, melyet "...
As the text contains east-european characters. What can I do to avoid this kind of inconveniences or how can I convert properly between different encodings?
I have found an elegant solution:
replacements = [["À", "À"], ["Á", "Á"], ["Â", "Â"], ["Ã", "Ã"], ["Ä", "Ä"], ["Å", "Å"], ["Æ", "Æ"], ["Ç", "Ç"], ["È", "È"], ["É", "É"], ["Ê", "Ê"], ["Ë", "Ë"], ["Ì", "Ì"], ["Í", "Í"], ["Î", "Î"], ["Ï", "Ï"], ["Ð", "Ð"], ["Ñ", "Ñ"], ["Ò", "Ò"], ["Ó", "Ó"], ["Ô", "Ô"], ["Õ", "Õ"], ["Ö", "Ö"], ["Ø", "Ø"], ["Ù", "Ù"], ["Ú", "Ú"], ["Û", "Û"], ["Ü", "Ü"], ["Ý", "Ý"], ["Þ", "Þ"], ["ß", "ß"], ["à", "à"], ["á", "á"], ["â", "â"], ["ã", "ã"], ["ä", "ä"], ["å", "å"], ["æ", "æ"], ["ç", "ç"], ["è", "è"], ["é", "é"], ["ê", "ê"], ["ë", "ë"], ["ì", "ì"], ["í", "í"], ["î", "î"], ["ï", "ï"], ["ð", "ð"], ["ñ", "ñ"], ["ò", "ò"], ["ó", "ó"], ["ô", "ô"], ["õ", "õ"], ["ö", "ö"], ["ø", "ø"], ["ù", "ù"], ["ú", "ú"], ["û", "û"], ["ü", "ü"], ["ý", "ý"], ["þ", "þ"], ["ÿ", "ÿ"]]
def replace(str,replacements)
replacements.each {|replacement| str.gsub!(replacement[0], replacement[1])}
return str
end
my_string=replace(my_string,replacements)

Parse string to JSON/Hash

I'm trying to convert the following string to either a hash or json.
How do I do this in ruby?
[{"place":null,"coordinates":null,"in_reply_to_user_id":null,"in_reply_to_status_id":null,
"favorited":false,"truncated":false,"created_at":"Wed Nov 16 08:00:46 +0000 2011","retweet_count":0,"in_reply_to_screen_name":null,
"user":{"profile_background_image_url":"http:\/\/a1.twimg.com\/profile_background_images\/190989640\/afcx.jpg","protected":false,
"statuses_count":23414,"profile_link_color":"FF0000"},"retweeted":false,"in_reply_to_status_id_str":null,"in_reply_to_user_id_str":null,"contributors":null,"geo":null}]
I'm running ruby1.8.7.
What you have appears to be JSON already, so I assume you're looking to get a Ruby Hash from it. If so, then this should work:
Get a JSON library, I used gem install json_pure, which is a native Ruby implementation (there's a faster, C-based version but you wouldn't notice the difference unless your JSON strings are very large or you have a lot of them).
then
require 'json'
arr = JSON(your_json_string_here)
Note that the string you gave is a single-element array containing something that will map to a Ruby Hash. If you just want the hash:
the_hash = arr[0] # or maybe arr.first
I get this:
{"coordinates"=>nil, "created_at"=>"Wed Nov 16 08:00:46 +0000 2011",
"truncated"=>false, "favorited"=>false, "in_reply_to_user_id_str"=>nil,
"contributors"=>nil, in_reply_to_status_id_str"=>nil, "retweet_count"=>0,
"geo"=>nil, "retweeted"=>false, "in_reply_to_user_id"=>nil,
"user"=>{"profile_link_color"=>"FF0000", "protected"=>false,
"statuses_count"=>23414,
"profile_background_image_url"=>"http://a1.twimg.com/profile_background_images/190989640/afcx.jpg"},
"in_reply_to_screen_name"=>nil, place"=>nil, "in_reply_to_status_id"=>nil}

Resources