What is the siginificance of [1] in FormData? - form-data

I'm inspecting FormData fields in HTTPS requests with Chrome. Here is what I see:
What is the significance of [1] in FormData?

They are the HTML string's escape characters for the square brackets, as processed in the URL. My guess is that [1] and [1][1] refer to array elements.

Related

Removing trailing newlines with regex in Ruby's 'String#scan'

I have a string, which contains a bunch of HTML documents, tagged with #name:
string = "#one\n\n<html>\n</html>\n\n#two\n<html>\n</html>\n\n\n"
I want to get an array of two-element arrays, each of which with a tag as the first element and the HTML document as the second:
[ ["#one", "<html>\n</html>"], ["#two", "<html>\n</html>"] ]
In order to solve the problem, I crafted the following regular expression:
regex = /(#.+)\n+([^#]+)\n+/
and applied it in string.scan regex.
However, instead of the desired output, I get the following:
[ ["#one", "<html>\n</html>\n"], ["#two", "<html>\n</html>\n\n"] ]
There are trailing newline characters at the end of each document. It appears that only one newline character was removed from the documents, but others stayed at the place.
How can the aforementioned regular expression be changed in order to remove all the trailing characters from the resulting documents?
The reason only the last \n was thrown away is because the two relevant capturing parts in your regex: .+ and [^#]+ capture everything up to the last \n (in order to make matching possible at all). It does not matter that they are followed by \n+. Remember that regex works from the left to the right. If some substring (sequences of \n in this case) can fit in either the preceding part of the following part of a regex, it actually fits in the preceding part.
With generality, I would suggest doing this:
string.split(/\s+(?=#)/).map{|s| s.strip.split(/\s+/, 2)}
# => [["#one", "<html>\n</html>"], ["#two", "<html>\n</html>"]]
You can remove duplicated newlines first:
string.gsub(/\n+/, "\n").scan(regex)
=> [["#one", "<html>\n</html>"], ["#two", "<html>\n</html>"]]

Partial string replace with gsub

I have an array of different image urls where I need to replace "/s_" with "/xl_". Ive tried a number of different ways, but non of them seems to work as I expect them to. Here is my latest version:
available_images.each do |img|
img.gsub(/.*(\/s_).*\.jpg/, "\/xl_")
end
available_images is the array holding a number of strings (which of course match the provided regex: .*(/s_).*.jpg ).
Any thoughts on how that can be fixed?
Thanks in advance!
A gsub! (! because you do a each and not a map) with a simple string (instead of a regex) should work:
"path/to/s_image.jpg".gsub '/s_', '/xl_'
# => "path/to/xl_image.jpg"
Update
As pointed out in the comments, the solution might result in unexpected behavior if the path contains multiple occurrences of '/s_'.
"path/s_thing/s_image.jpg".gsub '/s_', '/xl_'
#=> "path/xl_thing/xl_image.jpg"
▲ ▲
Borodin posted a nice, short regex substitution, which works in that case:
"path/s_thing/s_image.jpg".sub %r|/s_(?!.*/)|, '/xl_'
#=> "path/s_thing/xl_image.jpg"
△ ▲
It only replaces the last occurrence of '/s_'.

How do I extract urls from hyperlinks using hpricot?

I'd like to get the actual url strings from the hyperlinks. I'd like my result to be stripped of html.
So, if one of my input strings is
resource
I'd like to get:
http://target.com/resource.tar.gz
How can I do this?
In Hpricot you access attributes of an element using square brackets (like you would when accessing elements in a Hash). So, to use your example:
doc = Hpricot('resource')
puts doc.at('a')['href'] # => http://target.com/resource.tar.gz

Storing HTML codes (like Ø) as letters

I need to read some test data from an html document. The problem is there are some non-English characters there shown as HTML codes (e.g. Ø - Ø). How can I change those into a single character? Later I'll need to compare these characters to what user enters in a web form.
I'm trying to do this in Ruby 1.9.2.
Thanks in advance
This question was on SO many times. But I can't find it. So, as I can remember:
require 'CGI'
some_string = 'Ø&>'
p CGI.unescapeHTML(some_string).gsub(/&#(\d+);/){[$1.to_i].pack 'U'}
=> "\u00D8&>"
\u00D8 is your symbol. &> are just for example of use CGI::unescapeHTML.

What's the correct way to get HTML from an AJAX Response in JSON Format?

I have an AJAX request that creates a 'post', and upon successful post, I want to get HTML to inject back into the DOM. Right now I'm returning a JSON array that details success/error, and when I have a success I also include the HTML for the post in the response. So, I parse the response as JSON, and set a key in the JSON array to a bunch of HTML Code.
Naturally, the HTML code is making the JSON array break -- what should I do to escape it (or is there a better way to do this?). I get an AJAX response with a JSON array like so:
[{response:"success"},{html:'<div class="this is going to break...
Thanks!
Contrary to what you're probably used to in JavaScript, ' can't begin a string in JSON. It's strictly a ". Single quotes work when you're passing JSON to JavaScript.. much like <br> works when you want to put an XHTML line break.
So, use " to open the HTML string, and sanitize your quotes with \".
json.org has more info WRT what you should sanitize. Though the list of special characters isn't long, it's probably best to use a library like Anurag suggests in a comment.
Apart from escaping double quotes as mention by BranTheMan, newlines also break JSON strings. You need to replace newlines with \n.
Personally I've found this to be enough:
// Don't know what your serverside language is, example in javascript syntax:
print(encodeJSON({
response : "success",
html : htmlString.replace(/\n/g,'\\n').replace(/"/g,'\\"')
}));

Resources