String Include is not matching - ruby

I'm having an issue where a string link that has .pdf is not matching with include? in ruby. Example code
link = somelink.pdf
puts link.include?(".pdf")
Output when I run the program.
http://somelink.com/somepdf.pdf
false

Try converting to string first
link = somelink.pdf
puts link.to_s.include?(".pdf")
OR
File.extname(link.to_s) == ".pdf"

Related

How to search a XML file using the value in ARGV[1]

I am trying to search a file using the value in the ARGV array. However using doc.at is not working. I have set the variable keyword to ARGV[1] and when given a value that prints to the console but when i try to puts the variable text to the console it comes up blank.
require 'nokogiri'
input = ARGV[0]
keyword = ARGV[1]
case input
when input = "list"
doc = File.open("emails.xml") { |f| Nokogiri::XML(f) }
text = doc.at('record:contains("{keyword}")')
puts text
puts keyword
else
puts "no"
end
Your string interpolation is wrong.
Change it to:
doc.at("record:contains('#{keyword}')")
start with double " and interpolate with #{}

How do I search a text file for a string and then print/return/put out what line of the file it was found on as a number in Ruby

So, I want to use the number I get from it in this:
line = answer to question
database.read.lines[line]
Database being the text file I am searching in.
You can also do it this way :
text_to_find = 'some random text' # use gets method to take input from user
text_found_at_index = database.readlines.index{|line| not line[text].nil? }
Hope, this is what you require : )
I would try something like this:
query = gets.chomp
database.each_line.with_index do |line, index|
if line.include?(query)
puts "Line #{index}: #{line}"
end
end

Using binary data (strings in utf-8) from external file

I have problem with using strings in UTF-8 format, e.g. "\u0161\u010D\u0159\u017E\u00FD".
When such string is defined as variable in my program it works fine. But when I use such string by reading it from some external file I get the wrong output (I don't get what I want/expect). Definitely I'm missing some necessary encoding stuff...
My code:
file = "c:\\...\\vlmList_unicode.txt" #\u306b\u3064\u3044\u3066
data = File.open(file, 'rb') { |io| io.read.split(/\t/) }
puts data
data_var = "\u306b\u3064\u3044\u3066"
puts data_var
Output:
\u306b\u3064\u3044\u3066 # what I don't want
について # what I want
I'm trying to read the file in binary form by specifying 'rb' but obviously there is some other problem...
I run my code in Netbeans 7.3.1 with build in JRuby 1.7.3 (I tried also Ruby 2.0.0 but without any effect.)
Since I'm new in ruby world any ideas are welcomed...
If your file contains the literal escaped string:
\u306b\u3064\u3044\u3066
Then you will need to unescape it after reading. Ruby does this for you with string literals, which is why the second case worked for you. Taken from the answer to "Is this the best way to unescape unicode escape sequences in Ruby?", you can use this:
file = "c:\\...\\vlmList_unicode.txt" #\u306b\u3064\u3044\u3066
data = File.open(file, 'rb') { |io|
contents = io.read.gsub(/\\u([\da-fA-F]{4})/) { |m|
[$1].pack("H*").unpack("n*").pack("U*")
}
contents.split(/\t/)
}
Alternatively, if you will like to make it more readable, extract the substitution into a new method, and add it to the String class:
class String
def unescape_unicode
self.gsub(/\\u([\da-fA-F]{4})/) { |m|
[$1].pack("H*").unpack("n*").pack("U*")
}
end
end
Then you can call:
file = "c:\\...\\vlmList_unicode.txt" #\u306b\u3064\u3044\u3066
data = File.open(file, 'rb') { |io|
io.read.unescape_unicode.split(/\t/)
}
Just as a FYI:
data = File.open(file, 'rb') { |io| io.read.split(/\t/) }
Can be written more simply as one of these:
data = File.read(file, 'rb').split(/\t/)
data = File.readlines(file, "\t", 'mode' => 'rb')
(Remember that File inherits from IO, which is where these methods are defined, so look in IO for documentation on them.)
readlines takes a "separator" parameter, which in the example above is "\t". Ruby will substitute it for the usual "\n" on *nix or Mac OS, or "\r\n" on Windows, so records will be retrieved using the tab-delimiter.
This makes me wonder a bit why you'd want to do that though? I've never seen tabs as record delimiters, only column/field delimiters in "TSV" (Tab-Seperated-Value) files. So that leads me to think you should probably be using Ruby's CSV class, with a "\t" as the column-separator. But, without samples of the actual file you're reading I can't say for sure.

Extracting a part of a string in Ruby

I know this is an easy question, but I want to extract one part of a string with rails.
I would do this like Java, by knowing the beginning and end character of the string and extract it, but I want to do this by ruby way, that's why I need your help.
My string is:
STACK OVER AND FLOW
And I want the numerical values between quotation marks => 99999 and the value of the link => STACK OVER AND FLOW
How should I parse this string in ruby ?
Thanks.
If you need to parse html:
> require 'nokogiri'
> str = %q[STACK OVER AND FLOW]
> doc = Nokogiri.parse(str)
> link = doc.at('a')
> link.text
=> "STACK OVER AND FLOW"
> link['href'][/(\d+)/, 1]
=> "99999"
http://nokogiri.org/
This should work if you have only one link in string
str = %{STACK OVER AND FLOW }
num = str.match(/href=".*?'(\d*)'.*?/)[1].to_i
name = str.match(/>(.*?)</)[1].strip
Way to get both at a time:
str = "STACK OVER AND FLOW "
num, name = str.scan(/launchRemote\('(\d+)'[^>]+>\s*(.*?)\s*</).first
# => ["99999", "STACK OVER AND FLOW"]

How to get the file extension from a url?

New to ruby, how would I get the file extension from a url like:
http://www.example.com/asdf123.gif
Also, how would I format this string, in c# I would do:
string.format("http://www.example.com/{0}.{1}", filename, extension);
Use File.extname
File.extname("test.rb") #=> ".rb"
File.extname("a/b/d/test.rb") #=> ".rb"
File.extname("test") #=> ""
File.extname(".profile") #=> ""
To format the string
"http://www.example.com/%s.%s" % [filename, extension]
This works for files with query string
file = 'http://recyclewearfashion.com/stylesheets/page_css/page_css_4f308c6b1c83bb62e600001d.css?1343074150'
File.extname(URI.parse(file).path) # => '.css'
also returns "" if file has no extension
url = 'http://www.example.com/asdf123.gif'
extension = url.split('.').last
Will get you the extension for a URL(in the most simple manner possible). Now, for output formatting:
printf "http://www.example.com/%s.%s", filename, extension
You could use Ruby's URI class like this to get the fragment of the URI (i.e. the relative path of the file) and split it at the last occurrence of a dot (this will also work if the URL contains a query part):
require 'uri'
your_url = 'http://www.example.com/asdf123.gif'
fragment = URI.split(your_url)[5]
extension = fragment.match(/\.([\w+-]+)$/)
I realize this is an ancient question, but here's another vote for using Addressable. You can use the .extname method, which works as desired even with a query string:
Addressable::URI.parse('http://www.example.com/asdf123.gif').extname # => ".gif"
Addressable::URI.parse('http://www.example.com/asdf123.gif?foo').extname # => ".gif"

Resources