Ruby Twitter, Retrieving Full Tweet Text - ruby

I'm using the Ruby Twitter gem to retrieve the full text of a tweet.
I first tried this, and as you can see the text was truncated.
[5] pry(main)> t = client.status(782845350918971393)
=> #<Twitter::Tweet id=782845350918971393>
[6] pry(main)> t.text
=> "A #Gameofthrones fan? Our #earlybird Dublin starter will get you
touring the GOT location in 2017
#traveldealls… (SHORTENED URL WAS HERE)"
Then I tried this:
[2] pry(main)> t = client.status(782845350918971393, tweet_mode: 'extended')
=> #<Twitter::Tweet id=782845350918971393>
[3] pry(main)> t.full_text
=>
[4] pry(main)> t.text
=>
Both the text and full text are empty when I use the tweet_mode: 'extended' option.
I also tried editing the bit of the gem that makes the request, the response was the same.
perform_get_with_object("/1.1/statuses/show/#{extract_id(tweet)}.json?tweet_mode=extended", options, Twitter::Tweet)
Any help would be greatly appreciated.

Here's a workaround I found helpful:
Below is the way I am handling this issue ATM. Seems to be working. I am using both Streaming (with Tweetstream) and REST APIs.
status = #client.status(1234567890, tweet_mode: "extended")
if status.truncated? && status.attrs[:extended_tweet]
# Streaming API, and REST API default
t = status.attrs[:extended_tweet][:full_text]
else
# REST API with extended mode, or untruncated text in Streaming API
t = status.attrs[:text] || status.attrs[:full_text]
end
From https://github.com/sferik/twitter/pull/848#issuecomment-329425006

Related

Ruby - Matching Twitter URL from any html page using Regex

I am trying to fetch the Twitter URL from this page for instance; however, my result is nil. I am pretty sure my regex is not too bad, but my code fails. Here is it :
doc = `(curl --url "http://www.rabbitreel.com/")`
twitter_url = ("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+").match(doc)
puts twitter_url
# => nil
Maybe, I misused regex syntax. My initial idea was simple: I wanted to match a regular Twitter url structure. I even tried http://rubular.com to test my regex, and it seemed to be fine when I entered a Twitter url.
http://ruby-doc.org/core-2.2.0/String.html#method-i-match
tells you that the object you're calling match on should be the string you're parsing, and the parameter should be the regex pattern. So if anything, you should call :
doc.match("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+")
I prefer
doc[/your_regex/]
syntax, because it directly delivers a String, and not a MatchData, which needs another step to get the information out of.
For Regexen, I always try to begin as simple as possible
[3] pry(main)> doc[/twitter/]
=> "twitter"
[4] pry(main)> doc[/twitter\.com/]
=> "twitter.com"
[5] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[6] pry(main)> doc[/twitter\.com\/\//] #OOPS. One \/ too many
=> nil
[7] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[8] pry(main)> doc[/twitter\.com\/\S+/]
=> "twitter.com/rabbitreel\""
[9] pry(main)> doc[/twitter\.com\/[^"]+/]
=> "twitter.com/rabbitreel"
[10] pry(main)> doc[/http:\/\/twitter\.com\/[^"]+/]
=> nil
[11] pry(main)> doc[/https?:\/\/twitter\.com\/[^"]+/]
=> "https://twitter.com/rabbitreel"
[12] pry(main)> doc[/https?:\/\/twitter\.com\/[^" ]+/]
=> "https://twitter.com/rabbitreel"
[13] pry(main)> doc[/https?:\/\/twitter\.com\/\w+/] #DONE
=> "https://twitter.com/rabbitreel"
EDIT:
Sure, Regexen cannot parse an entire HTML document.
Here, we only want to find the first occurence of a Twitter URL. So, depending on the requirements, on possible input and the chosen platform, it could make sense to use a Regexp.
Nokogiri is a huge gem, and it might not be possible to install it.
Independently from this fact, it would be a very good idea to check that the returned String really is a correct Twitter URL.
I think this Regexp:
/https?:\/\/twitter\.com\/\w+/
is safe.
[31] pry(main)> malicious_doc = "https://twitter.com/userid#maliciouswebsite.com"
=> "https://twitter.com/userid#maliciouswebsite.com"
[32] pry(main)> malicious_doc[/https?:\/\/twitter\.com\/\w+/]
=> "https://twitter.com/userid"
Using Nokogiri doesn't prevent you from checking for malicious input.
The proposed solution from #mudasobwa is interesting, but isn't safe yet:
[33] pry(main)> Nokogiri::HTML('<html><body>Link</body></html>').css('a').map { |e| e.attributes.values.first.value }.select {|e| e =~ /twitter.com/ }
=> ["http://maliciouswebsitethatisnottwitter.com/"]
NB as of Nov 2021, rabbitreel.com domain is on sale, so please read the comments about the possibility of it’s serving malicious content.
One should never use regexps to parse HTML and here is why.
Below is a robust solution using Nokogiri HTML parsing library:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.css('a').map { |e| e.attributes.values.first.value }
.select {|e| e =~ /twitter.com/ }
#⇒ [
# [0] "https://twitter.com/rabbitreel",
# [1] "https://twitter.com/rabbitreel"
# ]
Or, alternatively, with xpath:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.xpath('//a[contains(#href, "twitter.com")]')
.map { |e| e.attributes['href'].value }

chef 11: any way to turn attributes into a ruby hash?

I'm generating a config for my service in chef attributes. However, at some point, I need to turn the attribute mash into a simple ruby hash. This used to work fine in Chef 10:
node.myapp.config.to_hash
However, starting with Chef 11, this does not work. Only the top-level of the attribute is converted to a hash, with then nested values remaining immutable mash objects. Modifying them leads to errors like this:
Chef::Exceptions::ImmutableAttributeModification
------------------------------------------------ Node attributes are read-only when you do not specify which precedence level to set. To
set an attribute use code like `node.default["key"] = "value"'
I've tried a bunch of ways to get around this issue which do not work:
node.myapp.config.dup.to_hash
JSON.parse(node.myapp.config.to_json)
The json parsing hack, which seems like it should work great, results in:
JSON::ParserError
unexpected token at '"#<Chef::Node::Attribute:0x000000020eee88>"'
Is there any actual reliable way, short of including a nested parsing function in each cookbook, to convert attributes to a simple, ordinary, good old ruby hash?
after a resounding lack of answers both here and on the opscode chef mailing list, i ended up using the following hack:
class Chef
class Node
class ImmutableMash
def to_hash
h = {}
self.each do |k,v|
if v.respond_to?('to_hash')
h[k] = v.to_hash
else
h[k] = v
end
end
return h
end
end
end
end
i put this into the libraries dir in my cookbook; now i can use attribute.to_hash in both chef 10 (which already worked properly and which is unaffected by this monkey-patch) and chef 11. i've also reported this as a bug to opscode:
if you don't want to have to monkey-patch your chef, speak up on this issue:
http://tickets.opscode.com/browse/CHEF-3857
Update: monkey-patch ticket was marked closed by these PRs
I hope I am not too late to the party but merging the node object with an empty hash did it for me:
chef (12.6.0)> {}.merge(node).class
=> Hash
I had the same problem and after much hacking around came up with this:
json_string = node[:attr_tree].inspect.gsub(/\=\>/,':')
my_hash = JSON.parse(json_string, {:symbolize_names => true})
inspect does the deep parsing that is missing from the other methods proposed and I end up with a hash that I can modify and pass around as needed.
This has been fixed for a long time now:
[1] pry(main)> require 'chef/node'
=> true
[2] pry(main)> node = Chef::Node.new
[....]
[3] pry(main)> node.default["fizz"]["buzz"] = { "foo" => [ { "bar" => "baz" } ] }
=> {"foo"=>[{"bar"=>"baz"}]}
[4] pry(main)> buzz = node["fizz"]["buzz"].to_hash
=> {"foo"=>[{"bar"=>"baz"}]}
[5] pry(main)> buzz.class
=> Hash
[6] pry(main)> buzz["foo"].class
=> Array
[7] pry(main)> buzz["foo"][0].class
=> Hash
[8] pry(main)>
Probably fixed sometime in or around Chef 12.x or Chef 13.x, it is certainly no longer an issue in Chef 15.x/16.x/17.x
The above answer is a little unnecessary. You can just do this:
json = node[:whatever][:whatever].to_hash.to_json
JSON.parse(json)

Specifying the offset and limit in a Redmine REST API request with Ruby

I'm using the Ruby REST API for Redmine (here: http://www.redmine.org/projects/redmine/wiki/Rest_api_with_ruby). I need to be able to get all issues in a chunk of 100 at a time.
I know there is an options[:offset] and an options[:limit] that the method "api_offset_and_limit" is looking for.
How do I pass those options when I'm doing this? I tried putting them in the URL as GET options, but they didn't come through on the other end. The following gives me the first 25 issues, as I expect it to.
class Issue < ActiveResource::Base
self.site = 'http://redmine.server/'
self.user = 'foo'
self.password = 'bar'
end
# Retrieving issues
issues = Issue.find(:all)
I'm not familiar with the API, but the way you describe it, the following should work:
issues = Issue.find(:all, :params => {:offset => 0, :limit => 100})

Ruby rest_client and windows LIVE connect OAUTH Wrap

Hi all I am trying to get my rails app to talk to Windows LIVE (through OAuth Wrap) so I can retrieve a list of contacts. I am using the rest_client gem to do this. Here is the action code for it:
def hotmail
app_id = 'some_id'
app_sec = 'some_secret'
app_callback = 'http://my.callback.com/same/as/getting/verification_code'
app_var = params[:wrap_verification_code]
encoded = "wrap_client_id=#{app_id}&wrap_client_secret=#{app_sec}&wrap_verification_code=#{app_var}&wrap_callback=#{app_callback}".encode!('UTF-8')
begin
r = RestClient.post("https://consent.live.com/AccessToken.aspx", encoded.bytes.to_a, {:content_type => 'application/x-www-form-urlencoded', :content_length => encoded.bytesize})
rescue => e
puts e.message
end
render :text => 'hello'
end
I base this on a c# example http://msdn.microsoft.com/en-us/library/ff750952.aspx (note: http://www.goatly.net/2010/12/23/401-unauthorized-when-acquiring-an-access-token-windows-live-sdk.aspx shows the correct payload)
However I keep getting 401 Unauthorized, so I am thinking is the way I am using rest_client incorrectly? During a form post is there somthing else I need to do?
ANy hints will be really helpful :) thanks in advance.
Found the problem. The C# code says it post the byte array but thats not true just post the encoded st direct is enough.

Feedzirra cannot parse atom feeds

The idea of having a single parser for any kind of feed is great and was hoping that it would work for me.
I have been trying to get feedzirra to parse atom feeds.
specifically:
http://pindancing.blogspot.com/feeds/posts/default
http://adam.heroku.com/feed
Those are just 2 that I tried with the problem is that feedzirra cannot parse the
entry URL. It always comes out nil
feed = Feedzirra::Feed.fetch_and_parse(search.rss_feed_url)
p feed.entries.first.title
p feed.entries.first.url #=> returns nil
Is there anything I need to do to get it working?
thanks for your help
Hate to say "works for me", but, well, works for me:
require 'Feedzirra'
urls = %w{
http://adam.heroku.com/feed
http://pindancing.blogspot.com/feeds/posts/default
}
urls.each do |url|
feed = Feedzirra::Feed.fetch_and_parse(url)
puts feed.entries.first.title
puts feed.entries.first.url
end
# => Memcached, a Database?
# => http://adam.heroku.com/past/2010/7/19/memcached_a_database/
# => The answer to "Will you mentor me?" is
# => http://pindancing.blogspot.com/2010/12/answer-to-will-you-mentor-me-is.html
It'd help to see the rest of your code, particularly the actual parameter you're using in the fetch_and_parse method.

Resources