All I'm looking for is something very simple. I want to check if the client has my cookie in their browser. If so then to load the last session ID from the cookie and access the session from the server side. If not I want to create the server side session and store the session id to the cookie.
So my question is two parts. What parts of the session code/variable/ID is relevant to retrieve it in the future? And what's the simple method for achieving a continuous session via cookie/session?
I am not using rails. It's a standard CGI script.
#!/usr/bin/ruby
require 'cgi'
require 'cgi/session'
require 'cgi/session/pstore'
cgi = CGI.new("html4")
# !! TODO !! if cookie has "session_id" then open sess from "session_id"
sess = CGI.Session.new( cgi,
'database_manager' => CGI::Session::PStore, # use PStore
'session_key' => '_rb_sess_id', # custom session key
'session_expires' => Time.now + 30 * 60, # 30 minute timeout
'prefix' => 'pstore_sid_')
if sess.has_key? "friend_list"
#friend_list = sess['friend_list']
sess['access_count'] += 1
else
#friend_list = JSON.parse( IO.read( "fbget.json" ) )
sess['friend_list'] = #friend_list
sess['access_count'] = 1
end
I wanted to save the json list on the server side via a session instance. I figure putting that in a cookie would be a very bad security mistake.
Through my searching online I haven't found any non-rails info on saving session access info via the cookie. Only how to use each individually.
Related
I'm trying to verify a link that will expire in a week. I have an activator_token stored in the database, which will be used to generate the link in this format: http://www.example.com/activator_token. (And not activation tokens generated by Devise or Authlogic.)
Is there a way to make this activator token expire (in a week or so) without comparing with updated_at or some other date. Something like an encoded token, which will return nil when decoded after a week. Can any existing modules in Ruby do this? I don't want to store the generated date in the database or in an external store like Redis and compare it with Time.now. I want it to be very simple, and wanted to know if something like this already exists, before writing the logic again.
What you want to use is: https://github.com/jwt/ruby-jwt .
Here is some boilerplate code so you can try it out yourself.
require 'jwt'
# generate your keys when deploying your app.
# Doing so using a rake task might be a good idea
# How to persist and load the keys is up to you!
rsa_private = OpenSSL::PKey::RSA.generate 2048
rsa_public = rsa_private.public_key
# do this when you are about to send the email
exp = Time.now.to_i + 4 * 3600
payload = {exp: exp, discount: '9.99', email: 'user#example.com'}
# when generating an invite email, this is the token you want to incorporate in
# your link as a parameter
token = JWT.encode payload, rsa_private, 'RS256'
puts token
puts token.length
# this goes into your controller
begin
#token = params[:token]
decoded_token = JWT.decode token, rsa_public, true, { :algorithm => 'RS256' }
puts decoded_token.first
# continue with your business logic
rescue JWT::ExpiredSignature
# Handle expired token
# inform the user his invite link has expired!
puts "Token expired"
end
I'm using token to access github api by Octokit client.
client = Octokit::Client.new(access_token: TOKEN)
It seems that is ok:
client.rate_limit
=> #<struct Octokit::RateLimit
limit=5000,
remaining=4998,
resets_at=2013-11-25 03:38:41 +0200,
resets_in=3533>
So now I want to get some info
repo = client.repo 'rails/rails'
repo.rels[:events]
repo.rels[:events].get.data
But when I'm getting next page
repo.rels[:events].get[:next]
I'm hitting the rate limit in 60 requests per hour.
It seems that next requests are not authorized by token.
How to make all request be authorized by token?
Maybe your token has been expired or your client variable lost scope.
Create a initializers/octokit.rb with authentication:
Octokit.configure do |c|
c.client_id = ENV['GITHUB_ID']
c.client_secret = ENV['GITHUB_SECRET']
end
I chose to do by id and secret.
And make your requests with:
repo = Octokit.repo 'rails/rails'
After create the initializer, you can test in rails c:
> Octokit.rate_limit
=> #<struct Octokit::RateLimit limit=5000, remaining=4927, resets_at=2016-04-22 12:24:52 -0300, resets_in=2102>
Taken from the Octokit documentation:
Note: While Octokit auto pagination will set the page size to the
maximum 100, and seek to not overstep your rate limit, you probably
want to use a custom pattern for traversing large lists.
I am trying to access the session cookie within a spider. I first login to a social network using in a spider:
def parse(self, response):
return [FormRequest.from_response(response,
formname='login_form',
formdata={'email': '...', 'pass':'...'},
callback=self.after_login)]
In after_login, I would like to access the session cookies, in order to pass them to another module (selenium here) to further process the page with an authentificated session.
I would like something like that:
def after_login(self, response):
# process response
.....
# access the cookies of that session to access another URL in the
# same domain with the autehnticated session.
# Something like:
session_cookies = XXX.get_session_cookies()
data = another_function(url,cookies)
Unfortunately, response.cookies does not return the session cookies.
How can I get the session cookies ? I was looking at the cookies middleware: scrapy.contrib.downloadermiddleware.cookies and scrapy.http.cookies but there doesn't seem to be any straightforward way to access the session cookies.
Some more details here bout my original question:
Unfortunately, I used your idea but I dind't see the cookies, although I know for sure that they exists since the scrapy.contrib.downloadermiddleware.cookies middleware does print out the cookies! These are exactly the cookies that I want to grab.
So here is what I am doing:
The after_login(self,response) method receives the response variable after proper authentication, and then I access an URL with the session data:
def after_login(self, response):
# testing to see if I can get the session cookies
cookieJar = response.meta.setdefault('cookie_jar', CookieJar())
cookieJar.extract_cookies(response, response.request)
cookies_test = cookieJar._cookies
print "cookies - test:",cookies_test
# URL access with authenticated session
url = "http://site.org/?id=XXXX"
request = Request(url=url,callback=self.get_pict)
return [request]
As the output below shows, there are indeed cookies, but I fail to capture them with cookieJar:
cookies - test: {}
2012-01-02 22:44:39-0800 [myspider] DEBUG: Sending cookies to: <GET http://www.facebook.com/profile.php?id=529907453>
Cookie: xxx=3..........; yyy=34.............; zzz=.................; uuu=44..........
So I would like to get a dictionary containing the keys xxx, yyy etc with the corresponding values.
Thanks :)
A classic example is having a login server, which provides a new session id after a successful login. This new session id should be used with another request.
Here is the code picked up from source which seems to work for me.
print 'cookie from login', response.headers.getlist('Set-Cookie')[0].split(";")[0].split("=")[1]
Code:
def check_logged(self, response):
tmpCookie = response.headers.getlist('Set-Cookie')[0].split(";")[0].split("=")[1]
print 'cookie from login', response.headers.getlist('Set-Cookie')[0].split(";")[0].split("=")[1]
cookieHolder=dict(SESSION_ID=tmpCookie)
#print response.body
if "my name" in response.body:
yield Request(url="<<new url for another server>>",
cookies=cookieHolder,
callback=self."<<another function here>>")
else:
print "login failed"
return
Maybe this is an overkill, but i don't know how are you going to use those cookies, so it might be useful (an excerpt from real code - adapt it to your case):
from scrapy.http.cookies import CookieJar
class MySpider(BaseSpider):
def parse(self, response):
cookieJar = response.meta.setdefault('cookie_jar', CookieJar())
cookieJar.extract_cookies(response, response.request)
request = Request(nextPageLink, callback = self.parse2,
meta = {'dont_merge_cookies': True, 'cookie_jar': cookieJar})
cookieJar.add_cookie_header(request) # apply Set-Cookie ourselves
CookieJar has some useful methods.
If you still don't see the cookies - maybe they are not there?
UPDATE:
Looking at CookiesMiddleware code:
class CookiesMiddleware(object):
def _debug_cookie(self, request, spider):
if self.debug:
cl = request.headers.getlist('Cookie')
if cl:
msg = "Sending cookies to: %s" % request + os.linesep
msg += os.linesep.join("Cookie: %s" % c for c in cl)
log.msg(msg, spider=spider, level=log.DEBUG)
So, try request.headers.getlist('Cookie')
This works for me
response.request.headers.get('Cookie')
It seems to return all the cookies that where introduced by the middleware in the request, session's or otherwise.
As of 2021 (Scrapy 2.5.1), this is still not particularly straightforward. But you can access downloader middlewares (like CookiesMiddleware) from within a spider via self.crawler.engine.downloader:
def after_login(self, response):
downloader_middlewares = self.crawler.engine.downloader.middleware.middlewares
cookies_mw = next(iter(mw for mw in downloader_middlewares if isinstance(mw, CookiesMiddleware)))
jar = cookies_mw.jars[response.meta.get('cookiejar')].jar
cookies_list = [vars(cookie) for domain in jar._cookies.values() for path in domain.values() for cookie in path.values()]
# or
cookies_dict = {cookie.name: cookie.value for domain in jar._cookies.values() for path in domain.values() for cookie in path.values()}
...
Both output formats above can be passed to other requests using the cookies parameter.
I'm just getting started with SocketStream. (v0.1.0) I created the file /app/server/auth.coffee with an exports.actions.login function. I'd like to access #session.setUserId in this file, but I'm have a hard time figuring out where #session lives and how to access it outside of /app/server/app.coffee
Here is my auth.coffee with comments where I'd like to access the session.
users = [
username: 'craig'
password: 'craig',
username: 'joe'
password: 'joe',
]
authenticate = (credentials, cb) ->
user = _.detect users, (user) ->
user.username == credentials.username and user.password == credentials.password
authenticated = true if user?
callback cb, authenticated
exports.actions =
login: (credentials, cb) ->
authenticate credentials, (user) ->
# here is where i'd like to set the userId like so:
# #session.setUserId credentials.username
callback cb user
Interesting you bring a question about sessions up at the moment as I've been re-writing a lot of this code over the last few days as part of SocketStream 0.2.
The good news is the #session variable will be back in 0.2 as I have found an efficient way to pass the session data through to the back end without having to use the ugly #getSession callback.
To answer your question specifically, the #session variable is simply another property which is injected into the export.actions object before the request is processed. Hence you cannot have an action called 'session' (though the name of this 'magic variable' will be configurable in the next release of 0.2).
The exports.authenticate = true setting does not apply in your case.
I'm interested to know how/why you'd like to use the #session object outside of your /app/server code.
I will be committing all the latest session code to the 0.2 preview branch on github in a few days time.
Hope that helps,
Owen
You get the current session only within your server-side code (app/server) using the #getCurrentSession method.
Also you have to add:
exports.authenticate = true
to that file.
I am doing a video crawler in ruby. In there I have to log in to a page by enabling cookies and download pages. For that I am using the CURL library in ruby. I can successfully log in, but I can't download the pages inside that with curl. How can I fix this or download the pages otherwise?
My code is
curl = Curl::Easy.new(1st url)
curl.follow_location = true
curl.enable_cookies = true
curl.cookiefile = "cookie.txt"
curl.cookiejar = "cookie.txt"
curl.http_post(1st url,field)
curl.perform
curl = Curl::Easy.perform(2nd url)
curl.follow_location = true
curl.enable_cookies = true
curl.cookiefile = "cookie.txt"
curl.cookiejar = "cookie.txt"
curl.http_get
code = curl.body_str
What I've seen in writing my own similar "post-then-get" script is that ruby/Curb (I'm using version 0.7.15 with ruby 1.8) seems to ignore the cookiejar/cookiefile fields of a Curl::Easy object. If I set either of those fields and the http_post completes successfully, no cookiejar or cookiefile file is created. Also, curl.cookies will still be nil after your curl.http_post, however, the cookies ARE set within the curl object. I promise :)
I think where you're going wrong is here:
curl = Curl::Easy.perform(2nd url)
The curb documentation states that this creates a new object. That new object doesn't have any of your existing cookies set. If you change your code to look like the following, I believe it should work. I've also removed the curl.perform for the first url since curl.http_post already implicitly does the "perform". You were basically http_post'ing twice before trying your http_get.
curl = Curl::Easy.new(1st url)
curl.follow_location = true
curl.enable_cookies = true
curl.http_post(1st url,field)
curl.url = 2nd url
curl.http_get
code = curl.body_str
If this still doesn't seem to be working for you, you can verify if the cookie is getting set by adding
curl.verbose = true
Before
curl.http_post
Your Curl::Easy object will dump all the headers that it gets in the response from the server to $stdout, and somewhere in there you should see a line stating that it added/set a cookie. I don't have any example output right now but I'll try to post a follow-up soon.
HTTPClient automatically enables cookies, as does Mechanize.
From the HTTPClient docs:
clnt = HTTPClient.new
clnt.get_content(url1) # receives Cookies.
clnt.get_content(url2) # sends Cookies if needed.
Posting a form is easy too:
body = { 'keyword' => 'ruby', 'lang' => 'en' }
res = clnt.post(uri, body)
Mechanize makes this sort of thing really simple (It will handle storing the cookies, among other things).