Skip part of execution on specific page - phoenix-framework

I have a plug that checks for data on every page, kind of like authentication. If they don't have at least one entry of data, they get redirected to a page to add an entry of the data. The problem is that it will keep redirecting them even if they are on that page. I still need to check the data on every page, but skip the redirect if it's on a specific page. I've tried pipelines, but it seemed overkill for what I need since I just need to skip one piece of the execution (the redirect). What's the best way to skip the redirect for a certain controller / action?
Plug
def add_default_team(conn, user, opts) do
repo = Keyword.fetch!(opts, :repo)
default_team = user.default_team_id
if default_team do
assign conn, :current_team, repo.get!(Team, default_team)
else
conn
|> redirect(to: Helpers.team_path(conn, :new))
|> halt()
end
end
The route I'm trying to avoid redirecting on would be the team_path(conn, :new)

You can check the value of conn.request_path and skip processing it if it matches a certain string by adding this clause above the existing clause you have:
def add_default_team(%Plug.Conn{request_path: "/team/new"}, _user, _opts), do: conn
It would be better to use path_info though, since it removes consecutive and trailing slashes:
def add_default_team(%Plug.Conn{path_info: ["team", "new"]}, _user, _opts), do: conn

Related

Logstash: Handling a Configuration File for a Filter

I've written a filter and use its register-function to load an external CSV-file and fill a bunch of hash-tables. The filter-function then accesses the hash-tables and adds fields to the event. While that's working nicely, the downside is that it only loads once and I'd need to restart logstash to trigger the reload after a change in the CSV-file. Maybe I should add that the filter is currently consuming events coming from three different file inputs.
Writing an input doesn't seem to solve it as the input is not tied to the filter in some way. Therefore, my plan is to somehow reload the CSV-file every few hours or at a particular time and somehow block the entire filter during that, i.e. pause incoming events. That sounds like a weird thing to do and I'm not sure whether or not logstash is actually meant to be used like this.
I'm a newbie regarding Ruby and actually I'm quite amazed that the filter is working this nice. As Google let me down on the entire issue I'm hoping that anyone on here has experience with this, can post a link to an example or can point me to another way of solving this.
For educational purposes I looked into the source of logstash and noticed that I could actually understand what's going on and things are much less complicated than I had thought.
There is a function filterworker in pipeline.rb and a class filterworker and I don't know which one is actually used, but my findings seem to be true for both.
Basically all filters seem to run in one thread in case it's not configured otherwise. This means that I can reload the file anywhere in the filter-function and the entire processing for all filters is paused (input and output might still do something, but that's handled by the queue for the events holding maximum 20 entries).
Therefore, this seems to do it for me:
public
def register
#config_files_read_timestamps = {}
read_config_files
end # def register
def filter(event)
# return nothing unless there's an actual filter event
return unless filter?(event)
read_config_files
:
# filter_matched should go in the last line of our successful code
filter_matched(event)
end # def filter
private
def read_config_files
read_marker_file
:
end
def check_for_changed_file?(filename)
mtime = File.mtime(filename)
#config_files_read_timestamps[filename] ||= Time.at(0)
if #config_files_read_timestamps[filename] < mtime
#config_files_read_timestamps[filename] = mtime
return true
end
end
def read_marker_file
if !check_for_changed_file?("markers.txt")
return
end
:
end
Obviously I don't need a separate thread for the parsing. It would become necessary if I plan to start the reload at a specific time. In that case I'd have to join the thread and then continue with event handling.
Let me know if there could be improvements...

How do I route based on a url parameter in sinatra?

I am using Sinatra and I want to use something like a referrer code in my urls that will somewhat control access and identify the provenance of a given URL.
/secret-code/rest/of/path
should be rejected if "secret-code" is not in a predetermined list.
I want to use route conditions
set(:valid_tag) { |tag| condition { tag === 'abcd' } }
get '/:tag', :valid_tag => params[:tag] do
'Hello world!'
end
but params is not in scope. Do I need to dispatch in the block? What is the best way to handle multiple routes without having to duplicate the tag checking logic in each one?
/secret/route1/
/secret/route1/blah
/secret/route2/
Is there a way to chain handlers? Can I do
get /:tag/*
# check :tag
redirect_to_handler(params[:splat])
By the sounds of things it looks like you're trying to make use of Sinatra's named parameters. Params is only in scope within the block:
get '/:secret_code/*' do
redirect_to_handler unless secret_codes.include? params[:secret_code]
end
The code above assumes you have a collection of 'secret_codes' that you're going to check with the secret_code from the URL.
(Answering my own question)
Sinatra matches the lexically first rule and you can pass onto the next matching rule using 'pass'. So something like this works as long as it is the first rule that would match.
get '/:tag/*' do
halt_if_bad_tag params[:tag]
pass
end
get '/:tag/route1' do
'hello world'
end

My loop in scrapy is not running sequentially

I am scraping a sequence of urls. The code is working but scrapy is not parsing the urls in sequential order. E.g. Although I am trying to parse url1, url2,...,url100, scrapy parses url2, url10,url1...etc.
It parses all the urls but when a specific url does not exist (e.g example.com/unit.aspx?b_id=10) Firefox shows me the result of my previous request. As I want to make sure that I don´t have duplicates, I need to ensure that the loop is parsing the urls sequentially and not "at will".
I tried "for n in range(1,101) and also a "while bID<100" the result is the same. (see below)
thanks in advance!
def check_login_response(self, response):
"""Check the response returned by a login request to see if we are
successfully logged in.
"""
if "Welcome!" in response.body:
self.log("Successfully logged in. Let's start crawling!")
print "Successfully logged in. Let's start crawling!"
# Now the crawling can begin..
self.initialized()
bID=0
#for n in range(1,100,1):
while bID<100:
bID=bID+1
startURL='https://www.example.com/units.aspx?b_id=%d' % (bID)
request=Request(url=startURL ,dont_filter=True,callback=self.parse_add_tables,meta={'bID':bID,'metaItems':[]})
# print self.metabID
yield request #Request(url=startURL ,dont_filter=True,callback=self.parse2)
else:
self.log("Something went wrong, we couldn't log in....Bad times :(")
# Something went wrong, we couldn't log in, so nothing happens.
You could try something like this. I'm not sure if it's fit for purpose on the basis that I haven't seen the rest of the spider code but here you go:
# create a list of urls to be parsed, in reverse order (so we can easily pop items off)
crawl_urls = ['https://www.example.com/units.aspx?b_id=%s' % n for n in xrange(99,1,-1)]
def check_login_response(self, response):
"""Check the response returned by a login request to see if we are successfully logged in.
"""
if "Welcome!" in response.body:
self.log("Successfully logged in. Let's start crawling!")
print "Successfully logged in. Let's start crawling!"
# Now the crawling can begin..
self.initialized()
return Request(url='https://www.example.com/units.aspx?b_id=1',dont_filter=True,callback=self.parse_add_tables,meta={'bID':1,'metaItems':[]})
else:
self.log("Something went wrong, we couldn't log in....Bad times :(")
# Something went wrong, we couldn't log in, so nothing happens.
def parse_add_tables(self, response):
# parsing code here
if self.crawl_urls:
next_url = self.crawl_urls.pop()
return Request(url=next_url,dont_filter=True,callback=self.parse_add_tables,meta={'bID':int(next_url[-1:]),'metaItems':[]})
return items
You can use use priority attribute in Request object. Scrapy guarantees the urls are crawled in DFO by default. But it does not ensure that the urls are visited in the order they were yielded within your parse callback.
Instead of yielding Request objects you want to return an array of Requests from which objects will be popped till it is empty.
For more info you can see here
Scrapy Crawl URLs in Order

Expiring all caches on a controller

I got a resourceful controller with a custom action. The action is pretty heavy, so I'm working on caching it:
class MyController < ApplicationController
caches_action :walk_to_mordor
# GET /my/:id/walk_to_mordor/:direction
def walk_to_mordor
# srz bzns
end
end
It works very nice, caching is done and the page is now fast. However, I want to allow the user to "bust" the cache by clicking on a link on the page. At first I tried:
def bust_cache
expire_action :action => :walk_to_mordor
end
Rails complained that no route matches my action. Might be because of the parameter. Hmm, let's give it to him:
def bust_cache
MyEntities.all.each do |e|
expire_action walk_to_mordor_path(e, ??)
end
end
Problem, I can't possibly identify all choices of :direction.
Is there a way to clear all action caches that match a certain regular expression, or all action caches from a specific controller?
The secret is called expire_fragment:
expire_fragment(key, options = nil)
Removes fragments from the cache.
key can take one of three forms:
String - This would normally take the form of a path, like "pages/45/notes".
Hash - Treated as an implicit call to url_for, like {:controller => "pages", :action => "notes", :id => 45}
Regexp - Will remove any fragment that matches, so %r{pages/d*/notes} might remove all notes. Make sure you don’t use anchors in the regex (^ or $) because the actual filename matched looks like ./cache/filename/path.cache. Note: Regexp expiration is only supported on caches that can iterate over all keys (unlike memcached).
http://api.rubyonrails.org/classes/ActionController/Caching/Fragments.html#method-i-expire_fragment
Sadly, it won't work with memcached (if I ever decide to use it). Gotta be a lot more clever to avoid cache in that circunstance. Maybe adding a serial parameter to the request, and increment it when the user presses the 'bust cache' button...

testing a multi-step workflow in rspec

I'd like to know about idioms or best practices for testing a multi-step workflow using rspec.
Let's take as an example a "shopping cart" system, where the buying process might be
when user submits to basket and we are not using https, redirect to https
when user submits to basket and we are using https and there is no cookie, create and display a new basket and send back a cookie
when user submits to basket and we are using https and there is a valid cookie and the new item is for a different product than the first item, add a line to the basket and display both lines
when user submits to basket and we are using https and there is a valid cookie and the new item is for the same product as a previous one, increment that basket line's quantity and display both lines
when user clicks 'checkout' on the basket page and is using https and there is a cookie and the basket is non-empty and ...
...
I've read http://eggsonbread.com/2010/03/28/my-rspec-best-practices-and-tips/ which advises i.a that each "it block" should contain only one assertion: instead of doing the computation and then testing several attributes in the same block, use a "before" inside a context to create (or retrieve) the object under test and assign it to #some_instance_variable, then write each attribute test as a separate block. That helps a little, but in a case such as outlined above where testing step n requires doing all the setup for steps [1..n-1] I find myself either duplicating setup code (obviously not good) or creating lots of helper functions with increasingly unwieldy names (def create_basket_with_three_lines_and_two_products) and calling them consecutively in each step's before block.
Any tips on how to do this less verbosely/tediously? I appreciate the general principle behind the idea that each example should not depend on state left behind by previous examples, but when you're testing a multi-step process and things can go wrong at any step, setting up the context for each step is inevitably going to require rerunning all the setup for the previous n steps, so ...
Here's one possible approach -- define an object that creates the necessary state for each step and pass it forward for each successive one. Basically you need to mock/stub the method calls for all the setup conditions:
class MultiStep
def initialize(context)
#context = context
end
def init_vars
#cut = #context.instance_variable_get(:#cut)
end
def setup(step)
init_vars
method(step).call
end
def step1
#cut.stub(:foo).and_return("bar")
end
def step2
step1
#cut.stub(:foo_bar).and_return("baz_baz")
end
end
class Cut # Class Under Test
def foo
"foo"
end
def foo_bar
"foo_bar"
end
end
describe "multiple steps" do
before(:each) do
#multi_stepper = MultiStep.new(self)
#cut = Cut.new
end
it "should setup step1" do
#multi_stepper.setup(:step1)
#cut.foo.should == "bar"
#cut.foo_bar.should == "foo_bar"
end
it "should setup step2" do
#multi_stepper.setup(:step2)
#cut.foo.should == "bar"
#cut.foo_bar.should == "baz_baz"
end
end
Certainly too late for OP, but this could be handy for others - the rspec-steps gem seems to be built for this exact situation: https://github.com/LRDesign/rspec-steps
It might be worthwhile to look at https://github.com/railsware/rspec-example_steps and https://github.com/jimweirich/rspec-given as well. I settled on rspec-steps, but I was in a rush and these other options might actually be better for all I know.

Resources