BDD / TDD with JSpec - Removing code duplication - refactoring

How do I refactor to remove the code duplication in this spec:
describe 'TestPlugins'
describe '.MovieScanner(document)'
before_each
MoviePage_loggedIn = fixture("movie_logged_in.html") // Get logged-in movie page
MoviePage_notloggedIn = fixture("movie_not_logged_in.html") // Get non logged-in movie page
scanner = new MovieScanner() // Get movie scanner
end
it 'should scan logged-in movie page for movie data'
doc = MoviePage_loggedIn // Get document to scan
// Unit Tests
// ------------------------------------------------------------
// Test movie scanner's functions
scanner.getMovieTitle(doc).should.eql "The Jacket"
scanner.getMovieYear(doc).should.eql "2005"
// Test movie scanner's main scan function
scannedData = scanner.scan(doc)
scannedData.title.should.eql "The Jacket"
scannedData.year.should.eql "2005"
end
it 'should scan non logged-in movie page for movie data'
doc = MoviePage_notloggedIn // Get document to scan
// Unit Tests
// ------------------------------------------------------------
// Test movie scanner's functions
scanner.getMovieTitle(doc).should.eql "The Jacket"
scanner.getMovieYear(doc).should.eql "2005"
// Test movie scanner's main scan function
scannedData = scanner.scan(doc)
scannedData.title.should.eql "The Jacket"
scannedData.year.should.eql "2005"
end
end
end

In BDD, we want to describe the behaviour of our app or classes to make them easy to change. If removing duplication would also obscure the behaviour, don't remove the duplication. The code gets read 10x more than it's written, and IME even more for BDD scenarios and unit-level examples.
If you do decide to remove the duplication anyway, replace it with something readable. I'm not familiar with JSpec but I'd expect something like
scannedData.shouldMatch "The Jacket", "2005"
where all the relevant outcomes for title and year are checked.
To remove the duplication irrelevant of whether you logged in or not:
Separate the code into Givens (context where it doesn't matter how you got there), Whens (events through the app whose behaviour you actually want to test) and Thens (outcomes you're looking for). You're looking to describe the capabilities of the system and things a user can do with it, rather than whether it's a web-page or a window - it shouldn't matter. Put the lower-level calls at a lower level.
You can then have two different givens - logged in or not logged in - and reuse the other steps for the rest.

Related

Any ar js multimarkers learning tutorial?

I have been searching for ar.js multimarkers tutorial or anything that explains about it. But all I can find is 2 examples, but no tutorials or explanations.
So far, I understand that it requires to learn the pattern or order of the markers, then it stores it in localStorage. This data is used later to display the image.
What I don't understand, is how this "learner" is implemented. Also, the learning process is only used once by the "creator", right? The output file should be stored and then served later when needed, not created from scratch at each person's phone or computer.
Any help is appreciated.
Since the question is mostly about the learner page, I'll try to break it down as much as i can:
1) You need to have an array of {type, URL} objects.
A sample of creating the default array is shown below (source code):
var markersControlsParameters = [
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-hiro.patt',
},
{
type : 'pattern',
patternUrl : 'examples/marker-training/examples/pattern-files/pattern-kanji.patt',
}]
2) You need to feed this to the 'learner' object.
By default the above object is being encoded into the url (source) and then decoded by the learner site. What is important, happens on the site:
for each object in the array, an ArMarkerControls object is created and stored:
// array.forEach(function(markerParams){
var markerRoot = new THREE.Group()
scene.add(markerRoot)
// create markerControls for our markerRoot
var markerControls = new THREEx.ArMarkerControls(arToolkitContext, markerRoot, markerParams)
subMarkersControls.push(markerControls)
The subMarkersControls is used to create the object used to do the learning. At long last:
var multiMarkerLearning = new THREEx.ArMultiMakersLearning(arToolkitContext, subMarkersControls)
The example learner site has multiple utility functions, but as far as i know, the most important here are the ArMultiMakersLearning members which can be used in the following order (or any other):
// this method resets previously collected statistics
multiMarkerLearning.resetStats()
// this member flag enables data collection
multiMarkerLearning.enabled = true
// this member flag stops data collection
multiMarkerLearning.enabled = false
// To obtain the 'learned' data, simply call .toJSON()
var jsonString = multiMarkerLearning.toJSON()
Thats all. If you store the jsonString as
localStorage.setItem('ARjsMultiMarkerFile', jsonString);
then it will be used as the default multimarker file later on. If you want a custom name or more areas - then you'll have to modify the name in the source code.
3) 2.1.4 debugUI
It seems that the debug UI is broken - the UI buttons do exist but are nowhere to be seen. A hot fix would be using the 'markersAreaEnabled' span style for the div
containing the buttons (see this source bit).
It's all in this glitch, you can find it under the phrase 'CHANGES HERE' in the arjs code.

RASA FormAction ActionExecutionRejection doesn’t re-prompt for missing slot

I am trying to implement a FormAction here, and I’ve overridden validate method.
Here is the code for the same:
def validate(self, dispatcher, tracker, domain):
logger.info("Validate of single entity called")
document_number = tracker.get_slot("document_number")
# Run regex on latest_message
extracted = re.findall(regexp, tracker.latest_message['text'])
document_array = []
for e in extracted:
document_array.append(e[0])
# generate set for needed things and
document_set = set(document_array)
document_array = list(document_set)
logger.info(document_set)
if len(document_set) > 0:
if document_number and len(document_number):
document_array = list(set(document_array + document_number))
return [SlotSet("document_number", document_array)]
else:
if document_number and len(document_number):
document_array = list(set(document_array + document_number))
return [SlotSet("document_number", document_array)]
else:
# Here it doesn't have previously set slot
# So Raise an error
raise ActionExecutionRejection(self.name(),
"Please provide document number")
So, ideally as per the docs, when ActionExecutionRejection occurs, it should utter a template with name utter_ask_{slotname} but it doesn’t trigger that action.
Here is my domain.yml templates
templates:
utter_greet:
- text: "Hi, hope you are having a good day! How can I help?"
utter_ask_document_number:
- text: "Please provide document number"
utter_help:
- text: "To find the document, please say the ID of a single document or multiple documents"
utter_goodbye:
- text: "Talk to you later!"
utter_thanks:
- text: "My pleasure."
The ActionExecutionRejection doesn't by default utter a template with the name utter_ask_{slotname}, but rather leaves the form logic to allow other policies (e.g. FallbackPolicy) to take action. The utter_ask_{slotname} is the default for the happy path in which it's trying to get a required slot for the first time. This default implementation of the action rejection is there in order to handle certain unhappy paths such as if a user decides they want to exit the flow by denying, or take a detour by chatting, etc.
If you want to implement the template to re-ask for the required slot using the utterance, you could replace the ActionExecutionRejection with dispatcher.utter_template(<desired template name>, tracker). However, this will leave you with no way to exit the form action without validation -- I don't know what your intents are, but perhaps you want to also incorporate some logic based on the intent (i.e. if it's something like "deny", let the ActionExecutionRejection happen so it can exit, it it's an "enter data" type of intent make sure it asks again).

rails string substitution or similar solution in controller

I'm building a site with users in all 50 states. We need to display information for each user that is specific to their situation, e.g., the number of events they completed in that state. Each state's view (a partial) displays state-specific information and, therefore, relies upon state-specific calculations in a state-specific model. We'd like to do something similar to this:
##{user.state} = #{user.state.capitalize}.new(current_user)
in the users_controller instead of
#illinois = Illinois.new(current_user) if (#user.state == 'illinois')
.... [and the remaining 49 states]
#wisconsin = Wisconsin.new(current_user) if (#user.state == 'wisconsin')
to trigger the Illinois.rb model and, in turn, drive the view defined in the users_controller by
def user_state_view
#user = current_user
#events = Event.all
#illinois = Illinois.new(current_user) if (#user.state == 'illinois')
end
I'm struggling to find a better way to do this / refactor it. Thanks!
I would avoid dynamically defining instance variables if you can help it. It can be done with instance_variable_set but it's unnecessary. There's no reason you need to define the variable as #illinois instead of just #user_state or something like that. Here is one way to do it.
First make a static list of states:
def states
%{wisconsin arkansas new_york etc}
end
then make a dictionary which maps those states to their classes:
def state_classes
states.reduce({}) do |memo, state|
memo[state] = state.camelize.constantize
memo
end
end
# = { 'illinois' => Illinois, 'wisconsin' => Wisconsin, 'new_york' => NewYork, etc }
It's important that you hard-code a list of state identifiers somewhere, because it's not a good practice to pass arbitrary values to contantize.
Then instantiating the correct class is a breeze:
#user_state = state_classes[#user.state].new(current_user)
there are definitely other ways to do this (for example, it could be added on the model layer instead)

XPath: Select Certain Child Nodes

I'm using XPath with Scrapy to scrape data off of a movie website BoxOfficeMojo.com.
As a general question: I'm wondering how to select certain child nodes of one parent node all in one Xpath string.
Depending on the movie web page from which I'm scraping data, sometimes the data I need is located at different children nodes, such as whether or not there is a link or not. I will be going through about 14000 movies, so this process needs to be automated.
Using this as an example. I will need actor/s, director/s and producer/s.
This is the Xpath to the director: Note: The %s corresponds to a determined index where that information is found - in the action Jackson example director is found at [1] and actors at [2].
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()
However, would a link exist to a page on the director, this would be the Xpath:
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/a/text()
Actors are a bit more tricky, as there <br> included for subsequent actors listed, which may be the children of an /a or children of the parent /font, so:
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font//a/text()
Gets all most all of the actors (except those with font/br).
Now, the main problem here, I believe, is that there are multiple //div[#class="mp_box_content"] - everything I have works EXCEPT that I also end up getting some digits from other mp_box_content. Also I have added numerous try:, except: statements in order to get everything (actors, directors, producers who both have and do not have links associated with them). For example, the following is my Scrapy code for actors:
actors = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font//a/text()' % (locActor,)).extract()
try:
second = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()' % (locActor,)).extract()
for n in second:
actors.append(n)
except:
actors = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()' % (locActor,)).extract()
This is an attempt to cover for the facts that: the first actor may not have a link associated with him/her and subsequent actors do, the first actor may have a link associated with him/her but the rest may not.
I appreciate the time taken to read this and any attempts to help me find/address this problem! Please let me know if any more information is needed.
I am assuming you are only interested in textual content, not the links to actors' pages etc.
Here is a proposition using lxml.html (and a bit of lxml.etree) directly
First, I recommend you select td[2] cells by the text content of td[1], with expressions like .//tr[starts-with(td[1], "Director")]/td[2] to account for "Director", or "Directors"
Second, testing various expressions with or without <font>, with or without <a> etc., makes code difficult to read and maintain, and since you're interested only in the text content, you might as well use string(.//tr[starts-with(td[1], "Actor")]/td[2]) to get the text, or use lxml.html.tostring(e, method="text", encoding=unicode) on selected elements
And for the <br> issue for multiple names, the way I do is generally modify the lxml tree containing the targetted content to add a special formatting character to <br> elements' .text or .tail, for example a \n, with one of lxml's iter() functions. This can be useful on other HTML block elements, like <hr> for example.
You may see better what I mean with some spider code:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
import lxml.etree
import lxml.html
MARKER = "|"
def br2nl(tree):
for element in tree:
for elem in element.iter("br"):
elem.text = MARKER
def extract_category_lines(tree):
if tree is not None and len(tree):
# modify the tree by adding a MARKER after <br> elements
br2nl(tree)
# use lxml's .tostring() to get a unicode string
# and split lines on the marker we added above
# so we get lists of actors, producers, directors...
return lxml.html.tostring(
tree[0], method="text", encoding=unicode).split(MARKER)
class BoxOfficeMojoSpider(BaseSpider):
name = "boxofficemojo"
start_urls = [
"http://www.boxofficemojo.com/movies/?id=actionjackson.htm",
"http://www.boxofficemojo.com/movies/?id=cloudatlas.htm",
]
# locate 2nd cell by text content of first cell
XPATH_CATEGORY_CELL = lxml.etree.XPath('.//tr[starts-with(td[1], $category)]/td[2]')
def parse(self, response):
root = lxml.html.fromstring(response.body)
# locate the "The Players" table
players = root.xpath('//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table')
# we have only one table in "players" so the for loop is not really necessary
for players_table in players:
directors_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Director")
actors_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Actor")
producers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Producer")
writers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Producer")
composers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Composer")
directors = extract_category_lines(directors_cells)
actors = extract_category_lines(actors_cells)
producers = extract_category_lines(producers_cells)
writers = extract_category_lines(writers_cells)
composers = extract_category_lines(composers_cells)
print "Directors:", directors
print "Actors:", actors
print "Producers:", producers
print "Writers:", writers
print "Composers:", composers
# here you should of course populate scrapy items
The code can be simplified for sure, but I hope you get the idea.
You can do similar things with HtmlXPathSelector of course (with the string() XPath function for example), but without modifying the tree for <br> (how to do that with hxs?) it works only for non-multiple names in your case:
>>> hxs.select('string(//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table//tr[contains(td, "Director")]/td[2])').extract()
[u'Craig R. Baxley']
>>> hxs.select('string(//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table//tr[contains(td, "Actor")]/td[2])').extract()
[u'Carl WeathersCraig T. NelsonSharon Stone']

Sharing data between Sinatra condition and request block

I am just wondering if it is possible to have a condition that passes information to the request body once it is complete, I doubt conditions can do it and are the right place even if they could, because it implies they are to do conditional logic, however the authorisation example also redirects so it has a blur of concerns... an example would be something like:
set(:get_model) { |body| { send_to_request_body(Model.new(body)) } }
get '/something', :get_model => request.body.data do
return "model called #{#model.name}"
end
The above is all psudocode so sorry for any syntax/spelling mistakes, but the idea is I can have a condition which fetches the model and puts it into some local variable for the body to use, or do a halt with an error or something.
I am sure filters (before/after) would be a better way to do this if it can be done, however from what I have seen I would need to set that up per route, whereas with a condition I would only need to have it as an option on the request.
An example with before would be:
before '/something' do
#model = Model.new(request.body.data)
end
get '/something' do
return "model called #{#model.name}"
end
This is great, but lets say I now had 20 routes, and 18 of them needed these models creating, I would need to basically duplicate the before filter for all 18 of them, and write the same model logic for them all, which is why I am trying to find a better way to re-use this functionality. If I could do a catch-all Before filter which was able to check to see if the given route had an option set, then that could possibly work, but not sure if you can do that.
In ASP MVC you could do this sort of thing with filters, which is what I am ideally after, some way to configure certain routes (at the route definition) to do some work before hand and pass it into the calling block.
Conditions can set instance variables and modify the params hash. For an example, see the built-in user_agent condition.
set(:get_model) { |body| condition { #model = Model.new(body) } }
get '/something', :get_model => something do
"model called #{#model.name}"
end
You should be aware that request is not available at that point, though.
Sinatra has support for before and after filters:
before do
#note = 'Hi!'
request.path_info = '/foo/bar/baz'
end
get '/foo/*' do
#note #=> 'Hi!'
params[:splat] #=> 'bar/baz'
end
after '/create/:slug' do |slug|
session[:last_slug] = slug
end

Resources