web2py A() link handling with multiple targets - ajax

I need to update multiple targets when a link is clicked.
This example builds a list of links.
When the link is clicked, the callback needs to populate two different parts of the .html file.
The actual application uses bokeh for plotting.
The user will click on a link, the 'linkDetails1' and 'linkDetails2' will hold the script and div return from calls to bokeh.component()
The user will click on a link, and the script, div returned from bokeh's component() function will populate the 'linkDetails'.
Obviously this naive approach does not work.
How can I make a list of links that when clicked on will populate two separate places in the .html file?
################################
#views/default/test.html:
{{extend 'layout.html'}}
{{=linkDetails1}}
{{=linkDetails2}}
{{=links}}
################################
# controllers/default.py:
def test():
"""
example action using the internationalization operator T and flash
rendered by views/default/index.html or views/generic.html
if you need a simple wiki simply replace the two lines below with:
return auth.wiki()
"""
d = dict()
links = []
for ii in range(5):
link = A("click on link %d"%ii, callback=URL('linkHandler/%d'%ii), )
links.append(["Item %d"%ii, link])
table = TABLE()
table.append([TR(*rows) for rows in links])
d["links"] = table
d["linkDetails1"] = "linkDetails1"
d["linkDetails2"] = "linkDetails2"
return d
def linkHandler():
import os
d = dict()
# request.url will be linked/N
ii = int(os.path.split(request.url)[1])
# want to put some information into linkDetails, some into linkDiv
# this does not work:
d = dict()
d["linkDetails1"] = "linkHandler %d"%ii
d["linkDetails2"] = "linkHandler %d"%ii
return d

I must admit that I'm not 100% clear on what you're trying to do here, but if you need to update e.g. 2 div elements in your page in response to a single click, there are a couple of ways to accomplish that.
The easiest, and arguably most web2py-ish way is to contain your targets in an outer div that's a target for the update.
Another alternative, which is very powerful is to use something like Taconite [1], which you can use to update multiple parts of the DOM in a single response.
[1] http://www.malsup.com/jquery/taconite/

In this case, it doesn't look like you need the Ajax call to return content to two separate parts of the DOM. Instead, both elements returned (the script and the div elements) can simply be placed inside a single parent div.
# views/default/test.html:
{{extend 'layout.html'}}
<div id="link_details">
{{=linkDetails1}}
{{=linkDetails2}}
</div>
{{=links}}
# controllers/default.py
def test():
...
for ii in range(5):
link = A("click on link %d" % ii,
callback=URL('default', 'linkHandler', args=ii),
target="link_details")
...
If you provide a "target" argument to A(), the result of the Ajax call will go into the DOM element with that ID.
def linkHandler():
...
content = CAT(SCRIPT(...), DIV(...))
return content
In linkHandler, instead of returning a dictionary (which requires a view in order to generate HTML), you can simply return a web2py HTML helper, which will automatically be serialized to HTML and then inserted into the target div. The CAT() helper simply concatenates other elements (in this case, your script and associated div).

Related

Concept for recipe-based parsing of webpages needed

I'm working on a web-scraping solution that grabs totally different webpages and lets the user define rules/scripts in order to extract information from the page.
I started scraping from a single domain and build a parser based on Nokogiri.
Basically everything works fine.
I could now add a ruby class each time somebody wants to add a webpage with a different layout/style.
Instead I thought about using an approach where the user specifies elements where content is stored using xpath and storing this as a sort of recipe for this webpage.
Example: The user wants to scrape a table-structure extracting the rows using a hash (column-name => cell-content)
I was thinking about writing a ruby function for extraction of this generic table information once:
# extracts a table's rows as an array of hashes (column_name => cell content)
# html - the html-file as a string
# xpath_table - specifies the html table as xpath which hold the data to be extracted
def basic_table(html, xpath_table)
xpath_headers = "#{xpath_table}/thead/tr/th"
html_doc = Nokogiri::HTML(html)
html_doc = Nokogiri::HTML(html)
row_headers = html_doc.xpath(xpath_headers)
row_headers = row_headers.map do |column|
column.inner_text
end
row_contents = Array.new
table_rows = html_doc.xpath('#{xpath_table}/tbody/tr')
table_rows.each do |table_row|
cells = table_row.xpath('td')
cells = cells.map do |cell|
cell.inner_text
end
row_content_hash = Hash.new
cells.each_with_index do |cell_string, column_index|
row_content_hash[row_headers[column_index]] = cell_string
end
row_contents << [row_content_hash]
end
return row_contents
end
The user could now specify a website-recipe-file like this:
<basic_table xpath='//div[#id="grid"]/table[#id="displayGrid"]'
The function basic_table is referenced here, so that by parsing the website-recipe-file I would know that I can use the function basic_table to extract the content from the table referenced by the xPath.
This way the user can specify simple recipe-scripts and only has to dive into writing actual code if he needs a new way of extracting information.
The code would not change every time a new webpage needs to be parsed.
Whenever the structure of a webpage changes only the recipe-script would need to be changed.
I was thinking that someone might be able to tell me how he would approach this. Rules/rule engines pop into my mind, but I'm not sure if that really is the solution to my problem.
Somehow I have the feeling that I don't want to "invent" my own solution to handle this problem.
Does anybody have a suggestion?
J.

three dependent drop down list opencart

I want to make 3 dependents drop down list, each drop down dependent to the previous drop down, so when I select an item from first drop down , all data fetch from database and add to second drop down as item.
I know how to do this in a normal php page using ajax, but as opencart uses MVC I don't know how can I get the selected value
Basically, you need two things:
(1) Handling list changes
Add an event handler to each list that gets its selected value when it changes (the part that you already know), detailed tutorial here in case someone needed it
Just a suggestion (for code optimization), instead of associating a separate JS function to each list and repeating the code, you can write the function once, pass it the ID of the changing list along with the ID of the depending list and use it anywhere.
Your HTML should look like
<select id="list1" onchange="populateList('list1', 'list2')">
...
</select>
<select id="list2" onchange="populateList('list2', 'list3')">
...
</select>
<select id="list3">
...
</select>
and your JS
function populateList(listID, depListID)
{
// get the value of the changed list thorugh fetching the elment with ID "listID"
var listValue = ...
// get the values to be set in the depending list through AJAX
var depListValues = ...
// populate the depending list (element with ID "depListID")
}
(2) Populating the depending list
Send the value through AJAX to the appropriate PHP function and get the values back to update the depending list (the part you are asking for), AJAX detailed tutorial here
open cart uses the front controller design patter for routing, the URL always looks like: bla bla bla.bla/index.php?route=x/y/z&other parameters, x = folder name that contains a set of class files, y = file name that contains a specific class, z = the function to be called in that class (if omitted, index() will be called)
So the answer for your question is:
(Step 1) Use the following URL in your AJAX request:
index.php?route=common/home/populateList
(Step 2) Open the file <OC_ROOT>/catalog/controller/common/home.php , you will find class ControllerCommonHome, add a new function with the name populateList and add your logic there
(Step 3) To use the database object, I answered that previously here
Note: if you are at the admin side, there is a security token that MUST be present in all links along with the route, use that URL:
index.php?route=common/home/populateList&token=<?php echo $this->session->data['token'] ?> and manipulate the file at the admin folder not the catalog
P.S: Whenever the user changes the selected value in list # i, you should update options in list # i + 1 and reset all the following lists list # i + 2, list # i + 3 ..., so in your case you should always reset the third list when the first list value is changed
P.P.S: A very good guide for OC 1.5.x => here (It can also be used as a reference for OC 2.x with some modifications)

Posting data on website using Mechanize Nokogiri Selenium

I need to post data on a website through a program.
To achieve this I am using Mechanize Nokogiri and Selenium.
Here's my code :
def aeiexport
# first Mechanize is submitting the form to identify yourself on the website
agent = Mechanize.new
agent.get("https://www.glou.com")
form_login_AEI = agent.page.forms.first
form_login_AEI.util_vlogin = "42"
form_login_AEI.util_vpassword = "666"
# this is suppose to submit the form I think
page_compet_list = agent.submit(form_login_AEI, form_login_AEI.buttons.first)
#to be able to scrap the page you end up on after submitting form
body = page_compet_list.body
html_body = Nokogiri::HTML(body)
#tds give back an array of td
tds = html_body.css('.L1').xpath("//table/tbody/tr[position()>1]/td")
# Checking my array of td with some condition
tds.each do |td|
link = td.children.first # Select the first children
if link.html = "2015 32 92 0076 012"
# Only consider the html part of the link, if matched follow the previous link
previous_td = td.previous
previous_url = previous_td.children.first.href
#following the link contained in previous_url
page_selected_compet = agent.get(previous_url)
# to be able to scrap the page I end up on
body = page_selected_compet.body
html_body = Nokogiri::HTML(body)
joueur_access = html_body.search('#tabs0head2 a')
# clicking on the link
joueur_access.click
rechercher_par_numéro_de_licence = html_body.css('.L1').xpath("//table/tbody/tr/td[1]/a[1]")
pure_link_rechercher_par_numéro_de_licence = rechercher_par_numéro_de_licence['href']
#following pure_link_rechercher_par_numéro_de_licence
page_submit_licence = agent.get(pure_link_rechercher_par_numéro_de_licence)
body_submit_licence = page_submit_licence.body
html_body = Nokogiri::HTML(body_submit_licence)
#posting my data in the right field
form.field_with(:name => 'lic_cno[0]') == "9511681"
1) So far what do you think about this code, Do you think there is an error in there
2) This part is the one I am really not sure about : I have posted my data in the right field but now I need to submit it. The problem is that the button I need to click is like this:
<input type="button" class="button" onclick="dispatchAndSubmit(document.JoueurRechercheForm, 'rechercher');" value="Rechercher">
it triggers a javascript function onclick. I am triying Selenium to trigger the click event. Then I end up on another page, where I need to click a few more times.. I tried this:
driver.find_element(:value=> 'Rechercher').click
driver.find_element(:name=> 'sel').click
driver.find_element(:value=> 'Sélectionner').click
driver.find_element(:value=> 'Inscrire').click
But so far I have not succeeded in posting the data.
Could you please tell me if selenium will enable me to do what I need to do. If can I do it ?
At a glance your code can use less indentation and more white space/empty lines to separate the internal logic of AEIexport (which should be changed to aei_export since Ruby uses snake case for method names. You can find more recommendations on how to style ruby code here).
Besides the style of your code, an error I found at the beginning of your method is using an undefined variable page when defining form_login_AEI.
For your second question, I'm not familiar with Selenium; however since it does use a real web browser it can handle JavaScript. Watir is another possible solution.
An alternative would be to view the page source (i.e. in Firebug) and understand what the JavaScript on the page does. Then use Mechanize to follow the link manually.

XPath: Select Certain Child Nodes

I'm using XPath with Scrapy to scrape data off of a movie website BoxOfficeMojo.com.
As a general question: I'm wondering how to select certain child nodes of one parent node all in one Xpath string.
Depending on the movie web page from which I'm scraping data, sometimes the data I need is located at different children nodes, such as whether or not there is a link or not. I will be going through about 14000 movies, so this process needs to be automated.
Using this as an example. I will need actor/s, director/s and producer/s.
This is the Xpath to the director: Note: The %s corresponds to a determined index where that information is found - in the action Jackson example director is found at [1] and actors at [2].
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()
However, would a link exist to a page on the director, this would be the Xpath:
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/a/text()
Actors are a bit more tricky, as there <br> included for subsequent actors listed, which may be the children of an /a or children of the parent /font, so:
//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font//a/text()
Gets all most all of the actors (except those with font/br).
Now, the main problem here, I believe, is that there are multiple //div[#class="mp_box_content"] - everything I have works EXCEPT that I also end up getting some digits from other mp_box_content. Also I have added numerous try:, except: statements in order to get everything (actors, directors, producers who both have and do not have links associated with them). For example, the following is my Scrapy code for actors:
actors = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font//a/text()' % (locActor,)).extract()
try:
second = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()' % (locActor,)).extract()
for n in second:
actors.append(n)
except:
actors = hxs.select('//div[#class="mp_box_content"]/table/tr[%s]/td[2]/font/text()' % (locActor,)).extract()
This is an attempt to cover for the facts that: the first actor may not have a link associated with him/her and subsequent actors do, the first actor may have a link associated with him/her but the rest may not.
I appreciate the time taken to read this and any attempts to help me find/address this problem! Please let me know if any more information is needed.
I am assuming you are only interested in textual content, not the links to actors' pages etc.
Here is a proposition using lxml.html (and a bit of lxml.etree) directly
First, I recommend you select td[2] cells by the text content of td[1], with expressions like .//tr[starts-with(td[1], "Director")]/td[2] to account for "Director", or "Directors"
Second, testing various expressions with or without <font>, with or without <a> etc., makes code difficult to read and maintain, and since you're interested only in the text content, you might as well use string(.//tr[starts-with(td[1], "Actor")]/td[2]) to get the text, or use lxml.html.tostring(e, method="text", encoding=unicode) on selected elements
And for the <br> issue for multiple names, the way I do is generally modify the lxml tree containing the targetted content to add a special formatting character to <br> elements' .text or .tail, for example a \n, with one of lxml's iter() functions. This can be useful on other HTML block elements, like <hr> for example.
You may see better what I mean with some spider code:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
import lxml.etree
import lxml.html
MARKER = "|"
def br2nl(tree):
for element in tree:
for elem in element.iter("br"):
elem.text = MARKER
def extract_category_lines(tree):
if tree is not None and len(tree):
# modify the tree by adding a MARKER after <br> elements
br2nl(tree)
# use lxml's .tostring() to get a unicode string
# and split lines on the marker we added above
# so we get lists of actors, producers, directors...
return lxml.html.tostring(
tree[0], method="text", encoding=unicode).split(MARKER)
class BoxOfficeMojoSpider(BaseSpider):
name = "boxofficemojo"
start_urls = [
"http://www.boxofficemojo.com/movies/?id=actionjackson.htm",
"http://www.boxofficemojo.com/movies/?id=cloudatlas.htm",
]
# locate 2nd cell by text content of first cell
XPATH_CATEGORY_CELL = lxml.etree.XPath('.//tr[starts-with(td[1], $category)]/td[2]')
def parse(self, response):
root = lxml.html.fromstring(response.body)
# locate the "The Players" table
players = root.xpath('//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table')
# we have only one table in "players" so the for loop is not really necessary
for players_table in players:
directors_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Director")
actors_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Actor")
producers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Producer")
writers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Producer")
composers_cells = self.XPATH_CATEGORY_CELL(players_table,
category="Composer")
directors = extract_category_lines(directors_cells)
actors = extract_category_lines(actors_cells)
producers = extract_category_lines(producers_cells)
writers = extract_category_lines(writers_cells)
composers = extract_category_lines(composers_cells)
print "Directors:", directors
print "Actors:", actors
print "Producers:", producers
print "Writers:", writers
print "Composers:", composers
# here you should of course populate scrapy items
The code can be simplified for sure, but I hope you get the idea.
You can do similar things with HtmlXPathSelector of course (with the string() XPath function for example), but without modifying the tree for <br> (how to do that with hxs?) it works only for non-multiple names in your case:
>>> hxs.select('string(//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table//tr[contains(td, "Director")]/td[2])').extract()
[u'Craig R. Baxley']
>>> hxs.select('string(//div[#class="mp_box"][div[#class="mp_box_tab"]="The Players"]/div[#class="mp_box_content"]/table//tr[contains(td, "Actor")]/td[2])').extract()
[u'Carl WeathersCraig T. NelsonSharon Stone']

Google AJAX Transliteration API: Is it possible to make all input fields in the page transliteratable?

I've used "Google AJAX Transliteration API" and it's going well with me.
http://code.google.com/apis/ajaxlanguage/documentation/referenceTransliteration.html
Currently I've a project that I need all input fields in every page (input & textarea tags) to be transliteratable, while these input fields differs from page to page (dynamic).
As I know, I've to call makeTransliteratable(elementIds, opt_options) method in the API call to define which input fields to make transliteratable, and in my case here I can't predefine those fields manually. Is there a way to achieve this?
Thanks in advance
Rephrasing what you are asking for: you would like to collect together all the inputs on the page which match a certain criteria, and then pass them into an api.
A quick look at the API reference says that makeTransliteratable will accept an array of id strings or an array of elements. Since we don't know the ids of the elements before hand, we shall pass an array of elements.
So, how to get the array of elements?
I'll show you two ways: a hard way and an easy way.
First, to get all of the text areas, we can do that using the document.getElementsByTagName API:
var textareas = document.getElementsByTagName("textarea");
Getting the list of inputs is slightly harder, since we don't want to include checkboxes, radio buttons etc. We can distinguish them by their type attribute, so lets write a quick function to make that distinction:
function selectElementsWithTypeAttribute(elements, type)
{
var results = [];
for (var i = 0; i < elements.length; i++)
{
if (elements[i].getAttribute("type") == type)
{
results.push(elements[i]);
}
}
return results;
}
Now we can use this function to get the inputs, like this:
var inputs = document.getElementsByTagName("input")
var textInputs = selectElementsWithTypeAttribute(textInputs, "text");
Now that we have references to all of the text boxes, we can concatenate them into one array, and pass that to the api:
var allTextBoxes = [].concat(textareas).concat(textInputs);
makeTransliteratable(allTextBoxes, /* options here */);
So, this should all work, but we can make it easier with judicious use of library methods. If you were to download jQuery (google it), then you could write this more compact code instead:
var allTextBoxes = $("input[type='text'], textarea").toArray();
makeTransliteratable(allTextBoxes, /* options here */);
This uses a CSS selector to find all of the inputs with a type attribute of "text", and all textareas. There is a handy toArray method which puts all of the inputs into an array, ready to pass to makeTransliteratable.
I hope this helped,
Douglas

Resources