<div class="section-body" id="section-2"><p>Most people with aortic stenosis do not develop symptoms until the disease is advanced. The diagnosis may have been made when the health care provider heard a heart murmur and performed tests.</p><p>Symptoms of aortic stenosis include:</p><ul><li>Chest discomfort: The chest pain may get worse with activity and reach into the arm, neck, or jaw. The chest may also feel tight or squeezed.</li><li>Cough, possibly bloody.</li><li>Breathing problems when exercising.</li><li>Becoming easily tired.</li><li>Feeling the heartbeat (palpitations).</li><li>Fainting, weakness, or dizziness with activity.</li></ul><p>In infants and children, symptoms include:</p><ul><li>Becoming easily tired with exertion (in mild cases)</li><li>Failure to gain weight</li><li>Poor feeding</li><li>Serious breathing problems that develop within days or weeks of birth (in severe cases)</li></ul><p>Children with mild or moderate aortic stenosis may get worse as they get older. They are also at risk for a heart infection called bacterial endocarditis.</p></div></div></section>
I have above script and I want to scrap the data in the list. i.e. in
I have tried following commands in scrapy but not working. It is giving '[]' as a output.
response.css("article div.section-body p").extract() <-- this is giving all info under section body but I want only under section-2
response.css("article div.section-body.section-2 p::text").extract()
response.xpath("//article/*[contains(#id, 'setion-2')]").extract()
please help me to extract. Thanks
Try
response.css("article div.section-body#section-2 p::text").extract()
div.section-body#section-2 means select DIV having both Class section-body and ID section-2
Note that ID is selected by # and class is selected by . ... So your CSS Selector posted in your question was wrong.
I'm wondering about the performance of Firebase when making n + 1 queries. Let's consider the example in this article https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html where a link has many comments. If I want to get all of the comments for a link I have to:
Make 1 query to get the index of comments under the link
For each comment ID, make a query to get that comment.
Here's the sample code from that article that fetches all comments belonging to a link:
var commentsRef = new Firebase("https://awesome.firebaseio-demo.com/comments");
var linkRef = new Firebase("https://awesome.firebaseio-demo.com/links");
var linkCommentsRef = linkRef.child(LINK_ID).child("comments");
linkCommentsRef.on("child_added", function(snap) {
commentsRef.child(snap.key()).once("value", function() {
// Render the comment on the link page.
));
});
I'm wondering if this is a performance concern as compared to the equivalent of this query if I were using a SQL database where I could make a single query on comments: SELECT * FROM comments WHERE link_id = LINK_ID clause.
Imagine I have a link with 1000 comments. In SQL this would be a single query, but in Firebase this would be 1001 queries. Should I be worried about the performance of this?
One thing to keep in mind is that Firebase works over web sockets (where available), so while there may be 1001 round trips there is only one connection that needs to be established. Also: a lot of the round trips will be happening in parallel. So you might be surprised at how much time this takes.
Should I worry about this?
In general people over-estimate the amount of use they'll get. So (again: in general) I recommend that you don't worry about it until you actually have that many comments. But from day 1, ensure that nothing you do today precludes optimizing later.
One way to optimize is to further denormalize your data. If you already know that you need all comments every time you render an article, you can also consider duplicating the comments into the article.
A fairly common scenario:
/users
twitter:4784
name: "Frank van Puffelen"
otherData: ....
/messages
-J4377684
text: "Hello world"
uid: "twitter:4784"
name: "Frank van Puffelen"
-J4377964
text: "Welcome to StackOverflow"
uid: "twitter:4784"
name: "Frank van Puffelen"
So in the above data snippet I store both the user's uid and their name for every message. While I could look up the name from the uid, having the name in the messages means I can display the messages without the lookup. I'm also keeping the uid, so that I provide a link to the user's profile page (or other message).
We recently had a good question about this, where I wrote more about the approaches I consider for keeping the derived data up to date: How to write denormalized data in Firebase
I've got three models: Decks, Slots and Cards. I put them together like so...
Decks are made of many slots, each slot contains one card and any one card can show up in a number of different slots.
I modeled it after the "Order - Order Line Item - Product" structure, hope that makes sense.
Anyways, Decks have an integer field called :deck_type, and suppose I want to get all of the decks of a certain type and then see all of their cards. I EXPECT to be able to run this query but I get an error of undefined method 'cards':
Deck.where(:deck_type => 1).cards
To get all decks of type 1 and then spit out their cards. I have an association established of "deck has many cards through slots", and when I call ".cards" on a single deck it works fine to return the cards.
I feel like this should be a pretty basic query - what am I missing?
Thanks in advance for any insight.
The method cards is for one deck only. So the following should work:
Deck.where(deck_type: 1).first.cards
The first will fetch 1 deck.
If you want cards that belong to decks with deck_type 1, then you've got a few options:
Deck.where(deck_type: 1).map(&:cards).flatten.uniq
That will apply the cards method on each found deck and then get all cards. The flatten will make the results into a 1D array and then uniq will ensure that no duplicates are present, if any.
But the following might be faster:
deck_ids = Deck.where(deck_type: 1).pluck(:id)
Card.where(deck_id: deck_ids)
I think it's safe to assume your Card model has a deck_id attribute. From the above, you will fetch only those cards that have deck_id in the deck_ids variable.
Even better however would be the following as it'll be a single database query. Assuming you've got the right associations setup, you can do:
# replace 'decks' with Deck.table_name if necessary
Card.joins(:deck).where(decks: {deck_type: 1})
I hope that last one is self-explanatory.
I'm tinkering with Prolog and ran into the following problem.
Assume I want to create a little knowledge base about the courses of an university.
I need the following two relation schemes:
relation scheme for lecturer: lecturer(Name,Surname)
relation scheme for course: course(Topic,Lecturer,Date,Location).
I have a lecturer, John Doe:
lecturer(doe,john).
John Doe teaches the complexity class:
course(complexity,lecturer(doe,john),monday,roomA).
Now I have a redundancy in the information - not good!
Is there any way to achieve something like this:
l1 = lecturer(doe,john).
course(complexity,l1,monday,roomA).
Many thanks in advance!
The same normalization possibilities as in data bases apply:
id_firstname_surname(1, john, doe).
and:
course_day_room_lecturer(complexity, monday 'A', 1).
That is, we have introduced a unique ID for each lecturer, and use that to refer to the person.
For university exam revision, I came across a past paper question with a Prolog database with the following structures:
% The structure of a media production team takes the form
% team(Producer, Core_team, Production_assistant).
% Core_team is an arbitrarily long list of staff structures,
% but excludes the staff structures for Producer and
% and Production_assistant.
% staff structures represent employees and take the form
% staff(Surname,Initial,file(Speciality,Grade,CV)).
% CV is an arbitrarily long list of titles of media productions.
team(staff(lyttleton,h,file(music,3,[my_music,best_tunes,showtime])),
[staff(garden,g,file(musical_comedy,2,[on_the_town,my_music])),
staff(crier,b,file(musical_comedy,2,[on_the_town,best_tunes]))],
staff(brooke-taylor,t,file(music,2,[my_music,best_tunes]))).
team(staff(wise,e,file(science,3,[horizon,frontiers,insight])),
[staff(morcambe,e,file(science,3,[horizon,leading_edge]))],
staff(o_connor,d,file(documentary,2,[horizon,insight]))).
team(staff(merton,p,file(variety,2,[showtime,dance,circus])),
[staff(smith,p,file(variety,1,[showtime,dance,circus,my_music])),
staff(hamilton,a,file(variety,1,[dance,best_tunes]))],
staff(steaffel,s,file(comedy,2,[comedians,my_music]))).
team(staff(chaplin,c,file(economics,3,[business_review,stock_show])),
[staff(keaton,b,file(documentary,3,[business_review,insight])),
staff(hardy,o,file(news,3,[news_report,stock_show,target,now])),
staff(laurel,s,file(economics,3,[news_report,stock_show,now]))],
staff(senate,m,file(news,3,[business_review]))).
One of the rules I have to write is the following:
Return the initial and surname of any producer whose team includes 2
employees whose CVs include a production entitled ‘Now’.
This is my solution:
recurseTeam([],0).
recurseTeam[staff(_,_file(_,_,CV))|List],Sum):-
member(now,CV),
recurseTeam(List,Rest),
Sum is Rest + 1.
query(Initial,Surname):-
team(staff(Surname,Initial,file(Speciality,Grade,CV)),Core_team,Production_assistant),
recurseTeam([staff(Surname,Initial,file(Speciality,Grade,CV)),Production_assistant|Core_team,Sum),
Sum >= 2.
The logic I have here is that I have a recursive predicate which takes each staff member in turn, and a match is found only if the CV list contains the production 'now', and as you can see it will return the Initial and Surname of a Producer if at least 2 employees CV contains the 'now' production.
So, at least as far as I can see, it should return the c,chaplin producer, right? Because this team has staff members who have CV's which contains the 'now' production.
But when I query it, e.g.
qii(Initial,Surname).
It returns 'false'.
When I remove the "member(now,CV)" predicate, it successfully returns all four producers. So it would seem the issues lies with this rule. Member is the built-in predicate for querying the contents of lists, and 'CV' is the list structure that is contained within the file structure of a staff structure.
Any ideas why this isn't working as I had expected?
Any suggestions on what else I could try here?
You need one more clause for the recurseTeam predicate, namely for the case that the first argument is a non-empty list, but its first element is a file structure that does not contain now.
In the current version, recurseTeam simply fails as soon as it encounters such an element in the list.
One possible solution is to add the following third clause for recurseTeam:
recurseTeam([staff(_,_,file(_,_,CV))|List],Sum):-
\+ member(now,CV),
recurseTeam(List,Sum).
Alternatively, one can use a cut ! in the second recurseTeam clause after member(now,CV) and drop \+ member(now,CV) in the third clause. This is more efficient, since it avoids calling member(now,CV) twice. (Note, however, that this is a red cut – the declarative and the operational semantics of the program are no longer the same. Language purists may find this disturbing – "real programmers" don't care.)