Xpath text extraction between 2 keywords - xpath

Is there an xpath expression that return the text available between 2 keywords?
For example we have a span like the following:
<b>Specialty: </b>PO<br/><b>Job Function: </b>RN<br/><br/><b>Qualifications/Duties</b><br/>Texas Health Presbyterian Allen is currently in search of a Registered Nurse to help meet the growing needs of our Day Surgery Department to work PRN in Day Surgery and also float to PACU.<br/><br/><b>Basic Qualifications:</b><br/><br/>*Graduate of an accredited school of nursing<br/>*Valid RN license in the state of Texas<br/>*BLS<br/>*ACLS<br/>*PALS within 6 months of hire<br/>*Minimum of 1 - 3 years experience as RN in Day Surgery, PACU, Outpatient Surgery, or ICU<br/>*Strong organizational skills and ability to function in a fast paced work environment<br/>*Ability to accept responsibility and show initiative to work without direct supervision<br/>*A high degree of confidentiality, positive interpersonal skills and ability to function in a fast-paced environment<br/><br/><b>Preferred Qualifications:</b><br/><br/>*Three years RN experience in Outpatient Surgery along with some ICU experience.<br/>*PALS<br/>*PACU , Endoscopy or Ambulatory setting<br/>*IV Conscious Sedation<br/><br/><b>Hours/Schedule:</b><br/><br/>*Variable<br/><br/>J2WPeriop<br/><br/><b>Entity Information</b><br/>Texas Health Presbyterian Hospital Allen is a 73-bed, acute-care hospital serving the northern Collin County area since 2000. Hospital services include women’s care, a Level II neonatal intensive care unit, orthopedics, pediatrics, wound care and sleep medicine. Texas Health Allen, a Pathway to Excellence® designated hospital by the American Nurses Credentialing Center, has more than 500 physicians on its medical staff practicing in more than 25 specialties. Texas Health Allen is a World Health Organization-designated "Baby-Friendly Hospital" and was the first hospital in Texas to receive the distinction. The hospital is a Level IV trauma center and an Accredited Chest Pain Center by the Society of Chest Pain Centers, which makes our facility intensely qualified to serve our community and your professional aspirations.<br/>
I would like to know if we can define an xpath to return all the text that is available between 2 keywords say "Qualifications/Duties" and "Entity Information"

Yes, but don't expect a nicely formatted output, the markup is messy and the expression might need some slight tweaks for whether you also want the nodes with "Basic Qualifications:" or not (this version skips them, it only takes "naked" text nodes).
//text()[preceding-sibling::*[text()='Qualifications/Duties'] and following-sibling::*[text()='Entity Information']]
And it means:
//text()
SELECT EVERY TEXT NODE
[
THAT
preceding-sibling::*[text()='Qualifications/Duties']
IS PRECEDED BY A NODE WITH TEXT = "Qualifications/Duties"
and following-sibling::*[text()='Entity Information'
AND FOLLOWED BY A NODE WITH TEXT = "Entity Information"
]
the output for your example:
Texas Health Presbyterian Allen is currently in search of a Registered Nurse to help meet the growing needs of our Day Surgery Department to work PRN in Day Surgery and also float to PACU.
*Graduate of an accredited school of nursing
*Valid RN license in the state of Texas
*BLS
*ACLS
*PALS within 6 months of hire
*Minimum of 1 - 3 years experience as RN in Day Surgery, PACU, Outpatient Surgery, or ICU
*Strong organizational skills and ability to function in a fast paced work environment
*Ability to accept responsibility and show initiative to work without direct supervision
*A high degree of confidentiality, positive interpersonal skills and ability to function in a fast-paced environment
*Three years RN experience in Outpatient Surgery along with some ICU experience.
*PALS
*PACU , Endoscopy or Ambulatory setting
*IV Conscious Sedation
*Variable
J2WPeriop

Related

Google Job Search API

Does google have an API for this feature?
https://www.google.com/search?q=product+manager+jobs&oq=product+manager+jobs+&aqs=chrome..69i57j0l4j69i60l3.5823j1j7&sourceid=chrome&ie=UTF-8&ibp=htl;jobs&sa=X&ved=2ahUKEwjPuIDJhebnAhWTqp4KHTXeCB0QiYsCKAB6BAgGEAM#htivrt=jobs&htidocid=2YjfCdSoJeXy_7nXAAAAAA%3D%3D&fpstate=tldetail
Wherein in the API I can pass a keyword then returns open jobs related to the keyword.
Right now google does not have such API, they only have an API for a job to be indexed and appear as a result. If you want to get jobs results you can use third party solutions for it.
I work at SerpApi and we have an API for Google Jobs.
You can check the playground and documentation to get a better idea of how it works.
Here is a part of a response sample for an individual job listing:
"title": "Staff Product Manager",
"company_name": "BuzzFeed",
"location": "New York, NY",
"via": "via Greenhouse.io",
"description": "The Role\n\nWe’re looking for an experienced product manager who is eager to help us drive retention and loyalty across our core BuzzFeed Products. You’ll be the product lead on a cross-functional team of engineers, designers, and data scientists that are focused on creating a differentiated and compelling site experience that our community of users will love.\n\nWhat You'll Do\n• Develop a strategy for driving retention and loyalty across our products by working with your other team leads and partners throughout the entire organization\n• Your purview over the core site experience would span BuzzFeed.com as well as our Google AMP, Facebook Instant Articles, and Apple News pages.\n• You would closely collaborate with our app team with the potential (but not requirement) to also manage an additional product manager\n• You would be responsible for thinking through how people are interacting with and coming to our various pages and empowered to create a strategy for the best way to drive... retention and a deeper level of engagement\n• You would be expected to help set the team strategy, prioritize the team’s work, write OKRs for individual products or projects, and communicate and coordinate the team’s plans with stakeholders within and outside of Tech\n\nYou Are\n• Experienced: You have hands-on experience launching and managing digital products, with a slant towards a consumer experience and how that drives the business and 5+ years of product management experience\n• Collaborative and communicative: You have a demonstrated ability to work well and communicate with engineers, designers, and data scientists -- experience working with business and editorial stakeholders is a plus\n• Curious, analytical, and proactive: You don’t merely accept outliers in data, but actively investigate and dive into the numbers and research to find novel insights and new product ideas\n• Comfortable in the spotlight: You will need to collaborate with and help influence senior leaders across the company to advance the team’s vision and product strategy in a fairly complex problem space\n\nA few examples of current team projects\n• Launching a new user profile experience\n• Creating new ways to reward engagement and establish habitual user behavior\n• Improving our notifications system\n• Improvements to the quiz taking experience\n• Creating ways to subscribe to topics to improve and personalize the site experience\n\nAbout BuzzFeed Tech\n\nBuzzFeed Tech is a group of about 150 product managers, engineers, data scientists, and designers that are focused on building great products and content experiences that bring our audience joy and truth. We’re a collaborative and friendly bunch that works in a typical agile way with sprints, JIRA, OKRs, and a close working relationship with management.\n\nLife at BuzzFeed\n\nAt BuzzFeed, we believe our work benefits from the diverse perspectives of our employees. As such, BuzzFeed celebrates inclusion and is committed to equal opportunity employment. At BuzzFeed, you can expect:\n• A supportive, inclusive atmosphere on a team that values your contributions\n• Opportunities for personal and professional growth via work experience, offerings from our in-house Learning # BuzzFeed team, our Employee Resource Groups, and more\n• An attractive and equitable compensation package, including salary and stock options\n• A generous and well-rounded benefits program featuring PTO, unlimited sick time, comprehensive medical benefits, a family leave policy, access to mental health platforms, retirement plans, gym and wellness discounts, and much more\n• Plenty of snacks (healthy and indulgent), catered lunches, beverages, etc..\n\nBuzzFeed is the world’s leading tech-powered media company, with a cross-platform news and entertainment network that reaches hundreds of millions of people globally. The company aims to spread truth and joy across the internet by producing articles, lists, quizzes, videos, original series; lifestyle content through brands including Tasty, the world’s largest social food network; original reporting and investigative journalism through BuzzFeed News; strategic partnerships, licensing and product development through BuzzFeed Marketing; and original productions across broadcast, cable, SVOD, film and digital platforms for BuzzFeed Studios.\n\nBuzzFeed is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category",
"extensions": [
"Over 1 month ago",
"Full-time"
]
},

How to support dynamically growing list of BusinessEvents

This applies to LUIS (MS Language Understanding)
Want to handle an utterance in the following format
"I met [a-PersonName] at [a-BusinessEvent] in [a-TimeReference]"
Sample utterances might be
I met Jane Allan at the Product Management Meetup in January
I met James at MS BUild in April 2017
I met Lily Tomlin at Learning UX Meeting in June 2018
The challenge is that [a-BusinessEvent] (the bold bits) will grow over time. Sure there are a couple of recurring things such as MSBuild or Apple WWDC but over time I'll want to have the users extend the list of BusinessEvents available. (imagine having a voice interface that would allow 'add new event called seattle chatbot meetup').
Should this be a list? or something else?
Are there any examples I could learn from?
Thank you
If I am understanding your query right, you are looking to have potentially a free form event names and you would want the ability to extract the event name entity out of the utterance consistently.
If my above understanding is right, you might want to take a look at "Pattern.Any" entity. This gives you an ability to extract data from the utterance where the utterances are well-formatted and where the end of the data may be easily confused with the remaining words of the utterance.
Once pattern.any is established, you'll have to add patterns that use the Pattern.any that you have created in order to improve the accuracy.
Visit this documentation for more information: https://learn.microsoft.com/en-us/azure/cognitive-services/luis/luis-tutorial-pattern-any
For eg., in your case,
I met Jane Allan at the "pattern.any" in January
I met James at "pattern.any" in April 2017
I met Lily Tomlin at "pattern.any" in June 2018
And finally, create a few patterns to improve accuracy.
I met {PersonName} at the {EventName} in {DateTime}
I met {PersonName} at {EventName} in {DateTime}
I met {PersonName} at {EventName} in {DateTime}

Search for the word and exporting 35 characters after that word using shell script

I have a file input.txt which have loads of weird characters, html tags and useful materials. I want to display 35 characters after the word "description" excluding weird characters like $&lmp and without html tags in the new file output.txt.
Input sample:
</image>
<title>A Londoner Looks Back: Were The Olympics Awesome?</title>
<link>http://www.askmen.com/sports/fanatic/london-olympics-post-mortem.html</link>
<description rdf:parseType="Literal">
The other evening I walked out of London&rsquo;s <a
href="http://www.askmen.com/fashion/watch_100/135_olympic-watches.html">Olympic
stadium onto the new &ldquo;Javelin&rdquo; train into town. (The journey from east to
central London, quite recently still something of a commuter&rsquo;s nightmare, took just
six minutes.) A railway worker on the platform didn&rsquo;t just point everyone the way
onto the train; he did a dance for us. You don&rsquo;t usually get that on London
transport. These Olympics made the city happier.I now live in Paris, but I
consider myself a Londoner. I went to nursery school in London, spent 15 years of my life
in the city, speak in a London accent, visit my parents and siblings here, and, as someone
of mongrel origin who belongs nowhere, I feel at home in the world&rsquo;s most
cosmopolitan city. To steal a line from the 1980s film Sammy and Rosie Get Laid:
&ldquo;I&rsquo;m not English. I&rsquo;m a Londoner.&rdquo; But London is also a sprawling,
gray, wet, overpriced city where traveling anywhere always seems to take forever, and
Londoners are not positive people. In fact, we are whiners. Going into the
Olympics, the whining was at full blast. Landing in London days before <a
href="http://www.askmen.com/sports/bodybuilding/olympic-bobsledding.html">the Games
began, I found my friends and family full of dread. The Games&rsquo; organizers had
indicated that while the Olympics were on, traveling anywhere would take even longer than
forever. My sister had been told to be at her desk at 7 a.m. during the Games to avoid the
rush hour -- this in a city where many people start work nearer to 10 a.m. A friend showed
me a kind of war scenario prepared by the bank where he worked, full of ominous questions
like, &ldquo;What if your supply chain stopped?&rdquo; &ldquo;What if your technology
failed?&rdquo; &ldquo;What if your brand, image and reputation were impacted by any of the
above?&rdquo; And what was all this upheaval in aid of? To watch some doped-up
moustachioed Eastern European women win incomprehensible weightlifting events? In a YouGov
survey days before the opening ceremony, only 51% of Britons expressed an interest in the
Olympics -- and that was a lot better than earlier surveys.On the day of the
opening ceremony I happened to have a meeting down the street from my last <a
href="http://www.askmen.com/london/">London address (a shared flat above a now defunct
liquor store). I ran to Baker Street tube, as I&rsquo;d done a thousand times before. Then
I got on a media bus to the opening ceremony that passed Southwark Bridge with the
Financial Times building where I had worked in the 1990s. It was like a dream:
You move through a familiar landscape that has been transformed. The Olympics helped me
see London afresh.It was during the opening ceremony that the mood among
Londoners changed. I know foreigners didn&rsquo;t get all the references: the Windrush
ship that brought the first Jamaican immigrants to Britain in 1948, the BBC weather
forecaster Michael Fish assuring us there would be no hurricane the night before one
struck in 1987, the dance of the state-funded National Health Service nurses. But Londoners got it. Danny Boyle, the director, gave us a multicultural and funny
Britain that had finally shed its imperial delusions of grandeur. The Olympic torch was
run into the stadium not by an Aryan superman but by the pot-bellied middle-aged ex-rower
Steve Redgrave, who can&rsquo;t run. For the first time in my life, Boyle&rsquo;s Britain
made me feel a patriot. The opening ceremony remains the highlight of my Olympics.Then the sports began, and with it the instinctive expectation that the Brits
would fall flat on their faces. We may have invented modern sports, but England&rsquo;s
soccer team hasn&rsquo;t won a prize since 1966, and no British man has won Wimbledon
since 1936. Surely our Olympians would continue the tradition?&nbsp;It seemed
so on the first day, when Britain&rsquo;s much-hyped male cyclists failed to win a medal
or even to figure in the run-in in front of Buckingham Palace. Only on the second day did
our first medal arrive: A silver for cyclist Lizzie Armitstead, a polite young vegetarian
from the rural north so little-known that at the press conference she had to introduce
herself to the nation. &ldquo;I could never get my head around eating corpses,&rdquo; she
explained. On the fourth morning, Britain still had no golds. The more
excitable newspapers began demanding inquests. And then the golds came in a crazy rush,
won by a bunch of underpaid Britons of all colors whose frank delight was irresistible.
Above all, there was Mo Farah, the Somali-born runner, who had arrived in London&rsquo;s
suburbs as an eight-year-old barely able to speak English, and had really wanted to play
on the wing for Arsenal, but who won gold in the 10,000 and 5,000 meters instead. After
his first gold, an African journalist asked if he wouldn&rsquo;t rather have been running
for Somalia. &ldquo;Look, mate, this is my country,&rdquo; replied Farah. He was
Boyle&rsquo;s multicultural Britain. The second Saturday of the Olympics, when
Farah was among six Britons to win gold, was Britain&rsquo;s best sporting day since 1966.
It was our best single Olympic day since the Games were held in London in 1908. Of course,
we embarked on an orgy of patriotism. On BBC TV, the new &ldquo;British heroes&rdquo; were
feted much like &ldquo;heroes of the harvest&rdquo; on North Korean state TV. Foreigners
rightly accused the Britons of practically ignoring the other 200 nations. However,
that&rsquo;s what every country at the Olympics does. Each country watches its own Games.
</item>
<title>How Facial Hair Can Save You From Skin Cancer</title>
<link>http://www.askmen.com/sports/news/moustaches-and-skin-cancer.html</link>
<description rdf:parseType="Literal">
Output should be like this:
The other evening I walked out of London Olympic
stadium onto the new Javelin train into town. (The journey from east to
central London, quite recently still
If you thought moustaches were solely to distinguish regular males from porn stars and
hipsters, think again. A new study suggests that
I have tried:
sed 's/^.*<description>/<description>/
s/</</g
s/>/>/g
s/&rsquo;/'"'"'/g
s/&ccedil;/c/g
s/<[^>]*>//g
s/^\(.\{35\}\).*/\1/' inputsample.txt
I don't believe it's possible with sed since sed doesn't understand XML entities. You need to use a programming language like Perl or Python for something like this.
The closest I can get you is:
$ sed -nE '/description/s/.*description(.{,35}).*>/\1/p' file_name
The -E means use extended regular expressions, so the {35,35} will work. The -n says don't print. I'm capturing the next 35 characters and substituting the whole line for them.
However, any special entities such as and all bets are off.

nokogiri: how to wrap html tags around given xpath elements?

I have an xpath to grab each text node which is not surrounded by any html tags. Instead, they are separated via <br>. I would like to wrap these with <span> tags.
Nokogiri::HTML(open("http://vancouver.en.craigslist.ca/van/swp/1426164969.html"))
.xpath("//br/following-sibling::text()|//br/preceding-sibling::text()").to_a
will return those text nodes.
complete revised code below:
doc = Nokogiri::HTML(open("http://vancouver.en.craigslist.ca/van/swp/1426164969.html"))
.xpath("//br/following-sibling::text()|//br/preceding-sibling::text()").wrap("<span></span>")
puts doc
I expected to see a full html source code with those texts wrapped with <span> tags, but I got the following:
Date: 2009-10-17, 4:36PM PDT
Reply to:
This is a spectacular open plan 1000 sq. ft. loft is in a former Canada Post building. Upon entering the loft from the hallway you are amazed at where you have arrived.... a stunning, bright and fully renovated apartment that retains its industrial feel. The restoration of the interior was planned and designed by a famous Vancouver architect.
The loft is above a police station, so youÂre guaranteed peace and quite at any time of the day or night.
The neighborhood is safe and lively with plenty of restaurants and shopping. ThereÂs a starbucks across the street and plenty of other coffee shops in the area. Antique alley with its hidden treasures is one block away, as well as the beautiful mile long boardwalk. Skytrain station is one minute away (literally couple of buildings away). 15 minutes to Commercial drive, 20 minutes to downtown Vancouver and Olympic venues.
Apartment Features:
- Fully furnished
- 14 ft ceilings
- Hardwood floors
- Gas fireplace
- Elevator
- Large rooftop balcony
- Full Kitchen: Fully equipped with crystal, china and utensils
- Dishwasher
- Appliances including high-end juice maker, blender, etc.
- WiFi (Wireless Internet)
- Bathtub
- Linens & towels provided
- Hair dryer
- LCD Flat-screen TV with DVD player
- Extensive DVD library
- Music Library: Ipod connection
- Wii console with Guitar Hero, games
- Book and magazine library
- Non-smoking
We are looking to exchange for a place somewhere warm (California, Hawaii, Mexico, South America, Central America) or a place in Europe (UK, Italy, France).
Email for other dates and pictures of the loft.
Your doc variable is not assigned to whole document — you should use
doc = Nokogiri::HTML(open("http://vancouver.en.craigslist.ca/van/swp/1426164969.html"))
doc.xpath("//br/following-sibling::text()|//br/preceding-sibling::text()").wrap("<span></span>")
puts doc
Unfortunately it doesn't solve the problem as nokogiri places first all brs than all spans with text like this:
<br><br><br><br><span>
text</span><span>
text</span><span>
text</span><span>
text</span>
But you can do like this
doc = Nokogiri::HTML(open("http://vancouver.en.craigslist.ca/van/swp/1426164969.html"))
doc.search("//br/following-sibling::text()|//br/preceding-sibling::text()").each do |node|
node.replace(Nokogiri.make("<span>#{node.to_html}</span>"))
end
puts doc

Home loan calculation formula (algorithm)?

How a bank calculate home loan's payments?
For example,
$1,000,000 at 5.00% over a 25 year period.
Monthly payment: $5,845.90
Current Payment To Date
Payment -------------------------- ----------------------------------------------
Number Interest Principal Interest Paid Principal Paid Balance
1 $4,166.67 $1,679.23 $4,166.67 $1,679.23 $998,320.77
2 $4,159.67 $1,686.23 $8,326.34 $3,365.46 $996,634.54
3 $4,152.64 $1,693.26 $12,478.98 $5,058.72 $994,941.28
4 $4,145.59 $1,700.31 $16,624.57 $6,759.03 $993,240.97
5 $4,138.50 $1,707.40 $20,763.07 $8,466.43 $991,533.57
6 $4,131.39 $1,714.51 $24,894.46 $10,180.94 $989,819.06
7 $4,124.25 $1,721.65 $29,018.71 $11,902.59 $988,097.41
8 $4,117.07 $1,728.83 $33,135.78 $13,631.42 $986,368.58
9 $4,109.87 $1,736.03 $37,245.65 $15,367.45 $984,632.55
10 $4,102.64 $1,743.26 $41,348.29 $17,110.71 $982,889.29
I'm trying to do same calculations in Excel, but I get another numbers...
The algorithms are well shown and discussed here (in Javascript) -- implement exactly the same algorithms in Excel's VBA, Javascript, Ruby, whatever, and you'll get pretty much the same results!-)
the magic words are amortization schedule.
The difference you see in Excel is probably to do with the way the compound interest is calculated. Most banks add compound interest daily (gets them more money).
The wikipedia article has a nice example of the equation used by US banks. You can code that up.

Resources