tweepy streaming track filter results - filter

It seems not all the tweets I get using filter contain the item ("health" in this case). How could I get only tweets contain this specific item? Anyone can help me?
Thanks so much in advance!!
This is the line when I use filter:
sapi.filter(locations=[-79.55, 37.883, -75.067, 39.717],track = ["health"])

Unfortunately, the Streaming API does not allow filtering by both location and terms. From the docs:
Bounding boxes do not act as filters for other filter parameters. For example track=twitter&locations=-122.75,36.8,-121.75,37.8 would match any tweets containing the term Twitter (even non-geo tweets) OR coming from the San Francisco area.
So essentially the reason you are seeing some tweets that do not contain the word "health" is because you are receiving tweets containing the word "health", OR located within your bounding box (in this case, locations=[-79.55, 37.883, -75.067, 39.717]).
You can, however, try to filter by your term(s) then parse through the tweet data for the location, or alternately filter by location then search the tweet text for your term(s). I would probably suggest the latter if location is necessary to limit the scope of your tweet consumption.

It is very easy you just need to add this line in your code.
twitterStream.filter(track=["health"])

Related

IFTTT JavaScript filter - How to make case insensitive searches + How to search Include and Exclude sets of terms

First off I'm a total novice for Javascript, so please go gently. I'm aware of how people feel about having to now pay for IFTTT, but it's perfect for what I need.
I am using a more expansive version of this code below to capture certain keywords from Tweets to then generate emails if the search returns a positive result. This search works very nicely, except it is case sensitive which is a problem.
Yes, I know you can manipulate the twitter search to pick up specific words or phrases. I am very proficient in achieving searches this way. I am casting a wide net to pick up approx 120 search words or phrases which is too long to achieve through "OR" Twitter search parameters alone which is why I'm using this.
Q1 - I have tried adding item.toLowerCase() and just .toLowerCase() in various parts of the code so it wouldn't matter if the sentence case of the search term is different to that of the original tweet text case. I just can't get it to work though. I've seen various posts on here but I can't get any of them to work in IFTTT. I believe IFTTT doesn't accept REGEX either, which is annoying.
Any advice of how to get this code running so it's case-insensitive for text within IFTTT?
Q2 - I have approx 120 search terms for the tweet text to return positive results. There is a lot of junk that comes through with that. Does anyone know how to add a second layer of 'and exclude' search terms?
I have something like 300-400 words and specific phrases which would be used to stop the email from being triggered - so it'd be something like "IF tweet text contains a, b, c BUT text ALSO contains x, y, z... do not send the email"
let str=Twitter.newTweetFromSearch.Text;
let searchTerms=[
"Northbound",
"Westbound",
"Southbound",
"Eastbound"
]
let foundOne=0;
if(searchTerms.some(function(v){return str.indexOf(v)>=0;})){
foundOne=1;
}
if(foundOne==0){
Email.sendMeEmail.skip();
}
I have looked at the Twitter API, but that is a step too far for my coding ability which is why I'm using IFTTT.
Any help is very much appreciated
Thank you.
I'm playing with IFTTT Filter myself at the moment, so here are some thoughts about solving your solution.
If you want to do a case insensitive seatch on the original text, convert the original text to lowercase, then have all your search terms in lowercase.
Plus I think you want to iterate over the searchTerms array, and use the includes() method. Ok, just realised that .some() does the iteration for you, but I prefer includes() over indexof().
let str=Twitter.newTweetFromSearch.Text.toLowerCase();
let searchTerms=[
"northbound",
"westbound",
"southbound",
"eastbound"
]
let foundOne=0;
if(searchTerms.some(function(term){return str.includes(term);})){
foundOne=1;
}
if(foundOne==0){
Email.sendMeEmail.skip();
}
Or you could just skip having the foundOne variable, and do the search in the if() statement.
let str=Twitter.newTweetFromSearch.Text.toLowerCase();
let searchTerms=[
"northbound",
"westbound",
"southbound",
"eastbound"
]
if(!searchTerms.some(function(term){return str.includes(term);})){
Email.sendMeEmail.skip();
}

How to use the PubMed API to search for an article with an exact title?

I'm trying to use the PubMed API to search for articles with an exact title. As an example, I want to search for the title: The cost-effectiveness of mirtazapine versus paroxetine in treating people with depression in primary care.
I want up to 1000 results in JSON format, so I know that the first part of my URL should look like this:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmode=json&retmax=1000&term=
How do I add a title search as a GET parameter?
I've been using the Pubmed advanced search constructor, and that suggests that the query should look like The cost-effectiveness of mirtazapine versus paroxetine in treating people with depression in primary care[Title].
But if I try just adding that to the URL term=, PubMed tries to break down the title into all kinds of peculiar queries:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmode=json&retmax=1000&term=The%20cost-effectiveness%20of%20mirtazapine%20versus%20paroxetine%20in%20treating%20people%20with%20depression%20in%20primary%20care[Title]
How can I specify an exact title as a GET param?
Use field=title
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmode=json&retmax=1000&term=The%20cost-effectiveness%20of%20mirtazapine%20versus%20paroxetine%20in%20treating%20people%20with%20depression%20in%20primary%20care&field=title
Check out ESearch API for more information:
http://www.ncbi.nlm.nih.gov/books/NBK25499/#_chapter4_ESearch_
Use + instead of %20 (space).
For example:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmode=json&retmax=1000&term=cost+effectiveness+of+mirtazapine[title]

Wiktionary/MediaWiki Search & Suffix Filtering

I'm building an application that will hopefully use Wiktionary words and definitions as a data source. In my queries, I'd like to be able to search for all Wiktionary entries that are similar to user provided terms in either the title or definition, but also have titles ending with a specified suffix (or one of a set of suffixes).
For example, I want to find all Wiktionary entries that contain the words "large dog", like this:
https://en.wiktionary.org/w/api.php?action=query&list=search&srsearch=large%20dog
But further filter the results to only contain entries with titles ending with "d". So in that example, "boarhound", "Saint Bernard", and "unleashed" would be returned.
Is this possible with the MediaWiki search API? Do you have any recommendations?
This is mostly possible with ElasticSearch/CirrusSearch, but disabled for performance reasons. You can still use it on your wiki, or attempt smart search queries.
Usually for Wiktionary I use yanker, which can access the page table of the database. Your example (one-letter suffix) would be huge, but for instance .*hound$ finds:
Afghan_hound
Bavarian_mountain_hound
Foxhound
Irish_Wolfhound
Mahound
Otterhound
Russian_Wolfhound
Scottish_Deerhound
Tripehound
basset_hound
bearhound
black_horehound
bloodhound
boarhound
bookhound
boozehound
buckhound
chowhound
coon_hound
coonhound
covert-hound
covert_hound
coverthound
deerhound
double-nosed_andean_tiger_hound
elkhound
foxhound
gazehound
gorehound
grayhound
greyhound
harehound
heckhound
hell-hound
hell_hound
hellhound
hoarhound
horehound
hound
limehound
lyam-hound
minkhound
newshound
nursehound
otterhound
powder_hound
powderhound
publicity-hound
publicity_hound
rock_hound
rockhound
scent_hound
scenthound
shag-hound
sighthound
sleuth-hound
sleuthhound
slot-hound
slowhound
sluthhound
smooth_hound
smoothhound
smuthound
staghound
war_hound
whorehound
wolfhound

I want to get nodes of parseTree

This is part of my code:
String sentence = "The system Does Not Require users to identify themselves to search for books according to certain criteria and to check the availability of a particular book. However to check out books, to check their respective book loan status, and to place holds on books that are already on loan, users must first identify themselves to the system.";
Parse topParses[] =ParserTool.parseLine(sentence, parser, /*numParses=*/ 3);
for (Parse parseTree: topParses){
parseTree.show();
How can I get verbs in the sentence? Please!
I mean, how can I get nodes of tree?
If only you need to get the verbs from the sentence , then POSTagger in opennlp is sufficient.All you have to do is to use a Opennlp tokenizer to get tokens in a array and feed it to the POSTaggerME.It will give you the corresponding POS tags..Then you can filter by tags for the Verb like VB, VBZ etc.
If you are looking for verb phrases then use the Chunker, if you need just verbs then use the POS tagger.
Check out this answer
How to extract the noun phrases using Open nlp's chunking parser

Google Places API query

I am using the Place Searches from the Google Places API and wanted to know how the JSON results are returned/ordered.
E.g. are the list of places returned in a random order within the results array; or are they returned in order of nearest distance to the specified location i.e. with the first result in the results array being the nearest place to the specified location?
This is determined by the rankby parameter, which lets you choose either prominence (best search result match) or distance.
More info: https://developers.google.com/maps/documentation/places/#PlaceSearchRequests
This has been added to the API, please use the 'rankby=distance' instead of 'radius' in your Place Search Request as described in the documentation here: https://developers.google.com/maps/documentation/places/#PlaceSearchRequests

Resources