1) Please explain what is the functionality of TextGetTargetedSentiment.
2) Please provide Java code snippet calling TextGetTargetedSentiment.
EDIT
API info is at
http://www.alchemyapi.com/api/sentiment/textc.html#targetedsentiment
As answered by Zach below, code snippet given by AlchemyAPI is
AlchemyAPI_TargetedSentimentParams sentimentParams = new AlchemyAPI_TargetedSentimentParams();
sentimentParams.setShowSourceText(true);
doc = alchemyObj.TextGetTargetedSentiment("This car is terrible.", "car", sentimentParams);
System.out.print(getStringFromDocument(doc));
Result is
:
<totalTransactions>1</totalTransactions>
<language>english</language>
<text>This car is terrible.</text>
<docSentiment>
<score>-0.776261</score>
<type>negative</type>
</docSentiment>
If we change a statement to
"This car is superb."
Then result is
:
<totalTransactions>1</totalTransactions>
<language>english</language>
<text>This car is superb.</text>
<docSentiment>
<score>0.695491</score>
<type>positive</type>
</docSentiment>
All files
TextGetTargetedSentiment finds the sentiment for a specific keyword within a text. This can be contrasted with document level sentiment (the endpoint TextGetTextSentiment), which looks at the whole text to determine sentiment.
The AlchemyAPI Java SDK can help you get up and running quickly with the targeted sentiment call.
Related
I'm using elasticsearch==2.4.1 and django-haystack==3.0 with Django==2.2 using an Elasticsearch instance version 2.3 on AWS.
I'm trying to implement a "Did you mean...?" using a similarity search.
I have this model:
class EquipmentBrand(SafeDeleteModel):
name = models.CharField(
max_length=128,
null=False,
blank=False,
unique=True,
)
The following index:
class EquipmentBrandIndex(SearchIndex, Indexable):
text = fields.EdgeNgramField(document=True, model_attr="name")
def index_queryset(self, using=None):
return self.get_model().objects.all()
def get_model(self):
return EquipmentBrand
And I'm searching like this:
results = SearchQuerySet().models(EquipmentBrand).filter(content=AutoQuery(q))
When name is "Example brand", these are my actual results:
q='Example brand" -> Found
q='bra" -> Found
q='xam' -> Found
q='Exmple' -> *NOT FOUND*
I'm trying to get the last example to work, i.e. finding the item if the word is similar.
My goal is to suggest items from the database in case of typos.
What am I missing to make this work?
Thanks!
I don't think you want to be using EdgeNgramField. "Edge" n-grams, from the Elasticsearch Docs:
emits N-grams of each word where the start of the N-gram is anchored to the beginning of the word.
It's intended for autocomplete. It only matches string that are prefixes of the target. So, when the target document include "example", searches that work would be "e", "ex", "exa", "exam", ...
"Exmple" is not one of those strings. Try using plain NgramField.
Also, please consider upgrading. So much has been fixed and improved since ES 2.4.1
I've been following the spaCy quick-start guide for text classification.
Let's say I have a very simple dataset.
TRAIN_DATA = [
("beef", {"cats": {"POSITIVE": 1.0, "NEGATIVE": 0.0}}),
("apple", {"cats": {"POSITIVE": 0, "NEGATIVE": 1}})
]
I'm training a pipe to classify text. It trains and has a low loss rate.
textcat = nlp.create_pipe("pytt_textcat", config={"exclusive_classes": True})
for label in ("POSITIVE", "NEGATIVE"):
textcat.add_label(label)
nlp.add_pipe(textcat)
optimizer = nlp.resume_training()
for i in range(10):
random.shuffle(TRAIN_DATA)
losses = {}
for batch in minibatch(TRAIN_DATA, size=8):
texts, cats = zip(*batch)
nlp.update(texts, cats, sgd=optimizer, losses=losses)
print(i, losses)
Now, how do I predict whether a new string of text is "POSITIVE" or "NEGATIVE"?
This will work:
doc = nlp(u'Pork')
print(doc.cats)
It gives a score for each category we've trained to predict on.
But that seems at odds with the docs. It says I should use a predict method on the original subclass pipeline component.
That doesn't work though.
Trying textcat.predict('text') or textcat.predict(['text']) etc.. throws:
AttributeError Traceback (most recent call last)
<ipython-input-29-39e0c6e34fd8> in <module>
----> 1 textcat.predict(['text'])
pipes.pyx in spacy.pipeline.pipes.TextCategorizer.predict()
AttributeError: 'str' object has no attribute 'tensor'
The predict methods of pipeline components actually expect a Doc as input, so you'll need to do something like textcat.predict(nlp(text)). The nlp used there does not necessarily have a textcat component. The result of that call then needs to be fed into a call to set_annotations() as shown here.
However, your first approach is just fine:
...
nlp.add_pipe(textcat)
...
doc = nlp(u'Pork')
print(doc.cats)
...
Internally, when calling nlp(text), first the Doc for the text will be generated, and then each pipeline component, one by one, will run its predict method on that Doc and keep adding information to it with set_annotations. Eventually the textcat component will define the cats variable of the Doc.
The API docs from which you're citing for the other approach, kind of give you a look "under the hood". So they're not really conflicting approaches ;-)
In AlchemyAPI there are these two functions available TextGetTextSentimentand TextGetRankedKeywords.
but TextGetTextSentiment gives only sentiments without keywords (which made API come to sentiment conclusion). And TextGetRankedKeywords does not give sentiments.
Is there any API that gives both this information and correlation ?
I tried all these for a sample text. But it did not give required results.
TextGetRankedNamedEntities
TextGetRankedConcepts
TextGetRankedKeywords
TextGetLanguage
TextGetCategory
TextGetTextSentiment
TextGetTargetedSentiment
TextGetRelations
TextGetCombined
TextGetTaxonomy
EDIT:
As answered by Zach below. Code would look like :-
AlchemyAPI_KeywordParams param = new AlchemyAPI_KeywordParams();
param.setSentiment(true);
doc = alchemyObj.TextGetRankedKeywords(textToAnalyse,param);
System.out.println(getStringFromDocument(doc));
It provides output like this
:
:
<totalTransactions>2</totalTransactions>
<language>english</language>
<keywords>
<keyword>
<relevance>0.938195</relevance>
<sentiment>
<type>neutral</type>
</sentiment>
<text>OK Madam Mitch</text>
</keyword>
<keyword>
<relevance>0.915145</relevance>
<sentiment>
<score>0.492952</score>
<type>positive</type>
</sentiment>
<text>Clarence Knight</text>
</keyword>
:
:
TextGetRankedKeywords has a sentiment parameter that allows you to perform targeted sentiment analysis on each keyword that is extracted. You simply need to set sentiment=1.
I would like to use Stanford CoreNLP for lemmatization but I have some words not to be lemmatized. Is there a way to provide this ignore list to the tool? I am following this code, and when the program calls this.pipeline.annotate(document);then, that's it; it would be hard to replace the occurrences. One solution is that create a mapping list in which each word to be ignored is paired with lemmatize(word) (i.e., d = {(w1, lemmatize(w1)), (w2, lemmatize(w2), ...} and do the post processing with this mapping list. But it should be easier than this, I guess.
Thanks for the help.
I think I found the solution with my friend's help.
for(CoreMap sentence: sentences) {
// Iterate over all tokens in a sentence
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
System.out.print(token.get(OriginalTextAnnotation.class) + "\t");
System.out.println(token.get(LemmaAnnotation.class));
}
}
You can get original form of the word by calling token.get(OriginalTextAnnotation.class).
I use Ruby 1.9.3p385, Nokogiri and xpath v.1.
With help from awesome people on Stackoverflow I have come up with this xpath expression:
products = xml_file.xpath("(/root_tag/middle_tag/item_tag")
to split this XML file:
<root_tag>
<middle_tag>
<item_tag>
<headline_1>
<tag_1>Product title 1</tag_1>
</headline_1>
<headline_2>
<tag_2>Product attribute 1</tag_2>
</headline_2>
</item_tag>
<item_tag>
<headline_1>
<tag_1>Product title 2</tag_1>
</headline_1>
<headline_2>
<tag_2>Product attribute 2</tag_2>
</headline_2>
</item_tag>
</middle_tag>
</root_tag>
into 2 products.
I now wish to go through each product and extract all the product information (by extracting its leaf nodes). For that purpose I am using this code:
products.each do |product|
puts product #=> <item_tag><headline_1><tag_1>Product title 1</tag_1></headline_1><headline_2><tag_2>Product attribute 1</tag_2></headline_2></item_tag>
product_data = product.xpath("//*[not(*)]")
puts product_data #=> <tag_1>Product title 1</tag_1><tag_2>Product attribute 1</tag_2><tag_1>Product title 2</tag_1><tag_2>Product attribute 2</tag_2>
end
As you can see this does exactly what I want, exept for one thing: It reads through products instead of product.
How do I limit my search to product only? When answering, please note that the example is simplified. I would prefer that the solution "erase" the knowledge of products (if possible), beacause a then it will probably work in all cases.
Instead of:
//*[not(*)]
Use:
(//product)[1]//*[not(*)]
This selects the "leaf nodes" only under the first product element in the XML document.
Repeat this for all product elements in the document. You can get their count by:
count(//product)
The answer is to simply add a . before //*[not(*)]:
product_data = product.xpath(".//*[not(*)]")
This tells the XPath expression to start at the current node rather than the root.
Mr. Novatchev's answer, while technically correct, would not result in the parsing code being idiomatic Ruby.
You may just want:
product_data = product.xpath("*")
which will all find sub-elements of product.