I'm currently working on a sentiment analysis project, and this just came to my mind..
is it really necessary to use classifier models to detect sentiments when we already have ratings?
(1-2: Negative; 3: Neutral 4-5: Positive)
I guess it's beneficial because models can give us a more accurate %? I just want to know what you guys think...
Thank you for your help!
We are trying to understand the underlying model of Rasa - the forums there still didnt get us an answer - on two main questions:
we understand that Rasa model is a transformer-based architecture. Was it
pre-trained on any data set? (eg wikipedia, etc)
then, if we
understand correctly, the intent classification is a fine tuning task
on top of that transformer. How come it works with such small
training sets?
appreciate any insights!
the transformer model is not pre-trained on any dataset. We use quite a shallow stack of transformer which is not as data hungry as deeper stacks of transformers used in large pre-trained language models.
Having said that, there isn't an exact number of data points that will be sufficient for training your assistant as it varies by the domain and your problem. Usually a good estimate is 30-40 examples per intent.
Using sentiment analysis API and want to know how the AI bias that gets in through the training set of data and other biases quantified. Any help would be appreciated.
There are several tools developed to deal with it:
Fair Learn https://fairlearn.github.io/
Interpretability Toolkit https://learn.microsoft.com/en-us/azure/machine-learning/how-to-machine-learning-interpretability
In Fair Learn you can see how biased a ML model is after it has been trained with the data set and choose a maybe less accurate model which performs better with biases. The explainable ML models provide different correlation of inputs with outputs and combined with Fair Learn can give an idea of the health of the ML model.
I want to make a program that can classify news into real or fake news. In this project I've used Naive Bayes Algorithm to solve it. I have categorized fake news with sentimental analysis and I already know how to classify it.
But now I want to add constraint in this project, which is I want to have punctuation amount to decide if its a real or fake news but I am confused how to mix it. In the like-hood or what?
Thank you. Before it I have already used n-gram and laplace smoothing in the sentimental analysis of the news.
I and a group of people are developing a Sentiment Analysis Algorithm. I would like to know what are the existent ones, because I want to compare them. Is there any article that have the main algorithms in this area?
Thanks in advance
Some of the papers on sentiment analysis may help you -
One of the earlier works by Bo Pang, Lillian Lee http://acl.ldc.upenn.edu/acl2002/EMNLP/pdfs/EMNLP219.pdf
A comprehensive survey of sentiment analysis techniques http://www.cse.iitb.ac.in/~pb/cs626-449-2009/prev-years-other-things-nlp/sentiment-analysis-opinion-mining-pang-lee-omsa-published.pdf
Study by Hang Cui, V Mittal, M Datar using 6-grams http://citeseerx.ist.psu.edu/viewdoc/download?doi=
For quick implementation naive bayes is recommended. You can find an example here http://nlp.stanford.edu/IR-book/
We did a statistical comparision of various classifiers and found SVM to be most accurate, though for a dataset consisting of large contents
( http://ai.stanford.edu/~amaas/data/sentiment/ ) none of the methods worked well.Our study may not be accurate though. Also instead of treating sentiment analysis as a text classification problem, you can look at extraction of meaning from text, though I do not know how successful it might be.
apparently the NLTK, a python natural language processing library, has one:
Probably worth having a look at it.
I'd like you to give me some advice in order to tackle this problem. At college I've been solving opinion mining tasks but with Twitter the approach is quite different. For example, I used an ensemble learning approach to classify users opinions about a certain Hotel in Spain. Of course, I was given a training set with positive and negative opinions and then I tested with the test set. But now, with twitter, I've found this kind of categorization very difficult.
Do I need to have a training set? and if the answer to this question is positive, don't you think twitter is so temporal so if I have that set, my performance on future topics will be very poor?
I was thinking in getting a dictionary (mainly adjectives) and cross my tweets with it and obtain a term-document matrix but I have no class assigned to any twitter. Also, positive adjectives and negative adjectives could vary depending on the topic and time. So, how to deal with this?
How to deal with the problem of languages? For instance, I'd like to study tweets written in English and those in Spanish, but separately.
Which programming languages do you suggest to do something like this? I've been trying with R packages like tm, twitteR.
Sure, I think the way sentiment is used will stay constant for a few months. worst case you relabel and retrain. Unsupervised learning has a shitty track record for industrial applications in my experience.
You'll need some emotion/adj dictionary for sentiment stuff- there are some datasets out there but I forget where they are. I may have answered previous questions with better info.
Just do English tweets, it's fairly easy to build a language classifier, but you want to start small, so take it easy on yourself
Python (NLTK) if you want to do it easily in a small amount of code. Java has good NLP stuff, but Python and it's libraries are way more user friendly
This site: https://sites.google.com/site/miningtwitter/questions/sentiment provides 3 ways to do sentiment analysis using R.
The twitter package is now updated to work with the new twitter API. I'd you download the source version of the package to avoid getting duplicated tweets.
I'm working on a spanish dictionary for opinion mining, and would publish somewhere accesible.
Sentiment Analysis will give only 3 results as said above - positive, negative and neutral. I found a tutorial on Twitter Sentiment analysis and it's quiet easy.
I found it here - https://www.ai-ml.tech/twitter-sentiment-analysis/
Only 3 dependencies, i downloaded and lesser code, done. Just go through it, you will get the solution.