Rasa NLU for complex paragraphs - rasa-nlu

I have seen many examples that Rasa NLU is very good and efficient for building chatbot solutions. Means it is efficient to find intent and entity out of small chat conversations.
But is it preferable to find the intent from complex paragraphs as well? For example is it capable to use for long emails?

Rasa NLU is most suited for shorter messages since it was built for contextual AI assistants. You can still try running it on larger paragraphs, but you probably need to add a lot of data. Maybe some approaches such as Doc2Vec might be better suited here.

Related

FAQ chatbot without using QnA maker in Botframework V4

I've found no answer for the question and no similar question asked. Is it possible for me to build a FAQ bot without using QnA maker in Botframework V4. Is there alternative to QnA maker other than using other NLP AI (unless it is free). Or is it possible to build one by myself without too much work?
It really depends on how many Q's are in the FAQ, and how varied the questions might be. Let's say you have just 4 different answers in your FAQ and every question has a different keyword: then you could just have a simple switch statement on Activity.Text.Contains(keyword) and return one of the 4 answers based on that.
You could also designed a sort-of topics tree with cards and buttons that drill down into the topics, eventually providing an answer to the user's question without them typing the question.
Generally though, a QnA or FAQ Bot is expected to be intelligent and provide answers to varied question texts. Writing something that understands human language enough to interpret a generalized or varied word sentence, and grasp what exactly is being asked, is no small task. If it were something a developer could throw together without much work, there would not be paid services with this as a business model.

Sentiment Analysis of given text

This topic has many thread. But also I am posting another one. All the post may be a way to do a sentiment analysis, but I found no way.
I want to implement the doing ways of sentiment analysis. So I would request to show me a way. During my research, I found that this is used anyway. I guess Bayesian algorithm is used to calculate positive words and negative words and calculate the probability of the sentence being positive or negative using bag of words.
This is only for the words, I guess we have to do language processing too. So is there anyone who has more knowledge? If yes, can you guide me with some algorithms with their links for reference so that I can implement. Anything in particular that may help me in my analysis.
Also can you prefer me language that I can work with? Some says Java is comparably time consuming so they don't recommend Java to work with.
Any type of help is much appreciated.
First of all, sentiment analysis is done on various levels, such as document, sentence, phrase, and feature level. Which one are you working on? There are many different approaches to each of them. You can find a very good intro to this topic here. For machine-learning approaches, the most important element is feature engineering and it's not limited to bag of words. You can find many other useful features in different applications from the tutorial I linked. What language processing you need to do depends on what features you want to use. You may need POS-tagging if POS information is needed for your features for example.
For classifiers, you can try Support Vector Machines, Maximum Entropy, and Naive Bayes (probably as a baseline) and these are frequently used in the literature, about which you can also find a pretty comprehensive list in the link. The Mallet toolkit contains ME and NB, and if you use SVMlight, you can easily convert the feature formats to the Mallet format with a function. Of course there are many other implementations of these classifiers.
For rule-based methods, Pointwise Mutual Information is frequently used, and some kinds of scoring-based methods, etc.
Hope this helps.
For the text analyzing there is no language stronger than SNOBOL. In SNOBOL-4 the Fortran interpretator, for example, takes only 60 lines.
NLTK offers really good Algorithm for sentiment analysis. It is open source so you can have a look at the source code and check out the algorithm used. You can even download NLTK book which is free and has some good material on sentiment analysis.
Coming to your second point I dont think Java is that slow. I am myself coding in c++ for years but lately also started with java as if you see a lot of very popular open source softwares like lucene, solr, hadoop, neo4j are all written in java.

Spam prevention in Rails

I've got a Rails app where users can send messages to other users. The problem is, it's the type of site that draws many spammers who send bogus messages.
I'm already aware of a couple spam services like Akismet (via rakismet) and Defensio (via defender). The problem with these is that it looks like they don't take into account messages the user has already sent. The type of spam I'm seeing on my site is where the user sends the same (or very similar) messages to many other users. As such, I'd like to be able to compare to at least a handful of past messages to ensure they're different enough to not be considered spam.
So far, the best thing I've come across is the Text::Levenshtein distance implementation, which calculates the number of differences between two strings. I suppose I could calculate the number of difference divided by the string length, and if it's above a certain threshold, then it's not considered spam.
One other thing I've come across is Classifier::Bayes, which makes a best guess as to what category something falls into. Still pondering on this one.
I feel like I might just be looking in the wrong place, and maybe there's already a better solution for something like this out there. Perhaps I'm searching for the wrong words to find something a little more useful.
Don't try and roll your own solution for this, it's much more complex than you would expect. It is infact one of those things, like encryption, where it is a much better idea to farm it out to someone/something that is really good at it. Here is some background for you.
Levenshtein distance is certainly a good thing to be aware of (you never know when a similarity metric will come in handy), but it is not the right thing to use for this particular problem.
A Bayesian classifier is much closer to what you're after. Infact spam detection is pretty much the canonical example of where a naive Bayesian classifier can do a tremendous job. Having said that you'd have to find a large collection of data (messages) that has been classified as spam and non-spam and that's similar to the types of messages you get on your site. You would then need to train your classifier and measure its performance. You'd need to tweak it and make sure you don't overfit it etc. While Classifier::Bayes is a decent basic implementation it will not give you a lot of support for this. Infact Ruby does suffer from a lack of good natural language processing libraries. There is nothing in Ruby to compare to python's NLTK.
Having said all of that, services like akismet will certainly have a bayesian classifier as one of the tools they use to determine if what you send them is spam or not. This classifier will likely be much more sophisticated than what you can build yourself, if for no other reason than the fact that they also have access to so much data. They likely also have other types of classifiers/algorithms that they will use, this is their core business after all.
Long story short, if I were you I would give something like Akismet another look. If you build a facility into your site where you or your users can flag messages as spam (for example via rakismet's spam! method), you'll be able to send this data to akismet and the service should learn pretty quickly that a particular kind of message is spammy. So if your users are sending many similar spammy messages, even if akismet doesn't pick this up straight away, after you flag a couple of these all the rest should be picked up automatically. If I were you I would be concentrating my efforts into experimenting in this direction rather than trying to roll your own solution.

Artificial Intelligence/Rules to guess user taste in Apparel/Clothing

Are there standard rules engine/algorithms around AI that would predict the user taste on a particular kind of product like clothes.
I know it's one thing all e-commerce website will kill for. But I am looking out for theoretical patterns defined out there which would help make that prediction in a better way, if not accurately.
Two books that cover recommender systems:
Programming Collective Intelligence: Python, does a good job explaining the algorithm, but doesn't provide enough help IMO in terms of understanding how to scale.
Algorithms of the Intelligent Web: Java, harder to follow, but also covers using persistence, in this case MySQL, to facilitate scaling and identifiers areas in example code that will not scale as-is.
Basically two ways of approaching the problem, user or item based. Netflix appears to use the former, while Amazon the latter. Typically user based requires more time and/or processing power to generate recommendations because you tend to have more users than items to consider.
Not sure how to answer this, as this question is overly broad. What you are describing is a Machine Learning kind of task, and thus would fall under that (very broad) umbrella. There are a number of different algorithms that can be used for something like this, but most texts would tell you that the definition of the problem is the important part.
What parts of fashion are important? What parts are not? How are you going to gather the data? How noisy is the data? All of these are important considerations to the problem space. Pandora does a similar type of thing with music, with their big benefit being that their users tell them initially what they like and don't like.
To categorize their music, they actually have trained musicians listening to the music to identify all sorts of stuff. See the article on Ars Technica here for more information about that. Based on what I know about fashion tastes, I would say that it is a similar problem space, and would probably require experts to "codify" the information before you could attempt to draw parallels.
Sorry for the vague answer - if you want more specifics, I would recommend asking a more specific question, about specific algorithms or data sets, etc.

Do you know any patterns for GUI programming? (Not patterns on designing GUIs)

I'm looking for patterns that concern coding parts of a GUI. Not as global as MVC, that I'm quite familiar with, but patterns and good ideas and best practices concerning single controls and inputs.
Let say I want to make a control that display some objects that may overlap. Now if I click on an object, I need to find out what to do (Just finding the object I can do in several ways, such as an quad-tree and Z-order, thats not the problem). And also I might hold down a modifier key, or some object is active from the beginning, making the selection or whatever a bit more complicated. Should I have an object instance representing a screen object, handle the user-action when clicked, or a master class. etc.. What kind of patterns or solutions are there for problems like this?
I think to be honest you a better just boning up on your standard design patterns and applying them to the individual problems that you face in developing your UI.
While there are common UI "themes" (such as dealing with modifier keys) the actual implementation may vary widely.
I have O'Reilly's Head First Design Patterns and The Poster, which I have found invaluable!
Shameless Plug : These links are using my associates ID.
Object-Oriented Design and Patterns by Cay Horstmann has a chapter entitled "Patterns and GUI Programming". In that chapter, Horstmann touches on the following patterns:
Observer Layout Managers and the
Strategy Pattern Components,
Containers, and the Composite Pattern
Scroll Bars and the Decorator Pattern
I don't think the that benefit of design patterns come from trying to find a design pattern to fit a problem. You can however use some heuristics to help clean up your design in this quite a bit, like keeping the UI as decoupled as possible from the rest of the objects in your system.
There is a pattern that might help out in this case, the Observer Pattern.
I know you said not as global as MVC, but there are some variations on MVC - specifically HMVC and PAC - which I think can answer questions such as the ones you pose.
Other than that, try to write new code "in the spirit" of existing patterns even if you don't apply them directly.
perhaps you're looking for something like the 'MouseTrap' which I saw in some articles on codeproject (search for UI Platform)?
I also found this series very useful http://codebetter.com/jeremymiller/2007/07/26/the-build-your-own-cab-series-table-of-contents/ where you might have a look at embedded controllers etc.
Micha.
You are looking at a professional application programming. I searched for tips and tricks a long time, without success. Unfortunately you will not find anything useful, it is a complicated topic and only with many years of experience you will be able to understand how to write efficiently an application. For example, almost every program opens a file, extracts information, shows it in different forms, allow processing, saving, ... but nobody explains exactly what the good strategy is and so on. Further, if you are writing a big application, you need to look at some strategies to reduce your compilation time (otherwise you will wait hours at every compilation). Impls idioms in C++ help you for example. And then there is a lot more. For this reason software developers are well paid and there are so many jobs :-)

Resources