Defining metric and threshold values for one of the sub-characteristic from ISO 25010 for a add new member" function - metrics

May I know is there any paper or research explaining how to set the metric and threshold of the "add new member" function. is there any similar research that can be referred?

Related

What is the default target metric that H2O models use for their predict() method? Can change?

I am using a H2ORandomForestEsimator. What is the default target metric that H2O models use for their predict() method?
https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2o.automl.H2OAutoML.predict
Is there a way to set this? (Eg. to use one of the other metric maximizing thresholds that can be seen when looking at the results of get_params() method)
Currently am doing something like...
df_preds = mymodel.predict(df)
activation_threshold = mymodel.find_threshold_by_max_metric('f1', valid=True)
# adjust the predicted label for the desired metric's maximizing threshold
df_preds['predict'] = df_preds['my_positive_class'].apply(lambda probability: 'my_positive_class' if probability >= activation_threshold else 'my_negative_class')
see
https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/model_categories.html?highlight=find_threshold#h2o.model.binomial.H2OBinomialModel.find_threshold_by_max_metric
https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/frame.html?highlight=apply#h2o.H2OFrame.apply
There's no concept of a "target metric" when generating predictions, since you're just predicting the response for a row of data (there's no scoring here).
Edit: Thanks for clarifying your question. If you want to change how the threshold is generated, then what you're doing above is a good solution. If you have a suggestion for a utility function that would make this more straight-forward, please file a JIRA with your idea (it could definitely be improved).

Google cloud natural language API adding own context classifier

I have been searching how to create a new entity in google natural language API, and found nothing. Can anybody help how to create a new classifier such that if I pass a sentence and I want to detect suppose 'python' as programming language then how would I get that. Current the API is giving 'python' as 'other'.
I have also looked into cloud auto ml api for my solution and tried to create and train a model but It was only able to do sentiment analysis not entity detection.It was giving me the score rather than telling me that Java is programming language.
Thanks in advance.Your help will be appreciated.
Automl content classification classifies your data into the labels specified in the training set. It does not do entity detection. But it seems like what you need to do is closer to content classification than entity detection. My understanding from the description you provided is that you have content (may be words or phrases or short sentences) and you want to classify them into some labels (e.g. programmingLanguage). If you put together a good training set, the automl model should be able to do this.
The number it provides in eval is not sentiment, it's the probability of the predicted label. As you can see in the eval page you posted, it's telling you that java is a programmingLanguage with probability of 1 (so, it's very certain about it).

Understanding RASA-Core stories

I was trying to understand the examples given in RASA core git. I have seen an example story
greet
utter_ask_howcanhelp
inform{"cuisine": "italian"}
utter_on_it
utter_ask_location
But I didn't understand what {"cuisine": "italian"} is. Whether it is the default value of the slot or user has to provide italian in the input string. Can anybody help me to understand how to write stories in RASA core
Regards
One of the most powerful features of any dialog flow is stories. This is how you are telling the model what are the possible flows of conversational dialog.
In the questions you have asked about. Clearly the Italian is not the default value.
inform{"cuisine": "italian"}
Here you are telling the machine learning engine that you are looking for an Intent 'Inform' which will have a slot named cuisine. Here Italian is an example. At the runtime, it can be anything. You can also have another story line where Intent inform without cuisine slot. That story might ask for cuisine in the next dialog.
Defining the story lines, should not be confused with programming language. It is just an indication for Machine learning algorithms.
More details about using slots can be found here and here
This story describes how the dialogue model would behave in the case the user said something like "I want to eat Italian food". As you note, the slot "cuisine" is set to the value "italian".
In the restaurant example, the cuisine slot is a simple TextSlot. This means that the dialogue model only gets to see if the slot has a value or not. The behaviour would be exactly the same if the user had asked for chinese food, thai food, or anything else.
If you want the value of a slot to influence the dialogue going forward, you can use a different slot type, e.g. a categorical slot

Topic Modelling - Assign human readable labels to topic

I want to assign human readable labels to the results of my topic modelling.
Is there any software library or data set that I can use that takes these key words as an input, and returns a title to describe the topic.
Example:
Input: ["Church","Priest","God","Prayer"]
Output: "Religion"
Note: I want automatic label creation - Not manual like others have asked before.
See this paper by Jey Han Lau. He describes how to automatically generate labels using different sources and features.
We generate a set of label candidates
from the top-ranking topic terms, titles of Wikipedia
articles containing the top-ranking topic terms, and
also a filtered set of sub-phrases extracted from the
Wikipedia article titles. We rank the label candidates
using a combination of association measures, lexical
features and an Information Retrieval feature.

Mahout Content Based Recommendation Engine

I am working on a recommendation problem (Content based recommendation). I have my data set in mongodb in json format.
Problem Statement
There are items which have their own properties, and users have some preference regarding each properties. Now I am thinking to predict how much the item x will be liked by the user based on the properties of item and comparing the preferences of the user for same properties that item x have. I want to build a recommendation system to recommend the items to user , based on their preference.
I am thinking of using Mahout and CBAYES Classifier algorithm to predict , "how much item x will be liked by User A ". But I haven't found any example and data set for implementing CBAYES using mahout.
If you have any other suggestion to use any other classifier algorithm then please recommend.
You can calculate “how much item x will be liked by User A” by using cosine similarity. Please refer the following link for your more information.
Reference link: What's difference between Collaborative Filtering Item-based recommendation and Content-based recommendation
Regards,
Rajasekar

Resources