Rasa v 0.15 change log states that "SpacyEntityExtractor supports same entity filtering as DucklingHTTPExtractor" but if sentence has ‘time’ entity then values identified by spacy entity extractor are just the text values(ex. 7pm) and not the real time values (ex 2019-05-03T07.00.00) as identified by duckling.
Duckling is indeed the recommended option for dates.
Is there any reason why you don't want to use Duckling?
You could also implement a custom NLU pipeline component, if you want to use a another library.
Related
I need to use the interaction variable feature of multiclass classification in H2OGradientBoostingEstimator in H2O in Python. I am not sure which parameter to use & how to use that. Can anyone please help me out with this?
Currently, I am using the below code -
pros_gbm = H2OGradientBoostingEstimator(nfolds=0,seed=1234, keep_cross_validation_predictions = False, ntrees=10, max_depth=3, learn_rate=0.01, distribution='multinomial')
hist_gbm = pros_gbm.train(x=predictors, y=target, training_frame=hf_train, validation_frame = hf_test,verbose=True)
GBM inherently creates interactions. You can extract information about feature interactions using the .feature_interaction() extractor method (for an H2O Model). More information is provided in the user guide and the Python docs.
If you want to explicitly add a new column that is the interaction between two numerics, you could create that manually by multiplying the two (or more) columns together to get a new interaction column.
For categorical interactions, there's also the the h2o.interaction() method in Python here to create interaction columns in the data (prior to sending it to the GBM or any algorithm).
This is my situation. I have over 400 features, many of which are probably useless and often zero. I would like to be able to:
train an model with a subset of those features
query that model for the features actually used to build that model
build a H2OFrame containing just those features (I get a sparse list of non-zero values for each row I want to predict.)
pass this newly constructed frame to H2OModel.predict() to get a prediction
I am pretty sure what found is unsupported but works for now (v 3.13.0.341). Is there a more robust/supported way of doing this?
model._model_json['output']['names']
The response variable appears to be the last item in this list.
In a similar vein, it would be nice to have a supported way of finding out which H2O version that the model was built under. I cannot find the version number in the json.
If you want to know which feature columns the model used after you have built a model you can do the following in python:
my_training_frame = your_model.actual_params['training_frame']
which will return some frame id
and then you can do
col_used = h2o.get_frame(my_training_frame)
col_used
EDITED (after comment was posted)
To get the columns use:
col_used.columns
Also, a quick way to check the version of a saved binary model is to try and load it into h2o, if it loads it is the same version of h2o, if it isn't you will get a warning.
you can also open the saved model file, the first line will list the version of H2O used to create it.
For a model saved as a mojo you can look at the model.ini file. It will list the version of H2O.
I'd like to create a Bot using the FormFlow with JSON Schema approach. However, I need a bit more flexibility for displaying the answer options, since those need to be whole sentences and not only single words.
Is it possible to extend the enums specified inside the JSON file with descriptions that will be offered as options instead of the enum itself?
As I understand this is possible in code by using the Describe-Attribute.
You could use the "Define" property with custom script. The Sandwich Bot example is doing it this way (from json-schema-example):
"Define": "field.SetType(null).AddDescription(\"cookie\", DynamicSandwich.FreeCookie).AddTerms(\"cookie\", Language.GenerateTerms(DynamicSandwich.FreeCookie, 2)).AddDescription(\"drink\", DynamicSandwich.FreeDrink).AddTerms(\"drink\", Language.GenerateTerms(DynamicSandwich.FreeDrink, 2)); return true;",
Is it possible to perform a NOT type query with chained methods using postgres_ext?
rules = Rule.where.overlap(:tags => ["foo"])
Basically want the inverse of the above. Thanks!
In regular active record you can use .where.not as described in this article: https://robots.thoughtbot.com/activerecords-wherenot however looking through the source code of postgres_ext I don't know if it is defined in that library. You may be able to construct your query in a way that uses the native active record methods.
We would like to perform a spatial search on one geo field but distance sort the results based on a second geo field. It seems that Solr supports this for the LatLonType. Here we simply add parameters to the geodist function.
The geodist(param1,param2,param3) function supports (optional) parameters:
param1: the sfield
param2: the latitude (pt)
param3: the longitude (pt)
Unfortunately, this doesn't seem to work with the SpatialRecursivePrefixTreeFieldType. However, we have to use SpatialRecursivePrefixTreeFieldType since we have several locations for each document and this is not supported for the LatLonType. Is there any solution other than writing our own field type?
Finally I figured it out. However, the solution is a bit of a hack. I've now created a plugin jar that contains a modified version of the GeoDistValueSourceParser class. Within this class I've modified the method parseSfield to simply use a constant sfield, which should be used for sorting. Then I've hooked the class up by adding the line
<valueSourceParser name="customdist" class="bla.search.function.distance.CustomGeoDistValueSourceParser"/>
to the solrconfig.xml. So far I don't understand why the GeoDistValueSourceParser isn't configurable? It shouldn't be too difficult to write it in a way that a different geo field can be specified for sorting.