I am trying to change the policy in RASA from max_history: 5 to FullDialogueTrackerFeaturizer - rasa-nlu

I am new to using RASA and I was able to follow this guide:
https://rasa.com/docs/rasa/user-guide/installation/
to set up RASA on Ubunto 18.04 on windows. I am now following a second guide:
https://rasa.com/docs/rasa/user-guide/rasa-tutorial/
and am at step 3, running the code in step 3 on Ubuntu cane back with the following result:
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
- name: MemoizationPolicy
- name: TEDPolicy
max_history: 5
epochs: 100
- name: MappingPolicy
Near the bottom, where it lists max_history: 5 I would like to change the policy to FullDialogueTrackerFeaturizer so that the entire dialog is considered for the bots production. I have tried reading this article:
https://rasa.com/docs/rasa/api/core-featurization/#featurization-conversations
although the article explains the function of the policy, I could not how to switch the policy.
My question is, how do I change the policy from max_history: 5 to FullDialogueTrackerFeaturizer?

FullDialogueTrackerFeaturizer is not a policy it's a featurizer. What you're describing is infinite max_history. This is not a good idea, as your model will become enormous and very slow. Instead, set max_history high enough to account for any patterns in your training data. If you need to save data for use at an arbitrary point in a conversation, set a slot instead.

Related

How to add Elastic APM integration from API/CMD/configuration file

I've created a docker-compose file with some configurations that deploy Elasticsearch, Kibana, Elastic Agent all version 8.7.0.
where in the Kibana configuration files I define the police I needed under xpack.fleet.agentPolicies, with single command all my environment goes up and all component connect successfully.
The only issue is there is one manual step, which is I had to go to Kibana -> Observability -> APM -> Add Elastic APM and then fill the Server configuration.
I want to automate this and manage this from the API/CMD/configuration file, I don't want to do it from the UI.
What is the way to do this? in which component? what is the path the configuration should be at?
I tried to look for APIs or command to do that, but with no luck. I'm expecting help with automating the remaning step.
#Update 1
I've tried to add it as below, but I still can't see the integration added.
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
inputs:
- type: apm
enabled: true
vars:
- name: host
value: "0.0.0.0:8200"
- name: url
value: "http://0.0.0.0:8200"
- name: enable_rum
value: true
frozen: true
Tldr;
Yes, I believe there is a way to do it.
But I am pretty sure this is poorly documented.
You can find some idea in the repository of apm-server
Solution
In the kibana.yml file you can add some information related to fleet.
This section below is taken from the repository above and helped me set up apm automatically.
But if you have some specific settings you would like to see enable I am usure where you provide them.
xpack.fleet.packages:
- name: fleet_server
version: latest
xpack.fleet.agentPolicies:
- name: Fleet Server (APM)
id: fleet-server-apm
is_default_fleet_server: true
is_managed: false
namespace: default
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
It is true that the kibana Fleet API is very poorly documented at this moment. I think your problem is that you are trying to add the variables to the fleet-server package insted of the apm package. Your yaml should look like this:
package_policies:
- name: fleet_server-apm
id: default-fleet-server
package:
name: fleet_server
- name: apm-1
package:
name: apm
inputs:
- type: apm
keep_enabled: true
vars:
- name: host
value: 0.0.0.0:8200
frozen: true
- name: url
value: "http://0.0.0.0:8200"
frozen: true
- name: enable_rum
value: true
frozen: true
Source

Can't checkout the same repo multiple times in a pipeline

I have self-hosted agents on multiple environments that I am trying to run the same build/deploy processes on. I would like to be able to deploy the same code from a single repo to multiple systems concurrently. Thus, I have created an "overhead" pipeline, and several "processes" pipeline templates. Everything seems to be going very well, except for when I try to perform checkouts of the same repo twice in the same pipeline execution. I get the following error:
An error occurred while loading the YAML build pipeline. An item with the same key has already been added.
I would really like to be able to just click ONE button to trigger a main pipeline that calls all the templates requires and gives the parameters needed to get all my jobs done at once. I could of course define this "overhead" pipeline and then queue up as many instances as I need of it per systems that I need to deploy to, but I'm lazy, hence why I'm using pipelines!
As soon as I remove the checkout from Common.yml, the validation succeeds without any issues. If I keep the checkout in there but only call the Common.yml once for the entire Overhead pipeline, then it succeeds without any issues as well. But the problem is: I need to pull the contents of the repo to EACH of my agents that are running on completely separate environments that are in no way ever able to talk to each other (can't pull the information to one agent and have it do some sort of a "copy" to all the other agent locations.....).
Any assistance is very much welcomed, thank you!
The following is my "overhead" pipeline:
# azure-pipelines.yml
trigger:
none
parameters:
- name: vLAN
type: string
default: 851
values:
- 851
- 1105
stages:
- stage: vLAN851
condition: eq('${{ parameters.vLAN }}', '851')
pool:
name: xxxxx
demands:
- vLAN -equals 851
jobs:
- job: Common_851
steps:
- template: Procedures/Common.yml
- job: Export_851
dependsOn: Common_851
steps:
- template: Procedures/Export.yml
parameters:
Server: ABTS-01
- stage: vLAN1105
condition: eq('${{ parameters.vLAN }}', '1105')
pool:
name: xxxxx
demands:
- vLAN -equals 1105
jobs:
- job: Common_1105
steps:
- template: Procedures/Common.yml
- job: Export_1105
dependsOn: Common_1105
steps:
- template: Procedures/Export.yml
parameters:
Server: OTS-01
And here is the "Procedures/Common.yml":
steps:
- checkout: git://xxxxx/yyyyy#$(Build.SourceBranchName)
clean: true
enabled: true
timeoutInMinutes: 1
- task: UsePythonVersion#0
enabled: true
timeoutInMinutes: 1
displayName: Select correct version of Python
inputs:
versionSpec: '3.8'
addToPath: true
architecture: 'x64'
- task: CmdLine#2
enabled: true
timeoutInMinutes: 5
displayName: Ensure Python Requirements Installed
inputs:
script: |
python -m pip install GitPython
And here is the "Procedures/Export.yml":
parameters:
- name: Server
type: string
steps:
- task: PythonScript#0
enabled: true
timeoutInMinutes: 3
displayName: xxxxx
inputs:
arguments: --name "xxxxx" --mode True --Server ${{ parameters.Server }}
scriptSource: 'filePath'
scriptPath: 'xxxxx/main.py'
I managed to make checkout work with variable branch names by using template expression variables ${{ ... }} instead of macro syntax $(...) variables.
The difference is that, template expressions are processed at compile time while macros are processed at runtime.
So in my case I have something like:
- checkout: git://xxx/yyy#${{ variables.BRANCH_NAME }}
For more information about variables syntax :
Understand variable syntax
I couldn't get it to work with expressions but I was able to get it to work using repository resources following the documentation at: https://learn.microsoft.com/en-us/azure/devops/pipelines/repos/multi-repo-checkout?view=azure-devops
resources:
repositories:
- repository: MyGitHubRepo # The name used to reference this repository in the checkout step
type: git
name: MyAzureProjectName/MyGitRepo
ref: $(Build.SourceBranch)
trigger:
- main
pool:
vmImage: 'ubuntu-latest'
#some job
steps:
- checkout: MyGitHubRepo
#some other job
steps:
- checkout: MyGitHubRepo
- script: dir $(Build.SourcesDirectory)

Why is my Fallback Intent and FallbackClassifier not working in Rasa?

I have mentioned it in my pipeline in the config.yml file, that I will be using the FallbackClassifier.
So my code looks like:
language: en
pipeline:
- name: FallbackClassifier
threshold: 0.7
ambiguity_threshold: 0.1
However, I receive this error, when I try to run it:
InvalidConfigException: The pipeline configuration contains errors. The component 'FallbackClassifier' requires 'IntentClassifier' to be placed before it in the pipeline. Please add the required components to the pipeline.
the FallbackClassifier steps in when the IntentClassifier is not confident about the intent. so you cant use the Fallback classifier without using an Intentclassifier.
you can choose an IntentClassifier from this https://rasa.com/docs/rasa/components/#intent-classifiers
the most simple Intentclassifier is the "KeywordIntentClassifier" but it wont be a great choice if the bot is expected to make complicated Conversations.
this as an Example of a working Pipeline using the Fallbackclassifier:
language: "en"
pipeline:
- name: "WhitespaceTokenizer"
- name: "CountVectorsFeaturizer"
- name: "DIETClassifier"
- name: FallbackClassifier
threshold: 0.7
ambiguity_threshold: 0.1
Your FallbackClassifier needs a IntentClassifier, which further needs a Featurizer, and a Featurizer requires a Tokenizer.
So the easiest way of making your FallbackClassifier to work is to take the config.yml file from when you run rasa init on your CLI. Copy paste the config.yml code and remove all the "#" comment lines from the properties of "pipeline".
So your code for pipeline should look like this:
language: en
pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
constrain_similarities: true
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
constrain_similarities: true
- name: FallbackClassifier
threshold: 0.7
ambiguity_threshold: 0.1
Now your FallbackClassifier should work like a charm!

Why doesn't the metricbeat index name change each day?

I am using a metricbeat (7.3) docker container along side several other docker containers, and sending the results to an elasticsearch (7.3) instance. This works, and the first time everything spins up I get an index in elasticsearch called metricbeat-7.3.1-2019.09.06-000001
The initial problem is that I have a Graphana dashboard setup to look for an index with today's date, so it seems to ignore one created several days ago altogether. I could try to figure out what's wrong with those Grafana queries, but more generically I need those index names to roll at some point - the index that's there is already up to over 1.3GB, and at some point that will just be too big for the system.
My initial metricbeat.yml config:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
Searching around a bit, it seems like the index field on the elasticsearch output should configure the index name, so I tried the following:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
index: "metricbeat-%{[beat.version]}-instance1-%{+yyyy.MM.dd}"
That throws an error about needing setup.template settings, so I settled on this:
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "memory"
- "network"
hosts: ["unix:///var/run/docker.sock"]
period: 10s
enabled: true
output.elasticsearch:
hosts: ["${ELASTICSEARCH_URL}"]
index: "metricbeat-%{[beat.version]}-instance1-%{+yyyy.MM.dd}"
setup.template:
overwrite: true
name: "metricbeat"
pattern: "metricbeat-*"
I don't really know what the setup.template section does, so most of that is a guess from Google searches.
I'm not really sure if the issue is on the metricbeat side, or on the elasticsearch side, or somewhere in-between. But bottom line - how do I get them to roll the index to a new one when the day changes?
This is the setting/steps that worked for me:
metricbeat.yml file:
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["<es-ip>:9200"]
index: metricbeat-%{[beat.version]}
index_pattern: -%{+yyyy.MM.dd}
ilm.enabled: true
then, over to kibana i.e :5601:
go to "Stack Monitoring", select the "metricbeat-*"
do this kind of a setting to begin with, and what follows later is self-explanatory too:

Concourse: Resource not found?

I am trying to built a concourse pipeline which is triggered by git, and then runs a script in that git repository.
This is what I have so far:
resources:
- name: component_structure_git
type: git
source:
branch: master
uri: git#bitbucket.org:foo/bar.git
jobs:
- name: component_structure-docker
serial: true
plan:
- aggregate:
- get: component_structure_git
trigger: true
- task: do-something
config:
platform: linux
image_resource:
type: docker-image
source: { repository: ubuntu }
inputs:
- name: component_structure_git
outputs:
- name: updated-gist
run:
path: component_structure_git/run.sh
- put: component_structure-docker
params:
build: component_structure/concourse
- name: component_structure-deploy-test
serial: true
plan:
- aggregate:
- get: component_structure-docker
passed: [component_structure-docker]
- name: component_structure-deploy-prod
serial: true
plan:
- aggregate:
- get: component_structure-docker
passed: [component_structure-docker]
When I apply this code with fly, everything is ok. When I try to run the build. it fails with the following error:
missing inputs: component_structure_git
Any idea what I am missing here?
Agree with first answer. When running things in parallel (aggregate blocks) there are a few things to consider
How many inputs do I have? I have more than one, let's run these get steps in an aggregate block
If I have two tasks, is there a dependency between the tasks that can change the outcome of a task run, eg. Do I have an output from one task that is required in the next task
I have a sequence of put statements, let's run these steps in an aggregate block
just a guess but aggregate is causing the issue. You can't have an input from something that is executing at the same time? Why do you have aggregate anyway? This is usually used for "get" to speed up the process.

Resources