How to know if X-Pack is installed in Elasticsearch? - bash

I install Elasticsearch with Debian package and installed X-pack in it.
Now, I want to verify if X-Pack is successfully installed.
Is there a simple way to do verify this?

You can call
GET _cat/plugins?v

xpack comes pre-installed from ElasticSearch 6.3 onwards. Refer to : https://www.elastic.co/what-is/open-x-pack for more info on this.
You can check if xpack is installed using: curl -XGET 'http://localhost:9200/_nodes'
The relevant output snippet looks like below:
"attributes": {
"ml.machine_memory": "67447586816",
"xpack.installed": "true",
"transform.node": "true",
"ml.max_open_jobs": "512",
"ml.max_jvm_size": "27917287424"
}

Related

Spark REST API, submit application NullPointerException on Windows

I used my PC as the Spark Server and at the same time as the Spark Worker, using Spark 2.3.1.
At first, I used my Ubuntu 16.04 LTS.
Everything works fine, I tried to run the SparkPi example (using spark-submit and spark-shell)and it is able to run without problem.
I also try to run it using REST API from Spark, with this POST string:
curl -X POST http://192.168.1.107:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "file:/home/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"clientSparkVersion": "2.3.1",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
"spark.jars": "file:/home/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"spark.driver.supervise":"false",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
"spark.submit.deployMode":"cluster",
"spark.app.name": "SparkPi",
"spark.master": "spark://192.168.1.107:7077"
}
}'
After testing this and that, I have to move to Windows, since it is will be done on Windows anyway.
I able to run the server and worker (manually), add the winutils.exe, and run the SparkPi example also using spark-shell and spark-submit, everything able to run too.
The problem is when I used the REST API, using this POST string:
curl -X POST http://192.168.1.107:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"clientSparkVersion": "2.3.1",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
"spark.jars": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"spark.driver.supervise":"false",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
"spark.submit.deployMode":"cluster",
"spark.app.name": "SparkPi",
"spark.master": "spark://192.168.1.107:7077"
}
}'
Only the path is a little different, but my worker always failed.
The logs said:
"Exception from the cluster: java.lang.NullPointerException
org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:151)
org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scal173)
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92)"
I searched but no solutions has come yet..
So, finally I found the cause.
I read the source from:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala
From inspecting it, I conclude that the problem is not from Spark, but the parameter is not being read correctly. Which means somehow, I put wrong parameter format.
So, after trying out several things, this one is the right one :
appResource": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar"
changed to:
appResource": "file:///D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar"
And I did the same with spark.jars param.
That little differences had cost me almost 24 hours work... ~~~~

Rasa NLU failing to classify intent

I'm running rasa-nlu on a docker container.
trying to train it on my data and then performing requests to the http server, which always result as follow:
"intent": { "confidence": 1.0, "name": "None" }
I'm running a config file as follows:
{
"name": null,
"pipeline": "mitie",
"language": "en",
"num_threads": 4,
"max_training_processes": 1,
"path": "./models",
"response_log": "logs",
"config": "config.json",
"log_level": "INFO",
"port": 5000,
"data": "./data/test/demo-rasa.json",
"emulate": null,
"log_file": null,
"mitie_file": "./data/total_word_feature_extractor.dat",
"spacy_model_name": null,
"server_model_dirs": null,
"token": null,
"cors_origins": [],
"aws_endpoint_url": null,
"max_number_of_ngrams": 7,
"duckling_dimensions": null,
"entity_crf_BILOU_flag": true,
"entity_crf_features": [
["low", "title", "upper", "pos", "pos2"],
["bias", "low", "word3", "word2", "upper", "title", "digit", "pos", "pos2", "p
attern"],
["low", "title", "upper", "pos", "pos2"]]
}
What's the reason for that behaviour?
The models folder contains the trained
model inside another nested folder, is it ok?
Thanks.
I already saw your GitHub issue, thanks for providing a bit more information here. You're still leaving a lot of details about the Docker container ambiguous.
I and a few others got a pull request merged into the rasa repo available here on Docker Hub. There are several different builds now available and the basic usage instructions can be found below or on the main repo README.
General Docker Usage Instructions
For the time being though, follow the below steps:
docker run -p 5000:5000 rasa/rasa_nlu:latest-mitie
The demo data should be loaded already able to be parsed against using the below command:
curl 'http://localhost:5000/parse?q=hello'
Trying to troubleshoot your specific problem
As for your specific install and why it is failing, my guess is that your trained data either isn't there or is a name that rasa doesn't expect. Run this command to see what models are available:
curl 'http://locahost:5000/status'
your response should be something like:
{
"trainings_queued" : 0,
"training_workers" : 1,
"available_models" : [
"test_model"
]
}
If you have a model listed under available_models you can load/parse it with the below command replacing test_model with your model name.
curl 'http://localhost:5000/parse?q=hello&model=test_model'
Actually, I found that using Mitie always fails, thus, the model wasn't getting updated. Thanks for the info though.
Using Mitie-Sklearn fixed the issue.
Thank you.
There are some issues with MITIE Pipeline on Windows :( , training on MITIE takes a lot of time and spaCy trains the model very quickly. (2-3 minutes depending on your Processor and RAM).
Here's how I resolved it:
[Note: I am using Python 3.6.3 x64 Anaconda and Windows 8.1 O.S]
Install the following packages in this order:
Spacy Machine Learning Package: pip install -U spacy
Spacy English Language Model: python -m spacy download en
Scikit Package: pip install -U scikit-learn
Numpy package for mathematical calculations: pip install -U numpy
Scipy Package: pip install -U scipy
Sklearn Package for Intent Recognition: pip install -U sklearn-crfsuite
NER Duckling for better Entity Recognition with Spacy: pip install -U duckling
RASA NLU: pip install -U rasa_nlu==0.10.4
Now, in RASA v0.10.4 they use Twisted Asynchronous server which is not WSGI compatible. (More information on this here.)
Now make the config file as follows:
{
"project": "Travel",
"pipeline": "spacy_sklearn",
"language": "en",
"num_threads": 1,
"max_training_processes": 1,
"path": "C:\\Users\\Kunal\\Desktop\\RASA\\models",
"response_log": "C:\\Users\\Kunal\\Desktop\\RASA\\log",
"config": "C:\\Users\\Kunal\\Desktop\\RASA\\config_spacy.json",
"log_level": "INFO",
"port": 5000,
"data": "C:\\Users\\Kunal\\Desktop\\RASA\\data\\FlightBotFinal.json",
"emulate": "luis",
"spacy_model_name": "en",
"token": null,
"cors_origins": ["*"],
"aws_endpoint_url": null
}
Now run the server, like the following template:
http://localhost:5000/parse?q=&project=
You will get a JSON response something like this, like the LUISResult class of BotFramework C#.

Programmatically set Kibana's default index pattern

A Kibana newbie would like to know how to set default index pattern programmatically rather than setting it on the Kibana UI through web browser during the first time viewing Kibana UI as mentioned on page https://www.elastic.co/guide/en/kibana/current/setup.html
Elasticsearch stores all Kibana metadata information under .kibana index. Kibana configurations like defaultIndex and advance settings are stored under index/type/id .kibana/config/4.5.0 where 4.5.0 is the version of your Kibana.
So you can achieve setting up or changing defaultIndex with following steps:
Add index to Kibana which you want to set as defaultIndex. You can do that by executing following command:
curl -XPUT http://<es node>:9200/.kibana/index-pattern/your_index_name -d '{"title" : "your_index_name", "timeFieldName": "timestampFieldNameInYourInputData"}'
Change your Kibana config to set index added earlier as defaultIndex:
curl -XPUT http://<es node>:9200/.kibana/config/4.5.0 -d '{"defaultIndex" : "your_index_name"}'
Note: Make sure your giving correct index_name everywhere, valid timestamp field name and kibana version for example if you are using kibana 4.1.1 then you can replace 4.5.0 with 4.1.1 .
In kibana:6.5.3 this can be achieved this calling the kibana api.
curl -X POST "http://localhost:5601/api/saved_objects/index-pattern/logstash" -H 'kbn-xsrf: true' -H 'Content-Type: application/json' -d'
{
"attributes": {
"title": "logstash-*",
"timeFieldName": "#timestamp"
}
}
'
the Docs are here it does mention that the feature is experimental.

elasticsearch bulk script does not work neither with elasticsearch.yml change

When I try to run a curl command like:
curl -s -XPOST localhost:9200/_bulk --data-binary "#bulk_prova.elastic"; echo
Where bulk_prova.elastic is:
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "indexName"} }{ "script" : "ctx._source.topic = \"topicValue\""}
I got this error
{"took":19872,"errors":true,"items":[{"update":{"_index":"indexName","_type":"type1","_id":"1","status":400,"error":{"type":"illegal_argument_exception","reason":"failed to execute script","caused_by":{"type":"script_exception","reason":"scripts of type [inline], operation [update] and lang [groovy] are disabled"}}}}]}
I searched to solve the issue and I've managed the elasticsearch.yml file to enable the dynamic script, but every time that I try to change the file and stop elastic when I restart the elasticsearch service it does not start.
Due to this strange behavior I do not know how to do to solve the issue.
I have the 2.2.0 version and my intention is to add a field to a index (for now) or more than an index (once the problem is solved)
In Elasticsearch 2.3 it has been modified from:
script.disable_dynamic: false
TO:
script.file: true
script.indexed: true

how to configure logstash with elasticsearch for window?

I am struggling to configure and use logstash with elasticsearch. I downloaded the logstash-1.2.0-flatjar.jar, and created the sample.conf with content
input { stdin { type => "stdin-type"}}
output { stdout {}
elasticsearch { embedded => true }
}
and tried to run java -jar logstash-1.2.0-flatjar.jar agent -f sample.conf which produces
{:fix_jar_path=>["jar:file:/C:/Users/Rajesh/Desktop/Toshiba/logstach-jar/logstash-1.2.0-flatjar.jar!/locales/en.yml"]}
log4j, [2014-04-02T22:39:28.121] WARN: org.elasticsearch.discovery.zen.ping.unicast: [Chimera] failed to send ping to [[#zen_unicast_1#][inet[localho
st/127.0.0.1:9300]]]
Could anyone please help? Do i need to install plugins? Please provide the link
Thanks in Advance
Instead of using the embedded elasticsearch in logstash, you can try to download elasticsearch and start the elasticsearch as a different instance. Please refer to this page about how to setup an elasticsearch

Resources