Graph-Aided Search Result filtering example - elasticsearch

I've duplicated the Movie database of Neo4j on Elasticsearch and it's indexed with the index nodes. It has two types Movie and Person. I am trying to make a simple Result Filtering with Graph-Aided Search using this curl command line:
curl -X GET localhost:9200/nodes/_search?pretty -d '{
"query": {
"match_all" : {}
},
"gas-filter": {
"name": "SearchResultCypherfilter",
"query": "MATCH (p:Person)-[ACTED_IN]->(m:Movie) WHERE p.name= 'Parker Posey' RETURN m.uuid as id",
"ShouldExclude": true,
"protocol": "bolt"
}
}'
But I get as results all the 171 nodes of both types Movie and Person in my index nodes. However, as my query says I want to return only the type Movie by its title. So basically it doesn't look to the gas-filter part.
Also when I put false as the value of shouldExclude I am getting the same results.
[UPDATE]
I tried the suggestion of #Tezra, I am now returning only the id uuid and I put shouldExclude instead of exclude but still getting the same results.
I am working with:
Elasticsearch 2.3.2
graph-aided-search-2.3.2.0
Neo4j-community 2.3.2.10
graphaware-uuid-2.3.2.37.7
graphaware-server-community-all-2.3.2.37
graphaware-neo4j-to-elasticsearch-2.3.2.37.1
Result that should be returning:
The uuid of the movie titled You've Got Mail.
I tried to follow this tutorial for the configuration, and I found out that index.gas.enable had the value false so I changed it and finished the configuration just like in the tutorial:
mac$ curl -XPUT http://localhost:9200/nodes/_settings?index.gas.neo4j.hostname=http://localhost:7474
{"acknowledged":true}
mac$ curl -XPUT http://localhost:9200/nodes/_settings?index.gas.enable=true
{"acknowledged":true}
mac$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.user=neo4j
{"acknowledged":true}
mac$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.password=mypassword
{"acknowledged":true}
After that I tried to add the settings of boltHostname and bolt.secure but it didn't work and I had this error:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Can't update non dynamic settings[[index.gas.neo4j.boltHostname]] for open indices [[nodes]]"}],"type":"illegal_argument_exception","reason":"Can't update non dynamic settings[[index.gas.neo4j.boltHostname]] for open indices [[nodes]]"},"status":400}
So I closed my index to configure it and then opened it again:
mac$ curl -XPOST http://localhost:9200/nodes/_close
{"acknowledged":true}
mac$ curl -XPUT http://localhost:9200/nodes/_settings?index.gas.neo4j.boltHostname=bolt://localhost:7687
{"acknowledged":true}
mac$ curl -XPUT http://localhost:9200/nodes/_settings?index.gas.neo4j.bolt.secure=false
{"acknowledged":true}
mac$ curl -XPOST http://localhost:9200/nodes/_open
{"acknowledged":true}
After finishing the configuration I tried again on Postman the same gas-filter query that I was executing with curl and now I am getting this error:
{
"error": {
"root_cause": [
{
"type": "runtime_exception",
"reason": "Failed to parse a search response."
}
],
"type": "runtime_exception",
"reason": "Failed to parse a search response.",
"caused_by": {
"type": "client_handler_exception",
"reason": "java.net.ConnectException: Connection refused (Connection refused)",
"caused_by": {
"type": "connect_exception",
"reason": "Connection refused (Connection refused)"
}
}
},
"status": 500
}
I don't know which connection the error is talking about. I am sure I passed the correct password of Neo4j in the configuration. I've even stopped and restarted again the servers of Elasticsearch and Neo4j but still the same errors.
The settings of my index nodes looks like this:
{
"nodes" : {
"settings" : {
"index" : {
"gas" : {
"enable" : "true",
"neo4j" : {
"hostname" : "http://localhost:7474",
"password" : "neo4j.",
"bolt" : {
"secure" : "false"
},
"boltHostname" : "bolt://localhost:7687",
"user" : "neo4j"
}
},
"creation_date" : "1495531307760",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "SdrmQKhXQmyGKHmOh_xhhA",
"version" : {
"created" : "2030299"
}
}
}
}
}
Any ideas?

I figured out that the Connection refused exception that I was getting was because of the Wifi. So I had to disconnect from the internet to make it work. I know it is not the perfect solution. So if anyone found a better way of doing it please share it here.

Related

Getting unknown setting [index._id] error while adding data to Elasticsearch

I have created a mapping eventlog in Elasticsearch 5.1.1. I added it successfully however while adding data under it, I am getting Illegal_argument_exception with reason unknown setting [index._id]. My result from getting the indices is yellow open eventlog sX9BYIcOQLSKoJQcbn1uxg 5 1 0 0 795b 795b
My mapping is:
{
"mappings" : {
"_default_" : {
"properties" : {
"datetime" : {"type": "date"},
"ip" : {"type": "ip"},
"country" : { "type" : "keyword" },
"state" : { "type" : "keyword" },
"city" : { "type" : "keyword" }
}
}
}
}
and I am adding the data using
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog' -d '{"index":{"_id":1}}
{"datetime":"2016-03-31T12:10:11Z","ip":"100.40.135.29","country":"US","state":"NY","city":"Highland"}';
If I don't include the {"index":{"_id":1}} line, I get Illegal_argument_exception with reason unknown setting [index.apiKey].
The problem was arising with sending the data from the command line as a string. Keeping the data in a JSON file and sending it as binary solved it. The correct command is:
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog/_bulk?pretty' --data-binary #eventlogs.json

Bulk delete elasticsearch

i am using elastic search 2.2.
here is the count of documents
curl 'xxxxxxxxx:9200/_cat/indices?v'
yellow open app 5 1 28019178 5073 11.4gb 11.4gb
In the "app" index we have two types of document.
"log"
"syslog"
Now i want to delete all the documents under type "syslog".
Hence, i tried using the following command
curl -XDELETE "http://xxxxxx:9200/app/syslog"
But am getting the following error
No handler found for uri [/app/syslog]
i have installed delete-by-query plugin as well. Is there any way i can do a bulk delete operation ?
For now , i am deleting records by fetching the id.
curl -XDELETE "http://xxxxxx:9200/app/syslog/A121312"
it took around 5 mins for me to delete 10000 records. i have more than 1000000 docs which needs to be deleted. please help.
[EDIT -1]
i ran the below query to delete syslog type docs
curl -XDELETE 'http://xxxxxx:9200/app/syslog/_query' -d'
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
}
}'
And result is below
{"found":false,"_index":"app","_type":"syslog","_id":"_query","_version":1,"_shards":{"total":2,"successful":1,"failed":0}}
i used to query to get this message from index
{
"_index" : "app",
"_type" : "syslog",
"_id" : "AVckPMQnKYIebrQhF556",
"_score" : 1.0,
"_source" : {
"message" : "some test message",
"#version" : "1",
"#timestamp" : "2016-09-13T15:49:04.562Z",
"type" : "syslog",
"host" : "1.2.3.4",
"priority" : 0,
"severity" : 0,
"facility" : 0,
"facility_label" : "kernel",
"severity_label" : "Emergency"
}
[EDIT 2]
Delete by query listed as plugin
sudo /usr/share/elasticsearch/bin/plugin list
Installed plugins in /usr/share/elasticsearch/plugins/node1:
- delete-by-query
I had similar problem, after filling elasticsearch with 77 millions of unwanted documents in last couple of days. Setting timeout in query is your friend. As mentioned here. Curl has parameter to increase too (-m 3600)
curl --request DELETE \
--url 'http://127.0.0.1:9200/nadhled/tree/_query?timeout=60m' \
--header 'content-type: application/json' \
-m 3600 \
--data '{"query":{
"filtered":{
"filter":{
"range":{
"timestamp":{
"lt":1564826247
},
"timestamp":{
"gt":1564527660
}
}
}
}
}
}'
I know this is not your bulk delete, but I've found this page during my research so I post it here. Hope it helps you too.
In latest Elasticsearch(5.2), you could use _delete_by_query
curl -XPOST "http://localhost:9200/index/type/_delete_by_query" -d'
{
"query":{
"match_all":{}
}
}'
The delete-by-query API is new and should still be considered
experimental. The API may change in ways that are not backwards
compatible
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
I would suggest that you should rather create a new index and reindex the documents you want to keep
But if you wanna use delete by query you should use this,
curl -XDELETE 'http://xxxxxx:9200/app/syslog/_query'
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
}
}
but then you'll be left with mapping.

set default analyzer of index

First I wanted to set default analyzer of ES, and failed. And then according to other questions and websites, I'm trying to set default analyzer of one index.But there are some problems too.
I have configured ik analyzer, and I can set some fields' analyzer, here is my command:
curl -XPUT localhost:9200/test
curl -XPUT localhost:9200/test/test/_mapping -d'{
"test":{
"properties":{
"name":{
"type":"string",
"analyzer":"ik"
}
}
}
}'
and get the message:
{"acknowledged":true}
also, it works as my wish.
but, if I try to set default analyzer of index:
curl -XPOST localhost:9200/test1?pretty -d '{ "index":{
"analysis" : {
"analyzer" : {
"default" : {
"type" : "ik"
}
}
}
}
}'
I will get error message:
{
"error" : {
"root_cause" : [ {
"type" : "index_creation_exception",
"reason" : "failed to create index"
} ],
"type" : "illegal_argument_exception",
"reason" : "no default analyzer configured"
},
"status" : 400
}
So strange,isn't it?
Looking forward to your opinions about this problem. Thanks! :)
You're almost there, you're simply missing /_settings in your path. Do it like this instead. Also note that you need to close the index first and then reopen it after updating analyzers.
// close index
curl -XPOST 'localhost:9200/test1/_close'
add this to the path
|
v
curl -XPUT localhost:9200/test1/_settings?pretty -d '{ "index":{
"analysis" : {
"analyzer" : {
"default" : {
"type" : "ik"
}
}
}
}
}'
// re-open index
curl -XPOST 'localhost:9200/test1/_open'

Issues when replicating from couchbase bucket to elasticsearch index?

This issue seems to be related to using the XDCR in couchbase. If I had the following simple objects
1: { "name" : "Mark", "age" : 30}
2: { "name" : "Bill", "age" : "forty"}
and set up an elasticsearch index as such
curl -XPUT 'http://localhost:9200/test/couchbaseDocument/_mapping' -d '
{
"couchbaseDocument" : {
"dynamic_templates": [
{
"store_generic": {
"match": "*",
"mapping": {
"store": "yes"
}
}
}
]
}
}'
I can then add the two objects to this index using the REST API
curl -XPUT localhost:9200/test/couchbaseDocument/1 -d '{
"name" : "Mark",
"age" : 30
}'
curl -XPUT localhost:9200/test/couchbaseDocument/2 -d '{
"name" : "Bill",
"age" : "forty"
}'
They are now both searchable (despite the fact the "age" is long for one and string for the other.
If, however, I stored these two objects in a couchbase bucket (rather than straight to elasticsearch) and set up the XDCR the first object replicates fine but the second fails with the following error
failed to execute bulk item (index) index {[test][couchbaseDocument][2], source[{"doc":{"name":"Bill","age":"forty"},"meta":{"id":"2","rev":"8-00000b9360d0a0bf0000000000000000","expiration":0,"flags":0}}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [doc.age]
I can't figure out why it works via the REST API but not when couchbase replicates the same objects.
I followed the answer and used the following mapping to get things to work via XDCR
curl -XPUT 'http://localhost:9200/test/couchbaseDocument/_mapping' -d '
{
"couchbaseDocument" : {
"properties" : {
"doc": {
"properties" : {
"name" : {"type" : "string", "store" : "yes"},
"age" : {"type" : "string", "store" : "yes"}
}
}
}
}
}'
Now all the objects (despite having different types for the same fields) are replicated and searchable. I don't think there was any need to include the dynamic_templates approach I initially tried. The mapping works.
It's something you have to solve on elasticsearch side.
If the same field name can contain both numeric values and string values, you should create a mapping first which says that age is a String.
So elasticsearch won't try to auto guess type for this field.
Hope this helps

No query registered for [match]

I'm working through some examples in the ElasticSearch Server book and trying to write a simple match query
{
"query" : {
"match" : {
"displayname" : "john smith"
}
}
}
This gives me the error:
{\"error\":\"SearchPhaseExecutionException[Failed to execute phase [query],
....
SearchParseException[[scripts][4]: from[-1],size[-1]: Parse Failure [Failed to parse source
....
QueryParsingException[[kb.cgi] No query registered for [match]]; }
I also tried
{
"match" : {
"displayname" : "john smith"
}
}
as per examples on http://www.elasticsearch.org/guide/reference/query-dsl/match-query/
EDIT: I think the remote server I'm using is not the latest 0.20.5 version because using "text" instead of "match" seems to allow the query to work
I've seen a similar issue reported here: http://elasticsearch-users.115913.n3.nabble.com/Character-escaping-td4025802.html
It appears the remote server I'm using is not the latest 0.20.5 version of ElasticSearch, consequently the "match" query is not supported - instead it is "text", which works
I came to this conclusion after seeing a similar issue reported here: http://elasticsearch-users.115913.n3.nabble.com/Character-escaping-td4025802.html
Your first query looks fine, but perhaps the way you use in the request is not correct. Here is a complete example that works:
curl -XDELETE localhost:9200/test-idx
curl -XPUT localhost:9200/test-idx -d '{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
},
"mappings": {
"doc": {
"properties": {
"name": {
"type": "string", "index": "analyzed"
}
}
}
}
}
'
curl -XPUT localhost:9200/test-idx/doc/1 -d '{
"name": "John Smith"
}'
curl -XPOST localhost:9200/test-idx/_refresh
echo
curl "localhost:9200/test-idx/_search?pretty=true" -d '{
"query": {
"match" : {
"name" : "john smith"
}
}
}
'
echo

Resources