Elastic search csv river module not working - elasticsearch

I am trying to index csv file in elasticsearch
curl -XPUT localhost:9200/_river/my_csv_river/_meta -d '
{
"type" : "csv",
"csv_file" : {
"folder" : "/tmp",
"filename_pattern" : ".*\\.csv$",
"first_line_is_header" : "true",
"field_separator" : ",",
"field_id" : "_id"
},
"index" : {
"index" : "my_csv_data_1",
"type" : "csv_type_1",
"bulk_size" : 100,
"bulk_threshold" : 10
}
}'
after indexing while searching http://localhost:9200/my_csv_data_1/_search
got
{
error: "IndexMissingException[[my_csv_data_1] missing]",
status: 404
}
any thoughts or i missed any thing?

Related

Elasticsearch does not found an existing document using a the DSL

I dont know why, using the URI Search way to search a document is returning the right document, but the document is not found if I use the API DSL.
To reproduce the issue:
Without any index created, I insert this document:
curl http://localhost:9299/integrationtest-index/searchable/ID_XXXX2 -d '{ "ref" : "XXXX2", "field1" : "value1" }'
So the index is created automatically with the default mapping (type searchable):
curl http://localhost:9299/integrationtest-index?pretty
{
"integrationtest-index" : {
"aliases" : { },
"mappings" : {
"searchable" : {
"properties" : {
"field1" : {
"type" : "string"
},
"ref" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"field1" : "value1",
"ref" : "XXXX2",
"number_of_shards" : "5",
"creation_date" : "1466780216631",
"number_of_replicas" : "1",
"uuid" : "GBj2VF-wQy6JP74AqoIn5g",
"version" : {
"created" : "2020099"
}
}
},
"warmers" : { }
}
}
This query return one document:
curl http://localhost:9299/integrationtest-index/searchable/_search?q=ref:XXXX2
But this other query response that does not exist:
curl -XPOST http://localhost:9299/integrationtest-index/searchable/_search/exists -d '
{
"query": {
"term" : {
"ref" : "XXXX2"
}
}
}'
Why the last query said that the document does not exist?
Environment:
ElasticSearch 2.2.0
Ubuntu 16.04 LTS
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-0ubuntu4~16.04.1-b14)
I have the same problem every few months, so I decided response myself and share my stupids errors.
By default, elasticsearch use index:analyzed, so the query with term does not found any document.
If you use the URI Search way, elasticsearch is executing a query_string and not a term query.
This query is working:
curl -XPOST http://localhost:9299/integrationtest-index/searchable/_search/exists -d '
{
"query": {
"match" : {
"ref" : "XXXX2"
}
}
}'
More information in the documentation, in the section Why doesn’t the term query match my document?

Elasticsearch river - no _meta document found after 5 attempts

I am using elasticsearch version 1.3.0. when I create a river using wikipedia plugin version 2.3.0 as thus
PUT _river/my_river/_meta -d
{
"type" : "wikipedia",
"wikipedia" : {
"url" : "http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2"
},
"index" : {
"index" : "wikipedia",
"type" : "wiki",
"bulk_size" : 1000,
"max_concurrent_bulk" : 3
}
}
the server responds with this message
{
"_index": "_river",
"_type": "my_river",
"_id": "_meta -d",
"_version": 1,
"created": true
}
however, I don't see the wikipedia documents when I run a search. also, when I restart my server I get river-routing no _meta document found after 5 attempts
Remove the -d at the end as it creates a document named _meta -d and not _meta.
PUT _river/my_river/_meta
{
"type" : "wikipedia",
"wikipedia" : {
"url" : "http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2"
},
"index" : {
"index" : "wikipedia",
"type" : "wiki",
"bulk_size" : 1000,
"max_concurrent_bulk" : 3
}
}

Elastic search : Snapshot not creating

I am runnign this command
http://localhost:9200/_snapshot/titan_es_backup?pretty
output
{
"titan_es_backup" : {
"type" : "fs",
"settings" : {
"compress" : "true",
"location" : "/home/manish/Softwares/DataBackUp/ES"
}
}
}
again this command
http://localhost:9200/_snapshot/titan_es_backup/snapshot_1?wait_for_completion=false
OUTPUT
{"error":"SnapshotMissingException[[titan_es_backup:snapshot_1] is missing]; nested: FileNotFoundException[/home/manish/Softwares/DataBackUp/ES/snapshot-snapshot_1 (No such file or directory)]; ","status":404}
Where I am going wrong ?

Elasticsearch How do I get a metadata using Image Plugin

I defined matadata by the mapping of the Elasticsearch image Plugin.
Mapping:
"photo" : {
"mappings" : {
"scenery" : {
"properties" : {
"my_img" : {
"type" : "image",
"feature" : {"FCTH" : { }, ... },
"metadata" : {
"jpeg.image_height" : {"type" : "string","store" : true},
"jpeg.image_width" : {"type" : "string","store" : true}
}
}
}
}
}
}
After an index, although searched, metadata does not return.
How do I get a metadata?
I tried:
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
"image":{
"my_img":{
"feature":"CEDD",
"index":"photo",
"type":"scenery",
"id":"0",
"path":"my_img",
"hash":"BIT_SAMPLING"
}
}
}
}'
Result:
{"took":14,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[{"_index":"photo","_type":"scenery","_id":"0","_score":1.0, "_source" : {"file_name": "376423.jpg", "my_img": "/9j/4AAQSkZJRgABAQ...
Perhaps, the original data (base64 encoded image) will be returned _source field. You can use that instead, the fields option.
Try this query.
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
...
},
"fields": ["my_img.metadata.jpeg.image_height","my_img.metadata.jpeg.image_width" ]
}'

Specify Routing on Index Alias's Term Lookup Filter

I am using Logstash, ElasticSearch and Kibana to allow multiple users to log in and view the log data they have forwarded. I have created index aliases for each user. These restrict their results to contain only their own data.
I'd like to assign users to groups, and allow users to view data for the computers in their group. I created a parent-child relationship between the groups and the users, and I created a term lookup filter on the alias.
My problem is, I receive a RoutingMissingException when I try to apply the alias.
Is there a way to specify the routing for the term lookup filter? How can I lookup terms on a parent document?
I posted the mapping and alias below, but a full gist recreation is available at this link.
curl -XPUT 'http://localhost:9200/accesscontrol/' -d '{
"mappings" : {
"group" : {
"properties" : {
"name" : { "type" : "string" },
"hosts" : { "type" : "string" }
}
},
"user" : {
"_parent" : { "type" : "group" },
"_routing" : { "required" : true, "path" : "group_id" },
"properties" : {
"name" : { "type" : "string" },
"group_id" : { "type" : "string" }
}
}
}
}'
# Create the logstash alias for cvializ
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "logstash-2014.04.25", "alias" : "cvializ-logstash-2014.04.25" } },
{
"add" : {
"index" : "logstash-2014.04.25",
"alias" : "cvializ-logstash-2014.04.25",
"routing" : "intern",
"filter": {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "user",
"id" : "cvializ",
"path" : "group.hosts"
},
"_cache_key" : "cvializ_hosts"
}
}
}
}
]
}'
In attempting to find a workaround for this error, I submitted a bug to the ElasticSearch team, and received an answer from them. It was a bug in ElasticSearch where the filter is applied before the dynamic mapping, causing some erroneous output. I've included their workaround below:
PUT /accesscontrol/group/admin
{
"name" : "admin",
"hosts" : ["computer1","computer2","computer3"]
}
PUT /_template/admin_group
{
"template" : "logstash-*",
"aliases" : {
"template-admin-{index}" : {
"filter" : {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "group",
"id" : "admin",
"path" : "hosts"
}
}
}
}
},
"mappings": {
"example" : {
"properties": {
"host" : {
"type" : "string"
}
}
}
}
}
POST /logstash-2014.05.09/example/1
{
"message":"my sample data",
"#version":"1",
"#timestamp":"2014-05-09T16:25:45.613Z",
"type":"example",
"host":"computer1"
}
GET /template-admin-logstash-2014.05.09/_search

Resources