Elasticsearch get by id doesn't work but document exists - elasticsearch

I'm seeing weird behaviours with ids on elasticsearch 1.2.0 (recently upgraded from 1.0.1).
A search retrieves my document, showing the correct value for _id:
[terminal]
curl 'myServer:9200/global/_search?q=someField:something
result is
{
"took": 79,
"timed_out": false,
"_shards": {
"total": 12,
"successful": 12,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 17.715034,
"hits": [
{
"_index": "global",
"_type": "user",
"_id": "7a113e4f-44de-3b2b-a3f1-fb881da1b00a",
...
}
]
}
}
But a direct lookup on id doesn't:
[terminal]
curl 'myServer:9200/global/user/7a113e4f-44de-3b2b-a3f1-fb881da1b00a'
result is
{
"_index": "global",
"_type": "user",
"_id": "7a113e4f-44de-3b2b-a3f1-fb881da1b00a",
"found": false
}
It seems that this is on documents that have previously been updated using custom scripting.
Any ideas?

I think you should upgrade to 1.2.1
Due to release notes (http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/) there are some problems, especially with get:
`There was a routing bug in Elasticsearch 1.2.0 that could have a number of bad side effects on the cluster. Possible side effects include:
documents that were indexed prior to the upgrade to 1.2.0 may not be accessible via get. A search would find these documents, but not a direct get of the document by ID.`

Related

Elastic search storage: How to get the list of field names under _source?

I am very new to using Elastic search storage and looking for a clue to find the list of all fields listed under_source. So far, I have come across the ways to find out the values for the different fields defined under _source but not the way to list out all the fields. For example: I have below document
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "my_product",
"_type": "_doc",
"_id": "B2LcemUBCkYSNbJBl-G_",
"_score": 1,
"_source": {
"email": "123#abc.com",
"product_0": "iWLKHmUBCkYSNbJB3NZR",
"product_price_0": "10",
"link_0": ""
}
}
]
}
}
So, from the above example, I would like to get the fields names like email, product_0, product_price_0 and link_0 which are under _source. I have been retrieving the values by parsing the array returned from the ess api but what should be at the ? mark to get the field names $result['hits']['hits'][0]['_source'][?]
Note: I am using php to insert data into ESS and retrieve data from it.
If I understood correctly you need array_keys
array_keys($result['hits']['hits'][0]['_source'])

How to insert data to elastic search from search query

I try to copy some data from one elastic db to another elasticsearch db, is there any way to insert data from query results?
Example of results:
{
"took": 29,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 572,
"max_score": 1,
"hits": [
{
"_index": "ref",
"_type": "dic",
"_id": "12345",
"_score": 1,
"_source": {
"name": "Test name"
}
},
...
]
}
In each db mapping is equals.
I fork some project witch make a bulk data for elasticsearch, json-to-es-bulk
I it's compability with 5.6 es version, you can use it in 2 variants:
`node index.js -f inputdata.json --index newIndexName --type newIndexType --rewrite true`
or
`node index.js -f inputdata.json --index --type --rewrite false`
after run, you'll see a file request-data.txt, just use it to
POST /_bulk
[request-data.txt content]
Input data json file must contains an array of search hits, like this:
[
{
"_index": "oldIndexName",
"_type": "oldIndexName",
"_id": "SOME_ID-HiTfm",
"_score": 1,
"_source": {
"orderNumber": "2984",
"refId": "SOME_VALUE"
}
},
...
]

Elasticsearch 2.3 - delete documents by query

I'm using elasticsearch 2.3 & Sense and trying to delete documents by query.
I refer to these docs:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/delete-by-query-usage.html
Request
DELETE /monitors/monitor/_query
{
"term": { "ProcessName" : "myProcName" }
}
Response
{
"found": false,
"_index": "monitors",
"_type": "monitor",
"_id": "_query",
"_version": 11,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
}
}
As you can see, i'm not getting any results even though I have ProcessName named "myProcName".
Response also tells that the engine looks for _id equals to _query.
EDIT 1:
Even when sending request:
DELETE /monitors/monitor/_query
{
"query": {
"term": { "ProcessName" : "tibapp_qflowfile" }
}
}
I'm getting response:
{
"found": false,
"_index": "monitors",
"_type": "monitor",
"_id": "_query",
"_version": 1,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
}
}
The output you're getting means that you haven't installed the delete-by-query plugin, which isn't installed by default.
Do that first, restart your node and it will work afterwards
bin/plugin install delete-by-query
FYI - The Plugin [delete-by-query] is incompatible with Elasticsearch [2.3.5]. Was designed for version [2.3.4]

No results found in KIbana with ElasticSearch

I have set up an index in elasticsearch, included its mapping have some data. When I make the GET request, I can check the contents as follows:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 1,
"hits": [
{
"_index": "flights",
"_type": "yatra",
"_id": "AU5tQ5QxEVKx_FDBBqf9",
"_score": 1,
"_source": {
"go_duration": 13.5,
"return_arrival_time": "2015-09-26 09:55:00",
"go_arrival_city": " NRT ",
"return_departure_city": "NRT",
"cost": 44594,
"return_duration": 11.5,
"_timestamp": "2015-07-08T19:43:42.254412",
"return_departure_time": "2015-09-25 18:40:00",
"return_arrival_city": " PNQ ",
"go_departure_time": "2015-09-16 20:00:00",
"go_arrival_time": "2015-09-17 13:20:00",
"airline": "Jet Airways",
"go_departure_city": "PNQ"
}
},
{
"_index": "flights",
"_type": "yatra",
"_id": "AU5tRPJuEVKx_FDBBqgF",
"_score": 1,
"_source": {
"go_duration": 13.5,
"return_arrival_time": "2015-09-26 09:55:00",
"go_arrival_city": " NRT ",
"return_departure_city": "NRT",
"cost": 44594,
"return_duration": 11.5,
"_timestamp": "2015-07-08T19:45:11.917928",
"return_departure_time": "2015-09-25 18:40:00",
"return_arrival_city": " PNQ ",
"go_departure_time": "2015-09-16 20:00:00",
"go_arrival_time": "2015-09-17 13:20:00",
"airline": "Jet Airways",
"go_departure_city": "PNQ"
}
}
]
}
}
Now, I have also configured kibana to use with ElasticSearch. Following is the snapshot from kibana.
I created a "_timestamp" field in Settings->Advanced->metaFields. So I created the new index with "_timestamp" field and " Index contains time-based events" field checked .
I have set the timestamp to "Last 60 days". But I still cannot see the data. What am I missing?
I had faced exactly same issue.
Creating a new field timsestamp didn't help.
So, my approach to the issue -
1.> Looked at the server status, if it was running or not.
For me it was server was up and running
2.> I looked at the previous day records to find out when did kibana go down.
So I saw, after latest deployment on production environment, Kibana didnt get any logs
3.> So since the server is fine, making new index didnt help. So, i thought now the problem might be with elasticsearch. But elasticsearch indexes logs that it gets from logstash.
So I went into my salt master and firstly, checked whether all the services were running or not. They were all running. Next I stopped logstash and elastic search and killed or java processes. And after further investigating the indexes I saw the indexes were corrupted.
Restarting the services again worked and everything went well.
WHY DID THIS HAPPEN ?
This happened because someone or something had caused a abrupt stopping and restarting of the instance.

ElasticSearch mongo always 0 hits

I'm trying to make ElasticSearch run over my Mongodb server, everything looks fine, but every query I do returns me 0 hits. Always.
My installation and configuration log:
Installed Mongodb 2.6.4
Up and running. No problems here. I have like 7000 products inside "products" collection.
.2 Created replica set.
Confirmed with rs.status() on Mongo shell that it's created and it's
the primary replica Changed mongod.conf with resplSet = rs0
oplogSize=100
.3. Restarted MongoDB
.4. Initiated the replica set
On mongo shell rs.initiate(). Everything fine.
.5. Installed ElasticSearch 1.3.2
{
"status": 200,
"name": "Franz Kafka",
"version": {
"number": "1.3.2",
"build_hash": "dee175dbe2f254f3f26992f5d7591939aaefd12f",
"build_timestamp": "2014-08-13T14:29:30Z",
"build_snapshot": false,
"lucene_version": "4.9"
},
"tagline": "You Know, for Search"
}
.6. Installed Mapper plugin
.7. Installed River plugin
.8. Create index
curl -XPUT 'http://localhost:9200/indexprueba/products/_meta?pretty=true' -d '{
"type": "mongodb",
"mongodb": {
"db": "test",
"collection": "products"
},
"index": {
"name": "probando1",
"type": "products"
}
}'
it returns:
{
"_index": "indexprueba",
"_type": "products",
"_id": "_meta",
"_version": 1,
"created": true
}
--------EDIT---------
8.5 Restore database
I didn't do this. Once I've created the index, I restore my database with mongorestore and this is what I get:
connected to: 127.0.0.1:27017
2014-09-08T08:17:17.773+0000 /var/backup/bikebud/products.bson
2014-09-08T08:17:17.773+0000 going into namespace [test.products]
Restoring to test.products without dropping. Restored data will be inserted without raising errors; check your server log
6947 objects found
2014-09-08T08:17:18.043+0000 Creating index: { key: { _id: 1 }, name: "_id_", ns: "test.products" }
2014-09-08T08:17:18.456+0000 /var/backup/bikebud/retailers.bson
2014-09-08T08:17:18.457+0000 going into namespace [test.retailers]
Restoring to test.retailers without dropping. Restored data will be inserted without raising errors; check your server log
20 objects found
2014-09-08T08:17:18.457+0000 Creating index: { key: { _id: 1 }, name: "_id_", ns: "test.retailers" }
So I understand from here that my indexes are created and linked to the database
--------EDIT---------
.9. Create simple query
curl -XGET `'http://127.0.0.1:9200/indexprueba/_search?pretty=true&q=*:*'`
Always returns:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
----------------EDIT-------------------
After the edit, this is what I get:
{
"took": 14,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [
{
"_index": "testindex1",
"_type": "products",
"_id": "1",
"_score": 1.0,
"_source": {
"type": "mongodb",
"mongodb": {
"servers": [
{
"host": "127.0.0.1",
"port": 27017
}
],
"options": {
"secondary_read_preference": true
},
"db": "test",
"collection": "products"
}
}
}
]
}
}
So now I get hits, but is the index itself. I was expecting to get all products from my database. I start to think I don't understand at all what elasticsearch does. Any clue??
----------------EDIT-------------------
I don't know what I'm missing here. Please, any advice?
----------------EDIT-------------------
It looks like it was a version problem. I have to downgrade ES to 1.2.2 (I'm using 1.3.2).
"Resolved"

Resources