Search particular document id in all available indices of Elasticsearch - elasticsearch

Is there any possibility where we can search a particular document id in all available indices. /_all/_search/ returns all documents but I tried it as /_all/_search/?q=<MYID> or
/_all/_search/_id/<MYID>
but I'm not getting any documents.
If Elasticsearch does not support this, how will we achieve this task ? The use case is centralized log system based on Logstash and Elasticsearch, having multiple indices of different running services.

You can use the terms query for this. Use _all to search on all indexes.Please refer here
here is the request I used
curl -XGET "http://localhost:9200/_all/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"terms": {
"_id": [
"4ea288f192e2c8b6deb3cee00d7b873b",
"dcc2b9c4fb6d14b2d41dbc5fee801af3"
]
}
}
}'
_id is the id of the document

You can use multi get api
You will need to pass the index name , it won't work on all indices
GET /_mget
{
"docs" : [
{
"_index" : "index1",
"_id" : "1"
},
{
"_index" : "index2",
"_id" : "1"
}
]
}

Related

How to Query just all the documents name of index in elasticsearch

PS: I'm new to elasticsearch
http://localhost:9200/indexname/domains/<mydocname>
Let's suppose we have indexname as our index and i'm uploading a lot of documents at <mydoc> with domain names ex:
http://localhost:9200/indexname/domains/google.com
http://localhost:9200/indexname/domains/company.com
Looking at http://localhost:9200/indexname/_count , says that we have "count": 119687 amount of documents.
I just want my elastic search to return the document names of all 119687 entries which are domain names.
How do I achieve that and is it possible to achieve that in one single query?
Looking at the example : http://localhost:9200/indexname/domains/google.com I am assuming your doc_type is domains and doc id/"document name" is google.com.
_id is the document name here which is always part of the response. You can use source filtering to disable source and it will show only something like below:
GET indexname/_search
{
"_source": false
}
Output
{
...
"hits" : [
{
"_index" : "indexname",
"_type" : "domains",
"_id" : "google.com",
"_score" : 1.0
}
]
...
}
If documentname is a field that is mapped, then you can still use source filtering to include only that field.
GET indexname/_search
{
"_source": ["documentname"]
}

Attempting to use Elasticsearch Bulk API when _id is equal to a specific field

I am attempting to bulk insert documents into an index. I need to have _id equal to a specific field that I am inserting. I'm using ES v6.6
POST productv9/_bulk
{ "index" : { "_index" : "productv9", "_id": "in_stock"}}
{ "description" : "test", "in_stock" : "2001"}
GET productv9/_search
{
"query": {
"match": {
"_id": "2001"
}
}
}
When I run the bulk statement it runs without any error. However, when I run the search statement it is not getting any hits. Additionally, I have many additional documents that I would like to insert in the same manner.
What I suggest to do is to create an ingest pipeline that will set the _id of your document based on the value of the in_stock field.
First create the pipeline:
PUT _ingest/pipeline/set_id
{
"description" : "Sets the id of the document based on a field value",
"processors" : [
{
"set" : {
"field": "_id",
"value": "{{in_stock}}"
}
}
]
}
Then you can reference the pipeline in your bulk call:
POST productv9/doc/_bulk?pipeline=set_id
{ "index" : {}}
{ "description" : "test", "in_stock" : "2001"}
By calling GET productv9/_doc/2001 you will get your document.

ES 1.5 Delete By Query API not working

I am using an old version on ElasticSearch - 1.5.
Problem: I need to delete a lot of documents, like few hundred thousands up to few millions. I have all the info about the records, including it's _ids - so array of _ids is what I want to use.
Scale problem: I had this deletion in the loop before, but ES is inconsistent when performing a lot of subsequent operations in a high speed. Thus I decided to look for a bulk delete.
I am trying to make use of delete by query API.
Docs states:
curl -XDELETE 'http://localhost:9200/twitter/tweet/_query' -d '{
"query" : {
"term" : { "user" : "kimchy" }
}
}
'
What I'm doing:
curl -XDELETE 'http://localhost:9200/my_index/logs/_query' -d '{
"query" : {
"terms" : { "_id" : ["AVTD6fhLAn35BG25xbZz", "AVTD6fhLAn35BG25xbaC"] }
}
}
'
The response is:
{
"found":false,
"_index":"my_index",
"_type":"logs",
"_id":"_query",
"_version":1,
"_shards":{"total":2, "successful":1, "failed":0}
}
And it does not remove any of documents. How do I make it work and actually delete these records?
Not sure about the delete_by_query API in elasticsearch 1.5. Seems to me that elasticsearch is unable to understand your query as it is looking for "_id": "_query" (as evident from the response you posted).
What you can do is, use the Bulk API as documented here:
https://www.elastic.co/guide/en/elasticsearch/reference/1.5/docs-bulk.html
As in the example in the doc page, you can do:
curl -s -XPOST localhost:9200/_bulk --data-binary #requests; echo
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
...
You need to make a file by any name ("requests" here) and add individual delete requests, each separated by a newline character.

How to retrieve all document ids (_id) in a specific index

I am trying to retrieve all documents in an index, while getting only the _id field back.
Basically I want to retrieve all the document ids I have.
While using:
{
"query": {
"match_all": {}
},
"fields": []
}
The hits I get contain: "_index", "_type", "_id" , "_score", "_source"
which is way more then I need.
Edit(Answer):
So my problem was that I used KOPF to run the queries, and the results were not accurate (got the _source and some more..)! When using curl I got the correct results!
So the above query actually achieved what I needed!
You can also use:
{
"query": {
"match_all": {}
},
"_source": false,
}
or
{
"query": {
"match_all": {}
},
"fields": ["_id"]
}
For elasticsearch, only can specific _source fields by using fields array.
_index, _type, _id, _score must will be returned by elasticsearch.
there is no way to remove them from response.
I am assuming your _id is of your document in index not of index itself.
In new version of elastic search, "_source" is used for retrieving selected fields of your es document because _source fields contains everything you need in elastic search for a es record.
Example:
Let's say index name is "movies" and type is "movie" and you want to retrieve the movieName and movieTitle of all elastic search records.
curl -XGET 'http://localhost:9200/movies/movie/_search?pretty=true' -d '
{
"query" : {
"match_all" : {}
},
"_source": ["movieName","movieTitle"]
}'
OR http://localhost:9200/movies/movie/_search?pretty=true&_source=movieName,movieTitle
By default it return 10 results. For getting n number of records then put &size=n in url

Return document on update elasticsearch

Lets say I'm updating user data
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
},
"fields": ["_source"]
}'
Heres an example of what I'm getting back when I perform an update
{
"_index" : "test",
"_type" : "type1",
"_id" : "1",
"_version" : 4
}
How do I perform an update that returns the given document post update?
The documentation is a little misleading with regards to returning fields when performing an Elasticsearch update. It actually uses the same approach that the Index api uses, passing the parameter on the url, not as a field in the update.
In your case you would submit:
curl -XPOST 'localhost:9200/test/type1/1/_update?fields=_source' -d '{
"doc" : {
"name" : "new_name"
}
}'
In my testing in Elasticsearch 1.2.1 it returns something like this:
{
"_index":"test",
"_type":"testtype",
"_id":"1","_version":9,
"get": {
"found":true,
"_source": {
"user":"john",
"body":"testing update and return fields",
"name":"new_name"
}
}
}
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html

Resources