How to Query just all the documents name of index in elasticsearch - elasticsearch

PS: I'm new to elasticsearch
http://localhost:9200/indexname/domains/<mydocname>
Let's suppose we have indexname as our index and i'm uploading a lot of documents at <mydoc> with domain names ex:
http://localhost:9200/indexname/domains/google.com
http://localhost:9200/indexname/domains/company.com
Looking at http://localhost:9200/indexname/_count , says that we have "count": 119687 amount of documents.
I just want my elastic search to return the document names of all 119687 entries which are domain names.
How do I achieve that and is it possible to achieve that in one single query?

Looking at the example : http://localhost:9200/indexname/domains/google.com I am assuming your doc_type is domains and doc id/"document name" is google.com.
_id is the document name here which is always part of the response. You can use source filtering to disable source and it will show only something like below:
GET indexname/_search
{
"_source": false
}
Output
{
...
"hits" : [
{
"_index" : "indexname",
"_type" : "domains",
"_id" : "google.com",
"_score" : 1.0
}
]
...
}
If documentname is a field that is mapped, then you can still use source filtering to include only that field.
GET indexname/_search
{
"_source": ["documentname"]
}

Related

Elasticseach - force a field to index only, avoid store

How do I force a field to be indexed only and not store the data. This option is available in Solr and not sure if it's possible in Elasticseach.
From document
By default, field values are indexed to make them searchable, but they
are not stored. This means that the field can be queried, but the
original field value cannot be retrieved.
Usually this doesn’t matter. The field value is already part of the
_source field, which is stored by default. If you only want to retrieve the value of a single field or of a few fields, instead of
the whole _source, then this can be achieved with source filtering
If you don't want field to be stored in _source too. You can exclude the field from source in mapping
Mapping:
{
"mappings": {
"properties": {
"title":{
"type":"text"
},
"description":{
"type":
}
},
"_source": {
"excludes": [
"description"
]
}
}
}
Query:
GET logs/_search
{
"query": {
"match": {
"description": "b" --> field description is searchable(indexed)
}
}
}
Result:
"hits" : [
{
"_index" : "logs",
"_type" : "_doc",
"_id" : "-aC9V3EBkD38P4LIYrdY",
"_score" : 0.2876821,
"_source" : {
"title" : "a" --> field "description" is not returned
}
}
]
Note:
Removing fields from source will cause below issue
The update, update_by_query, and reindex APIs.
On the fly highlighting.
The ability to reindex from one Elasticsearch index to another, either to change mappings or analysis, or to upgrade an index to a new major version.
The ability to debug queries or aggregations by viewing the original document used at index time.
Potentially in the future, the ability to repair index corruption automatically.

Search particular document id in all available indices of Elasticsearch

Is there any possibility where we can search a particular document id in all available indices. /_all/_search/ returns all documents but I tried it as /_all/_search/?q=<MYID> or
/_all/_search/_id/<MYID>
but I'm not getting any documents.
If Elasticsearch does not support this, how will we achieve this task ? The use case is centralized log system based on Logstash and Elasticsearch, having multiple indices of different running services.
You can use the terms query for this. Use _all to search on all indexes.Please refer here
here is the request I used
curl -XGET "http://localhost:9200/_all/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"terms": {
"_id": [
"4ea288f192e2c8b6deb3cee00d7b873b",
"dcc2b9c4fb6d14b2d41dbc5fee801af3"
]
}
}
}'
_id is the id of the document
You can use multi get api
You will need to pass the index name , it won't work on all indices
GET /_mget
{
"docs" : [
{
"_index" : "index1",
"_id" : "1"
},
{
"_index" : "index2",
"_id" : "1"
}
]
}

In Elastic search ,how to get "-id" value of a document by providing unique content present in the document

I have few documents ingested in Elastic search. A sample document is as below.
"_index": "author_index",
"_type": "_doc",
"_id": "cOPf2wrYBik0KF", --Automatically generated by Elastic search after ingestion
"_score": 0.13956004,
"_source": {
"author_data": {
"author": "xyz"
"author_id": "123" -- This is unique id for each document
"publish_year" : "2016"
}
}
Is there a way to get the auto-generated _id by sending author_id from High-level Rest Client?
I tried researching solutions.But all the solutions are only fetching the document using _id. But I need the reverse operation.
Actual Output expected: cOPf2wrYBik0KF
The SearchHit provides access to basic information like index, document ID and score of each search hit, so with Search API you can do it this way on Java,
String index = hit.getIndex();
String id = hit.Id() ;
OR something like this,
SearchResponse searchResponse =
client.prepareSearch().setQuery(matchAllQuery()).get();
for (SearchHit hit : searchResponse.getHits()) {
String yourId = hit.id();
}
SEE HERE: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search.html#java-rest-high-search-response
You can use source filtering.You can turn off _source retrieval as you are interested in just the _id.The _source accepts one or more wildcard patterns to control what parts of the _source should be returned.(https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-request-source-filtering.html):
GET /author_index
{
"_source" : false,
"query" : {
"term" : { "author_data.author_id" : "123" }
}
}
Another approach will also give for the _id for the search.The stored_fields parameter is about fields that are explicitly marked as stored in the mapping, which is off by default and generally not recommended:
GET /author_index
{
"stored_fields" : ["author_data.author_id", "_id"],
"query" : {
"term" : { "author_data.author_id" : "123" }
}
}
Output for both above queries:
"hits" : [
{
"_index" : "author_index",
"_type" : "_doc",
"_id" : "cOPf2wrYBik0KF",
"_score" : 6.4966354
}
More details here: https://www.elastic.co/guide/en/elasticsearch/reference/7.0/search-request-stored-fields.html

Given a Document ID find the matching Document in Elasticsearch

I have indexed some articles in the Elasticsearch. Now suppose a user likes an article now i want to recommend some matching article to him. Assuming articles are precise and well written to the point. All articles are of same type.
I know it is like getting all the tokens related to that article and searching all other article on them. Is there anything in elastic search which does this for me...?
Or any other way of doing this..?
You can use More Like This Query:
From the doc it selects a set of representative terms of these input documents, forms a query using these terms, executes the query and returns the results. 
Usage:
{
"query": {
"more_like_this" : {
"fields" : ["title", "description"],
"like" : [
{
"_index" : "your index",
"_type" : "articles",
"_id" : "1" # your document id
}
],
"min_term_freq" : 1,
"max_query_terms" : 12
}
}
}

Get all fields of a document in ElasticSearch search query

How can I get all fields in documents matched by search query? ES documentation on fields says that using *, one can get all fields: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
Having this document and this query, I get hit in result, but no fields are returned:
Put document:
curl -XPUT http://localhost:9200/idx/t/doc1 -d '{
"f": "value"
}'
Search it:
curl -XPOST http://localhost:9200/idx/_search?pretty -d '{
"fields": "*",
"query": { "term" : { "f" : "value" }}
}'
I tried also ["*"], but the result is the same, only default fields (_id and _type) are returned. The hits part of response looks like this:
"hits" : {
"total" : 1,
"max_score" : 0.30685282,
"hits" : [ {
"_index" : "idx",
"_type" : "t",
"_id" : "doc1",
"_score" : 0.30685282
} ]
}
The doc actually says:
"* can be used to load all stored fields from the document."
The core types doc says that the default for storing fields is 'false'.
Since by default ElasticSearch stores all fields of the source document in the special _source field, this option is primarily useful when the _source field has been disabled in the type definition. Defaults to false.
If you don't specify 'fields' in your search, you can see what's in _source.
So, if you want to return it as a field, change your mapping to store the field.
I am facing this problem, too.
I found out that if I just search the text or keyword fields, everything is OK.
Hope this may help you.

Resources