python elasticsearch get field from given doc_id

python elasticsearch get field from given doc_id - elasticsearch

My input is <index_name>, <doc_id>, <field_name>, i want the value of the field
I am looking for python-client equivalent of
GET <index_name>/_doc/<doc_id>/?_source_includes=<field_name>

I figured it out
from elasticsearch import Elasticsearch
es = Elasticsearch()
result = es.get(
index=<index_name>,
id=<doc_id>,
_source_includes=<field_name>
)

Related

ElasticSearch get only document ids, _id field, using search query on index

For a given query I want to get only the list of _id values without getting any other information (without _source, _index, _type, ...).
I noticed that by using _source and requesting non-existing fields it will return only minimal data but can I get even less data in return ?
Some answers suggest to use the hits part of the response, but I do not want the other info.

Better to use scroll and scan to get the result list so elasticsearch doesn't have to rank and sort the results.
With the elasticsearch-dsl python lib this can be accomplished by:
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
es = Elasticsearch()
s = Search(using=es, index=ES_INDEX, doc_type=DOC_TYPE)
s = s.fields([]) # only get ids, otherwise `fields` takes a list of field names
ids = [h.meta.id for h in s.scan()]

I suggest to use elasticsearch_dsl for python. They have a nice api.
from elasticsearch_dsl import Document
# don't return any fields, just the metadata
s = s.source(False)
results = list(s)
Afterwards you can get the the id with:
first_result: Document = results[0]
id: Union[str,int] = first_result.meta.id
Here is the official documentation to get some extra information: https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#extra-properties-and-parameters

How can I find the true score from Elasticsearch query string with a wildcard?

My ElasticSearch 2.x NEST query string search contains a wildcard:
Using NEST in C#:
var results = _client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq.Query("Micro*")))
.From(pageNumber)
.Size(pageSize));
Comes up with something like this:
$ curl -XGET 'http://localhost:9200/_all/_search?q=Micro*'
This code was derived from the ElasticSearch page on using Co-variants. The results are co-variant; they are of mixed type coming from multiple indices. The problem I am having is that all of the hits come back with a score of 1.
This is regardless of type or boosting. Can I boost by type or, alternatively, is there a way to reveal or "explain" the search result so I can order by score?

Multi term queries like wildcard query are given a constant score equal to the boosting by default. You can change this behaviour using .Rewrite().
var results = client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq
.Query("Micro*")
.Rewrite(RewriteMultiTerm.ScoringBoolean)
)
)
.From(pageNumber)
.Size(pageSize)
);
With RewriteMultiTerm.ScoringBoolean, the rewrite method first translates each term into a should clause in a bool query and keeps the scores as computed by the query.
Note that this can be CPU intensive and there is a default limit of 1024 bool query clauses that can be easily hit for a large document corpus; running your query on the complete StackOverflow data set (questions, answers and users) for example, hits the clause limit for questions. You may want to analyze some text with an analyzer that uses an edgengram token filter.

Wildcard searches will always return a score of 1.
You can boost by a particular type. See this:
How to boost index type in elasticsearch?

Fetch all the rows using elasticsearch_dsl

Currently i am using the following program to extract the id and its severity information from elastic search .
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q
client = Elasticsearch(
[
#'http://user:secret#10.x.x.11:9200/',
'http://10.x.x.11:9200/',
],
verify_certs=True
)
s = Search(using=client, index="test")
response = s.execute()
for hit in response:
print hit.message_id, hit.severity, "\n\n"
i believe by default the query returns 10 rows. I am having more than 10000 rows in elastic search. I need to fetch all the information.
Can some one guide me how to run the same query to fetch all records ?

You can use the scan() helper function in order to retrieve all docs from your test index:
from elasticsearch import Elasticsearch, helpers
client = Elasticsearch(
[
#'http://user:secret#10.x.x.11:9200/',
'http://10.x.x.11:9200/',
],
verify_certs=True
)
docs = list(helpers.scan(client, index="test", query={"query": {"match_all": {}}}))
for hit in docs:
print hit.message_id, hit.severity, "\n\n"

Elasticsearch DSL: Bucket not working

Running the code,
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q, A
client = Elasticsearch(timeout=100)
s = Search(using=client, index="cms*")
s.aggs.bucket('ExitCode', 'terms', field='ExitCode').metric('avgCpuEff', 'avg', field='CpuEff')
for hit in s[0:20].execute():
print hit['ExitCode']
yields several ExitCode = 0. I thought a terms bucket is supposed to group all the results that have the same exit code, in this case. What is actually going on?

You're iterating over the hits, you need to iterate over the aggregated buckets instead:
response = s.execute()
for code in response.aggregations.ExitCode.buckets:
print(code.key, code.avgCpuEff.value)

How to do facet search with mpdreamz Nest

does anybody know how to do facet search with Nest?
My index is https://gist.github.com/3606852
would like to search for some keyword in 'NumberEvent' and dispaly the result if the keyword exist.Please help me !!!

This is using the assumption that the MyPoco class exists and maps to your elasticsearch document. If it doesn't you can use dynamic but you'l have to swap the lambda based field selectors with strings.
var result = client.Search<MyPoco>(s=>s
.From(0)
.Size(10)
.Filter(ff=>ff.
.Term(f=>f.Categories.Types.Events.First().NumberEvent.event, "keyword")
)
.FacetTerm(q=>q.OnField(f=>f.Categories.Types.Facets.First().Person.First().entity))
);
result.Documents now holds your documents
result.Facet<TermFacet>(f => f.Categories.Types.Facets.First().Person.First().entity); now holds your facets
Your document seems a bit strange though in the sense that it already has Facets with counts in them.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

python elasticsearch get field from given doc_id - elasticsearch

My input is <index_name>, <doc_id>, <field_name>, i want the value of the field I am looking for python-client equivalent of GET <index_name>/_doc/<doc_id>/?_source_includes=<field_name>

I figured it out from elasticsearch import Elasticsearch es = Elasticsearch() result = es.get( index=<index_name>, id=<doc_id>, _source_includes=<field_name> )

Related

ElasticSearch get only document ids, _id field, using search query on index

How can I find the true score from Elasticsearch query string with a wildcard?

Fetch all the rows using elasticsearch_dsl

Elasticsearch DSL: Bucket not working

How to do facet search with mpdreamz Nest

Categories

Resources