I am trying basic commands in elasticsearch and I am stuck with basic search.
My script:
from elasticsearch import Elasticsearch
INDEX_NAME = 'person'
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
data = {
'name': 'John'
}
response = es.index(index=INDEX_NAME, body=data)
print(response['result']) #=> created
id = response['_id']
response = es.get(index=INDEX_NAME, id=id)
print(response['_source']) #=> {'name': 'John'}
query = {
'query': {
'match_all': {}
}
}
response = es.search(index=INDEX_NAME, body=query)
print(response) #=> {'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 0, 'relation': 'eq'}, 'max_score': None, 'hits': []}}
print(response['hits']['hits']) #=> []
According to debug logs, a document is created, es.get can find it in the index, but es.search cannot. Does anyone know where the problem is?
I also tried update and delete and both worked. Docs was not really helpful.
EDIT:
Search works in Kibana:
GET person/_search
{
"query": {
"match_all": {}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "person",
"_type" : "_doc",
"_id" : "3EZ1s3gBsIJkcgJzc150",
"_score" : 1.0,
"_source" : {
"name" : "John"
}
}
]
}
}
So it must be something Python specific.
It was a bug in PyCharm apparently. I restarted the IDE and it works.
Related
I have two index 1.ther 2.part.
"ther" index has 24 fields, ''part" index has 19 fields. I have to enrich "ther" index with "part" index fields.
Field "user_id" is common between two indexes. Using enrich process I tried creating 3 rd index "part_ther"
get ther/_search
{
"_index" : "ther",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"Ahi" : 42.6,
"Index" : 90,
"ipapPressureAverage" : 49,
"MinuteVentAverage" : 20,
"DeviceSerialNumber" : "<>",
"ClearAirwayApneaIndex" : 72.26135463,
"PeriodicBreathingTime" : 9311,
"cpapPressure90Pct" : 22,
"epapPressureAverage" : 73,
"ipapPressure90Pct" : 27,
"Usage" : 93904,
"epapPressure90Pct" : 10,
"AverageBreathRate" : 65,
"#timestamp" : "2021-08-29T00:00:00.000+05:30",
"user_id" : "39,476",
"TrigBreathsPct" : 93,
"cpapPressureAverage" : 29,
"AverageExhaledTidalVolume" : 20,
"DayStorageDate" : "29-08-2021",
"UnintendedLeakageAverage" : 67.58
}
get part/_search
{
"_index" : "part",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"email" : "<>",
"program_id" : 1849,
"program_name" : "<> ",
"scheduled_program_id" : 4765,
"scheduled_program_name" : "<>",
"scheduled_program_status" : 0,
"experience_id" : 10129,
"experience_name" : "<>",
"response_id" : 364482,
"user_id" : 39476,
"firing_time" : "2021-09-19T09:28:51.000-04:00",
"opened_at" : null,
"participation_percentage" : 0,
"status" : "<>",
"created_at" : "2021-09-18T12:26:02.455-04:00",
"updated_at" : "2021-09-19T09:30:04.228-04:00",
"last_frame_completed_id" : null,
"realm" : "affective-part-364482",
"realms_organization_ids" : [
"<>"
]
}
}
Using enrich process I tried creating 3 rd index "ther-part" so that all fields from "ther" index got enriched with "part" index.
Steps-
PUT /_enrich/policy/ther-policy
{
"match": {
"indices": "ther",
"match_field": "user_id",
"enrich_fields": ["Ahi","AverageBreatheRate","AverageExhaledTidalVolume","ClearAirwayApneaIndex","cpapPressureAverage","cpapPressure90Pct","DayStorageDate","epapPressureAverage","epapPressure90Pct","ipapPressureAverage","ipapPressure90Pct","MinuteVentAverage","PeriodicBreathingTime","TrigBreathsPct","UnintendedLeakageAverage","Usage","DeviceSerialNumber"]
}
}
*****
Output-
{
"acknowledged" : true
}
*************
POST /_enrich/policy/ther-policy/_execute
Output-
{
"status" : {
"phase" : "COMPLETE"
}
}
*************************
PUT /_ingest/pipeline/ther_lookup
{
"description" : "Enriching user details with tracks",
"processors" : [
{
"enrich" : {
"policy_name": "ther-policy",
"field" : "user_id",
"target_field": "tmp",
"max_matches": "1"
}
},
{
"script": {
"if": "ctx.tmp != null",
"source": "ctx.putAll(ctx.tmp); ctx.remove('tmp');"
}
}
]
}
Output-
{
"acknowledged" : true
}
**************************
POST _reindex
{
"source": {
"index": "part"
},
"dest": {
"index": "part_ther",
"pipeline": "ther_lookup"
}
}
Fields are getting enriched in part_ther index. Index "part" and "ther" has only 2 docs as I am testing the enrichment processor.
get part_ther/_search
{
"_index": "part_ther",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"Ahi": 42.6,
"response_id": 364482,
"program_id": 1849,
"program_name": "<> ",
"created_at": "2021-09-18T12:26:02.455-04:00",
"ipapPressureAverage": 49,
"MinuteVentAverage": 20,
"DeviceSerialNumber": "<>",
"scheduled_program_id": 4765,
"experience_id": 10129,
"ClearAirwayApneaIndex": 72.26135463,
"PeriodicBreathingTime": 9311,
"updated_at": "2021-09-19T09:30:04.228-04:00",
"scheduled_program_status": 0,
"cpapPressure90Pct": 22,
"email": "<>",
"epapPressureAverage": 73,
"ipapPressure90Pct": 27,
"last_frame_completed_id": null,
"Usage": 93904,
"participation_percentage": 0,
"epapPressure90Pct": 10,
"firing_time": "2021-09-19T09:28:51.000-04:00",
"opened_at": null,
"realms_organization_ids": [
"<>"
],
"user_id": "39476",
"scheduled_program_name": "<>",
"TrigBreathsPct": 93,
"cpapPressureAverage": 29,
"experience_name": "<>",
"AverageExhaledTidalVolume": 20,
"DayStorageDate": "29-08-2021",
"realm": "<>",
"UnintendedLeakageAverage": 67.58,
"status": "<>"
}
In Original Index "therapy" index has 9k docs and "participation index" has 80k docs.
When I am trying to create third index on original indices, I am getting errors.
Error 502, {"ok":false,"message":"backend closed connection"}
How can I avoid this error and enrich indices successfully.
POST _reindex
{
"source": {
"index": "-participation"
},
"dest": {
"index": "data-participation-therapy1",
"pipeline": "therapy_lookup"
}
}
output-
{"ok":false,"message":"backend closed connection"}
Thanks, #Val for clarifying this error.
backend closed connection simply means that the client (i.e. Kibana Dev Tools in your browser) timed out. But the reindex process is still ongoing in the background.
If your source index is big, odds are high that it will take longer than the timeout for the operation to terminate. So you should start your reindex to run asynchronously using
POST _reindex?wait_for_completion=false
The call will return immediately and give you a task ID which you can use to check the task status as it progresses using
GET _tasks/<task_id>
Ref link-Enrich fields in new index using enrich processor
I have this code to get the scroll_id after doing the first search:
var initSearch = client.LowLevel.Search<dynamic>(INDEX, TYPE, QUERY, x => x.AddQueryString("scroll", "1m").AddQueryString("size", "2"));
string scrollId = initSearch.Body["_scroll_id"].ToString();
then I used the scrollid during the 2nd search but it didn't return any hits
var scrollSearch = client.LowLevel.ScrollGet<dynamic>(x =>
x.AddQueryString("scroll", "1m").AddQueryString("scroll_id", scrollId));
scrollId = scrollSearch.Body["_scroll_id"].ToString();
var searchHits = int.Parse(scrollSearch.Body["hits"]["total"].ToString());
searchHits.Count is zero. What may be the cause of this? Also, when I loop into the scrollSearch again, I am expecting that the scrollid would change but it is not changing values.
A size of 2 will return 2 documents in each response, including the first response. So, if the total documents for a given query were less than or equal to 2, all documents would be returned within the first response. Take the following for example
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "messages";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex)
.PrettyJson()
.DisableDirectStreaming()
.OnRequestCompleted(response =>
{
if (response.RequestBodyInBytes != null)
{
Console.WriteLine(
$"{response.HttpMethod} {response.Uri} \n" +
$"{Encoding.UTF8.GetString(response.RequestBodyInBytes)}");
}
else
{
Console.WriteLine($"{response.HttpMethod} {response.Uri}");
}
Console.WriteLine();
if (response.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(response.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(connectionSettings);
if (client.IndexExists(defaultIndex).Exists)
{
client.DeleteIndex(defaultIndex);
}
client.IndexMany(new[]
{
new Message { Content = "message 1" },
new Message { Content = "message 2" },
new Message { Content = "message 3" },
new Message { Content = "message 4" },
new Message { Content = "message 5" },
new Message { Content = "message 6" },
});
client.Refresh(defaultIndex);
var searchResponse = client.Search<Message>(s => s
.Scroll("1m")
.Size(2)
.Query(q => q
.Terms(t => t
.Field(f => f.Content.Suffix("keyword"))
.Terms("message 1", "message 2")
)
)
);
searchResponse = client.Scroll<Message>("1m", searchResponse.ScrollId);
}
public class Message
{
public string Content { get; set; }
}
The search and scroll responses return
------------------------------
POST http://localhost:9200/messages/message/_search?pretty=true&scroll=1m
{
"size": 2,
"query": {
"terms": {
"content.keyword": [
"message 1",
"message 2"
]
}
}
}
Status: 200
{
"_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAADGFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAxxZzNUp4bVFXSFJoLTNWMmdjUHotYWRBAAAAAAAAAMgWczVKeG1RV0hSaC0zVjJnY1B6LWFkQQAAAAAAAADJFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAyhZzNUp4bVFXSFJoLTNWMmdjUHotYWRB",
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.6931472,
"hits" : [
{
"_index" : "messages",
"_type" : "message",
"_id" : "AV8IkTSbM7nzQBTCbQok",
"_score" : 0.6931472,
"_source" : {
"content" : "message 1"
}
},
{
"_index" : "messages",
"_type" : "message",
"_id" : "AV8IkTSbM7nzQBTCbQol",
"_score" : 0.6931472,
"_source" : {
"content" : "message 2"
}
}
]
}
}
------------------------------
POST http://localhost:9200/_search/scroll?pretty=true
{
"scroll": "1m",
"scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAADGFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAxxZzNUp4bVFXSFJoLTNWMmdjUHotYWRBAAAAAAAAAMgWczVKeG1RV0hSaC0zVjJnY1B6LWFkQQAAAAAAAADJFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAyhZzNUp4bVFXSFJoLTNWMmdjUHotYWRB"
}
Status: 200
{
"_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAADGFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAxxZzNUp4bVFXSFJoLTNWMmdjUHotYWRBAAAAAAAAAMgWczVKeG1RV0hSaC0zVjJnY1B6LWFkQQAAAAAAAADJFnM1SnhtUVdIUmgtM1YyZ2NQei1hZEEAAAAAAAAAyhZzNUp4bVFXSFJoLTNWMmdjUHotYWRB",
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.6931472,
"hits" : [ ]
}
}
------------------------------
Since there are only 2 matching documents for the given query and size was set to 2, both documents are returned in the first response and the following scroll response does not contain any hits.
You can use the total from the initial search response to determine whether you need to call the scroll API for more documents.
The actual _scroll_id value is an implementation detail which may or may not change values on subsequent calls. I would not recommend basing any logic on its value, but only use the _scroll_id value returned from the last scroll response in the subsequent scroll request.
Below is the code I am using:
"_source" : {
"name" : "hn name",
"user_id" : 553,
"email_id" : "ns#gmail.com",
"lres_id" : "",
"hres_id" : "hn image",
"followers" : 0,
"following" : 1,
"mentors" : 2,
"mentees" : 2,
"basic_info" : "hn developer",
"birth_date" : 1448451985397,
"charge_price" : 3000,
"org" : "mnc pvt ltd",
"located_in" : "Noidasec51 ",
"position" : "jjunior ava developer",
"requests" : 0,
"exp" : 5,
"video_bio_lres" : "hn test lres url",
"video_bio_hres" : "hn hres url",
"ratings" : [ {
"rating" : 1,
"ratedByUserId" : 777
}, {
"rating" : 1,
"ratedByUserId" : 555
} ],
"avg_rating" : 0.0,
"status" : 0,
"expertises" : [ 3345, 1234, 2345 ],
"blocked_users" : [ ]
}
In the Following Code, I want to delete rating ratedByUserId 555 only.But Some How I am unable for doing so.
How to do it?
its works for me:-
curl -XPOST 'localhost:9200/mentorz/users/555/_update' -d
'{" script":"ctx._source.ratings.remove(ratings)",
"params":{
"ratings":{
"rating":1,
"ratedByUserId":555
}
}
}'
Please, observe:
MongoDB shell version: 2.4.1
connecting to: test
> use dummy
switched to db dummy
> db.invoices.find({'items.nameTags': /^z/}, {_id: 1}).explain()
{
"cursor" : "BtreeCursor items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1 multi",
"isMultiKey" : true,
"n" : 55849,
"nscannedObjects" : 223568,
"nscanned" : 223568,
"nscannedObjectsAllPlans" : 223568,
"nscannedAllPlans" : 223568,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 86,
"nChunkSkips" : 0,
"millis" : 88864,
"indexBounds" : {
"items.nameTags" : [
[
"z",
"{"
],
[
/^z/,
/^z/
]
],
"created" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"special" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"_id" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"items.qty" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"items.total" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "IL-Mark-LT:27017"
}
>
Here is the definition of the index:
> db.system.indexes.find({name : 'items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1'}).pretty()
{
"v" : 1,
"key" : {
"items.nameTags" : 1,
"created" : 1,
"special" : 1,
"_id" : 1,
"items.qty" : 1,
"items.total" : 1
},
"ns" : "dummy.invoices",
"name" : "items.nameTags_1_created_1_special_1__id_1_items.qty_1_items.total_1"
}
>
Finally, here is an example invoice document (with just 2 items):
> db.invoices.findOne({itemCount: 2})
{
"_id" : "85923",
"customer" : "Wgtd Fm 91",
"businessNo" : "314227928",
"billTo_name" : "Wgtd Fm 91",
"billTo_addressLine1" : "3839 Ross Street",
"billTo_addressLine2" : "Kingston, ON",
"billTo_postalCode" : "K7L 4V4",
"purchaseOrderNo" : "boi",
"terms" : "COD",
"shipDate" : "2013-07-10",
"shipVia" : "Moses Transportation Inc.",
"rep" : "Snowhite",
"items" : [
{
"qty" : 4,
"name" : "CA 7789",
"desc" : "3 pc. Coffee Table set (Silver)",
"price" : 222.3,
"total" : 889.2,
"nameTags" : [
"ca 7789",
"a 7789",
" 7789",
"7789",
"789",
"89",
"9"
],
"descTags" : [
"3",
"pc",
"c",
"coffee",
"offee",
"ffee",
"fee",
"ee",
"e",
"table",
"able",
"ble",
"le",
"e",
"set",
"et",
"t",
"silver",
"ilver",
"lver",
"ver",
"er",
"r"
]
},
{
"qty" : 4,
"name" : "QP 8681",
"desc" : "Ottoman Bed",
"price" : 1179.1,
"total" : 4716.4,
"nameTags" : [
"qp 8681",
"p 8681",
" 8681",
"8681",
"681",
"81",
"1"
],
"descTags" : [
"ottoman",
"ttoman",
"toman",
"oman",
"man",
"an",
"n",
"bed",
"ed",
"d"
]
}
],
"itemCount" : 2,
"discount" : "10%",
"delivery" : 250,
"hstPercents" : 13,
"subTotal" : 5605.6,
"totalBeforeHST" : 5295.04,
"total" : 5983.4,
"totalDiscount" : 560.56,
"hst" : 688.36,
"modified" : "2012-10-08",
"created" : "2014-06-25",
"version" : 0
}
>
My problem is that mongodb does not use index only according to the aforementioned explain() output. Why? After all I only request the _id field, which is part of the index.
In general, I feel that I am doing something very wrong. My invoices collection has 65,000 invoices with the total of 3,291,092 items. It took almost 89 seconds to explain() the query.
What am I doing wrong?
You are using arrays and subdocuments. Covered Indexes dont work with either of these.
From the mongo docs:
An index cannot cover a query if:
any of the indexed fields in any of the documents in the collection includes an array. If an indexed field is an array, the index becomes a multi-key index index and cannot support a covered query.
any of the indexed fields are fields in subdocuments. To index fields in subdocuments, use dot notation. For example, consider a collection users with documents of the following form:
http://docs.mongodb.org/manual/tutorial/create-indexes-to-support-queries/
I have a MongoDB query that's taking an unreasonably long time to run, but it:
is only scanning 6 objects
hits an index
consistently takes ~1500ms (wasn't paging or otherwise occupied)
index miss% is 0 in mongostat
It showed up in the profiler (without the explain()), and I don't understand why it's so slow. Any ideas?
gimmebar:PRIMARY> db.assets.find({ owner: "123", avatar: false, private: false }).sort({date: -1}).explain()
{
"cursor" : "BtreeCursor owner_1_avatar_1_date_-1",
"nscanned" : 6,
"nscannedObjects" : 6,
"n" : 6,
"millis" : 1567,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"owner" : [
[
"123",
"123"
]
],
"avatar" : [
[
false,
false
]
],
"date" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
}
}
Missing the index on the private key?
BtreeCursor owner_1_avatar_1_date_-1 vs .find({ owner: "123", avatar: false, private: false }).sort({date: -1})