AttributeError: 'Search' object has no attribute 'execute_suggest' - elasticsearch

Here is my code:
class SearchSuggest(View):
def get(self, request):
key_words = request.GET.get('s', '')
re_datas = []
if key_words:
s = ArticleType.search()
s = s.suggest('my_suggest', key_words, completion={
"field": "suggest", "fuzzy": {
"fuzziness": 1
},
"size": 5
})
suggestions = s.execute_suggest()
for match in suggestions.my_suggest[0].options:
source = match._source
re_datas.append(source["title"])
return HttpResponse(json.dumps(re_datas),
content_type="application/json")
It's a piece of code in views in django.when I run this project.It raise:
File "/home/yixuan/PycharmProjects/Scrapy/LcvSearch/search/views.py", line 20, in get
suggestions = s.execute_suggest()
AttributeError: 'Search' object has no attribute 'execute_suggest'
I don't know where is error. I will appreciate it if you can solve it.
my version is:
elasticsearch-dsl==6.1.0
elasticsearch==6.2.0

It looks like elasticsearch-dsl has removed the function execute_suggest from their Search object. Had to check the source code for this, since it appears it hasn't been documented in their changelogs or releases.
I assume you can just use execute and parse the response according to your needs, but just in case, here is the source code for execute_suggest, if you want to implement it someway yourself.
def execute_suggest(self):
es = connections.get_connection(self._using)
return SuggestResponse(
es.suggest(
index=self._index,
body=self._suggest,
**self._params
)
)
SuggestResponse is just an AttrDict it appears.
Source:
https://github.com/elastic/elasticsearch-dsl-py/blob/6.1.0/elasticsearch_dsl/search.py
https://github.com/elastic/elasticsearch-dsl-py/blob/5.4.0/elasticsearch_dsl/search.py
Hope this helps.

check your elasticsearch-dsl version #elasticsearch-dsl --version
Elasticsearch 6.x
elasticsearch-dsl>=6.0.0,<7.0.0
Elasticsearch 5.x
elasticsearch-dsl>=5.0.0,<6.0.0
Elasticsearch 2.x
elasticsearch-dsl>=2.0.0,<3.0.0

Related

google-cloud-php Document AI: INVALID_ARGUMENT

I am trying to use google-cloud-php to send documents to Google Document AI for processing.
Here is an example of my code:
require 'vendor/autoload.php';
putenv('GOOGLE_APPLICATION_CREDENTIALS=[###].json');
use Google\Cloud\DocumentAI\V1\Document;
use Google\Cloud\DocumentAI\V1\DocumentProcessorServiceClient;
$document = array();
$document['mime_type'] = 'application/pdf';
$document['content'] = file_get_contents('file.pdf');
$inlineDocument = new Document($document);
$postBody = array();
$postBody['inlineDocument'] = $inlineDocument;
$postBody['skipHumanReview'] = true;
$documentProcessorServiceClient = new DocumentProcessorServiceClient();
$formattedName = $documentProcessorServiceClient->processorName('[###]', 'eu', '[###]');
$operationResponse = $documentProcessorServiceClient->processDocument($formattedName, $postBody);
I am passing my arguments according to following documents:
processDocument Documentation
Document Documentation
However, I get the following response:
Fatal error: Uncaught Google\ApiCore\ApiException: { "message": "Request contains an invalid argument.", "code": 3, "status": "INVALID_ARGUMENT", "details": [] } thrown in \vendor\google\gax\src\ApiException.php on line 139
For some reason, the following document mentions to pass one argument as "mimeType" instead of "mime_type" compared to the previous link:
https://cloud.google.com/document-ai/docs/send-request
I tried that as well but that throws an exception in the php class.
Any help would be greatly appreciated!
I was running into the same issue, the problem is when other region than 'us'. If so, you should specify the apiEndpoint:
$documentProcessorServiceClient = new DocumentProcessorServiceClient([
'apiEndpoint' => 'eu-documentai.googleapis.com'
]);

Mocha and Chai: JSON contains/includes certain text

Using Mocha and Chai, I am trying to check whether JSON array contains a specific text. I tried multiple things suggested on this site but none worked.
await validatePropertyIncludes(JSON.parse(response.body), 'scriptPrivacy');
async validatePropertyIncludes(item, propertyValue) {
expect(item).to.contain(propertyValue);
}
Error that I getting:
AssertionError: expected [ Array(9) ] to include 'scriptPrivacy'
My response from API:
[
{
"scriptPrivacy": {
"settings": "settings=\"foobar\";",
"id": "foobar-notice-script",
"src": "https://foobar.com/foobar-privacy-notice-scripts.js",
}
You can check if the field is undefined.
If field exists in the JSON object, then won't be undefined, otherwise yes.
Using filter() expresion you can get how many documents don't get undefined.
var filter = object.filter(item => item.scriptPrivacy != undefined).length
If attribute exists into JSON file, then, variable filter should be > 0.
var filter = object.filter(item => item.scriptPrivacy != undefined).length
//Comparsion you want: equal(1) , above(0) ...
expect(filter).to.equal(1)
Edit:
To use this method from a method where you pass attribute name by parameter you can use item[propertyName] because properties into objects in node can be accessed as an array.
So the code could be:
//Call function
validatePropertyIncludes(object, 'scriptPrivacy')
function validatePropertyIncludes(object, propertyValue){
var filter = object.filter(item => item[propertyValue] != undefined).length
//Comparsion you want: equal(1) , above(0) ...
expect(filter).to.equal(1)
}

How do i serialize objects during enrichment with stream-django and django rest framework?

Im using stream-django with django REST framework and the enriched activities are throwing "not JSON serializable" on the objects returned from enrichment, which is as expected as they have not gone through any serializing.
How do i customize the enrichment process so that it returns a serialized object from my drf serializer and not the object itself?
Some example data, not enriched:
"is_seen": false,
"is_read": false,
"group": "19931_2016-04-04",
"created_at": "2016-04-04T08:53:42.601",
"updated_at": "2016-04-04T11:33:26.140",
"id": "0bc8c85a-fa59-11e5-8080-800005683205",
"verb": "message",
"activities": [
{
"origin": null,
"verb": "message",
"time": "2016-04-04T11:33:26.140",
"id": "0bc8c85a-fa59-11e5-8080-800005683205",
"foreign_id": "chat.Message:6",
"target": null,
"to": [
"notification:1"
],
"actor": "auth.User:1",
"object": "chat.Message:6"
}
The view:
def get(self, request, format=None):
user = request.user
enricher = Enrich()
feed = feed_manager.get_notification_feed(user.id)
notifications = feed.get(limit=5)['results']
enriched_activities=enricher.enrich_aggregated_activities(notifications)
return Response(enriched_activities)
I solved it by doing the following:
property tag on the model that returns the serializer class
#property
def activity_object_serializer_class(self):
from .serializers import FooSerializer
return FooSerializer
Then used this to serialize the enriched activities. Supports nesting.
#staticmethod
def get_serialized_object_or_str(obj):
if hasattr(obj, 'activity_object_serializer_class'):
obj = obj.activity_object_serializer_class(obj).data
else:
obj = str(obj) # Could also raise exception here
return obj
def serialize_activities(self, activities):
for activity in activities:
for a in activity['activities']:
a['object'] = self.get_serialized_object_or_str(a['object'])
# The actor is always a auth.User in our case
a['actor'] = UserSerializer(a['actor']).data
return activities
and the view:
def get(self, request, format=None):
user = request.user
enricher = Enrich()
feed = feed_manager.get_notification_feed(user.id)
notifications = feed.get(limit=5)['results']
enriched_activities = enricher.enrich_aggregated_activities(notifications)
serialized_activities = self.serialize_activities(enriched_activities)
return Response(serialized_activities)
The enrich step replaces string references into full Django model instances.
For example: the string "chat.Message:6" is replaced with an instance of chat.models.Message (same as Message.objects.get(pk=6)).
By default DRF does not know how to serialize Django models and fails with a serialization error. Luckily serializing models is a very simple task when using DRF. There is a built-in serializer class that is specific to Django models (serializers.ModelSerializer).
The documentation of DRF explains this process in detail here: http://www.django-rest-framework.org/api-guide/serializers/#modelserializer.
In your case you probably need to use nested serialization and make the serialization of the object field smart (that field can contain references to different kind of objects).
There is an open issue about this on Github: https://github.com/GetStream/stream-django/issues/38. Ideally this is something the library will provide as helper/example, so any code contribution / input will help making that happen.

Ruby finding duplicates in MongoDB

I am struggling to get this working efficiently I think map reduce is the answer but can't getting anything working, I know it is probably a simple answer hopefully someone can help
Entry Model looks like this:
field :var_name, type: String
field :var_data, type: String
field :var_date, type: DateTime
field :external_id, type: Integer
If the external data source malfunctions we get duplicate data. One way to stop this was when consuming the results we check if a record with the same external_id already exists, as one we have already consumed. However this is slowing down the process a lot. The plan now is to check for duplicates once a day. So we are looking get a list of Entries with the same external_id. Which we can then sort and delete those no longer needed.
I have tried adapting the snippet from here https://coderwall.com/p/96dp8g/find-duplicate-documents-in-mongoid-with-map-reduce as shown below but get
failed with error 0: "exception: assertion src/mongo/db/commands/mr.cpp:480"
def find_duplicates
map = %Q{
function() {
emit(this.external_id, 1);
}
}
reduce = %Q{
function(key, values) {
return Array.sum(values);
}
}
Entry.all.map_reduce(map, reduce).out(inline: true).each do |entry|
puts entry["_id"] if entry["value"] != 1
end
end
Am I way off? Could anyone suggest a solution? I am using Mongiod, Rails 4.1.6 and Ruby 2.1
I got it working using the suggestion in the comments of the question by Stennie using the Aggregation framework. It looks like this:
results = Entry.collection.aggregate([
{ "$group" => {
_id: { "external_id" => "$external_id"},
recordIds: {"$addToSet" => "$_id" },
count: { "$sum" => 1 }
}},
{ "$match" => {
count: { "$gt" => 1 }
}}
])
I then loop through the results and delete any unnecessary entries.

tire terms filter not working

I'm trying to achieve a "scope-like" function with tire/elasticsearch. Why is this not working, even when i have entries with status "Test1" or "Test2"? The results are always empty.
collection = #model.search(:page => page, :per_page => per_page) do |s|
s.query {all}
s.filter :terms, :status => ["Test1", "Test2"]
s.sort {by :"#{column}", "#{direction}"}
end
The method works fine without the filter. Is something wrong with the filter method?! I've checked the tire doku....it should work.
Thanks! :)
Your issue is most probably being caused by using the default mappings for the status field, which would tokenize it -- downcase, split into words, etc.
Compare these two:
http://localhost:9200/myindex/_analyze?text=Text1&analyzer=standard
http://localhost:9200/myindex/_analyze?text=Text1&analyzer=keyword
The solution in your case is to use the keyword analyzer (or set the field to not_analyzed) in your mapping. When the field would not be an “enum” type of data, you could use the multi-field feature.
A working Ruby version would look like this:
require 'tire'
Tire.index('myindex') do
delete
create mappings: {
document: {
properties: {
status: { type: 'string', analyzer: 'keyword' }
}
}
}
store status: 'Test1'
store status: 'Test2'
refresh
end
search = Tire.search 'myindex' do
query do
filtered do
query { all }
filter :terms, status: ['Test1']
end
end
end
puts search.results.to_a.inspect
Note: It's rarely possible -- this case being an exception -- to offer reasonable advice when no index mappings, example data, etc. are provided.

Resources