Not getting all comments Near Geo Position in MongoDB(Laravel) - laravel

I'm working on app and want to get all comments for a specific post, I created each comment for each country with its lat long(as one post may has many comments), but When I execute query I'm not getting all result, but it shows me only 17 results while there are 200+ comments
My query is as follows
db.posts.aggregate([
{ "$geoNear": {
"near": [70.0000,30.0000 ],
"spherical": true,
"distanceField": "distance",
"distanceMuliplier": 6371,
}},
{
"$match":{"post_id":"5efb10c325a36375c97dbda4"}
} ])
and I picked lat long from this link
https://gist.github.com/sindresorhus/1341699
Please help me, Thank you

Related

Google People API notes field

I'd like to know how to get the 'notes' field from the Google people API. There's no equivalent in documentation. I tried "biographies" but that didn't return any notes when I called the API for names, emails, etc.
I've seen this question asked a few years ago but I know the API has changed a bit and was hoping someone knows the answer.
Thanks!
Answer
The biographies field still representing the notes
How to check it?
Create a contact setting the notes with a concrete string, i.e. Hey, I'am a note
Call people.connections.list
resourceName = people/me
personFields = names
Get the resourceName to use it later
Call people.get
resourceName = resource from point 3
personFields = biographies
Check the biographies field, it will look something like:
"biographies": [
{
"metadata": {
"primary": true,
"source": {
"type": "CONTACT",
"id": "randomId"
}
},
"value": "Hey, I'am a note",
"contentType": "TEXT_PLAIN"
}
References
people.connections.list
people.get

Get document by index position in Elasticsearch

I am working with Elasticsearch and I am getting a query error:
elasticsearch.exceptions.TransportError: TransportError(500, 'search_phase_execution_exception', 'script score query returned an invalid score: NaN for doc: 32894')
It seems like my metric is returning NaN for document 32894 (NaN for doc: 32894). Naturally, the next step is to look at that document to see if there is anything wrong with it.
The problem is that I upload my documents using my own ID, so "32894" is meaningless for me.
A query like
curl -X GET "localhost:9200/my_index/_doc/one_of_my_ids?pretty&pretty"
works fine, but this fails if I try with the doc number from the error message.
I expected this to be trivial, but some Google has failed to help.
How can I then find this document? Or is using my own IDs not recommended and the unfixable source of this problem?
Edit: as requested, this is the query that fails. Note that obviously fixing this is my ultimate goal, but not the specific point of this question. Help appreciated in either case.
I am using the elasticsearch library in Python.
self.es.search(index=my_index, body=query_body, size=number_results)
With
query_body = {
"query": {
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "cosineSimilaritySparse(params.queryVector, doc['embedding']) + 10.0",
"params": {"queryVector": query_vector}
}
}
}
}

Can inclusion of specific fields change the elasticsearch result set?

I have an ES query that returns 414 documents if I exclude a specific field from results.
If I include this field, the document count drops to 328.
The documents that get dropped are consistent and this happens whether I scroll results or query directly.
The field map for the field that reduces the result set looks like this:
"completion": {
"type": "object",
"enabled": false
}
Nothing special to it and I have other "enabled": false object type fields that return just fine in this query.
I tested against multiple indexes with the same data to rule out corruption (I hope).
This 'completion' object is a nested and ignored object that has 4 or 5 levels of nesting but once again, I have other similarly nested objects that return just fine for this query.
The query is a simple terms match for 414 terms (yes, this is terrible, we are rethinking our strategy on this):
var { _scroll_id, hits } = await elastic.search({
index: index,
type: type,
body: shaQuery,
scroll: '10s',
_source_exclude: 'account,layout,surveydata,verificationdata,accounts,scores'
});
while (hits && hits.hits.length) {
// Append all new hits
allRecords.push(...hits.hits)
var { _scroll_id, hits } = await elastic.scroll({
scrollId: _scroll_id,
scroll: '10s'
})
}
The query is:
"query": {
"terms": {
"_id": [
"....",
"....",
"...."
}
}
}
In this example, I will only get back 328 results. If I add 'completion' to the _source_exclude then I get the full set back.
So, my question is: What are the scenarios where including a field in the result could limit the search when that field is totally unrelated to the search.
The #'s are specific to this example but consistent across queries. I just include them for context on the overall problem.
Also important is that this completion field has the same data and format across both included and excluded records, I can't see anything that would cause a problem.
The problem was found and it was obscure. What we saw was that it was always failing at the same point and when it was examined a little more closely, the same error was coming out:
{ took: 158,
timed_out: false,
_shards:
{ total: 5,
successful: 4,
skipped: 0,
failed: 1,
failures: [ [Object] ] },
[ { shard: 0,
index: ‘theindexname’,
node: ‘4X2vwouIRriYbQTQcHQ_sw’,
reason:
{ type: ‘illegal_argument_exception’,
reason:
‘cannot write xcontent for unknown value of type class java.math.BigInteger’ } } ]
Ok well thats strange, we are not using BigIntegers at all. But, thanks to the power of the Google this issue in the elasticsearch issue tracker was revealed:
https://github.com/elastic/elasticsearch/pull/32888
"XContentBuilder to handle BigInteger and BigDecimal" which is a bug in 6.3 where fields that used BigInteger and BigDecimal would fail to serialize and thus break when source filtering was applied. We were running 6.3.
It is unclear why our systems are triggering this issue but upgrading to 6.5 solved it entirely.
Obscure obscure obscure but solved thanks to Javier's persistence.

Regarding Google Places Near by Search

While searching for places using google nearby search for anyone of type from ["hospital", "pharmacy", "doctor", "dentist", "physiotherapist", "clinic", "diagnostic center"] , In return, the response from Google Places is giving places other than the mentioned types. Any solution to avoid results other than mentioned types. The e.g. parameters { location: [ '6.4504948', '3.3956989' ], type: 'Clinics', radius: 25000 }

How do I architect my Elasticsearch indexes for rapidly changing data?

I have two models in my MySQL database: Users and Posts
users have geolocation attributes (lat/long)
posts simply have a body of text.
I want to use Elasticsearch to find all posts that match a string of text plus use the user's location as a filter. The problem is -- the user's location always changes (as people walk around the city). I will be frequently updating the lat/long of each user.
This is my current solution:
Index the posts and have a geolocation attribute in each
document. When a user changes location, run an elasticsearch batch
update on all that user's posts, and modify the geolocation attribute
on those documents.
Obviously this is not a scalable solution -- what if the user has 2000 posts and walks around the city? I'd have to update 2000 documents every minute.
Is there a way to "relationally map" the posts to the user's object and use it as a filter, so that when the location changes, I only need to update that user's object instead of all his posts?
Updating 2000 posts per minute is not a big deal either with the update by query plugin or with the upcoming reindex API. However, if you have many users with many posts and you need to update them in short intervals (e.g. 1 min), it might not be that scalable, indeed. Say if it takes 500 milliseconds to update all posts from a user, you'd start to lag behind at around 120 users.
Clearly, since the users' posts need to "follow" the user and don't keep the location the user had when she posted them, I would first query the users around a given location and get their IDs, and then run a second query on posts filtered by those user IDs and the matching body text.
It is perfectly OK to keep both of your indices simple and only update the location in a single user's document every minute. Those two queries I'm suggesting should be quite fast and you should not be worried of running them. People are often worried when they need to run two or more queries in order to find their results. Sometimes, trying to tie the documents to tight together is not the solution and simply running two queries over two indices is the key and works perfectly well.
The query to retrieve users would look similar to the first one below, where you only retrieve the _id property of the user. I'm making the assumption that your user documents have the id of the user as their ES doc _id, so you do not have to retrieve the _source at all (i.e. "_source": false) which is even faster and you can simply return the _id with response filtering:
POST /users/_search?filter_path=hits.hits._id
{
"size": 1000,
"_source": false,
"query": {
"bool": {
"filter": [
{
"geo_distance": {
"distance": "100m",
"location": {
"lat": 32.5362723,
"lon": -80.3654783
}
}
}
]
}
}
}
You'll get all the _id values of the users who are currently 100 meters around the desired geographic location. Then the next query consists of filtering the posts by those ids while matching their body text.
POST /posts/_search
{
"size": 50,
"query": {
"bool": {
"must": {
"match": {
"body": "some text"
}
},
"filter": [
{
"terms": {
"user_id": [ 1, 2, 3, 4 ]
}
}
]
}
}
}

Resources