includeStats in neo4j.rb - ruby

I'm using neo4j.rb, and when I run
MATCH (a {name:'apple'}) SET a.flag = true
I'd like to get the response data, which would be along the lines of:
{
"results": [
{
"columns": [],
"data": [],
"stats": {
"contains_updates": true,
"nodes_created": 0,
"nodes_deleted": 0,
"properties_set": 1,
"relationships_created": 0,
"relationship_deleted": 0,
"labels_added": 0,
"labels_removed": 0,
"indexes_added": 0,
"indexes_removed": 0,
"constraints_added": 0,
"constraints_removed": 0
}
}
],
"errors": []
}
Instead, I get nothing--the object is blank, I suppose because I'm not asking for nodes to be returned, but want metadata on the query results.
There's a proposed solution here using py2neo (py2neo return number of nodes and relationships created), with includeStats: true, and I've also tried appending it to the address I'm using to run queries as ?includeStats=true, which I saw somewhere else and resulted in a server not available error (response code 302 / RuntimeError) for me. Is there any solution for this using neo4j.rb ?

Unfortunately we don't keep the metadata when returning results in the neo4j-core gem. It might be something that's easy to add. Perhaps you could create an issue:
https://github.com/neo4jrb/neo4j-core/issues
Pull requests are welcome, of course!

Related

Can inclusion of specific fields change the elasticsearch result set?

I have an ES query that returns 414 documents if I exclude a specific field from results.
If I include this field, the document count drops to 328.
The documents that get dropped are consistent and this happens whether I scroll results or query directly.
The field map for the field that reduces the result set looks like this:
"completion": {
"type": "object",
"enabled": false
}
Nothing special to it and I have other "enabled": false object type fields that return just fine in this query.
I tested against multiple indexes with the same data to rule out corruption (I hope).
This 'completion' object is a nested and ignored object that has 4 or 5 levels of nesting but once again, I have other similarly nested objects that return just fine for this query.
The query is a simple terms match for 414 terms (yes, this is terrible, we are rethinking our strategy on this):
var { _scroll_id, hits } = await elastic.search({
index: index,
type: type,
body: shaQuery,
scroll: '10s',
_source_exclude: 'account,layout,surveydata,verificationdata,accounts,scores'
});
while (hits && hits.hits.length) {
// Append all new hits
allRecords.push(...hits.hits)
var { _scroll_id, hits } = await elastic.scroll({
scrollId: _scroll_id,
scroll: '10s'
})
}
The query is:
"query": {
"terms": {
"_id": [
"....",
"....",
"...."
}
}
}
In this example, I will only get back 328 results. If I add 'completion' to the _source_exclude then I get the full set back.
So, my question is: What are the scenarios where including a field in the result could limit the search when that field is totally unrelated to the search.
The #'s are specific to this example but consistent across queries. I just include them for context on the overall problem.
Also important is that this completion field has the same data and format across both included and excluded records, I can't see anything that would cause a problem.
The problem was found and it was obscure. What we saw was that it was always failing at the same point and when it was examined a little more closely, the same error was coming out:
{ took: 158,
timed_out: false,
_shards:
{ total: 5,
successful: 4,
skipped: 0,
failed: 1,
failures: [ [Object] ] },
[ { shard: 0,
index: ‘theindexname’,
node: ‘4X2vwouIRriYbQTQcHQ_sw’,
reason:
{ type: ‘illegal_argument_exception’,
reason:
‘cannot write xcontent for unknown value of type class java.math.BigInteger’ } } ]
Ok well thats strange, we are not using BigIntegers at all. But, thanks to the power of the Google this issue in the elasticsearch issue tracker was revealed:
https://github.com/elastic/elasticsearch/pull/32888
"XContentBuilder to handle BigInteger and BigDecimal" which is a bug in 6.3 where fields that used BigInteger and BigDecimal would fail to serialize and thus break when source filtering was applied. We were running 6.3.
It is unclear why our systems are triggering this issue but upgrading to 6.5 solved it entirely.
Obscure obscure obscure but solved thanks to Javier's persistence.

What data format/structure is this and how to handle it?

So I have ran into such data format:
{
"i": {
"hid|15#aid|9305#h|Openjobmetis Varese#a|Germani Basket Brescia#h2|VARESE#a2|BRESCIA#round|1019#nat|ita#hcolors": {
"bg|851010#g1|920000#g2|ad0b0b#g3|800000#c|"
},
"acolors": {
"bg|037f43#g1|00582d#g2|0fb966#g3|037f43#c|"
},
"hp|33#vp|20"
},
"idor": 0,
"jr|1#t": 19,
"t2": 30,
"ip|#b": false,
"v": {
"h": 0,
"a": 0,
"t": 30,
"h2": 12,
"a2": 12
}
}
I have never seen such structure and I could not find any sources to explain me this format. Actually I was not even sure how to search it.
So yeah, my question is, what is this data format and how can I handle it?
Looks like JSON.
It appears some data was also encoded as pipe-delimited string values in the JSON.
Ok, right after posting this question, I was able to decode this JSON. It seemed like a JSON all along but these pipes were kind of intimidating. This is how I solved it eventually.
function jsonDecode(json){
if(!json) return null;
json = json.replace(/#/g, '","').replace(/\|/g, '":"').replace(/%/g, '"},{"');
return JSON.parse(json);
}
Great! This question is now answered. Thanks everybody!

Elasticseach Multisearch only returns 1 set of results

I'm trying to return multiple "buckets" of results from Elasticsearch in one HTTP request.
I'm using the _msearch API.
I'm using the following query:
POST /_msearch
{"index" : "[INDEXNAME]", "type":"post"}
{"query" : {"match" : {"post_type":"team-member"}}, "from" : 0, "size" : 10}
{"index" : "[INDEXNAME]", "type": "post"}
{"query" : {"match" : {"post_type": "article"}}, "from" : 0, "size" : 10}
The query executes without error, but the results only return one object, where it seems it shoul be two (one for the 10 team-members, and one for the 10 articles):
{
"responses": [
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 4,
"successful": 4,
"failed": 0
},
"hits": {
"total": 191,
"max_score": 3.825032,
"hits": [
{...}
]
}
}, // second query should be here, no?
]
}
Is my query construction wrong, or am I misunderstanding how this should work?
The format of a _msearch request must follow the bulk API format. It must look something like this:
header\n
body\n
header\n
body\n
The header part includes which index / indices to search on, optional (mapping) types to search on, the search_type, preference, and routing. The body includes the typical search body request (including the query, aggregations, from, size, and so on).
NOTE: the final line of data must end with a newline character \n.
Make sure your query follows this format (from your code example, depending on the environment, as you've added two new lines after POST /_msearch, your query may or may not work; you should only add one new line) . If the responses array only has one result, then, in your case, the last query is somehow discarded - again, check its format.
I don't see any problem actually, but you should check "Bulk API", it's similar.
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

Time taken by count query - Elastic search

I want to know the time taken by count query in elastic search, just like the search query query which contain took - time taken.
My Query looks like -
curl -XGET "http://localhost:9200/index1/type1/_count"
And result for that query -
{
"count": 136,
"_shards": {
"total": 15,
"successful": 15,
"failed": 0
}
}
Is there is any way so that I can get the time taken for count query just like search api?
Document for count API - Count API
At the time of writing this answer still its not supported by Elastic, raised a feature request and mostly I will work on to add a support of it.
A trick that can help with that is to use _search
with size zero (so no restult will be returned);
track_total_hits set to true (so it will count all hits, not only the ones in the result window); and
filter_path equal to took,htis.total.value.
For example, I executed the query above in a cluster of mine...
GET viagens-*/_search?filter_path=took,hits.total.value
{
"size": 0,
"track_total_hits": true,
"query": {
"match_all": {}
}
}
...and got this result:
{
"took": 2,
"hits": {
"total": {
"value": 2589552
}
}
}
It does not profile the Count API itself, unfortunately, but has a similar result. Can be very useful as an alternative in some situations!

Count query with PHP Elastica and Symfony2 FosElasticaBundle

I'm on a Symfony 2.5.6 project using FosElasticaBundle (#dev).
In my project, i just need to get the total hits count of a request on Elastic Search. That is, i'm querying Elastic Search with a normal request, but through the special "count" URL:
localhost:9200/_search?search_type=count
Note the "search_type=count" URL param.
Here's the example query:
{
"query": {
"filtered": {
"query": {
"match_all": []
},
"filter": {
"bool": {
"must": [
{
"terms": {
"media.category.id": [
681
]
}
}
]
}
}
}
},
"sort": {
"published_at": {
"order": "desc"
}
},
"size": 1
}
The results contains a normal JSON response but without any documents in the hits part. From this response i easily get the total count:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 81,
"max_score": 0,
"hits": [ ]
}
}
Okay, hits.total == 81.
Now, i couldn't find any solution to do the same through FOSElasticaBundle, from a repository.
I tried this:
$query = (...) // building the Elastica query here
$count = $this->finder->findPaginated(
$query,
array(ES::OPTION_SEARCH_TYPE => ES::OPTION_SEARCH_TYPE_COUNT)
)->getNbResults();
But i get an Attempted to load class "Pagerfanta". I don't want Pagerfanta.
Then this:
$count = $this->finder->createPaginatorAdapter(
$query,
array(ES::OPTION_SEARCH_TYPE => ES::OPTION_SEARCH_TYPE_COUNT)
)->getTotalHits();
But it would always give me 0.
Would be easy if i had access to the Elastica Finder service from the repository (i could then get a ResultSet from the query search, and this ResultSet has a correct getTotalHits() method). But services from repository... you know.
Thank you for any help or clue!
I faced the same challenge, getting access to the searchable interface from inside the repo. Here's what I ended up with:
Create AcmeBundle\Elastica\ExtendedTransformedFinder. This just extends the TransformedFinder class and makes the searchable interface accessible.
<?php
namespace AcmeBundle\Elastica;
use FOS\ElasticaBundle\Finder\TransformedFinder;
class ExtendedTransformedFinder extends TransformedFinder
{
/**
* #return \Elastica\SearchableInterface
*/
public function getSearch()
{
return $this->searchable;
}
}
Make the bundle use our new class; in service.yml:
parameters:
fos_elastica.finder.class: AcmeBundle\Elastica\ExtendedTransformedFinder
Then in a repo use the getSearch method of our class and do what you want :)
class SomeSearchRepository extends Repository
{
public function search(/* ... */)
{
// create and set your query as you like
$query = Query::create();
// ...
// then run a count query
$count = $this->finder->getSearch()->count($query);
}
}
Heads up this works for me with version 3.1.x. Should work starting with 3.0.x.
Ok, so, here we go: it is not possible.
You cannot, as of version 3.1.x-dev (2d8903a), get the total matching document count returned by elastic search from FOSElasticaBundle, because this bundle does not expose this value.
The RawPaginatorAdapter::getTotalHits() method contains this code:
return $this->query->hasParam('size')
? min($this->totalHits, (integer) $this->query->getParam('size'))
: $this->totalHits;
which prevents to get the correct $this->totalHits without actually requiring any document. Indeed, if you set size to 0, to tell elasticsearch not to return any document, only meta information, RawPaginatorAdapter::getTotalHits() will return 0.
So FosElasticaBundle doesn't provide a way to know this total hits count, you could only do that through the Elastica library directly. Of course with the downisde that Elastica finders are natively available in \FOS\ElasticaBundle\Repository. You'd had to make a new service, do some injection, and inovke your service instead of the FOSElasticaBundle one for repositories... ouch.
I chose another path, i forked https://github.com/FriendsOfSymfony/FOSElasticaBundle and changed the method code as follow:
/**
* Returns the number of results.
*
* #param boolean $genuineTotal make the function return the `hits.total`
* value of the search result in all cases, instead of limiting it to the
* `size` request parameter.
* #return integer The number of results.
*/
public function getTotalHits($genuineTotal = false)
{
if ( ! isset($this->totalHits)) {
$this->totalHits = $this->searchable->search($this->query)->getTotalHits();
}
return $this->query->hasParam('size') && !$genuineTotal
? min($this->totalHits, (integer) $this->query->getParam('size'))
: $this->totalHits;
}
$genuineTotal boolean restores the elasticsearch behaviour, without introducing any BC break. I could also have named it $ignoreSize and use it the opposite way.
I opened a Pull Request: https://github.com/FriendsOfSymfony/FOSElasticaBundle/pull/748
We'll see! If that could help just one person i'd be happy already!
While, you can get the index instance as a service (fos_elastica.index.INDEX_NAME.TYPE_NAME) and ask for count() method.
Joan

Resources