Arango single tree response using AQL only - d3.js

I have found several questions that are similar but no solution worked as needed, or used internal functions. This is the most relevant one:
Getting data for d3 from ArangoDB using AQL (or arangojs)
I'm unable to understand how to return a single response with a tree structure of parent + children. Something that D3 can understand. Whatever I do, beyond the first iteration, everything is a mess. I have tried MERGE and MERGE_RECURSIVE but it just did not work as I thought of.
I'm clueless to how I can make it to work. I'm used to Neo4J and for some reason this one is just hard for me to understand.
Any help will do,
Thanks,
DD.

I found a simple solution. I'm just using AQL to get a flat list of results and their edges. After that, I just sort it as I need on my code

Related

Simple arithmetic functions in Elasticsearch

I am starting to get acquainted with the use of ELK for work purposes, but struggle to find a solution to use simple mathematic requests in my database.
As shown on the picture, my DB contains 16 available fields, but I would like to create others, without doing it on Excel before converting my file in CVS again.
For example, I would like to create a variable #Bugs/Release. I've heard that this is quite easy to make with no need of scripting, but I can't find the way to do it... Has anybody the solution of this problem?
Huge thanksenter image description here

Parsing STDF Files to Compare results

I am new to this site and I would like to get some inputs regarding parsing STDF files. Generally speaking, I am trying to parse a STDF file to gather only the results (numbers) and not the rest of the line. If I am able to achieve this, I would then like to compare all the numbers together through a bubble sort or insertion sort and see if any numbers are equal to each other. I am capable of doing this in C/C++ and Java but I have no experience parsing documents using Scripts.
Could anyone push me in the right direction? What should I be reading to learn my way around this?
Are you already using an STDF library?
You did not mention one, so I assume not.
You should find a library you are comfortable with (the list changes over time, but you can find some by Googling or looking at the STDF page on Wikipedia) rather than attempting to parse STDF yourself, unless you have a good reason to recreate the STDF parser wheel.
An STDF file contains many tests. It generally does not make sense to compare the results for different tests, so I assume you are looking for matching values within the set of results for each test.
I would use your chosen STDF parser to read the value of each test for each part. Keep a set of the results for each test. As you read each new result, check the set to see if already exists. If it does, you have found the case you were looking for, otherwise add the result to the set.

Need help rewriting XQuery to avoid expanded tree cache full error in MarkLogic

I am new to XQuery and MarkLogic.
I am trying to update documents in MarkLogic and get the extended tree cache full error.
Just to get the work done I have increased the expanded tree cache but that is not recommended.
I would like to tune this query so that it does not need to simultaneously cache as much XML.
Here is my query
I have uploaded my query as an image because it was not so pretty when I pasted it on the editor. If any one knows a better way please suggest.
Thanks in advance.
Expanded tree cache errors can be caused by executing queries that select too many XML nodes at once. In your example, this is likely the culprit: /tx:AttVal[tx:AttributeName/text()=$attributeName].
It's possible that calling text() is the source of your problem (and text() probably not what you mean anyway - see this blog), causing MarkLogic to evaluate that function on all these nodes, and that by simply using /tx:AttVal[tx:AttributeName=$attributeName] it may solve your problem.
Next I would consider an adding a path range index on /tx:AttVal/tx:AttributeName and query those nodes using cts:search and cts:path-range-query. This will be substantially faster than just XPath without a range index. It's also possible to use XPath with a range index: MarkLogic will automatically optimize the XPath expression to use the range index; however, there can be reasons it doesn't optimize the expression correctly, and you would want to check that using xdmp:plan.
Also note that the general best practice recommendation for XML in MarkLogic is to use "semantic XML". E.g., when you mean an attribute, use an attribute: <some-node AttributeName=AttVal>. MarkLogic's indexes are optimized out of the box for semantic XML design. However, if you don't have an option but to work with XML that's not, then that's what path range indexes were designed for.
I've just solved exactly this scenario. There are two things I did
I put the node-replace and node-insert type calls (that is any calls that modify the XML structure into a separate module and then called that module using xdmp:invoke, passing in any parameters required, like this
let $update := xdmp:invoke("/app/lib/update-attribute-node.xqy",
(xs:QName("newValue"), $new),
{xdmp:modules-database()})
The reason why this works is that the call to xdmp:invoke happens in it's own transaction and once it completes, the memory is cleared up. If you don't do this then, each time you call the update or insert function, it will not actually do the write, until the end in a single transaction meaning your memory will fill up pretty quickly.
Any time I needed to loop over paths in MarkLogic (or documents or whatever they are called - I've only been using MarkLogic for a few days) and there are a large number of them I processed them only a few at a time like below. I came up with an elaborate way of skipping and taking only a batch of documents at a time, but you can do it in any number of ways.
let $whatever:= xdmp:directory("/whatever/")[$start to $end]
I also put this into a separate module so that it is processed immediately and not in a single transaction.
Putting all expensive calls into separate modules and taking only a subset of large data sets at a time helped me solve my expanded tree cache full errors.

Understanding X-Path Expression

I'm trying to get an understanding of XPath in order to parse a diffxml file. I skimmed over the w3schools site. Am I understanding these correctly?
Statement 1: /node()[1]/node()[3]
Selects the third child of the root node
Statement 2: /node()[1]/node()[1]/node()[1]
Selects the child of the first node of the root node
Statement 3: /node()[1]/node()[3]/node()[2]
Selects the second child of the third node under the root node.
Yes, you understand them correctly, but this is not how you'd use XPath. First node() can be anything, not just elements. Then the pure index is arguably the wort way of selecting things, you should really use names, and possibly predicates for filtering the node-sets.
You'll find a lot of criticism of w3schools on this site. Personally I find it a useful resource, but only when I'm trying to remind myself of something I once knew. It's not really designed for teaching yourself things from scratch, and I suggest you need a different learning strategy. Call me old-fashioned, but when I'm learning a new technology I find there's nothing better than a good book.
You've understood your examples correctly as far as I can tell. But have you understood what a "node" is? For example, do you know under what circumstances whitespace text counts as a node? The key to understanding XPath is to understand the data model, and the way in which the data model relates to the lexical (angle-bracket) form of the XML.

MongoDB find and remove - the fastest way

I have a quick question, what is the fast way to grab and delete an object from a mongo collection. Here is the code, I have currently:
$cursor = $coll->find()->sort(array('created' => 1))->limit(1);
$obj = $cursor->getNext();
$coll->remove(array('name' => $obj['name']));
as you can see above it grabs one document from the database and deletes it (so it isn't processed again). However fast this may be, I need it to perform faster. The challenge is that we have multiple processes doing this and processing what they have found BUT sometimes two or more of the processes grab the same document therefore making duplicates. Basically I need to make it so a document can only be grabbed once. So any ideas would be much appreciated.
Peter,
It's hard to say what the best solution is here without understanding all the context - but one approach which you could use is findAndModify. This will query for a single document and return it, and also apply an update to it.
You could use this to find a document to process and simultaneously modify a "status" field to mark it as being processed, so that other workers can recognize it as such and ignore it.
There is an example here that may be useful:
http://docs.mongodb.org/manual/reference/command/findAndModify/
Use the findAndRemove function as documented here:
http://api.mongodb.org/java/current/com/mongodb/DBCollection.html
The findAndRemove function retrieve and object from the mongo database and delete it in a single (atomic) operation.
findAndRemove(query, sort[, options], callback)
The query object is used to retrieve the object from the database (see collection.find())
The sort parameter is used to sort the results (in case many where found)
I make a new answer to remark the fact:
As commented by #peterscodeproblems in the accepted answer. The native way to this in mongodb right now is to use the
findAndModify(query=<document>, remove=True)
As pointed out by the documentation.
As it is native, and atomic, I expect this to be the faster way to do this.
I am new to mongodb and not entirely sure what your query is trying to do, but here is how I would do it
# suppose database is staging
# suppose collection is data
use staging
db.data.remove(<your_query_criteria>)
where is a map and can contain any search criteria you want
Not sure if this would help you.

Resources