Combining graphs using Statsd & Graphite

Combining graphs using Statsd & Graphite - metrics

In Statsd client, how do I combine two graphs to show it as a single one?
Like this?

I'm just playing around with graphite and statsd right now, so I am no expert, but I might be missing something here...why would you combine it in statsd? It seems you would want it to be combined in the Graphite UI, which will let you combine multiple stats onto a single graph. In fact, I think you could create a graph with both of these metrics present and then save the graph to "My Graphs" to pull up later.
As far as statsd is concerned, you are just writing out two separate metrics, one being a counter called NumWarnings and another called Deployed. To get a graph like the one above, I supposed you'd need to assign an arbitrary value to Deployed as a counter for each deployment (100 seems like a good number).

For people discovering this later: with graphite if you drag a graph with the mouse and hold it over another one for a few seconds, a green box will appear saying "Drop to Merge". Drop the graph when that appears and it will merge the two graphs into one.

Related

CEP (Apama) Increase performance for geofencing

Need to know if it is possible (and will have effect) to implement a b-tree within a the CEP (single corerlator). The problem we face is that we can not handle more then 1000 messages per second. I think it is caused by the way the solution has been implemented.
We want to detect if a position is wihtin a zone and raise an event on entering, stop, start and leaving zone. We have now just around 500 zones and up to more then 1000 positions per second want to increase the zones. Messages are now being back-up. I think the solutions would be introducing a B-tree within the CEP. So firts you would detect if a position is in the head zone and then query if the positons are in the zones within this head zone. I think this could increase perfomance, but not realy sure if it is possible or wise.
Has anyone had any experience?

Firstly, we're deprecating the community forum in favour of answering questions here, so you're in the right place.
Secondly, the answer to your question probably needs a bit more detail about what you're doing at the moment. How are you managing your geofencing at the moment? Apama has built-in support for matching locations against rectangular areas with the location type. Using that in a hypertree expression with a listener should be very fast.
To manage other shaped geofences we'd recommend starting by using the bounding box in a listener and then doing your specific geofence calculation on events that fall within the bounding box.
To answer your question about a hierarchical approach - if the above does not help enough then you could start with corse-grained bounding boxes in an ingestion context which then delegates to multiple secondary contexts with more detailed bounding boxes again using the hypertree. These secondary contexts would be able to work in parallel.
On a single large machine we've managed to handle hundreds of thousands of location updates across thousands of geofences, although this will very much depend on what action you're taking when you get a match and what your match rate is.
HTH,
Matt

How can I visually compare refactoring options?

I have an existing system model, representing services calling APIs backed by one or more implementations in perhaps other services, in the form of a DAG. I've been rendering this by GraphViz dot files' digraph which is fairly reasonable, if sometimes difficult to enforce layering.
While considering various possible refactorings of services and APIs, I'd like to be able to chart alternative routes towards an end goal. Each refactoring step would yield a different DAG – represented in terms of diffs from the previous DAG (e.g., convert service A's use of API x in service B to API y in service C) – and similarly renderable.
What tools are there to be able to create such "refactoring paths" and then visualize flows between them, determine dependencies and parallelism? Extra bonus points for goal seeking (e.g., no dependencies from any service other than service A on service C; cheapest path based on weights) and providing a loose ordering of refactorings that demonstrate their (presumably) monotonically increasing system value.
I am picturing two UI components:
a DAG diff that shows visually the nodes/vectors that got replaced in the source graph with nodes/vectors in the second
a controlling display that acts like D3's force layout that lets you navigate the loosely connected DAG of refactorings and select which refactoring you'd like to see the before/after picture in the DAG diff.
That said, I'm totally up to other tooling, formats, etc. Just would like to be able to produce these and display them to other people as to why what we're doing is valuable (goal assertion) and taking as long as it does (dependencies and Gantt charts).

Would it be feasible for you to create a separate tool that, given two similar DAGs, outputs the "merge" of them? If that's possible, then visualizing the merge DAG will probably tell you a lot about both DAGs. You can color-code the nodes by whether they appear in both DAGs or in either.
That's how we originally designed the visual diffs of workflow graphs in VisTrails, see here.
If you insist on showing the two DAGs side by side, creating the merge DAG might still be the right idea, because then dot can lay the merge graph out, and you can simply hide the appropriate nodes for each subgraph. In this way, the shared structure will be laid out consistently by construction.

d3js force large number of nodes

pl. help me with this noob questions. I want to show a network with large number (70000) of nodes, and 2.1 million links in force layout. Looking for a good and scalable way to do this.
How do we actually show such large nodes practically, can we do some kind of approximation and show semantically same network (e.g: http://www.visualcomplexity.com/vc/project.cfm?id=76 )
How do we actually reduce such data in back end [ say using KDE ? We cannot afford to use science.js in front end as the volume is large ]
Initial view can be the network with pre-determined locations of the nodes or clusters. How do we predertmine the locations in back end, before sending the data to d3js. Do we have to use topojson ?
Any such examples are available using d3js (and a backend - say java, python etc) ?

Sorry about the question, but do you really need to show all that information in one shot?
If you really need it, have first a look with Gephi and see what it looks like, then pass to the next step.
If you see that you can focus on specific nodes or patterns at the beginning and then explore the result of the chart, probably this is the best solution from a performance point of view.
In case the discovery approach works but you are still having troubles with many items on the screen, just control the force layout with a time based threshold. It's not perfect but it will work for hundred nodes.
Next step
If you decide to go anyway on this path, I would recommend the followings:
Aggregate: that's probably the most useful thing you can do here: let the user interact with the data and dig in it to see more in detail. That is the best solution if you have to serve many clients.
Do not run the force directed layout on the front end with the entire network as is: it will eat all the browser resources for at least tens of minutes in any case.
Compute the layout on the back end - e.g. using JUNG or Gephi core itself in Java or NetworkX in Python - and then just display the result.
Cache the result of the point above as well: they are many even for the server if you have many clients, so cache it.
When the user drag the network, hide the links: it should speed up the computation ( sigmajs uses this trick)

Is there a chart/listener in JMeter that would show data of different runs in same graph?

I have measured separately our application server CPU with 5 users and 15 users. Both runs took 5 minutes. I would like to show both runs and CPU in one graph. Currently the only way to do it is to export CSV to Excel and create custom graphs. This is silly because the graphs are quite good in JMeter itself.
JMeter plugin extensions contains a graph called Composite graph, but the use-case seems to be to be able to add many data points to one and the same graph from within the same run. It doesn't seem to be able to merge graphs from two separate thread runs because the time in the two are different (the two runs have been executed in succession and therefore the graphs are shown after one-another, not at the same time).
Any ideas? Thanks!

It looks like it can be done using the information here jmeter_wiki but that it will still require some effort.

How can I store a graph and run page rank like analytics on it hbase?

Sorry if this question seems a bit complex but I think its all related so I wanted try to get the answer in one shot. Basically I have a layered graph*, that has various sets of data that are connected to only the next set of data(so set1 has vertexes that have edges to set2, and so on but set1 has nothing connecting to set3 or anything other than set2. It might be relevant not sure). Generally, you can think of my data as one massive family tree(every set I add about a billion nodes) that I keep loading new generations with every new set(families create new families and no edges go backwards).
I have an Hbase/hadoop system running and I know how to use java to add columns and values, but what I don't know how to do is:
add data to hbase in a graph type format(since its hbase, I want to load it in a way that I can add a ton of data and it'll scale..unlike other databases that limit graphs to the size of the system). I know how to add data but don't understand how to do it in a scalable graph way.
Once the graph is loaded I want to know how to apply some kind of analytics to it. Pagerank is popular so I thought I would say it, but pretty much anything that is based on processing a graph.
I guess the simplified way of asking the question is how to do I specifically get a graph into hbase and once its there how do I analyze it? Is there a tutorial? There's a lot of hbase information on the internet(I read the hbase book) but I could not find anything specific to graphs. I found, giraph, but I don't think it can connect to hbase(yet). Seeing how hadoop/hbase are versions of mapreduce/bigtables I suspect there is a way to process graphs I'm just not having luck finding anything.
*A layered graph is a directed graph with a level for different set of vertex's, like so: http://en.wikipedia.org/wiki/Layered_graph_drawing

I think this question on SO could help:
https://stackoverflow.com/questions/9865738/is-it-possible-to-store-graphs-hbase-if-so-how-do-you-model-the-database-to-sup/9867563#9867563
This part of my answer to this question might be of use.
Using HBase/Accumulo as input to giraph has been submitted recently (7
Mar 2012) as a new feature request to Giraph: HBase/Accumulo Input
and Output formats (GIRAPH-153)

We use giraph in this way, it only store minimum data in each vertex, and then run the graph algorithm with giraph, then we assemble the result with rich data using pig, for page rank algo, each vertex only needs to store vertex id, rank, thus it could scale to almost billion level.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio