Datadog: METRIC.as_rate() vs. per_second(METRIC) - metrics

I'm trying to figure out the difference between the in-application modifier as_rate() and the rollup function per_second().
I want a table with two columns: the left column shows the total number of events submitted to a Distribution (in query-speak: count:METRIC{*} by {tag}), and the right column shows the average rate of events per second. The table visualization applies a sum rollup on left column, and an average rollup on the right column, so that the left column should equal the right column multiplied by the total number of seconds in the selected time period.
From reading the docs I expected either of these queries to work for the right column:
count:DISTRIBUTION_METRIC{*} by {tag}.as_rate()
per_second(count:DISTRIBUTION_METRIC{*} by {tag})
But, it turns out that these two queries are not the same. as_rate() is the only one that finds the expected average rate where left = right * num_seconds. In fact, the per_second() rollup does this extra weird thing where metrics with lower total events have higher average rates.
Is someone able to clarify why these two functions are not synonymous and what per_second() does differently?

Related

Dynamic Calculated Column on different report level SSAS DAX query

I'm trying to create a calculated column based on a derived measure in SSAS cube, this measure which will count the number of cases per order so for one order if it has 3 cases it will have the value 3.
Now I'm trying to create a bucket attribute which says 1caseOrder,2caseOrder,3caseOrder,3+caseOrder. I tried the below one
IF([nrofcase] = 1, "nrofcase[1]", IF([nrofcase] = 2, "nrofcase[2]",
IF([nrofcase] = 3, "nrofcase[3]", "nrofcase[>3]") )
But it doesn't work as expected, when the level of the report is changed from qtr to week it was suppose to recalculate on different level.
Please let me know if it case work.
Calculated columns are static. When the column is added and when the table is processed, the value is calculated and stored. The only way for the value to change is to reprocess the model. If the formula refers to a DAX measure, it will use the measure without any of the context from the report (eg. no row filters or slicers, etc.).
Think of it this way:
Calculated column is a fact about a row that doesn't change. It is known just by looking at a single row. An example of this is Cost = [Quantity] * [Unit Price]. Cost never changes and is known by looking at the Quantity and Unit Price columns. It doesn't matter what filters or context are in the report. Cost doesn't change.
A measure is a fact about a table. You have to look at multiple rows to calculate its value. An example is Total Cost = SUM(Sales[Cost]). You want this value to change depending on the context of time, region, product, etc., so it's value is not stored but calculated dynamically in the report.
It sounds like for your data, there are multiple rows that tell you the number of cases per order, so this is a measure. Use a measure instead of a calculated column.

Power Bi matrix subtotal is not the sum of the values in the column

I am trying to create a "meetingroom occupancy" matrix in Power BI. The raw data contains bookings per day per Room. The maximum daily available time per room is 12 hours. I have created a Date Dimension Table for the dates.
I have tried to change datatypes, added the available time column in the query editor, added the available time as DAX column and as calculated measure, but all with no success. I have changed the available time for Room B to 1, and the result of the Subtotal was 13, so it looks like subtotals is only summing unique values, but I do not know how to solve this.
Could someone please explain to me what is happening and how I could solve this?
The input data is as follows:
And my Date_Dimension is as follows:
This is the current and desired result:

DAX formula, crossfilter function nor returning expected result

I'm obtaining wrong results from a DAX formula and I can't understand why.
In my database I have articles that are composed by multiple tools, which are produced from blank tools. One blank can be used to produce multiple tools. I need to calculate blank sales by 3 time periods: last 6, last 12 and last 24 months.
This is my Power BI model:
The time period table I used for the time period slicer and the measure look like this :
To obtain Blank's sales volumes, I created 3 measures:
When I use the last formula, which I thought would have returned the right amount of Blank sold by article by time period, I obtain strange results.
When I select "last 24 months" time period, everything looks fine:
When I select "Last 12 months", the total is fine, but the total by article is wrong:
Finally, if I select "Last 6 months" time period, all the results are totally wrong:
The curious fact is that I checked the result by executing a sql query on the database, and the DAX formula returns the right result (so 1466 for the selected time period), but only when used in a card, without filtering it by Article number.
I have no other filters that affect the visuals.
Could you help me understand why I'm not obtaining the right result, or suggest a better way to reach the desired results?
I'm guessing (at least part of) the problem is that you are backing up from different end dates because LASTDATE(Sales[DocumentDate]) can return different values for different ArticleNo.
I'm not sure what value you actually want for that date, possibly LASTDATE('Dates Table'[Date]), but I'm pretty sure you want it consistent across different ArticleNo.

How to create value over time line chart in Kibana 4?

I'm facing a following problem. In Kibana 4 I've created a line chart based on my input from elasticeasrch but I can only display average, min, max instead of an actual value of the field per time, e.g. sent bytes.
Most answears to that question on stackoverflow are about Kibana 3 (How to create value over time chart with Kibana 3?) and seem to include a Histogram on a X axis, yet I can't seem to find one which will enable me to apply them to Kibana 4. I was unable to find the histogram panel and once I click on the discover tab there is the constant Searching loading.
If I have the following fields in my _source:
{"timestamp":"2015-06-02T10:16:44.0855","time":587,"threadName":"Thread Group 1-957","byte":1372,"status":"false","latence":306,"registerCall":"404"}
and I would like to have the number of bytes on the Y-axis and on the X-axis my timestamp.
Any help in the right direction will be appreciated :)
To create a value over time line chart in Kibana, follow these steps:
Go to visualize tab and select line chart
In the X-axis, select X-axis, Aggregation as Date Histogram and then select your timestamp field as the date field.
Next for the Y-Axis, select Sum as the aggregation and then bytes as the field.
For the X axis, what Alcanzar said is good, but as you notice, the Y axis is problematic.
Sum (suggested by "Limit") works, but since it's aggregated, it shows the total used in each aggregated bucket, but that may be meaningless depending on what you are trying to show. Your question isn't clear on what you want, so I'm just guessing here. One hour of requests, each of which ran for one minute and sent 1 megabyte is indeed 60 megabytes-minutes, if you are trying to show total capacity used over than hour (maybe you are paying a bill based on usage per time). On the other hand, if you are trying to show peak usage in each time, it would be wrong.
You said you already looked and Max and Min and they don't meet your needs. I don't suppose Standard Deviation would be any better?
I have the same concern. The best I've been able to do so far is
display Min and Max simultaneously in the Y axis. When they diverge, I know I'm zoomed out too far, so I zoom in until they align.
This is how I know I'm seeing individual events.
In any case, I share your frustration. I too would like to be able to show time series as easily as I can in, say, Excel.

How to organise and rank observations of a variable?

I have this dataset containing world bilateral trade data for a few years.
I would like to determine which goods were the most exported ones in the timespan considered by the dataset.
The dataset is composed by the following variables:
"year"
"hs2", containing a two-digit number that tells which good is exported
"exp_val", giving the value of the export in a certain year, for that good
"exp_qty", giving the exported quantity of the good in a certain year
Basically, I would like to get the total sum of the quantity exported for a certain good, so an output like
hs2 exp_qty
01 34892
02 54548
... ...
and so forth. Right now, the column "hs2" gives me a very large number of observations and, as you can understand, they repeat themselves multiple times (as the variables vary across both time and country of destination). So, the task would be to have every hs2 number just once, with the correspondent value of "total" exports.
Also (but that would be just a plus, I could just check the numbers by myself) it would be nice to get a result sorted by exp_qty, so to have a ranking of the most exported goods by quantity.
The following might be a start at what you need.
collapse (sum) exp_qty, by(hs2)
gsort -exp_qty
collapse summarizes the data in memory to one observation per value of hs2, summing the values of exp_qty. gsort then sorts the collapsed data by descending value of exp_qty so the first observation will be the largest. See help collapse and help gsort for further details.

Resources