I have a data coming from logstash that shows how much space is used on a table in a database and maximum allocated capacity for a table. I want to create in Kibana gauges for every table that show how much space is currently occupied.
The problem is that maximum available space sometimes changes so the limit for a gauge has to be set as a variable and I can't figure out how to do this. I also don't know how to show only data from current day on a dashboard for a time range. Data coming from logstash looks like that:
time | table_name | used_gb | max_gb
---------+------------+---------+--------
25.04.18 | table_1 | 1.2 | 10.4
25.04.18 | table_2 | 4.6 | 5.0
26.04.18 | table_1 | 1.4 | 14.6
26.04.18 | table_2 | 4.9 | 5.0
I want my gauge for every table to look something like that:
This problem can be solved using Time Series Visual Builder.
Choose Gauge, then Panel options, you can specify 1 as your max value. Then in your gauge data settings you can compute the dynamic ratio per table. Here's a screenshot of a similar setup:
In older versions of Kibana instead of Bucket Script you should use Calculation Aggregation.
Reference:
https://discuss.elastic.co/t/gauge-with-dynamic-maximum-value/130634/2
Related
I setup an ElasticStack and imported Millions of LogEntries. Each log entry contains a Tiestamp and a sessionID. Each session produces multiple log entries thus I have the following information available
SessionID | Timestamp
1234 | stamp1
1234 | stamp2
2223 | stamp3
1234 | stamp4
5566 | stamp5
5566 | stamp6
2223 | stamp7
Now I would like to calculate the average/minimum/maximum session duration.
Does anyone know how to achieve this?
Thanks in advance
To do exactly what you want isn't going to be simple, I'm not even convinced it's possible with your data in its current form.
I'm also not sure what having the average, minimum and maximum session lengths actually gives you in terms of actionable information - why do you need the max/min/avg session times?
Something that could be easily visualised using you data would be session count against a date histogram. From Kibana, create a line graph visualisation. On the y-axis do a unique count of the session ID, on the x-axis select date histogram and use your timestamp field...
I would have thought knowing the session count over a period of time would give you a better idea for capacity planning than knowing max/min session times - perhaps you have already done this? This assumes each session is regularly logging... If you zoom in too far (i.e. between log events) the graph will look choppy, but it should smooth as you zoom out and there are options available for smoothing.
I want to accomplish something easy to understand (and maybe easy to do but I can't find a way...).
I have a table which represents the date when a client has bought something.
Let's have this example:
=============================================
Purchase_id | Purchase_date | Client_id
=============================================
1 | 2016/03/02 | 1
---------------------------------------------
2 | 2016/03/02 | 2
---------------------------------------------
3 | 2016/03/11 | 3
---------------------------------------------
I want to create a single number card which will be the average of purchase realised by day.
So for this example, the result would be:
Result = 3 purchases / 2 different days = 1.5
I managed doing it by grouping in my query by Purchase_date and my new column is the number of rows.
It gives me the following query:
==================================
Purchase_date | Number of rows
==================================
2016/03/02 | 2
----------------------------------
2016/03/11 | 1
----------------------------------
Then I put the field Number of rows in a single number card, selecting "Average".
I have to precise that I am using Direct Query with SQL Server.
But the problem is that I want to have a filter on the Client_id. And once I do the grouping, I lose this column.
Is there a way to have this Client_id as a parameter?
Maybe even the fact of grouping is not the right solution here.
Thank you in advance.
You can create a measure to calculate this average.
From Power BI's docs:
The calculated results of measures are always changing in response to
your interaction with your reports, allowing for fast and dynamic
ad-hoc data exploration
This means filtering client_id's will change the measure accordingly.
Here is an easy way of defining this measure:
Result = DISTINCTCOUNT(tableName[Purchase_date])/DISTINCTCOUNT(tableName[Purchase_id])
I'm using a small collection of webscrapers to get the current GPS location of various devices. I also want to keep historic records. What's the best way of doing this without storing the data twice? For now i have two tables, both looking like this:
Column | Type | Modifiers | Storage | Description
---------+-----------------------------+---------------+----------+-------------
vehicle | character varying(20) | | extended |
course | real | | plain |
speed | real | | plain |
fix | smallint | | plain |
lat | real | | plain |
lon | real | | plain |
time | timestamp without time zone | default now() | plain |
One is named gps, and another is named gps_log. The function that updates these two does two things: first it performs an INSERT on gps_log, and afterwards it does an UPDATE OR INSERT (a user-defined function) on gps. However, this results in what seems to me as a pointless case of double-storing for other purposes than having easy SELECTable access to the current data.
Is there a simple way of only using gps_log and having a function select only the newest entry for each vehicle? Keep in mind that gps_log currently has 1397150 rows increasing with roughly 150 rows every 15 minutes, so performance is likely to be an issue.
Using PostgreSQL 8.4 via Perl DBI.
If SELECT performance is paramount, your current solution with redundant storage might not be such a bad idea.
If you get rid of the redundant table, you can help SELECT performance with a multi-column index like:
CREATE INDEX gps_log_vehicle_time ON gps_log (vehicle, time DESC);
Assuming that vehicle is your primary key.
Would make this corresponding query pretty fast:
SELECT *
FROM gps_log
WHERE vehicle = 'foo'
ORDER BY time DESC
LIMIT 1;
To SELECT the last entry for multiple or all rows, use this related technique.
Total storage size would probably grow, though, because the index will be bigger that the redundant table (+ index) if you have many rows per vehicle.
It might help storage and performance to add a serial column as a surrogate primary key instead of vehicle. Especially if you have foreign keys pointing to it.
Aside: don't use time as column name. It's a type name in PostgreSQL and a reserved word in every SQL standard. It is also misleading to name a timestamp column time.
I'm almost completely new to HBase. I would like to take my current site tracking based on MySQL and put it to HBase because MySQL simply doesn't scale anymore.
I'm totally lost int eh first step...
I need to track different actions of users and need to be able to aggregate them by some aspects (date, country they come from, product they performed the action with, etc...)
The way I store it currently is that I have a table with a composite PK with all these aspects (country, date, product, ...) and the rest of the fields are counters for actions. When an action is performed, I insert it to the table incrementing the action's column by one (ON DUPLICATE KEY UPDATE...).
*date | *country | *product | visited | liked | put_to_basket | purchased
2011-11-11 | US | 123 | 2 | 1 | 0 | 0
2011-11-11 | GB | 123 | 23 | 10 | 5 | 4
2011-11-12 | GB | 555 | 54 | 0 | 10 | 2
I have a feeling that this is completely against the HBase way, and also doesn't really scale (with the growing number if keys inserts get expensive) and not really flexible.
How to track user actions with it attributes effectively in HBase? How table(s) should look like? Where MapReduce comes in the picture?
Thanks for all suggestions!
Lars George's "HBASE: the definitive guide" explains a design very similar to what you want to achieve in the introduction chapter
This can be done as follows,
Have the unique row id in Hbase as follows,
rowid = date + country + product ---> append these into a single entity and have it as key.
Then have the counters as columns. So when you get an event like,
if(event == liked){
increment the liked column of the hbase by 1 for the corresponding key combination.
}
and so on for other cases.
Hope this helps!!
Looking to add a column in my SSRS Matrix which will give me the percentage from the total column in that row.
I'm using the following expression, but keep getting 100% for my percentages (I'm assuming this is because the total is evaluated last, so it's just doing Total/Total?
=FORMAT((Fields!ID.Value/SUM(Fields!ID.Value)), "P")
The field ID is calcuted within SQL, not SSRS.
For example
Site | Value 1 | %1 | Value2 | %2 | Total
1 | 20 | 50% | 20 | 50% | 40
Probably this is happening because you need define the right scope for the SUM function:
SUM(Fields!ID.Value,"group_name") instead of plain SUM(Fields!ID.Value)
Updated:
I needed some time to make an example since I didn't have reporting services available the first time I answered you.
You can see the result and the field values
Hard to provide details without more info on the setup of your groups, but you should look at using the scope option to the aggregate operators like SUM or first:
=SUM(Fields!ID.Value, "NameOfRowGrouping") / SUM(Fields!ID.Value, "TopLevelGroupName")
Also, to keep things clean, you should move your format out of the expression and to either the placeholder properties or textbox properties that contains your value.