Kusto query remove double entries - distinct

I have a query that returns several entries with timestamp and an operation_Id.
Entry 2 and 3 have the same operation_Id but different timestamps. How can I remove the duplicate operation_Id, the first shall be used - and I still want to display the timestamp.
timestamp
operation_Id
name
2022-10-28T06:13:05.789Z
12d83416-0c94-4c98-9523-603b7e634a14
iOS
2022-10-28T03:50:44.249Z
642bb5d7-69e5-437a-b086-d89eec93438b
iOS
2022-10-28T03:50:42.662Z
642bb5d7-69e5-437a-b086-d89eec93438b
iOS
I know I can use "distinct".
| distinct operation_Id, OS;
operation_Id
name
12d83416-0c94-4c98-9523-603b7e634a14
iOS
642bb5d7-69e5-437a-b086-d89eec93438b
iOS
642bb5d7-69e5-437a-b086-d89eec93438b
iOS
But how do I add now the timestamp?
I cannot do something like this, because then I am back to my first problem :-)
| distinct timestamp, operation_Id, OS;
I also tried with "summarize" but it summarized all operation_Id, even the operation_Id was different

arg_min()
datatable(timestamp:datetime, operation_Id:string, name:string)
[
datetime(2022-10-28T06:13:05.789Z) ,"12d83416-0c94-4c98-9523-603b7e634a14" ,"iOS"
,datetime(2022-10-28T03:50:44.249Z) ,"642bb5d7-69e5-437a-b086-d89eec93438b" ,"iOS"
,datetime(2022-10-28T03:50:42.662Z) ,"642bb5d7-69e5-437a-b086-d89eec93438b" ,"iOS"
]
| summarize arg_min(timestamp, *) by operation_Id
operation_Id
timestamp
name
642bb5d7-69e5-437a-b086-d89eec93438b
2022-10-28T03:50:42.662Z
iOS
12d83416-0c94-4c98-9523-603b7e634a14
2022-10-28T06:13:05.789Z
iOS
Fiddle

Related

Creating dynamic data validation

Given the following dataset in one sheet, let's say 'Product Types' (columns and headers displayed):
| A | B | C |
| :----------: | :------: | :-----: |
| Product Type | Desktops | Laptops |
| Desktops | Dell | Dell |
| Laptops | HP | Apple |
In another sheet, let's say 'Assets', I've set column A to require a match to the data validation of an item listed in column A of 'Product Types' (not including the header). What I'm trying to do is that once column A is selected in 'Assets', I'd like to create a dynamic query data validation that will then present the values of the column with the header in 'Product Types'.
As an example, in the 'Assets' sheet, if column A has "Laptops" selected, column B will use data validation for values under the "Laptops" column in 'Product Types'; then giving the only options as "Dell" or "Apple". Alternatively, if ColA is changed to "Desktops", data validation is defined to only allow "Dell" or "HP" as options.
I'm unsure if this is possible. However, data validation in Google Sheets claims to allow a range or "formula".
I don't remember where I sourced this formula from, but it can present the values I need when running the query within a cell. However, I'm unable to use the same formula within a data validation field.
=ARRAYFORMULA(IFERROR(VLOOKUP(A2, TRANSPOSE({'Product Types'!A1:M1;
REGEXREPLACE(TRIM(QUERY(IF('Product Types'!A2:M<>"", 'Product Types'!A2:M&",", )
,,999^99)), ",$", )}), 2, 0)))
The above query presents the correct comma-separated values of the column I want in 'Product Types', but I'm not sure if this can be translated into something data validation can use or if there's altogether a different method to accomplish this.
P.S. I'm new. Markdown for the table seems to work when editing, but not when published..
the answer is no. data validation does not support direct input of complex formulae. you will need to run your formula in some column and then reference the range of that column within the data validation

Calculate average Time span in Azure Application Insights for Trace or Events

I'm currently evaluating a use case in Azure Application Insights but I'm open to use any other framework of infrastructure that would fit best.
So basically I have a desktop application who logs some events or traces (I don't exactly know which one it should be). Examples of events (or traces?)
| timestamp | state | user |
------------------------------------------
| yyyy-mm-dd 12:00 | is_at_home | John |
| yyyy-mm-dd 15:00 | is_at_work | John |
| yyyy-mm-dd 18:00 | is_outside | John |
Users are considered to be in the last state received until new event comes.
I need to extract data to answer questions like this:
I want to see if the total duration John is at home is growing or going down.
I want to get in which states the users pass most time.
I want the average duration of the state "is_at_work". And if it's going down or up over time.
So, Can the application insights output this kind of analysis? If not, which architecture/platform should I use? I'm I using the right keywords to describe what I want?
Thank you
the ai/log analytics query language (kql) supports all kinds of things like that. the trick you'll have is getting your queries exactly right, here you'll have to figure out exactly what you need to do so that you calculate the times between rows as "state" changes.
here's my first attempt:
let fakeevents = datatable (timestamp: datetime, state: string, user: string ) [
datetime(2021-08-02 12:00), "is_at_home" , "John" ,
datetime(2021-08-02 15:00), "is_at_work" , "John",
datetime(2021-08-02 18:00), "is_outside" , "John",
datetime(2021-08-02 11:00), "is_at_home" , "Jim" ,
datetime(2021-08-02 12:00), "is_at_work" , "Jim",
datetime(2021-08-02 13:00), "is_outside" , "Jim",
];
fakeevents | partition by user (
order by user, timestamp desc |
extend duration = prev(timestamp, 1, now()) - timestamp
)
gets me:
timestamp
state
user
duration
2021-08-02T18:00:00Z
is_outside
John
06:20:23.1748874
2021-08-02T15:00:00Z
is_at_work
John
03:00:00
2021-08-02T12:00:00Z
is_at_home
John
03:00:00
2021-08-02T13:00:00Z
is_outside
Jim
11:25:14.6912472
2021-08-02T12:00:00Z
is_at_work
Jim
01:00:00
2021-08-02T11:00:00Z
is_at_home
Jim
01:00:00
before you send any data real data, you can create "fake" data by using the datatable operator to make a fake table full of data.
you can then apply things like summarize to calculate things like which had the max, etc. note the use of partition by user to make sure each user is treated separately. in my assumption i use now() if there's no value to end the duration of an event, you'll want to do something there otherwise you'll have blank cells.

How to bind values from CSV files with a query to database?

I'm trying to build a report using BIRT. I define several data sources: two CSV-files and MySQL database. A query that receives data from the database looks like this:
SELECT applicationType, STATUS, COUNT(*)
FROM cards
GROUP BY applicationType, STATUS;
Then I created a table with three columns that outputs these values from the query:
So far so good. But I want to output values from CSV-files instead of applicationType and status. The first file, apptype.csv, has the following structure:
applicationType,apptypedescr
1,"Common Type"
2,"Type 1"
...
and the second one, statuscards.csv, has the following structure:
status,statuscards
1,"Blocked"
2,"Normal"
...
And instead of:
Тип приложения | Статус карты | Количество
---------------|--------------|------------
1 | 2 | 55
I want to output the following:
Тип приложения | Статус карты | Количество
---------------|----------------|------------
Common Type | Normal | 55
I alse created New Joint Data Set to bind MySQL dataset and the first file dataset:
But I don't know how to change the table now. As far as I understand, [applicationType] in the first column should be replaced with [apptypedescr]:
but I'm not able to drag this field into the table, it's possible to add it to the report only outside the table. How can I bind these values from the CSV files to data from the MySQL query in the table?
I did this by setting new dataset for table in Properties -> Binding -> DataSet. After this the report was built properly:

Using filter inside a calculated memeber

I'm having some MDX issues, I want to calculate how many products I have per type per version, this would be my output.
ProductID | QtyProductAVersionA | QtyProductAVersionB | QtyProductBVersionA | QtyProductBVersionB |
I have this MDX so far
WITH MEMBER [Measures].[ProductAVersionA]
AS SUM([DimProduct].[ProductName].&[ProductA],[Measures].[ProductQty])
SELECT NON EMPTY (
[Measures].[ProductAVersionA]) ON COLUMNS,
NON EMPTY [DimOrg].[ProductID].[ProductID].MEMBERS ON ROWS
FROM [Sales]
WHERE([DimCustomers].[Customer Area].&[United States])
But this returns the total of product A, I want only the product A filtered by version A. I can't use it in the WHERE clause since not all my products have the same versions.
Is there any way I can achieve this with a Filter expression inside the calculated memeber? I tried to used but I kept getting an error.
FYI product version is in another dimension [DimVersion]
Any help would be appreciated
You could try filtering the ProductAVersionA calculated member by the DimVersion 'A' member as follows:
member [Measures].[ProductAVersionA] as
sum(([DimProduct].[ProductName].&[ProductA], [DimVersion].[VersionName].&[VersionA]), [Measures].[ProductQty])

RDL/RDLC Matrix Report Column Header Formatting

I am trying to format column headers in my Matrix report on an RDLC report. I have the columns specified as DateTime in my dataset and if I leave the column alone Ex:
=Fields!FinancialsTableMonthYear.Value
It displays fine Ex: 1/1/2009 | 2/1/2009 | 3/1/2009 Etc...
But if I try and put any formatting on the column header Ex:
=MonthName(Fields!FinancialsTableMonthYear.Value, true)
It will display Ex: #Error | #Error | #Error Etc...
I have also tried Ex:
=Year(Fields!FinancialsTableMonthYear.Value)
Any ideas?
You should use =MonthName(Month(Fields!FinancialsTableMonthYear.Value))
If you want to show the name of the month, you can use this format:
Format(Fields!FinancialsTableMonthYear.Value, "MMM")
Change the "MMM" to any format you want.

Resources