I am trying to the count of number of groups in my report I know I could do it in the SQL however trying to avoid adding redundant data to my dataset if I can.
I have a MainDataSet that could have multiple entries per distinct group item. All I want is the no. of groups not the count of items within the group.
For example words starting with alphabet letters, lets say I have 2 groups A and B only (NB: number of groups can change dynamically as I filter the MainDataSet based on user parameter selection):
Group | Data
------|-----
A | Apple
A | Ant
B | Balloon
B | Book
B | Bowl
Final Result:
Group | Index | NGroups
A | 1 | 2
B | 2 | 2
I know I can get the Index using a aggregate function as follows:
RunningValue(Fields!Group.Value, CountDistinct, "TablixName")
But how do I get the NGroups value?
I guess I could also create another dataset based on the MainDataSet (make use of a sql function) and do:
SELECT 'X' AS GroupCount, COUNT(Distinct Group) AS NGroups
FROM dbo.udf_MainDataSet()
WHERE FieldX = #Parameter1
Then use a LookUp:
Lookup("X", Fields!GroupCount.Value, Fields!NGroups.Value, "NewDataSet")
But is there a simple solution that I am not seeing?
CountDistinct(Fields!Group.Value, "TablixName")
Related
I have a data table that resembles the structure here:
| Prof | PI | Class |
|:----:|:------:|:-----:|
| Dr.K | Louisa | A |
| Dr.L | Jenny | B |
| Dr.X | Liu | C |
Filter 1: I'd like to create two dropdown, single selection parameter-filters, the first of which contains the headers of the columns. So, filter one would contain the option to select: Pro, PI, or Class.
Filter 2: The second filter would then dynamically change to represent values of the selected column. If a user chose "Prof" in Filter 1, Filter 2 would show: Dr. K, Dr. L, and Dr. X. The table in the dashboard would then reflect the chosen filters.
I believe choosing "only relevant values" on Filter 2 would take care of some of the issues, but I still don't understand how I can turn column headers into a list, and those values still retain the integrity of the original columns. Thank you for any help you can provide!
IF [Parameter 1] = STR("Prof") THEN [Prof] ELSEIF [Parameter 1] STR("PI") THEN [PI] END
In Google sheets, I'd like to get the last five results for each team, a win/loss streak statistics (e.g. W,L,W,L,L)
My data looks like this:
team A | Win | 21/02/2020 11:32:00
team B | Loss | 09/03/2020 09:38:00
team C | Win | 04/03/2020 14:07:00
team A | Loss | 09/03/2020 16:58:00
team B | Win | 29/01/2020 10:59:00
team C | Win | 16/04/2020 11:27:00
the output I'd like is
team A | W | L |
team B | W | L |
team C | W | W |
I suspect there will be a lookup for the team name, a sort on the date, perhaps an index to get the last 5 dates.
What formula (combination) will I need to get this output?
This formula should get you most of the way there.
If you have your data in A1:C, and a list of your unique team names in column D, put this formula in E1:
=TRANSPOSE(QUERY(QUERY(A$1:C,"select A,B,C
where A = '"&D1&"' order by C desc limit 5 " ,0),
"select Col2 order by Col3"))
The inner query matches the team name, orders the data in descending date order, and selects the first five records (the five latest).
The outer query reverses the date order - newest at the end - and selects only the Win/Loss column.
This "column" is then transposed to go across as a row of data, for that specific team.
I'm not good at REGEXREPLACE, but this could easily covert WIN to W and Loss to L. Or by using SUBSTITUTE.
And I'll see if I can make it into an ARRAYFORMULA. As it is now, you copy the formula down.
Here is a sample sheet.
Does https://crate.io support facets (for faceted search)?
I didn't find anything in the docs. ElasticSearch replaced facets with aggregations in 2014, but the aggregation section in the crate docs only talks about SQL aggregation functions.
My use case:
I've got a list of web sites, each record has a domain and a language field. When displaying the search results, I want to get a list of all domains that the search results appear in, as well a list of all languages, ordered by number of occurences so search results can be narrowed down. The number of results for those single facet values shall also be given.
Screenshot with facets:
There is no way to get the facets I want from crate itself.
Instead we're enabling the ElasticSearch REST API in crate.yml now
es.api.enabled: true
.. and can use the ElasticSearch aggregation API.
Crate doesn't support facets or Elasticsearch aggregations directly. Like you suggested, you can always turn on the Elasticsearch API. However, there are other ways to get these aggregations.
1) Have you considered to issue multiple queries to the cluster? For example, if you load your page dynamically with Javascript, you can first return the search results and load the facets later. This should also decrease the overall response time of the application.
2) In CrateDB 2.1.x, there will be support for subqueries, which allow you to include the facets within your query:
select q1.id, q1.domain, q1.tag, q2.d_count, q3.t_count from websites q1,
(select domain, count(*) as d_count from websites where text like '%query%' group by domain) q2,
(select tag, count(*) as t_count from websites where text like '%query%' group by tag) q3
where q1.domain = q2.domain and q1.tag = q3.tag and q1.text like '%query%'
order by q1.id
limit 5;
This gives you a result table like this where you have the search results alongside with the domain and tag count for the query:
+----+--------------+-----------+---------+-----------+
| id | domain | tag | d_count | t_count |
+----+--------------+-------------+---------+---------+
| 1 | example.com | example | 2 | 3 |
| 14 | crate.io | software | 1 | 4 |
| 17 | google.com | search | 5 | 2 |
| 29 | github.com | open-source | 3 | 3 |
| 47 | linux.org | software | 2 | 4 |
+----+--------------+-------------+---------+---------+
Disclaimer: I'm new to Crate :)
I've searched quite a bit for this and can't find a good solution anywhere to what seems to me like a normal problem for this product.
I've got a data table (in memory) that is from a rollup table(call it 'Ranges'). Basically like so:
id | name | f1 | f2 | totals
0 | Channel1 | 450 | 680 | 51
1 | Channel2 | 890 | 990 | 220
...and so on
Which creates a bar chart with Name on the X and Totals on the Y.
I have another table that is an external link to a large (500M+ rows) table. That table (call it 'Actuals') has a column ('Fc') that can fit inside the F1 and F2 values of Ranges.
I need a way for Spotfire Analyst (v7.x) to use the selection of the the bar chart for Ranges to trigger this select statement:
SELECT * FROM Actuals WHERE Actuals.Fc between [Ranges].[F1] AND [Ranges].[F2]
But there aren't any relationships (Foreign keys) between the two data sources, one is in memory (Ranges) and the other is dynamic loaded.
TLDR: How do I use the selected rows from one visualization as a filter expression for another visualization's data?
My choice for the workaround:
Add a button which says 'Load Selected Data'
This will run the following code, which will store the values of F1 and F2 in a Document Property, which you can then use to filter your Dynamically Loaded table and trigger a refresh (either with the refresh code or by setting it to load automatically).
rowIndexSet=Document.ActiveMarkingSelectionReference.GetSelection(Document.Data.Tables["IL_Ranges"]).AsIndexSet()
if rowIndexSet.IsEmpty != True:
Document.Properties["udF1"] = Document.Data.Tables["IL_Ranges"].Columns["F1"].RowValues.GetFormattedValue(rowIndexSet.First)
Document.Properties["udF2"] = Document.Data.Tables["IL_Ranges"].Columns["F2"].RowValues.GetFormattedValue(rowIndexSet.First)
if Document.Data.Tables.Contains("IL_Actuals")==True:
myTable=Document.Data.Tables["IL_Actuals"]
if myTable.IsRefreshable and myTable.NeedsRefresh:
myTable.Refresh()
This is currently operating on the assumption that you will not allow your user to view multiple ranges at a time, and simply shows the first one selected.
If you DO want to allow them to view multiple ranges, you can run a cursor through your IL_Ranges table to either get the Min and Max for each value, and limit the Actuals between the min and max, or you can create a string that will essentially say 'Fc between 450 and 680 or Fc between 890 and 990', pass that through to a stored procedure as a string, which will execute the quasi-dynamic statement, and grab the resulting dataset.
I noticed that LIMIT queries will return more than the expected number of rows when they are executed against tables that contain nested or repeated data. For example, the following query run against the persons sample data set from the developer guide produces the following results:
% bq query 'SELECT fullName, children.name FROM [persons.person] LIMIT 1'
+----------+---------------+
| fullName | children_name |
+----------+---------------+
| John Doe | Jane |
| John Doe | John |
+----------+---------------+
It looks like BQL is applying the LIMIT operator before flattening the results as opposed to the other way around (which I think would make more sense).
Is this a bug in the BQL implementation or is this the expected behavior? If this is the expected behavior can someone please provide an explanation for why this makes sense?
This is expected given the way BigQuery flattens query results. When you run the query, the LIMIT 1 applies to the repeated record. Then the results get flattened in the output, and you get two rows. A workaround is to use an explicit flatten operation. For example:
SELECT fullName, children.name
FROM (FLATTEN([persons.person], children.name) LIMIT 1
This will return only a single row.