Kibana - Limit "sum" metric on data table - elasticsearch

In Kibana's visualize screen I'm trying to create a "sum" metric and I would like to limit the values of this sum (between 0.1 and X), but I only get to limit the field values before aggregation is performed.
I have designed a simple metric that consists in the aggregation of a field called Timestamp_Med. I don't know how to make the filter properly,
so I only get the filter applied on the field, but not on the sum metric.
For example, if I have these rows:
ID Session_Id Timestamp_Med
1 1 5
2 1 7
3 1 3
4 2 6
5 3 0
6 3 2
7 4 15
8 5 4
9 6 0
10 6 4
My metric must sum the Timestamp_Med field, group by the Session_Id
Session_Id Sum(Timestamp_Med)
1 15
2 6
3 2
4 15
5 4
6 4
I apply the filter: (Timestamp_Med: [1 TO 5]) I get the results for the original records with ID 1,3,5,6,8,9 and 10 with the next session values for the metric:
Session_Id Sum(Timestamp_Med)
1 8
3 2
5 4
6 4
But I want to filter values in "Sum(Timestamp_Med)", not in "Timestamp_Med", and get:
Session_Id Sum(Timestamp_Med)
3 2
5 4
6 4
Could I do this filter? Is it possible?
Thank you.

Related

Maximize the minimum score

Given a grid of dimensions A*B with values between 1-9, find a sequence of B numbers that maximizes the minimum number of values matched when compared with A rows.
Describe the certain steps you would take to maximize the minimum score.
Example:
Grid Dimension
A = 5 , B = 10
Grid Values
9 3 9 2 9 9 4 5 7 6
6 3 4 2 8 5 7 5 9 2
4 9 5 8 3 7 3 2 7 6
7 5 8 9 9 4 7 3 3 7
2 6 8 3 2 4 5 4 2 2
Possible Answer
6 3 8 2 9 4 7 5 7 4
Score Calculation
This answer scores
5 when compared with Row 1
5 when compared with Row 2
1 when compared with Row 3
4 when compared with Row 4
2 when compared with Row 5
And thus the minimal score for this answer is 1.
I would go for a local hill-climbing approach that you can complement with a randomization to avoid local minima. Something like:
1. Generate a random starting solution S
2. Compute its score score(S, row) for each row. We'll call min_score(S) the minimum score among all rows for S.
3. Attempt to improve the solution with:
For each digit i (1..B) in S:
If i belongs to a row such that score(S, row) > (min_score(S) + 1) then:
Change i to be the digit of a row with min_score(S). If there was only one row with min_score(S), then min_score(S) has improved by 1
Update the scores of all the rows.
If min_score(S) hasn't improved for more than N iterations of 3, go back to 1 and start with a new random solution.

Making a scatterplot with PCA and how to read results

Im a little newbie with R and not familiar with PCA. My problem is, from a survey I have a list with observations from nine variables, first one is the gender of the respondents, the next five (Q51_1_c,Q51_2_c,Q51_4_c,Q51_6_c,Q51_7_c) ask about entrepreneurial issues and the others ask about future expectations (Q56_1_c, Q56_2_c, Q56_3_c). Except gender, all this variables takes values between 1 and 5. I want to make a scatter plot with two axis. First one with "entrepreneurial variables" and second axis with "future expectations variables" and then define as points in the scatter plot the position of Male and Female. My data look like this:
x <- "Q1b Q51_1_c Q51_2_c Q51_4_c Q51_6_c Q51_7_c Q56_1_c Q56_2_c Q56_3_c
3 Male 5 4 4 4 4 5 4 4
4 Female 4 3 4 4 3 3 4 3
5 Female 1 1 1 1 1 3 1 1
7 Female 2 1 1 1 1 5 1 4
8 Female 4 4 5 4 4 5 4 4
9 Female 3 3 4 4 3 3 4 4
13 Male 4 4 4 4 5 3 3 3
15 Female 3 4 4 4 4 1 1 5
16 Female 4 1 4 4 4 3 3 3
19 Female 3 2 3 3 3 3 3 3
20 Male 1 1 1 1 1 3 1 5
21 Female 3 1 1 2 1 3 3 3
26 Female 5 5 1 2 1 4 4 3
27 Female 2 1 1 1 1 1 1 1
29 Male 2 2 2 2 1 4 4 4
31 Female 3 1 1 1 1 5 2 3
34 Female 4 1 1 4 3 3 1 4
36 Female 5 1 1 4 4 5 1 2
37 Male 5 1 2 4 4 5 4 5
38 Female 3 1 1 1 1 1 1 1"
To run PCA this is my code:
x <- na.omit(x) #Jus to simplyfy
resul <- prcomp(x[,-1], scale = TRUE)
x$PC1 <- resul$x[,1] #Saving Scores PC1
x$PC2 <- resul$x[,2] #Saving Scores PC2
The result axis are like this:
biplot(resul, scale = 0)
Finally, to make the scatter plot:
x %>%
group_by(Q1b) %>%
summarise(mean_PC1 = mean(PC1),
mean_PC2 = mean(PC2)) %>%
ggplot(aes(x=mean_PC1, y=mean_PC2, colour=Q1b)) +
geom_point() +
theme_bw()
Which gives me this:
I'm not sure how about read the results... Should I accept that Females in general get higher values in the dimension of future expectations than Males. And Males get higher values in the entrepreneurial dimension?
Thanks in advance!!
Your interpretation of the axes looks correct, i.e., PC1 is a gradient which from left to right represents decreasing "entrepreneurialness", while PC2 is a gradient which from bottom to top represents increasing future expectations (assuming that "5" in the original data means highest entrepreneurialness/expectations).
In terms of whether males and females are different, you probably need to plot more than the just the means for each group: even if males and females are truly identical in their entrepreneurialness/expectations, you'd never expect the means from two samples to sit right on top of each other on a scatter plot. To address this, you could plot the actual observations rather than their means (i.e., one point per row, coloured by gender) and see if they intermingle vs. separate in the plot space. Or, regress gender against the principal components.
Another issue is whether it's appropriate to use PCA on ordinal data - see here for discussion.

Using SAS to transfer data structure

I have a question on using SAS for data structure transfer. This is my old dataset
question answer
1 3
2 4
3 5
4 3
5 1
1 2
2 4
3 1
4 3
5 6
The ideal output dataset is
ques1 ques2 ques3 ques4 ques5
3 4 5 3 1
2 4 1 3 6
The solution is simple. Create a dummy column which stores the questions group and then transpose that data with by variable as that group causing 2 separate output rows. Check out the following code.
data have;
infile datalines missover;
input question answer ;
if question=1 then group+1;
datalines;
1 3
2 4
3 5
4 3
5 1
1 2
2 4
3 1
4 3
5 6
;;;;
run;
proc transpose data=have out=want prefix=ques;
by group;
var answer;
id question;
run;
proc print data=want;run;

Sort on specific columns, output only one of those identical but having the highest number in another column

I have records like these:
1 4 6 4 2 4 8
2 3 5 4 6 7 1
5 4 6 4 3 8 4
1 4 6 4 5 7 1
5 7 3 3 3 6 3
6 7 3 3 4 8 4
I want to sort them on columns 2,3,4, and 6 and keep just one of those identical in column 2,3,4 and having the biggest number in column 6 such as:
1 4 6 4 5 7 1
2 3 5 4 6 7 1
5 4 6 4 3 8 4
5 7 3 3 3 6 3
6 7 3 3 4 8 4
I have tried all kinds of combinations between sort and uniq but everything fails because uniq cannot be applied onto a specific column. The only thing I came up with is to change the order of the columns as to first sort as above then move records 2,3,and 4 to the end and then run uniq with -w as to focus only on the last 3 records. This seems quite inefficient to me.
Thanks for help!
You can achieve this with two passes of sort(assuming in the first place I understand your requirement correctly, seeing that the desired data snippet posted above does not match your description of it) . The first pass sorts by field 2 through 4 ascending and field 6 descending, the second pass sorts on fields 2 through 4 only but passing in the "stable sort" and unique flags in addition to pick out those rows for each combination of fields 2-4 that have the highest value from field 6
sort -k2,4n -k6,6nr file.txt | sort -k2,4n -s -u
2 3 5 4 6 7 1
5 4 6 4 3 8 4
6 7 3 3 4 8 4

How to Use Where with Many to Many Comparison

I have problem with LINQ Query in following scenario:
I have Activity and ActivityTeacher Two Table and List of Some Teachers.
Activity Table
ActivityID Date Class
1 4/4/2012 1
2 4/5/2013 2
3 4/6/2013 5
4 5/6/2013 2
5 5/16/2013 1
6 5/20/2013 8
7 5/21/2013 7
8 6/22/2013 6
9 8/10/2013 5
10 8/12/2013 4
ActivityTeacher Table
ActivityID TeacherID
1 2
1 3
1 4
2 6
3 6
3 6
3 4
2 5
4 2
4 3
4 6
5 8
5 7
5 6
6 6
6 7
6 9
6 10
6 1
6 2
7 2
7 8
7 9
7 10
8 3
8 4
8 6
8 7
9 10
9 3
9 2
10 1
10 2
List of Teachers={2,3,4}
Now I want to select records from Activity based on List of Teachers={2,3,4}
without using foreach loop.
The Activity entity should have a Teachers navigation property you can utilize:
context.Activities
.Where(x => listOfTeachers.Contains(x.Teachers.Select(t => t.TeacherId)));
If listOfTeachers contains the three IDs 2, 3, 4, this query should translate to SQL that is similar to the following:
select a.*
from Activity a
inner join ActivityTeacher at
on a.activityid = at.activityid
where at.teacherid in (2, 3, 4);

Resources