Remove duplicated rows from two datatables with different numbers of rows - uipath

I have two datatables, in one of them old_DataTable I have 3 results. In the other one called Output I have 4 results. I have been implementing solutions suggested in forums:
Read the two Datatables
drCommonRows = (From d1 In old_DataTable.AsEnumerable
Join d2 In Output.AsEnumerable
On d1("StartDate").toString.Trim Equals d2("StartDate").toString.Trim
Select d1).ToList
drFilteredRows=old_DataTable.AsEnumerable.Except(drCommonRows, DataRowComparer.Default).toList
If drFilteredRows.Count >0 than dtFiltered drFilteredRows.CopyToDataTable
Else dtFiltered old_DataTable.Clone
When testing it gives a good result of drCommonRows but not of drFilteredRows and in the IF condition it always goes in Else.
I even changed the : drFilteredRows=**old_DataTable**.AsEnumerable.Except(drCommonRows, DataRowComparer.Default).toList
into
drFilteredRows=**Output**.AsEnumerable.Except(drCommonRows, DataRowComparer.Default).toList but in that case it gives as result all the datatable saved in Output so gives 4 results and is not working.
> The result that I am expecting is 1 row that is the one which is
> different and is in the Output Datatable.

Related

Business Objects 4.x: need to combine two queries similar to a UNION

I can't seem to figure out how to combine the result of 2 Business Objects queries.
Both queries return a set of codes and a number of hours. Query 1 can have codes that do not appear in Query 2, and Query 2 can have codes that do not appear in Query 1.
The resulting report should contain all codes from both Query 1 and Query2, a column with the sum of hours from Q1 for that code, and a column with the sum of hours from Query 2 for that code. If one of the queries doesn't have a code in it, it would return a blank or 0 total.
Example:
Q1 results:
|Code|Value|
|:---|:----|
|A|15|
|A|17|
|B|12|
|D|22|
|D|35|
|E|16|
|E|9|
|E|11|
Q2 results:
|Code|Value|
|:---|:----|
|A|5|
|A|19|
|B|33|
|C|17|
|C|24|
|E|78|
|E|12|
Report:
|Code|Value1|Value2|
|----|------|------|
|A|32|24|
|B|12|33|
|C| |41|
|D|57| |
|E|36|90|
|Total|137|188|
When I create the Business Object report table as normal, only the values of Query 1 are used, and I miss the row for value C. If I flip the queries around, I miss the row for value D.
How do I set up my report to show all the code values?
Edit: Sorry for the formatting of the tables, in the preview it looks perfect. :(

Strange behaviour when using FILTER to filter a different table with no direct relationship?

I have two facts tables, First and Second, and two dimension tables, dimTime and dimColour.
Fact table First looks like this:
and facet table Second looks like this:
Both dim-tables have 1:* relationships to both fact tables and the filtering is one-directional (from dim to fact), like this:
dimColour[Color] 1 -> * First[Colour]
dimColour[Color] 1 -> * Second[Colour]
dimTime[Time] 1 -> * First[Time]
dimTime[Time] 1 -> * Second[Time_]
Adding the following measure, I would expect the FILTER-functuion not to have any affect on the calculation, since Second does not filter First, right?
Test_Alone =
CALCULATE (
SUM ( First[Amount] );
First[Alone] = "Y";
FILTER(
'Second';
'Second'[Colour]="Red"
)
)
So this should evaluate to 7, since only two rows in First have [Alone] = "Y" with values 1 and 6 and that there is no direct relationship between First and Second. However, this evaluates to 6. If I remove the FILTER-function argument in the calculate, it evaluates to 7.
There are thre additional measures in the pbix-file attached which show the same type of behaviour.
How is filtering one fact table which has no direct relationship to a second fact table affecting the calculation done on the second table?
Ziped Power BI-file: PowerBIFileDownload
Evaluating the table reference 'Second' produces a table that includes the columns in both the Second table, as well as those in all the (transitive) parents of the Second table.
In this case, this is a table with all of the columns in dimColour, dimTime, Second.
You can't see this if you just run:
evaluate 'Second'
as when 'evaluate' returns the results to the user, these "Parent Table" (or "Related") columns are not included.
Even so, these columns are certainly present.
When a table is converted to a row context, these related columns become available via RELATED.
See the following queries:
evaluate FILTER('Second', ISBLANK(RELATED(dimColour[Color])))
evaluate 'Second' order by RELATED(dimTime[Hour])
Similarly, when arguments to CALCULATE are used to update the filter context, these hidden "Related" columns are not ignored; hence, they can end up filtering First, in your example. You can see this, by using a function that strips the related columns, such as INTERSECT:
Test_ActuallyAlone = CALCULATE (
SUM ( First[Amount] ),
First[Alone] = "Y",
//This filter now does nothing, as none of the columns in Second
//have an impact on 'SUM ( First[Amount] )'; and the related columns
//are removed by the INTERSECT.
FILTER(
INTERSECT('Second', 'Second')
'Second'[Colour]="Red"
)
)
(See these resources that describe the "Expanded Table"
(this is an alternative but equivalent explanation of this behaviour)
https://www.sqlbi.com/articles/expanded-tables-in-dax/
https://www.sqlbi.com/articles/context-transition-and-expanded-tables/
)

How to filter clickhouse table by array column contents?

I have a clickhouse table that has one Array(UInt16) column. I want to be able to filter results from this table to only get rows where the values in the array column are above a threshold value. I've been trying to achieve this using some of the array functions (arrayFilter and arrayExists) but I'm not familiar enough with the SQL/Clickhouse query syntax to get this working.
I've created the table using:
CREATE TABLE IF NOT EXISTS ArrayTest (
date Date,
sessionSecond UInt16,
distance Array(UInt16)
) Engine = MergeTree(date, (date, sessionSecond), 8192);
Where the distance values will be distances from a certain point at a certain amount of seconds (sessionSecond) after the date. I've added some sample values so the table looks like the following:
Now I want to get all rows which contain distances greater than 7. I found the array operators documentation here and tried the arrayExists function but it's not working how I'd expect. From the documentation, it says that this function "Returns 1 if there is at least one element in 'arr' for which 'func' returns something other than 0. Otherwise, it returns 0". But when I run the query below I get three zeros returned where I should get a 0 and two ones:
SELECT arrayExists(
val -> val > 7,
arrayEnumerate(distance))
FROM ArrayTest;
Eventually I want to perform this select and then join it with the table contents to only return rows that have an exists = 1 but I need this first step to work before that. Am I using the arrayExists wrong? What I found more confusing is that when I change the comparison value to 2 I get all 1s back. Can this kind of filtering be achieved using the array functions?
Thanks
You can use arrayExists in the WHERE clause.
SELECT *
FROM ArrayTest
WHERE arrayExists(x -> x > 7, distance) = 1;
Another way is to use ARRAY JOIN, if you need to know which values is greater than 7:
SELECT d, distance, sessionSecond
FROM ArrayTest
ARRAY JOIN distance as d
WHERE d > 7
I think the reason why you get 3 zeros is that arrayEnumerate enumerates over the array indexes not array values, and since none of your rows have more than 7 elements arrayEnumerates results in 0 for all the rows.
To make this work,
SELECT arrayExists(
val -> distance[val] > 7,
arrayEnumerate(distance))
FROM ArrayTest;

OBIEE TOPN On Result of Union Two SubjectArea

I used "Union" to combine two different subject area in an analysis in OBIEE 11 :
"A" is a column in the first subject area with formula that needs
"A_Dim" to be calculated using "A_Dim" and Case-When (So I Should Use "A_Dim" in first subject area then exclude it in result)
"A" equals to zero in second subject area
"B" is a column in the second subject area
"B" equals to zero in first subject area
"C" is a column in Result (Using Add Result Column) that has this formula :
SUM("A" BY sth ) / SUM("B" BY sth)
("A","B",... replaced with saw_i in result column formula as you know)
the problem is, I can not get top 10 rows ordering by "C" ??
(I tried using RANK, TOPN , TOPN(RANK()),... with no luck)
(and one more thing, there are two problem with using "Narrative view" instead of other views , first they want a bar chart, besides in narrative there is no Exclude option and I should use javaScript to get top 10 from thousands of repeated "C" values)
The original question is old, but I figured I would add my solution in case anyone ends up here like I did.
I was able to create a "Result Column" with my TOPN formula. I did not have to create equivalent columns in the two queries that are being unioned (aka there are 4 columns in each union query but 5 columns in the "Result Columns" overall). It seems to work as expected.

how can I group sum and count with sequel ORM and postgresl?

This is too tough for me guys. It's for Jeremy!
I have two tables (although I can also envision needing to join a third table) and I want to sum one field and count rows, in the same, table while joining with another table and return the result in json format.
First of all, the data type field that needs to be summed, is numeric(10,2) and the data is inserted as params['amount'].to_f.
The tables are expense_projects which has the name of the project and the company id and expense_items which has the company_id, item and amount (to mention just the critical columns) - the "company_id" columns are disambiguated.
So, the following code:
expense_items = DB[:expense_projects].left_join(:expense_items, :expense_project_id => :project_id).where(:project_company_id => company_id).to_a.to_json
works fine but when I add
expense_total = expense_items.sum(:amount).to_f.to_json
I get an error message which says
TypeError - no implicit conversion of Symbol into Integer:
so, the first question is why and how can this be fixed?
Then I want to join the two tables and get all the project names form the left (first table) and sum amount and count items in the second table. I have tried
DB[:expense_projects].left_join(:expense_items, :expense_items_company_id => expense_projects_company_id).count(:item).sum(:amount).to_json
and variations of this, all of which fails.
I would like a result which gets all the project names (even if there are no expense entries and returns something like:
project item_count item_amount
pr 1 7 34.87
pr 2 0 0
and so on. How can this be achieved with one query returning the result in json format?
Many thanks, guys.
Figured it out, I hope this helps somebody else:
DB[:expense_projects___p].where(:project_company_id=>user_company_id).
left_join(:expense_items___i, :expense_project_id=>:project_id).
select_group(:p__project_name).
select_more{count(:i__item_id)}.
select_more{sum(:i__amount)}.to_a.to_json

Resources