I have got duplicate rows in a dataset. how can i select distinct rows from that.
From comments: My query is something like this:
select name, age
from student
When I receive its output in a dataset the output consists of rows having duplicate names. Using dataset itself I have to select distinct name from this because I need the same query with duplicate values for some other place.
select DISTINCT name, age from student
If you need both the distinct data as well as the full data (with duplicate values), then you'll either have to maintain two datasets or continue doing things as you are now.
Related
There's a couple of tables that I need to use columns from in the select statement. the questions is: Create a View to display the employee id, first name and surname. In your query include the coin price and a 10% commission for the sales made by the employees.
the difficult part for me is that employees of the same employee number, can make multiple coin sales, so in the view, i need to be able to add all the coin sales together of each respective primary key (employee_id)
As you can see in this image, emp101 has sold two different coins, with the coin_id's of "7116" and "7112". In the view i want to be able to somehow tally each coin value that each employee_id has sold if that makes sense ?
There's multiple other tables, but there's too many to send, so i am just trying to get a general idea of how to do this. I understand the logistics of the question, i just dont know how to implement the answer with the correct syntax and methods etc...
Since this appears to be a homework question, here is a discussion of how to solve it:
You want to CREATE a VIEW and give the view a name (something like employee_commisions) and include 5 columns (employee_id, first_name, surname, coin_price and commission - or maybe only 4 columns if they want the combined price plus commission).
To get the values for that view, you want to SELECT ... FROM existing tables; however, your image does not include first_name, surname or a value for a coin so I am assuming that you will have an employees table and a coins table that you will need to INNER JOIN to the invoice table on their respective primary keys.
To get the total value for the coins, you want to aggregate the values and this would be done with the SUM aggregation function and, so that you get the value for each employee, you would need to GROUP BY the employee_id primary key. You will either need to include the other columns you are not aggregating by in the GROUP BY clause or apply an aggregation function to those columns such as MAX(surname).
The syntax for CREATE VIEW is here.
The syntax for SELECT is here.
I am trying to create a search against a field on a table, which can contain a combination of all sorts of string values such as single words or complete sentences.
What I trying to do is to create a query that searches on this field and returns any rows that contains any of the words and combination of words from a list as my predicate parameter.
Is there a way of passing in this list as a wild card search?
As I understood, you have a list of keywords, and you need to filter all those rows for which a certain column contains one or more keywords.
Two possible ways that I can think of -
using REGEXP_LIKE
select * from tbl where regexp_like(col, '^(ABC|EFG|IJK)');
Maintaining the list of keywords in a separate table and join the tables -
CREATE TABLE matching_patterns(
pattern VARCHAR(20)
);
INSERT INTO patterns VALUES ('%ABC%'), ('%EFG%'), ('%IJK%');
SELECT t.* FROM your_table t JOIN matching_patternsp ON (t.your_column LIKE p.pattern);
I am looking for some help on below issue:
I am getting column names from the XML column in one query, and I wanted to pass those column names in another query.
below is the sample structure of my query:
with temp1 as (
select x
from table1)
select (select distinct x from temp1) from table2
when I have given like above it is displaying column name text, not data from another table, can someone please help me to solve the issue.
Thank you.
there are 62000 records in my fact table which is not correct because I only have six records in my time dim, 240 records in my student dim and 140 records in my placement dim, does it have something to do with my where clause any help would be mostly appreciated.
INSERT INTO fact_placements (
report_id,
no_of_placements,
no_of_students,
fk1_time_id,
fk2_placement_id,
fk3_student_id )
SELECT
fact_seq.nextval,
no_of_placements,
no_of_students,
time_id,
placement_id,
student_id
FROM
time_dim,
placement_dim,
student_dim
WHERE
placement_dim.year = time_dim.year AND
student_dim.year = time_dim.year;
Unless you do a cartesian join i.e. without any WHERE clause, you will get less than 140 (placement) * 240 (student) * 6 (time) = 201600 fact records. Your current SQL uses the year column in the 3 tables to join, this is filtering down the records to the 62000 you are getting.
Your question title says that even this is "too many". If that is the case, then you would need to understand the granularity of your dimensions and the fact before joining these on any criteria. Are these all at the "year" level, if so do you have 1 record per year in each of these tables and no duplicates based on year?
If not, you might need to re-think the fact tables granularity or alternatively would need to join unique records based on year in each dimension to get the actual (less) number of records you are expecting, which can also be done by summarizing these tables based on year.
Ideally the fact table contains the combinations of the dimension keys with additional column i.e. the factual metrics (in this case no_of_placements and no_of_students). But depending on the available data not all combinations will be present in the fact table.
Also you might want to change the SQL syntax to use the INNER JOIN clause instead of the implied joins using the commas between table names in the FROM clause, as shown below
FROM time_dim
INNER
JOIN placement_dim
ON placement_dim.year = time_dim.year
INNER
JOIN student_dim
ON placement_dim.year = student_dim.year
There's no relationship between placement and student that's why you have so many records.
Your query is saying: Give me all the students and all the placements where year is the same.
I'm not sure that's what you want. What is really strange here is that you are loading a fact table with dimensions tables.
Cognos by default suppress duplicate/identical records. Duplicate rows do not appear in the report, but summaries are performed on all rows - including the duplicates that were eliminated.
To perform summaries on only the distinct rows, you must add the distinct key word when creating the summary definition. For example, the following summary:
Total(MyColumn)
Would become...
Total (distinct MyColumn)
But I would like Total of Column1 based on Distinct values of Column2. How to do this?
I assume your report is built on top of relational model.
The short answer to your questions is using FOR clause:
Using the AT and FOR Options with Relational Summary Functions
So you can do something like this:
Total(distinct MyColumn for Column2)
My question is why would you think distinct on one column is different from other column?
Cognos "eliminate" duplicate rows only if two or more rows are completely identical.
If one of the columns is different, then it's not a distinct row.
You can use grouping instead, which group together identical values on single column.