Oracle query to list all parents before children - oracle

I have a table which has child#, parent# like the following :
child# | parent#
------------------
10 | NULL
20 | NULL
2 | 1
1 | 10
50 | 10
6 | 5
5 | 2
There is no ordering of numbers, i.e. 1 can be parent of 10 and 10 can be parent of 20.
I want an ORACLE SQL query which lists all parents first, followed by their children.
I want a temporary table like following:
child# | parent#
----------------
10 | NULL
20 | NULL
1 | 10
2 | 1
50 | 10
5 | 2
I want to traverse this temporary table and process each rows, so for that I need to make sure parent is listed before the children rows.

select level,child,parent
from your_table
start with t2.parent is null
connect by prior t2.child = t2.parent
order by level
OUTPUT:
LEVEL CHILD PARENT
1 10 (null)
1 20 (null)
2 1 10
2 50 10
3 2 1
4 5 2
5 6 5
Link to fiddle

Related

order by not returning running count

In this question - why adding order by in the query changes the aggregate value? - I was told that "If the window clause contains both PARTITION BY and ORDER BY, it returns the running count within the partition . So, using the ORDER BY expression, how many rows have been counted so far within the partition."
Referring to this example - https://www.vertica.com/docs/11.0.x/HTML/Content/Authoring/AnalyzingData/SQLAnalytics/ReportingAggregates.htm?tocpath=Analyzing%20Data%7CSQL%20Analytics%7CWindow%20Framing%7C_____3
Why does the cumulative count shows 4 (last value of count) for all values of sal=109?
=> SELECT deptno, sal, empno, COUNT(sal) OVER (
-> PARTITION BY deptno ORDER BY sal
-> ) AS COUNT
-> FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 1
10 | 104 | 4 | 2
------------------------------
20 | 100 | 11 | 1
20 | 109 | 7 | 4<-
20 | 109 | 6 | 4<-
20 | 109 | 8 | 4<-
20 | 110 | 10 | 6<-
20 | 110 | 9 | 6<-
------------------------------
30 | 102 | 2 | 1
30 | 103 | 3 | 2
30 | 105 | 5 | 3
You order by sal, which is at 109 for 3 rows within deptno 20. For the ordering criteria, there are 3 rows that should appear at the same time. After all 3 are added, after 100 for 1 row, you are immediately at 4. So you get 4 for all 3.
You need distinct ordering values to get distinct running count results.

Populating Tables in Access

I'm new to Access and trying to create a simple Access tool where one Table gets data from another Table.
For example I have Table 1 with a Qty for each unique ID. Table 2A now needs to retrieve data from Table 1 based on the row's ID to finally look like Table 2B.
Table 1: | Table 2A: | Table 2B:
-----------------------------------------------------------
ID Name Qty | ID Name Qty | ID Name Qty
1 One 19 | 1 One | 1 One 19
2 Two 21 | 3 Three | 3 Three 10
3 Three 10 | 1 One | 1 One 19
4 Four 26 | 4 Four | 4 Four 26
5 Five 20 | 4 Four | 4 Four 26
I took a look at the "Lookup Wizard" but all the forums I've been to advises not to use it. Can anyone please advise on how to do this in Access in the simplest way.

Row count by group

I am attempting to write the following query to get a row count by group.
select
a.employee, a.cov_option,
count(a.cov_option) over (partition by a.cov_option order by a.employee) as row_num
from wilson.benefit a
inner join wilson.bncategory b
ON a.plan_type = b.plan_type and a.plan_option = b.plan_option
inner join wilson.bncovopt c
ON a.company = c.company and a.plan_code = c.plan_code and a.cov_option = c.coverage_opt
where
a.plan_type = 'HL' and
to_char(a.stop_date, 'yyyy-mm-dd') = '1700-01-01'
order by a.employee, a.cov_option
The result set returned is:
employee | cov_option |row_num
-------------|--------------|--------------
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 2
1420 | 3 | 4
1420 | 3 | 4
1537 | 2 | 2
1537 | 2 | 2
The result set I am attempting to return is:
429 | 1 | 1
429 | 3 | 2
429 | 3 | 2
1420 | 1 | 1
1420 | 3 | 2
1420 | 3 | 2
1537 | 2 | 1
1537 | 2 | 1
What you seem to want is dense_rank() rather than count(). Indeed, "count" means simply determining how many rows are in each group, it is not "counting" in the way we learn as children (first, second, third). That kind of counting is called "ranking".
dense_rank() over (partition by a.employee order by a.cov_option) as row_num
should do what you need.
There is also rank() - the difference is that if two rows are tied for first, with dense_rank() the third row gets rank 2; with simple rank() it gets rank 3 (rank 2 is "used up" by the first two rows).

Stata: need help creating a binary variable from panel data

I have a dataset in which a household id (hhid) and a member id (mid) identify a unique person. I have results from two separate surveys taken a year apart (surveyYear). I also have data on whether or not the individual was enrolled in school at the time.
I want a binary variable which signifies if the individual in question dropped out of school between the surveys (i.e. 1 if dropped and 0 if still in school)
I have a decent understanding of Stata but this coding challenge seems a little beyond me because I am not sure how to compare the in-school status of the later id with the earlier id and then propagate that result into a binary column.
Here is an example of what I need
Previously:
+----------------------------------+
| hhid mid survey~r inschool |
|----------------------------------|
1. | 1 2 3 1 |
2. | 1 2 4 1 |
3. | 1 3 3 1 |
4. | 1 3 4 1 |
5. | 2 1 3 1 |
6. | 2 1 4 0 |
7. | 2 2 3 0 |
8. | 2 2 4 0 |
+----------------------------------+
After:
+--------------------------------------------+
| hhid mid survey~r inschool dropped |
|--------------------------------------------|
1. | 1 2 3 1 0 |
2. | 1 2 4 1 0 |
3. | 1 3 3 1 0 |
4. | 1 3 4 1 0 |
5. | 2 1 3 1 1 |
6. | 2 1 4 0 1 |
7. | 2 2 3 0 0 |
8. | 2 2 4 0 0 |
+--------------------------------------------+
bysort hhid mid (surveyyear) : gen dropped = inschool[1] == 1 & inschool[2] == 0
The commentary is longer than the code:
Within blocks of observations with the same hhid and mid, sort by surveyyear.
You want students who were inschool in year 3 but not in year 4. So, inschool is 1 in the first observation and 0 in the second.
Here subscripting [1] and [2] refers to order within blocks of observations defined by the by: statement.
If further detail is needed see e.g. this article. Note that contrary to one tag, no loop is needed (or, if you wish, that the loop over possibilities is built in to the by: framework).

putting data into categories (buckets) based on a value

how can i assign a category based on a value?
for example, i have a table with values from 1-200. how do i assign a category to each record, like 1-5, 6-10, 11-15, etc.
i can do it using the below but that seems like a bad solution.
sorry, this is probably very basic but i don't know what it's called and googling buckets (as it's called in our company) didn't bring up any results.
thank you
SELECT DISTINCT CountOfSA,
CASE
WHEN CountOfSA BETWEEN 1 AND 5 THEN
'1-5'
WHEN CountOfSA BETWEEN 6 AND 10 THEN
'6-10'
WHEN CountOfSA BETWEEN 11 AND 15 THEN
'11-15'
WHEN CountOfSA BETWEEN 16 AND 20 THEN
'16-20'
WHEN CountOfSA BETWEEN 21 AND 25 THEN
'21-25'
WHEN CountOfSA BETWEEN 26 AND 30 THEN
'26-30'
END
AS diff
FROM NR_CF_212
Have a look at WIDTH_BUCKET function.
This divides the range into equal sized intervals and assigns a bucket number to each interval.
with x as (
select CountOfSA,
width_bucket(CountOfSA, 1, 200, 40) bucket_
from NR_CF_212
)
select CountOfSA,
cast(1 + (bucket_ - 1)*5 as varchar2(4)) ||
'-' ||
cast( bucket_*5 as varchar2(4)) diff
from x
order by CountOfSA;
Demo here.
I would put the range values and descriptions into a separate table, especially if you plan to use these for future queries, views, etc. Plus its easier to change the ranges or descriptions as needed. For example:
create table sales_ranges
(
low_val number not null,
high_val number not null,
range_desc varchar2(100) not null
)
cache;
insert into sales_ranges values (0,1000,'$0-$1k');
insert into sales_ranges values (1001,10000,'$1k-$10k');
insert into sales_ranges values (10001,100000,'$10k-$100k');
insert into sales_ranges values (100001,1000000,'$100k-$1mm');
insert into sales_ranges values (1000001,10000000,'$1mm-$10mm');
insert into sales_ranges values (10000001,100000000,'$10mm-$100mm');
commit;
create table sales
(
id number,
total_sales number
);
insert into sales(id, total_sales)
-- some random values for testing
select level, trunc(dbms_random.value(1,10000000))
from dual
connect by level <= 100;
commit;
select id, total_sales, range_desc
from sales s
left outer join sales_ranges sr
on (s.total_sales between sr.low_val and sr.high_val)
order by s.id
;
Output (just first 3 rows):
ID TOTAL_SALES RANGE_DESC
1 5122380 $1mm-$10mm
2 347726 $100k-$1mm
3 6564700 $1mm-$10mm
I guess you could use concatenate together some calculated values to dynamically create the bucket name:
select countofsa
, ((countofsa - 1)/5) * 5 + 1
, ((countofsa - 1)/5 + 1) * 5
, ((countofsa - 1)/5) * 5 + 1 || '-' || ((countofsa - 1)/5 + 1) * 5 AS diff
from nr_cf_212
Some output:
countofsa | ?column? | ?column? | diff
-----------+----------+----------+-------
1 | 1 | 5 | 1-5
2 | 1 | 5 | 1-5
3 | 1 | 5 | 1-5
4 | 1 | 5 | 1-5
5 | 1 | 5 | 1-5
6 | 6 | 10 | 6-10
7 | 6 | 10 | 6-10
8 | 6 | 10 | 6-10
9 | 6 | 10 | 6-10
10 | 6 | 10 | 6-10
11 | 11 | 15 | 11-15
(11 rows)
UPDATE from comments, Oracle example, dynamically computing range:
create table nr_cf_212(countofsa number);
insert into nr_cf_212 values(1);
insert into nr_cf_212 values(2);
insert into nr_cf_212 values(3);
insert into nr_cf_212 values(4);
insert into nr_cf_212 values(5);
insert into nr_cf_212 values(6);
insert into nr_cf_212 values(7);
insert into nr_cf_212 values(9);
insert into nr_cf_212 values(10);
insert into nr_cf_212 values(11);
select countofsa
, TRUNC((countofsa - 1)/5) * 5 + 1
, (TRUNC((countofsa - 1)/5) + 1) * 5
, TRUNC((countofsa - 1)/5) * 5 + 1 || '-' || (TRUNC((countofsa - 1)/5) + 1) * 5 AS diff
from nr_cf_212;
| COUNTOFSA | TRUNC((COUNTOFSA-1)/5)*5+1 | (TRUNC((COUNTOFSA-1)/5)+1)*5 | DIFF |
|-----------|----------------------------|------------------------------|-------|
| 1 | 1 | 5 | 1-5 |
| 2 | 1 | 5 | 1-5 |
| 3 | 1 | 5 | 1-5 |
| 4 | 1 | 5 | 1-5 |
| 5 | 1 | 5 | 1-5 |
| 6 | 6 | 10 | 6-10 |
| 7 | 6 | 10 | 6-10 |
| 9 | 6 | 10 | 6-10 |
| 10 | 6 | 10 | 6-10 |
| 11 | 11 | 15 | 11-15 |
I tried it with sqlfiddle (http://sqlfiddle.com/#!4/b922e/4).
I broke it into parts to show the "from" column, the "to" column and then the range. If you divide your number by 5 and look at the quotient and the remainder, you will see a pattern:
1/5 = 0 remainder 1
2/5 = 0 remainder 2
3/5 = 0 remainder 3
4/5 = 0 remainder 4
5/5 = 1 remainder 0
6/5 = 1 remainder 1
7/5 = 1 remainder 2
8/5 = 1 remainder 3
9/5 = 1 remainder 4
10/5 = 2 remainder 0
11/5 = 2 remainder 1
The range for the number is "from" 5 times the quotient "to" 5 times the quotient plus the remainder -almost. Actually, everything is offset by 1. So take your number, subtract 1, then do the division.

Resources