My table look like this in my oracle db;
ID | NI | NT | MB | ETC
-------------------------------------------
1 |1234567 | | | comments //valid
2 |9654875 | | | jhdsd //valid
3 |43gf543 | | | dd //in-valid
4 |123 | | | dfds //in-valid
5 |12654332 | | | dffd //in-valid
6 | |542 | | comments //valid
7 | |362 | | jhdsd //valid
8 | |9631 | | dd //invlaid
9 | |r45 | | dfds //in-valid
10 | |56 | | dffd // in-valid
11 | | |03121234567 | comments //valid
12 | | |03874514414 | jhdsd //valid
13 | | |05764544444 | dd //in-valid as not starts with 03
14 | | |30010101019 | dfds //in-valid
15 | | |038f5678543 | dffd //in-valid
I like select only valid records with select query
where
NI length should be fix 7 and all, starts with any digit
NT length should be fix 3 and all, starts with any digit
digits MB length should be fix 11, starts with 03 and all digits.
result should look like this;
1 |1234567 | | | comments
2 |9654875 | | | jhdsd
3 | |542 | | comments
4 | |362 | | jhdsd
5 | | |03121234567 | comments
6 | | |03874514414 | jhdsd
Try this:
NI length should be fix 7 and all, starts with any digit
REGEXP_LIKE(NI, '^\d{7}$')
NT length should be fix 3 and all, starts with any digit
REGEXP_LIKE(NT, '^\d{3}$')
digits MB length should be fix 11, starts with 03 and all digits.
REGEXP_LIKE(MB, '^03\d{9}$')
you could use a substr and length
select ID, NI, NT, MB, ETC
from my_table
where length(NI) = 7
and length(NT) = 3
and substr(MB,1,2) ='03'
AND REGEXP_LIKE(NI, '^[[:digit:]]+$')
AND REGEXP_LIKE(NT, '^[[:digit:]]+$')
Related
Newbie question: I have a table with ID, ParentID, and Type. I want to create two new columns (StrategyID, SubstrategyID) that contains the ID for the row if its Type = 'Strategy' or 'Substrategy'. Otherwise, I want to look at its parent row and return that ID if it matches the Types sought. If not, repeat and look at the parent of the parent, etc. I am not getting the syntax for functions in general and recursive functions in particular in PowerQuery.
I've looked at many examples and videos, and found some help, but not specifically for what I am trying to do.
------------------------------------------------------------
| Existing columns New Colums |
------------------------------------------------------------
| ID | ParentID | Type | StrategyID | SubstrategyID |
| 1 | 0 | Strategy | 1 | |
| 2 | 1 | Substrategy | 1 | 2 |
| 3 | 2 | Feature | 1 | 2 |
| 4 | 3 | Story | 1 | 2 |
| 5 | 3 | Story | 1 | 2 |
| 6 | 1 | Substrategy | 1 | 6 |
| 7 | 6 | Feature | 1 | 6 |
| 8 | 7 | Story | 1 | 6 |
| 9 | 7 | Story | 1 | 6 |
| 10 | 0 | Strategy | 10 | |
| 11 | 10 | Substrategy | 10 | 11 |
| 12 | 11 | Feature | 10 | 11 |
| 13 | 12 | Story | 10 | 11 |
| 14 | 12 | Story | 10 | 11 |
| 15 | 12 | Story | 10 | 11 |
| 16 | 10 | Substrategy | 10 | 16 |
| 17 | 16 | Feature | 10 | 16 |
| 18 | 17 | Story | 10 | 16 |
| 19 | 17 | Story | 10 | 16 |
------------------------------------------------------------
'''
Give this a try. Assumes source data in Table1 with 3 columns --"ID", "ParentID" and "Type"
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ChangedType = Table.TransformColumnTypes(Source,{{"ID", type text}, {"ParentID", type text}}),
ID_List = List.Buffer( ChangedType[ID] ),
ParentID_List = List.Buffer( ChangedType[ParentID] ),
Type_List = List.Buffer( ChangedType[Type] ),
Highest = (n as text, searchfor as text) as text =>
let
Spot = List.PositionOf( ID_List, n ),
ThisType = Type_List{Spot},
Parent_ID = ParentID_List{Spot}
in if Parent_ID = null or ThisType=searchfor then ID_List{Spot} else #Highest(Parent_ID,searchfor),
FinalTable = Table.AddColumn( ChangedType, "StrategyID", each Highest( [ID],"Strategy" ), type text),
FinalTable2 = Table.AddColumn( FinalTable, "SubstrategyID", each Highest( [ID],"Substrategy" ), type text),
#"Replaced Errors" = Table.ReplaceErrorValues(FinalTable2, {{"SubstrategyID", null}})
in #"Replaced Errors"
I think you want to use PATH and PATHITEM.
Assuming your table is called 'Table'
create a new column:
Path = PATH(Table[ID],Table[ParentID])
Then:
StrategyID = PATHITEM(Table[Path],1,1)
SubstrategyID = PATHITEM(Table[Path],2,1)
Very similar to my last question, now I want only the, "full combination," for a group in order of priority. So, from this source table:
+-------+-------+----------+
| GROUP | State | Priority |
+-------+-------+----------+
| 1 | MI | 1 |
| 1 | IA | 2 |
| 1 | CA | 3 |
| 1 | ND | 4 |
| 1 | AZ | 5 |
| 2 | IA | 2 |
| 2 | NJ | 1 |
| 2 | NH | 3 |
And so on...
I need a query that returns:
+-------+---------------------+
| GROUP | COMBINATION |
+-------+---------------------+
| 1 | MI, IA, CA, ND, AZ |
| 2 | NJ, IA, NH |
+-------+---------------------+
Thanks for the help, again!
Use listagg() ordering by priority within the group.
SELECT "GROUP",
listagg("STATE", ', ') WITHIN GROUP (ORDER BY "PRIORITY")
FROM "ELBAT"
GROUP BY "GROUP";
db<>fiddle
I want to pivot the following table
| ID | Code | date | qty |
| 1 | A | 1/1/19 | 11 |
| 1 | A | 2/1/19 | 12 |
| 2 | B | 1/1/19 | 13 |
| 2 | B | 2/1/19 | 14 |
| 3 | C | 1/1/19 | 15 |
| 3 | C | 3/1/19 | 16 |
into
| ID | Code | mth_1(1/1/19) | mth_2(2/1/19) | mth_3(3/1/19) |
| 1 | A | 11 | 12 | 0 |
| 2 | B | 13 | 14 | 0 |
| 3 | C | 15 | 0 | 16 |
I am new to hive, i am not sure how to implement it.
NOTE: I don't want to do mapping because my month values change over time.
I run this query in sphinx se console:
SELECT #distinct FROM all_ips GROUP BY ip1;
I get this result:
+------+--------+
| id | weight |
+------+--------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 9 | 1 |
| 15 | 1 |
| 16 | 1 |
| 17 | 1 |
| 20 | 1 |
| 21 | 1 |
| 25 | 1 |
| 26 | 1 |
| 27 | 1 |
| 31 | 1 |
| 32 | 1 |
| 38 | 1 |
| 39 | 1 |
| 40 | 1 |
| 46 | 1 |
| 50 | 1 |
| 51 | 1 |
+------+--------+
20 rows in set (0.57 sec)
How can i get number of unique values? Why #distinct column doesn't show up in results?
1) I dont think that is sphinxSE - do you really mean sphinxQL? That looks more like sphinxQL.
2) Distinct of what column? You need to sell sphinx what attribute you want to count the distinct values in. In sphinxQL use COUNT(DISTINCT column_name)
You will require simple SQL statement for getting count. Something like this
SELECT count(ip1),ip1
FROM all_ips
GROUP BY ip1;
Assume a range of values inserted in a schema table and in the end of the month i want to apply for these records (i.e. 2500 rows = numeric values) the algorithm: sort the values descending (from the smallest to highest value) and then find the 80% value of the sorted column.
In my example, if each row increases by one starting from 1, the 80% value will be the 2000 row=value (=2500-2500*20/100). This algorithm needs to be implemented in a procedure where the number of rows is not constant, for example it can varries from 2500 to 1,000,000 per month
Hint: You can achieve this using Oracle's cumulative aggregate functions. For example, suppose your table looks like this:
MY_TABLE
+-----+----------+
| ID | QUANTITY |
+-----+----------+
| A | 1 |
| B | 2 |
| C | 3 |
| D | 4 |
| E | 5 |
| F | 6 |
| G | 7 |
| H | 8 |
| I | 9 |
| J | 10 |
+-----+----------+
At each row, you can sum the quantities so far using this:
SELECT
id,
quantity,
SUM(quantity)
OVER (ORDER BY quantity ROWS UNBOUNDED PRECEDING)
AS cumulative_quantity_so_far
FROM
MY_TABLE
Giving you:
+-----+----------+----------------------------+
| ID | QUANTITY | CUMULATIVE_QUANTITY_SO_FAR |
+-----+----------+----------------------------+
| A | 1 | 1 |
| B | 2 | 3 |
| C | 3 | 6 |
| D | 4 | 10 |
| E | 5 | 15 |
| F | 6 | 21 |
| G | 7 | 28 |
| H | 8 | 36 |
| I | 9 | 45 |
| J | 10 | 55 |
+-----+----------+----------------------------+
Hopefully this will help in your work.
Write a query using the percentile_disc function to solve your problem. Sounds like it does what you want.
An example would be
select percentile_disc(0.8) within group (order by the_value)
from my_table