how to use lookups - oracle

Lookup table:-
FROM_METER TO_METER COST
1 5 0.004
6 10 0.006
11 20 0.012
Needed output:- (here for 15 and 7 for example cost output will come as from lookup master)
METER COST
15 0.012
7 0.006
etc....

A lookup table is normally used t provide code or reference values for regular data. Assuming your business table is the one you called MASTER you would join its records to your lookup like this:
select m.meter
, l.cost
from master m
join lookup l
on ( m.meter between l.meter_from and l.meter_to)
Your data model ppears to have a problem: it doesn't guarantee that every METER in the MASTER table will find a matching record in the lookup table. This is why a more common approach is to use types or categories (say 'cheap', 'reasonable', 'expensive').
Obviously, without knowing your business rukes it is hard for me to say whether you model is correct, but you probably should consider what to do if the MASTER table doesn't have a matching row. Perhaps the relation is enforced in the application but in my experience that approach is hard to make it bulletproof.

select ? Meter, cost from Lookup
where ? > = FROM_METER
and ? < = TO_METER
replace the ? with your parameter.

Related

PowerBI filter table based on value of measure_A OR measure_B [duplicate]

We are trying to implement a dashboard that displays various tables, metrics and a map where the dataset is a list of customers. The primary filter condition is the disjunction of two numeric fields. We want to the user to be able to select a threshold for [field 1] and a separate threshold for [field 2] and then impose the condition [field 1] >= <threshold> OR [field 2] >= <threshold>.
After that, we want to also allow various other interactive slicers so the user can restrict the data further, e.g. by country or account manager.
Power BI naturally imposes AND between all filters and doesn't have a neat way to specify OR. Can you suggest a way to define a calculation using the two numeric fields that is then applied as a filter within the same interactive dashboard screen? Alternatively, is there a way to first prompt the user for the two threshold values before the dashboard is displayed -- so when they click Submit on that parameter-setting screen they are then taken to the main dashboard screen with the disjunction already applied?
Added in response to a comment:
The data can be quite simple: no complexity there. The complexity is in getting the user interface to enable a disjunction.
Suppose the data was a list of customers with customer id, country, gender, total value of transactions in the last 12 months, and number of purchases in last 12 months. I want the end-user (with no technical skills) to specify a minimum threshold for total value (e.g. $1,000) and number of purchases (e.g. 10) and then restrict the data set to those where total value of transactions in the last 12 months > $1,000 OR number of purchases in last 12 months > 10.
After doing that, I want to allow the user to see the data set on a dashboard (e.g. with a table and a graph) and from there select other filters (e.g. gender=male, country=Australia).
The key here is to create separate parameter tables and combine conditions using a measure.
Suppose we have the following Sales table:
Customer Value Number
-----------------------
A 568 2
B 2451 12
C 1352 9
D 876 6
E 993 11
F 2208 20
G 1612 4
Then we'll create two new tables to use as parameters. You could do a calculated table like
Number = VALUES(Sales[Number])
Or something more complex like
Value = GENERATESERIES(0, ROUNDUP(MAX(Sales[Value]),-2), ROUNDUP(MAX(Sales[Value]),-2)/10)
Or define the table manually using Enter Data or some other way.
In any case, once you have these tables, name their columns what you want (I used MinNumber and MinValue) and write your filtering measure
Filter = IF(MAX(Sales[Number]) > MIN(Number[MinCount]) ||
MAX(Sales[Value]) > MIN('Value'[MinValue]),
1, 0)
Then put your Filter measure as a visual level filter where Filter is not 0 and use MinCount and MinValues column as slicers.
If you select 10 for MinCount and 1000 for MinValue then your table should look like this:
Notice that E and G only exceed one of the thresholds and tha A and D are excluded.
To my knowledge, there is no such built-in slicer feature in Power BI at the time being. There is however a suggestion in the Power BI forum that requests a functionality like this. If you'd be willing to use the Power Query Editor, it's easy to obtain the values you're looking for, but only for hard-coded values for your limits or thresh-holds.
Let me show you how for a synthetic dataset that should fit the structure of your description:
Dataset:
CustomerID,Country,Gender,TransactionValue12,NPurchases12
51,USA,M,3516,1
58,USA,M,3308,12
57,USA,M,7360,19
54,USA,M,2052,6
51,USA,M,4889,5
57,USA,M,4746,6
50,USA,M,3803,3
58,USA,M,4113,24
57,USA,M,7421,17
58,USA,M,1774,24
50,USA,F,8984,5
52,USA,F,1436,22
52,USA,F,2137,9
58,USA,F,9933,25
50,Canada,F,7050,16
56,Canada,F,7202,5
54,Canada,F,2096,19
59,Canada,F,4639,9
58,Canada,F,5724,25
56,Canada,F,4885,5
57,Canada,F,6212,4
54,Canada,F,5016,16
55,Canada,F,7340,21
60,Canada,F,7883,6
55,Canada,M,5884,12
60,UK,M,2328,12
52,UK,M,7826,1
58,UK,M,2542,11
56,UK,M,9304,3
54,UK,M,3685,16
58,UK,M,6440,16
50,UK,M,2469,13
57,UK,M,7827,6
Desktop table:
Here you see an Input table and a subset table using two Slicers. If the forum suggestion gets implemented, it should hopefully be easy to change a subset like below to an "OR" scenario:
Transaction Value > 1000 OR Number or purchases > 10 using Power Query:
If you use Edit Queries > Advanced filter you can set it up like this:
The last step under Applied Steps will then contain this formula:
= Table.SelectRows(#"Changed Type2", each [NPurchases12] > 10 or [TransactionValue12] > 1000
Now your original Input table will look like this:
Now, if only we were able to replace the hardcoded 10 and 1000 with a dynamic value, for example from a slicer, we would be fine! But no...
I know this is not what you were looking for, but it was the best 'negative answer' I could find. I guess I'm hoping for a better solution just as much as you are!

MAX() SQL Equivalent on Redis

I'm new on Redis, and now I have problem to improve my stat application. The current SQL to generate the statistic is here:
SELECT MIN(created_at), MAX(created_at) FROM table ORDER BY id DESC limit 10000
It will return MIN and MAX value from created_at field.
I have read about RANGE and SCORING on Redis, seem them can be used to solve this problem. But I still confused about SCORING for last 10000 records. Are they can be used to solve this problem, or is there another way to solve this problem using Redis?
Regards
Your target appears to be somewhat unclear - are you looking to store all the records in Redis? If so, what other columns does the table table have and what other queries do you run against it?
I'll take your question at face value, but note that in most NoSQL databases (Redis included) you need to store your data according to how you plan on fetching it. Assuming that you want to get the min/max creation dates of the last 10K records, I suggest that you keep them in a Sorted Set. The Sorted Set's members will be the unique id and their scores will be the creation date (use the epoch value), for example, rows with ids 1, 2 & 3 were created at dates 10, 100 & 1000 respectively:
ZADD table 10 1 100 2 1000 3 ...
Getting the minimal creation date is easy now - just do ZRANGE table 0 0 WITHSCORES - and the max is just a ZRANGE table -1 -1 WITHSCORES away. The only "tricky" part is making sure that the Sorted Set is kept updated, so for every new record you'll need to remove the lowest id from the set and add the new one. In pseudo Python code this would look something like the following:
def updateMinMaxSortedSet(id, date):
count = redis.zcount('table')
if count > 10000:
redis.zrem('table', id-10000)
redis.zadd('table', id, date)

JPA best way to avoid n+1 when I need to make a calculation for each row

My application is used to find place in a city. Each place needs a score to be calculated and this score cannot be predicted in advance (stored somewhere) as it is different for each user and changes over time. Here is was i'm at the moment doing an that is TERRIBLY inneficient (15 times slower than if I mock the database call inside the loop)
SQL(native) query to fetch all the places that matches the search (I select all the column I need specifically)
I loop through the List and for each poi I make a db call to get the info needed to calculate the scores (I need different value residing on different tables)
make the calculation
sort by score desc
cut the list depending on the pagination setting (yes I cannot put LIMIT directly in the query as i don't know the score yet....)
return the List.
Well, this takes 15 seconds in total.
If I remove 2. and simply mock the DB call it only takes 600ms..
my table looks like this:
place_tag_count table:
place_id / tag_id / tag_count
1 100 15
1 200 25
1 300 35
user_tag_score Table:
user_id / tag_id / score
1000 100 0.5
1000 200 0.3
as a simplified example the place score is the sum of the user's tag score multiplied by the tag count found in the place_tag_count
score = 0.5 * 15 + 0.3 * 25 + (i won't complicate the thing but if a tag score is missing i do other calculation that need other db calls....)
the query at 1. returns a distinct place so because the calculation needs all the counts from of the tag's place and the user's tag score I need to make that extra DB call for each poi.
My question is, what would be the BEST way to avoid having n+1 call in my situation? I have thought to some alternative but I prefer having the opinion of a more experience person before going head first.
Instead of returning a distinct place in the query in 1) I actually return the same place grouped by place_id,tag_id for example, and in my Java code I just loop and when I see that the place_id change it means I'm processing an other place
make the query in 1. a bitttt more complicate and aggregate all the numbers i need in a comma separated list )but that requires some kind of sub select which might affect the speed of the query)
other solution ?

oracle PL/SQL with automatic values

Hi folks I have a question regarding calculating the value in a column in oracle.
So I have this table
NAME PROCESS1 PROCESS2 WEIGHT TOTAL_WEIGHT
ITEM1 0 0 10
ITEM2 1 1 10
ITEM3 1 1 15
So what I am trying to do here is generating the value in total_weight based on process1 and process2 in PL/SQL. Because later on I need to show the sum of total weight in PHP page. so for item 2 the total weight should be 20 and for item 3 it should be 30. Should I use procedure for generating the value in total weight? I want the value is updated when user change the value in Process1 or process2. Please help me I am kinda newbie here.
select name, process1, process2, weight, (process1+process2)*weight total_weight
from table
I see no reason that PL/SQL has to play a role in this, unless I misunderstood the requirement.
Declaring VIRTUAL (Computed) COLUMNS in Oracle Table Design
I'll agree with most of what has been said so far with some additional elaboration. My starting table design looks similar but it too is also inaccurate for some use cases as explained below.
CREATE TABLE "PROCESSED_PRODUCT_WEIGHT" (
"PRODUCT_NAME" VARCHAR2(40) NOT NULL,
"PROCESS1" NUMBER,
"PROCESS2" NUMBER,
"WEIGHT" NUMBER,
"TOTAL_WEIGHT" NUMBER GENERATED ALWAYS AS
((PROCESS1 + PROCESS2)*WEIGHT) VIRTUAL,
"RECORDED_DATE" DATE,
CONSTRAINT "PROCESSED_PRODUCT_WEIGHT_PK"
PRIMARY KEY ("PRODUCT_NAME", "RECORDED_DATE")
)
/
Previous Suggestions and Assumptions
Table Bound Attribute Properties: The table construct used by #Bob Jarvis is also known as a VIRTUAL COLUMN. It works well because the definition of TOTAL_WEIGHT is entirely dependent on other values contained within the same table.
SQL Query Associated Calculation: On the other hand, #Nishanthi Grashia and #OldProgrammer both recommend modifying the value within each SQL query executed against the database.
BOTH Cases may work assuming that the mass per unit of the product does not change during the lifetime of the production cycle.
An example where this assumption is not flexible is if the products consist of units of varying mass per unit volume.
Since it was not mentioned in the OP, consider this possibility:
Products ITEM1, ITEM2 and ITEM3 have variable weights per unit.
They are all produced in a coffee packaging plant.
Each item can be a type of coffee bean and its source.
"Processes" could be bean "treatments" such as decaffeination, roasting type or flavor infusion.
The "units" could be packaging of varying sizes. This would mean that package volumes would have a direct effect on the mass (called "weight") per product unit counted.
Test Cases for Identifying the Effect of Changing Unit Sizes
Each test case shows how a virtual column does not satisfy the possibility of variations in the unit sizes and masses of each product over time.
Test Case One:
For production observations made 2/14/2015
Test Case Two:
The mass per unit processed on 3/14/2014 is increased only, skewing the total mass produced since the item quantities made previously are multiplied by a larger value through the virtual column definition.
Test Case Three:
Data Output and Results
Above are the test results associated with all three test cases. the resulting values are not correct for the use cases created. They demonstrate that for a changing weight value, the virtual/calculated column formula and approach gives incorrect results.
A Discussion of Alternate Solutions
The trigger approach may work for maintaining calculated values for TOTAL_WEIGHT. Incremental changes (updates) are appended to the current, existing value as each component varies.
Force all DML through a single DML operation contained in a CRUD package. The problem with defining an embedded SQL statement to enforce requirements is that other processes and their developers will need to be familiar with what your isolated PHP form/page does within your app in order to duplicate it for their own operation.
If there is a concern about overhead or possible locking of the main table, then consider introducing a composite key: PRODUCT_NAME + WEIGHT. This covers for the problem so that quantities of the same product name are multiplied by their correct weight and values already calculated remain unchanged even if the weight multiplier is modified.
SOMETIMES, ALWAYS, NEVER... Are popular assumptions thrown around in developer's project circles. How likely is this to happen at all? It depends... if you're a coffee bean packaging outfit, I'd say it's quite possible.
Onward!
From what I understand, you want the field TOTAL_WEIGHT to be updated whenever the value in process 1 or process2 changes. So, Ideally you must be using TRIGGERS for this.
TRIGGERS are used to trigger an action based on an initial event.
So, for your case, the initial event is "Value change of process1 or process2" and the action expected is "automatic update of total weight field based on changed value."
But, for your requirement, a trigger is unnecessary and a totally unnecessary overhead. So, instead of having an additional field in the table, rather use a select query as below, which would calculate the value during run-time and display the real-time value.
SELECT NAME,
PROCESS1,
PROCESS2,
WEIGHT ,
(WEIGHT * (PROCESS1 + PROCESS2)) AS TOTAL_WEIGHT
FROM MY_TABLE
The output would be:
NAME | PROCESS1 | PROCESS2 | WEIGHT | TOTAL_WEIGHT
------------------------------------------------------------
ITEM1 | 0 | 0 | 10 | 0
ITEM2 | 1 | 1 | 10 | 20
ITEM3 | 1 | 1 | 15 | 30
You can use this TOTAL_WEIGHT using something like resultSet.getLong("TOTAL_WEIGHT");
Or, if you are very particular in having the field, then you can modify your update query to include
UPDATE MY_TABLE SET FIELD1=VALUE1, FIELD2=VALUE2, ... ,
TOTAL_WEIGHT = (WEIGHT * (PROCESS1 + PROCESS2))
WHERE SOME_CONDITION;
If you're using 11g or later, the safest way to handle this would be to make TOTAL_WEIGHT a computed column. The CREATE TABLE statement would then become something like
CREATE TABLE MY_TABLE
(PROCESS1 NUMBER,
PROCESS2 NUMBER,
WEIGHT NUMBER,
TOTAL_WEIGHT NUMBER GENERATED ALWAYS AS (NVL((PROCESS1+PROCESS2)*WEIGHT, 0)));
Done this way applications don't need to know how to compute TOTAL_WEIGHT - it's always done correctly.
SQLFiddle here.
Share and enjoy.

How to convert multiple rows into single row in iformatica for large volume of data, need best solution

I have data in table A as below
Assetid attribute value
1546 Ins_date 05062011
1546 status active
1546 X 10.4567
1546 Y 27.56
1546 size 17
675 X 4.778
675 Y 53.676
675 depth 5
675 st_date 06092010
I have data as above in table A. This table has many Assetids 1546,675,....etc. attributes might vary for assets.
I want output as below:
assetid ins_date status X Y Size depth st_date
1546 05062011 active 10.4567 27.56 17 null null
675 null null 4.778 53.676 null 5 06092010
I have created Stored procedure, then called in Informatica to achieve this output. However, since i have large volume of data, it is taking much time to load.
Please suggest me other easy and best way to load it.
Use a router to split the rows into separate groups depending on attribute and then use a set of joiners to merge the rows with the same assetid values.
Use an Aggregator transformation to condense the records into one record per assetid. Then for each attribute, create a port that returns MAX(value) where the attribute matches. Note that this method assumes that you know all possible attributes ahead of time.
As suggested in previous answer you can use the aggregator. Since your data set is large you can use a technique using variable port in an expression as well provided the data is sorted before it reaches the expression.
You can download the sample mappings that demonstrate both the the techniques from Informatica Marketplace App titled "PowerCenter Mapping: Convert Rows Into Columns".

Resources