Default data in fact table - etl

I have the ff. DB table 'proposal' and 'funds' that will be transformed into a dimension (dim_proposal) and fact (fact_funds) tables. Cardinality of 'proposal' to 'funds' is zero-to-many and 'funds' to 'proposal' is one-is-to-many.
My question is should i generate fact data representing a proposal row with zero funds? In the source system, there is no generated rows for zero funded proposal. Kindly provide best approach and practices for this kind of scenario.
Thank you.

Related

How to normalize this UNF table to 3NF given these FDs - staff, patient, date, time, surgery

The following table is not normalized:
Assuming the following functional dependencies are in place, how would we normalize this table?:
I can't seem to find a way to normalize the table while following all of the functional dependencies as well. I have the following (Modeled in Oracle SQL Developer Data Modeler):
What can I do to fully normalize the original table?
So Wikipedia’s entry for functional dependency includes this explanation:
a dependency FD: X → Y means that the values of Y are determined by the values of X. Two tuples sharing the same values of X will necessarily have the same values of Y.
So FD1 says, if you know the appointment date time and the staffer, you can determine the individual patient, and likewise for FD5 if you know the appointment dateline and the patient you can determine the staffer.
FD2 is pretty obvious, a staffer Id needs to map to an individual dentist. That is why you have ids.
Then it gets weird. FD3 indicates that from a patient number you can determine a single procedure. So if you’re required to abide by that, the surgery can go on the patient entity. Which is stupid, of course.
FD4 is puzzling too because it says that a staffer can perform only type of procedure in a given day. When you create data models in real life this is the kind of business rule you would not try to enforce through table design, you’d use a constraint, or enforce it with application code. If you did enforce this with tables you would get a weird intersection table with staffer Id, date, and procedure.
Assignments are not going to be totally realistic, but this seems far enough off you should check with your instructor about whether you are on the right track or not.
.

Star Schema - External Identifier fact or dimension?

Here's a question I'm struggling with in a star schema design.
The outline is that we track packages with embedded globally unique identifiers (tags). Each of those tags creates to a series of chronological events. I consider the events to be the facts and am including the continuously variable values as columns in the fact table. Dimensions are things like the package type.
What I'm not sure about is whether the tag identifier should be in a dimension or directly on the fact table. We've currently got over 5 million unique tags we are tracking.
Is such a large dimension advisable?
It is a degenerate dimension and you should keep this column in the fact table.
http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/degenerate-dimension/
http://www.kimballgroup.com/2003/06/design-tip-46-another-look-at-degenerate-dimensions/

SNMP table with dynamic number of columns

I want to have a SNMP table with dynamic number of rows and columns.
The code which creates the OIDs in the snmpd is ready but now I'm having problems with the MIB file.
The MIB file allows dynamic number of rows(entries) but must have constant number of columns.
I'm looking for a way to solve this problem. The following solutions may be possible but I don't know if they are available on the MIB file:
The number of columns is between 1-32. If I could define the columns to be optional - it would solve my problem.
Having dynamic number of tables: If I could define Template table which will have Template name and OID, this will allow me to split my table to smaller dynamic tables with static number of columns.
Currently I can't find any record of such solutions.
SNMP does not allow a dynamic number of columns in a table. It requires that the MIB describes the table completely, so that a manager knows which columns are present, before trying to contact the agent.
Defining tables dynamically is also not permitted.
If you edit your question to describe the data you are trying to model, perhaps we could figure out whether or not it's possible to model it in a MIB. I can certainly imagine situations where the capabilities of SNMP are insufficient to model a data set. It works best where data is either scalar, a tree, or a table with a fixed set of columns.
Edit: As k1eran posted in a comment, it is possible to simply not populate some columns with data, leaving a "sparse table". Please see his comment for a link.

how to insert in to db when number is having digits greater than m for number(m,n) in oracle

in DB which i do not have privilege to alter.
a column has number(13,4) and how is it possible to insert 999999999999999999 whose length is more than 13 ? It is throwing exception. Is it possible to convert in to 1.23e3 format and does the db save this format?
no it is not possible because of the rules and limitations you mentioned yourself. The column has that formatting, you cannot change it so you cannot make it fit. period
No it is not possible to insert a number, which is greater than the specified precision and scale of the column.
You have to change the database.
If you don't have permissions to alter the table then simply ask someone who does; you have a valid "business" need to do so.
I would highly recommend not working out some way to "hack" around this limitation. Constraints such as this exist to enforce data quality. Though maybe misapplied in this situation, putting data in two different formats in the same column makes it immeasurably more difficult to retrieve data from the database. Hence why you should always store numbers as numbers etc.
No, unfortunately not. There is no way how to achieve this.

Having more than 50 column in a SQL table

I have designed my database in such a way that One of my table contains 52 columns. All the attributes are tightly associated with the primary key attribute, So there is no scope of further Normalization.
Please let me know if same kind of situation arises and you don't want to keep so many columns in a single table, what is the other option to do that.
It is not odd in any way to have 50 columns. ERP systems often have 100+ columns in some tables.
One thing you could look into is to ensure most columns got valid default values (null, today etc). That will simplify inserts.
Also ensure your code always specifies the columns (i.e no "select *"). Any kind of future optimization will include indexes with a subset of the columns.
One approach we used once, is that you split your table into two tables. Both of these tables get the primary key of the original table. In the first table, you put your most frequently used columns and in the second table you put the lesser used columns. Generally the first one should be smaller. You now can speed up things in the first table with various indices. In our design, we even had the first table running on memory engine (RAM), since we only had reading queries. If you need to get the combination of columns from table1 and table2 you need to join both tables with the primary key.
A table with fifty-two columns is not necessarily wrong. As others have pointed out many databases have such beasts. However I would not consider ERP systems as exemplars of good data design: in my experience they tend to be rather the opposite.
Anyway, moving on!
You say this:
"All the attributes are tightly associated with the primary key
attribute"
Which means that your table is in third normal form (or perhaps BCNF). That being the case it's not true that no further normalisation is possible. Perhaps you can go to fifth normal form?
Fifth normal form is about removing join dependencies. All your columns are dependent on the primary key but there may also be dependencies between columns: e.g, there are multiple values of COL42 associated with each value of COL23. Join dependencies means that when we add a new value of COL23 we end up inserting several records, one for each value of COL42. The Wikipedia article on 5NF has a good worked example.
I admit not many people go as far as 5NF. And it might well be that even with fifty-two columns you table is already in 5NF. But it's worth checking. Because if you can break out one or two subsidiary tables you'll have improved your data model and made your main table easier to work with.
Another option is the "item-result pair" (IRP) design over the "multi-column table" MCT design, especially if you'll be adding more columns from time to time.
MCT_TABLE
---------
KEY_col(s)
Col1
Col2
Col3
...
IRP_TABLE
---------
KEY_col(s)
ITEM
VALUE
select * from IRP_TABLE;
KEY_COL ITEM VALUE
------- ---- -----
1 NAME Joe
1 AGE 44
1 WGT 202
...
IRP is a bit harder to use, but much more flexible.
I've built very large systems using the IRP design and it can perform well even for massive data. In fact it kind of behaves like a column organized DB as you only pull in the rows you need (i.e. less I/O) rather that an entire wide row when you only need a few columns (i.e. more I/O).

Resources