Should I store US states as an array or create table columns? - ruby

I have an app that houses product data via a Product model and table. Each product has specific state availability (multiple states) that I will need to filter and/or search by in the future. I am hoping to find someone who can tell me the most efficient way to store this data. As I see it, I have two options.
The first is to simply create 50 columns in my table, titled with each state name and containing a boolean value. I can then simply filter by = "avail in California" if product.ca. While this certainly works, it seems a bit cumbersome, especially when searching for multiple state availability.
The second option would be to simply have one column("states") that stores an array of available states and then filter by = "avail in California" if product.states.include? "CA". This seems like a better solution for two reasons. The first, it just allows for a cleaner DB table. Second, and more important, I can allow my user to search by simply saving the user's input as a variable(user_input) and then = "avail in California" if product.states.include? user_input. This solution does call for a little more work up front however when saving the product in the DB, since I won't be able to simply check off a boolean value.
I think option two makes the most sense, but am hoping for some advice as to why or why not. I have found a few similar questions, but they do not seem to explain which solution would be better, just how to accomplish each.
What should I do?

You should normalize unless you have a really good reason not to, and I don't see one in your overview.
To normalize, you should have the following tables:
product table, one record per product
state table, one record per state
product_state table, one entry for every product that is in a state
The product_state schema looks like this:
(product_state_id PK, product_id FK, state_id FK)
UNIQUE INDEX(product_id,state_id);
This allows you to have a product in zero or more states.

I assume that since you’re selling products, you will be charging taxes. There are different taxes by state, county, city. There are country taxes in some countries too.
So you need to abstract these entities into a common parent, usually called GeopoliticalArea, so that you can point a single foreign key (from, say, a tax rates table) at any subtype.
create table geopolitical_area (
id bigint primary key,
type text not null
);
create table country (
id bigint primary key references geopolitical_area(id),
name text not null unique
);
-- represents states/provinces:
create table region (
id bigint primary key references geopolitical_area(id),
name text not null,
country_id bigint references country(id),
unique (name, country_id)
);
insert into geopolitical_area values
(1, 'Country'),
(2, 'Region');
insert into country values
(1, 'United States of America');
insert into region values
(2, 'Alabama', 1);

Related

laravel many to many relation additional foreign key.?

i'm trying to get extra column as foreign key in laravel many to many relationship..here is my table structure
idcards
id,
name,
quality
id,
name
idcard_quality
id,
idcard_id
quality_id
related_id
in above idcard_quality table i want add extra foreign key related_id, i wanted to retrieve result like every idcard hasmany qualities and that quality is another idcard.. suppose one idcardcard has one normal quality and that normal quality is different idcard...
please help me
I think your table design should look like this:
idcards
id,
name
quality
id,
name,
related_id
idcard_quality
idcard_id
quality_id
This way when you retrieve the values of the parent/related id card on joining the tables.

Validate that value is unique over multiple tables access

Scenario: I have to create a database which has to contain 3 different tables holding information about people. Lets call them Members, Non_Members and Employees. Among the other information they may share , one is the telephone number. The phone numbers are unique, each in its respective table.
My problem: I want to make sure the phone number is always unique among these 3 tables. Is there a way to create a validation rule for that ? If not and I need to redesign the database, which would be the recommended way to do it.
Additional info: While the 3 tables hold the same information (Name , address etc.) its not required always required to fill them. So I am not sure if a generic table named Persons would work for my case.
Some ideas: I was wondering if and how I can use a query as a validation rule (that would make things easier). If I would end up creating a table called Phone numbers , how would the relations between the 4 tables would work in order to ensure that each of the 3 tables has a phone number.
ERD
I assume you are talking about a relational database.
I would go for a single person table with a "type" column (member, non_member, ...). That is much more flexible in the long run. It's easy to add new "person types" - what if you later want a "guest" type?
You would need to define as nullable to cater for the "not all information is required" part.
With just a single table, it's easy to make the phone number unique.
If you do need to make it unique across different tables, you need to put the phone numbers in their own table (where the number is unique) and the references that phone_number table from the other tables.
Edit
Here is an example of creating such a phone_number table:
create table phone_number
(
id integer primary key,
phone varchar(100) not null unique
);
create table member
(
id integer primary key,
name varchar(100),
... other columns
phone_number_id integer references phone_number
);
The tables non_member and employee would have the same structure (which is a strong sign that they should be a single entity)
Edit 2 (2016-01-08 20:12)
As sqlvogel correctly pointed out, putting the phone numbers into a single table doesn't prevent a phone number to be used by more than one person (I misunderstood the requirement so that no phone number should be stored more than once)

Star Schema: How the fact table aggregations are performed?

https://web.stanford.edu/dept/itss/docs/oracle/10g/olap.101/b10333/globdiag.gif
Assume that we have a start schema as above..
My questions is - In real-time how do we populate the colums (unit_price, unit_cost) columns of the fact table..?
Can anyone provide me a start schema tables with real data?
I am having hard time in understanding star schema...
Please help!..
Start schema consists of two types of tables fact tables and dimensions.
The ideal of the star design is that you can split your data in two part.
The static part is described with dimensions and the dynamic part (= transactions) in the fact table.
Each transaction is stored in the fact table as a new record and is connected to the surrounding dimensions, that define the context of the transaction.
The example in link contains two fact tables: SHIPMENTS and PRODUCT_CONDITIONS.
Note that the fact tables in the link are dubbed UNITS_HISTORY_FACT and PRICE_AND_COST_HISTORY_FACT, but I find this not a best choice.
The SHIPMENTS table stores one record for each shipment of a PRODUCT to a CUSTOMER at some TIME, via a defined CHANNEL.
All the above information is defined using the corresponding keys of the respective dimensions.
The fact table also contains MEASURES describing the attributes of the transaction, here the number of UNITS shipped.
The structure of the fact table would be therefore
CUSTOMER_ID
PRODUCT_ID
TIME_ID
CHANNEL_ID
UNITS
The second fact table (bottom) is more interesting, because here you split the product in two parts:
PRODUCT dimension defining the ID, name and other more static attributes
PRODUCT_CONDITION this is fact table, designed with the expectation the price and cost of the product will change over time.
With each change of the price or cost insert a new record in the fact table and connect it to the PRODUCT and TIME (of change).
The structure of the fact table would be therefore
PRODUCT_ID
TIME_ID
UNIT_PRICE
UNIT_COST
Final note the the design of the TIME dimension.
The best practice to connect the fact table with the dimension tables is to use meaningless ID (surrogate keys), but for TIME dimension you should be careful. For big (time partitioned) fact table is often used the natural key (DATE format) to be able to deploy the partitioning features. See more details in How I Defined a Time Dimension Using a Surrogate Key and other resources in web.

When should I use a nested table and when a reference?

How should I decide whether to use a nested table or a reference?
For example:
We have an airline and a flights table:
CREATE TABLE airline OF airline_ty(
token VARCHAR2(8),
description VARCHAR2(20)
)
CREATE TABLE flights OF flights_ty(
flightNumber NUMBER(10)
securityLevel VARCHAR2(10)
)
Should I know make a reference in airline (flights REF flights_ty) or go for a nested table?
It depends on the requirements for usage of the data. In your example with airlines and flights a flight should have a foreign key to its airline. The main table is flights and airlines is a codebook.
An example case where a nested table would be a good choice:
A customer in a core banking application has several phone numbers, email addresses etc. You need to hold this data for a customer, but you do not evaluate it (all customers with this email etc.), you just display it together with other customer detail. You cannot have an extra table for each one to many property, because you have much more interesting data, like accounts, loans, credit cards, account statements, behavior score cards etc.
You have always take into account, what will be the redundancy, reuse, importance, property vs. entity, aggregation vs. composition...

Entity Relation Design

I am trying to implement an entity relation for a hospital oracle database system.
I am rather confused if I should seperate the table below or merge them into 1.
- Supply
ItemNo (PK) , Name, ItemDescription, QuantityInStock, BackOrderLevel, CostPerUnit
- PharmaceuticalSupply
DrugNo (PK) , Dosage, MethodOfAdmin
Basically in my ERD, I pointed PharmaceuticalSupply to Supply as a subset which inherits the attribute but also have additional attributes. Am I wrong in doing that?
Ultimately, this is a design decision that has no right or wrong answer, but keeping them separate can be helpful. For example, there are many types of supplies that are not pharmaceutical. If you merge the tables, you make it possible to enter data that has no real meaning. For example, you can't have a dosage of bandages. The separate table makes it clear that dosage only applies to pharmaceuticals.
Note that there are a few variations on how to manage the PKs and FKs in PharmaceuticalSupply. It could have both an ItemNo and a DrugNo, where ItemNo is a foreign key. In that case, either one could be the primary key, but if DrugNo is the primary key, then ItemNo probably needs to be a unique index. However, unless DrugNo is needed due to some custom format, it might work well to simply use ItemNo as both PK and FK and completely eliminate DrugNo. This results in a "specialization" as the relational database world likes to refer to it.
It depends on your population. It it's a subset, to reduce redundancy add a foreign key to Supply. That way you'll be able to build a join that list all data.
I would still introduce a DrugNo key for indexing. Can an item number appear more than once in the PharmaceuticalSupply table ? If your do then your definitely need the DrugNo key.
PharmaceuticalSupply
DrugNo (PK) , ItemNo (FK), Dosage, MethodOfAdmin

Resources