Relational Algebra (sanity check): branches whose customers include all Tulsa customers - relational-algebra

Given These Schemas:
Account: bname, acct_no, balance
Depositor: cname, acct_no
Customer: cname, street, city <-(all customers / both loan and account customers)
Loan: loan_no, amount, b_name
Borrower: cname, loan_no
Branch: bname, b_city, assets
Question: Find branches whose customers include all customers that live in Tulsa
My professor gave this solution:
Π cname, bname(account ⋈ depositor) / Πcname(σcity == ‘Tulsa’ (customers))
I don't think the part Π cname, bname(account ⋈ depositor) is correct
because that ONLY includes the cname and bname of customers with accounts and does not include ALL customers (leaves out those with loans). The question does not specifically say "Find branches whose customers with accounts include all customers that live in Tulsa".
What am I missing?

We can guess--per names, symmetry & your mention of "loan and account customers"--that there is a correct query involving (the union of projections of) (Account join Depositor) & (Loan join Borrower). So it seems like your take on a query is reasonable. But you don't give the base table predicates (criteria under which rows appear); you rely on us to guess.
Under constraints some queries return the same results as others that otherwise wouldn't. Maybe your professor thinks that (it is obvious that) a borrower must have an account. Under that constraint, if your take is correct then so is theirs. Without certain constraints like that, you are right that they are wrong. But you don't also give the constraints.
However you are presumably both wrong: If a certain branch & Tulsa have no customers then the result should hold that branch. But a quotient will not. The specification is only similar to one corresponding to a division. Your division returns "branches whose customers include all customers that live in Tulsa" and that have at least one customer. This is a case of classic errors & ambiguities in specification & implementation involving division & almost involving division. On the other hand, maybe there is a constraint that no bank has no customers. Then your query is correct--but not your reasoning.
Re relational querying. (Which you can use to justify your query & arguments precisely & soundly.)

Related

Need advice on laravel modeling (pivot of pivot)

I have a company auditing system, which is a bit difficult to explain but i will try to keep it simple. Here are the main parts
Company
Audits (certifications)
Clauses (These are questions specific to a certification and they all need to be answered in order for the audit to complete. The user can then add comments, status, mark it as complete etc )
There are lots of audits that a company can be tested for. So a company can have multiple audits. Each audit have some clauses, that needs to be executed in order to complete the audit. A user can also add some data to these clauses, for eg progress of the clause, status, comments etc.
Right now i have these database tables
Companies
Audits
Clauses
Companies - Audits (Pivot table)
Clauses - Audit (Pivot table)
Companies - Audits - Clauses (pivot of pivot table)
Now the first 5 are pretty standard, but i am not sure how to implement the last one. Right now in companies-audits pivot table i have an auto increment field called companies_audits_id, And i then use this id inside Companies_Audits_Clauses table. In Companies_Audits_Clauses table i also have the fields like status, progress, comments etc.
I am not sure if this is a good idea. So i need your thoughts on it. Every help is appreciated.
EDIT :
The company can have many audits, and an audit can also have many companies. Think of audits as certifications, so a company can be certified for different certifications e.g ISO1000, ISO2000 etc. So its necessary to have pivot for audits_companies. Similarly clauses belong to only one certification, in this case ISO2000 might have 10 clauses. And then each company will have certain comments on these clauses.

Trigger to enforce M-M relationship

Suppose I have following schema :
DEPARTMENT (DepartmentName, BudgetCode, OfficeNumber, Phone)
EMPLOYEE (EmployeeNumber, FirstName, LastName, Department, Phone, Email)
The problem am facing is how to design a system of triggers to enforce the M-M relationship.Assuming that departments with only one employee can be deleted. Also I need to assign the last employee in a department to Human Resources.
I have no idea to enforce M-M relationship through trigger. Please help
Many-to-many conditions should not be enforced using a trigger. Many-to-many conditions are enforced by creating a junction table containing the keys in question, which are then foreign-keyed back to the respective parent tables.
If your intention is to allow many employees to be in a department, and to allow an employee to be a member of many departments, the junction table in question would look something like:
CREATE TABLE EMPLOYEES_DEPARTMENTS
(DEPARTMENTNAME VARCHAR2(99)
CONSTRAINT EMPLOYEES_DEPARTMENTS_FK1
REFERENCES DEPARTMENT.DEPARTMENTNAME,
EMPLOYEENUMBER NUMBER
CONSTRAINT EMPLOYEES_DEPARTMENTS_FK2
REFERENCES EMPLOYEE.EMPLOYEENUMBER);
This presumes that DEPARTMENT.DEPARTMENTNAME and EMPLOYEE.EMPLOYEENUMBER are either primary or unique keys on their respective tables. Get rid of the column EMPLOYEE.DEPARTMENT as it's no longer needed. Now by creating rows in the EMPLOYEES_DEPARTMENTS table you can relate multiple employees with a department, and you can relate a single employee with multiple departments.
The business logic requiring that only departments with one or fewer employees can be deleted should not be enforced in a trigger. Business logic should be performed by application code, NEVER by triggers. Putting business logic in triggers is a gatèw̢ay to unending debugging sessions. M̫̣̗̝̫͙a̳͕̮d̖̤̳̙̤n̳̻̖e͍̺̲̼̱̠͉ss̭̩̟ lies this way. Do not give in. Do not surrender. ̬̦B҉usi͢n̴es̡s logic ̶in triggers opens deep wounds in the fabric of the world, through which unholy beings of indeterminate form will cross the barrier between the spheres, carryi͞n̨g o̡f͠f t͢h̶e ̕screaming͡ sou͏ĺs o͜f͜ ̢th͜e̴ ̕de͏v́e̡lop͏e͜r͝s to an et͞er͜n̸it̶y ́of͢ pain̶ ąn̨d͢ ̨to͟r̨ment͟. Do not, as I have said, put b́u͜siness͞ ̸log̛i͘ç ̵in͢ ͞trigge͠rs͞.̡ Be firm. Resist.You must resist. T̷he ̢Tem͟p͞t̶at͏i͝o̶n҉s͘ ̢m͘a̶y ́śing hymns̷ ́o͢f̴ ̸un͘hol̵y r̶ev͢ęla͠t̡ion̴ ͢buţ ́yo͠u̵ mu͏s͝t ͝n͜͏͟o҉t̶͡͏ ̷l̸̛͟͢ì̧̢̨̕s̵̨̨͢t̵̀͞e̶͠n̶̴̵̢̕. Only by standing firmly in the door between the worlds and blocking out the hideous radiance cast off by bú̧s̷i̶̢n̵̕e̵ş͝s ́l̴ó̢g̛͟i̕͏c i͞n̕ ͏t̵͜r͢͝i̸̢̛ģ͟ge̸̶͟r̶s͢͜, which perverts the very form of the world ąnd̴̀͝ ç͞a̧͞l̶l͟͜s̕͘͢ Z̶̴̤̬͈̤̬̲̳͇ͯ̊ͮ͐̒̆͂͠Â̆́̊̓͛́̚͏̮̘̗̻̞̬̱ͅL̛̄̌͏̦͕̤͎̮̦G̷͖̙̬͛̇ͬ̍͒̐̅O̡̳͖͎̯̯͍ͫ̽ͬ͒͂̀ i͜҉nt͝ǫ̴ ̸b̷͞è͢ì̕n̴g͏,̛̀͘ ̴c҉á̴͡ń ̀͠youŕ̨ ̧̨a̸p͏̡͡pl̷͠ic͞a̢t̡i͡҉ǫn̴ ̸s̶͜u̶͢ŗv̛í̴v́ȩ.͘͘ Resist. R͏͢͝e͏͢͟s̸͏͜ì̢̢s͠ţ̀. T̶̀h̨̀e̶r̀͏e͢͞ ̶i̶̡͢s̴ ͞͞n̵͝o̡ ́ẁ҉̴a̡y̕҉ ̶b́͏u̵̶̕t͜ ̨s͘͢t͘͠į͟l͘l̷̴ ̴͜͜ỳò͜u҉̨ ̨͏mus̸͞t̸̛͜ ̧rȩ̴s̢͢i͘͡s͏t̸.̛̀͜ Your very śo͡u̧̧͘ļ͟͡ is compromised by p͝u͘͝t̢͜t͠i̸ņ̸̶g͟͡ ̵̶̛b̴҉u̶̡̨͜͞s̷̵̕͜͢i͝҉̕͢ǹ͏e̡͞ś̸͏ş̕͜͡҉ ̴̨ĺ̵̡͟͜o̶̕g͠i͢͠c̕͝ ̕͞i̧͟͡n̡͘͟ ̶̕͞t̡͏͟҉̕r̸̢̧͡͞i̴̡͏̵͜g̵̴͟͝ģ̴̴̵ę̷̷͢r̢̢ś̸̨̨͜. T̀͜͢o̷͜ny̕ ͟͡T̨h̶̷̕e ̢͟P̛o̴̶n͡y shall rise from his dark stable and d͞ę̡v̶̢ó͟u̸̸r̴͏ ̷t͞h̀e̛ ̨͜s̷o̧͝u҉l̀ ͟͡o͢͏f̵͢ ̛t͢h̶̛e̢̢ ̡̀vi͜͞r̢̀g̶i̢n͞, and yet y͢ơú͝ m̷̧u͏s͡t̡͠ ̛s̷̨t̸̨i̴̸l̶̡l ͝ǹot̵ ͞p̧u̵t̨ ͜͏b̀̕u̕s̨í̵ņ̀͠ȩs̵͟s ́͞l̛҉o̸g̨i̴͟c ͘͘i͘nt̛o͡ ͘͘͞t̶͞r̀̀i̕ǵ̛g̵̨͞e̸͠҉r̵͟ś! It is too much to bear, we cannot stand! Not even the children of light may put business logic into their triggers, for b̴̸̡̨u͜͏̧͝ş̶i̷̸̢̛҉ń̸͟͏́e̡͏͏͏s̷̵̡s̕͟ ͏̴҉͞l̷̡ǫ̷̶͡g҉̨̛i͘͠͏̸̨c̕͢͏ ̸̶̧͢͢i̸̡̛͘n͢͡ ̀͢͝t̷̷̛́ŗì̴̴̢g̶͏̷ǵ͠ȩ̀́r̸̵̢̕͜s͞͏̵ is the very es̵s̕͡ę̢n͞c̨e̢͟ ̴o̶̢͜f͏ ͟d́ar͟͞͠k̡͞n̢̡es̵̛͡s̀̀͡ and dev͘ou͝͡r̨̡̀s͢͝ ҉͝t҉h̴e̡͘ l̫̬i̤͚ͅg̞̲͕̠͇̤̦̹h̩̙̘̭̰͎͉̮̳t͙̤̘̙! Yea, yea, the blank-faced ones rì͢s̨͘e from the f͟͢͏o̵͜͝n̶t̨ ̵o͏f̸̡͠ ͏͝fl͟͞a̵̷҉me̶̵͢ and ca͝s͜t́ down the p̹̤̳̰r̮̦̥̥̞̫͑͂ͤ͑ͮ͒̑ï̄̌ͬͨe̦̗͔ͥͣ̆̾̂s̬̭̮̮̜ͭt̻̲̍sͫͣ̿ ̐͗̈ͤ͂ͦ̅f̭͚̪̻̣̩ͮ̒ṟͨ͌ͮ̅̓ỏ̝͓̝̣̟̼m̳͇̱̝͔͒ ͒ͫͧ͂̓̈̈́t̲̔̅̎͐h̺͈͍ͣͧ̿ē̪̼̪̻͉̪̙̐̽̎̉i̠͎̗͕̗̣̬̐̎͛r͓̫͌ͅ ̼a͑̈ͯͦ̍l̪͉͖̥͚̤͌ͨ͊ͦͤ̔t̫͎̹ͯa̼̻͍̳̟̤̬̓ͪ̀r̭͖̓ͬ̉̉ͤ͊ṡ̐ͪ̊̋̄̅! A̵̵̛v͝é͜ŕt̶͏ ̶y̸͝͠o̶u̧͘r͏̡ ̧e͞y҉e̕͝s,̀ ͡t̛h̛o̢͞ug̸̢h̵͟ ̡y̷o͢҉͢u̧͡ ̕͡c҉̵̶an͠͏n҉o̧͢t!̸̨͘ ͡H̵e̸͢͡ ̧̕c̶ơm̷̢̢e̶͞ś͢!̨́ ̷H̕ȩ ̵c̨̡͟o̴҉m̷͢es͠!̷͘͞ P̱̼̯̟͈h̝̳̞̖͚'͉̙͉̰̲̺n̪̦͕̗͜g͔̹̟̰̰̻̩l̬͈̹̥͕͖ͅụ̻̺̤̤̬̳i̸̯̬̝̻̣͚̫ ̰̹̞̞m͟g̷̝͓͉̤l̩͇̙͕w̪̦̰͔'̮̟̱̀n̢̜a̦f̘̫̤̘̬͓̞h̠͍͖̯ͅ ̩̠͓̯̘̫C̟̘̗̘͘ṭ͍͕ͅh̤ͅu̼̦̘̥ͅl҉̦hu̠̤̤̘͚ ̘̕R̶̟'̠͔̞̻͇l̩̺̗̻͖͓̕ͅy̛̖ȩ͉̭̖ẖ̡̥̼͈̖ w̟̫̮͇͔͞ͅg͈̘̱̻a̰͟h̘͙͖͢'̮̲̯͞n̤̜͍̯̳a͓͓̲̲g̱̻͈ĺ͍ ̷̣̞̲͖͍̲̺f̲ͅh͇͕̪̘͟t͔͈̙a͓͢g҉̳̜̲͚n͓͚͎̱̠̜!
Don't ask me how I know.
Best of luck.

What might be the purpose of this column in eTRM (Oracle eBusiness suite)

I realize this is quite specialized question(about Oracle's eTRM + eBusiness suite ) I'm trying to figure out the meaning of this
REMIT_TO_ADDRESS_ID NUMBER (15)
which comes from the AR.RA_CUSTOMER_TRX_ALL table . The reason is that in a query I have, there's a bug like this where we say:
LEFT OUTER JOIN ra_customer_trx_all
ON rct.REMIT_TO_ADDRESS_ID = acct.REMIT_TO_ADDRESS_ID \
(acct is from the table hz_cust_acct_sites_all , by the way)
My guess is that REMIT_TO_ADDRESS_ID is some kind of meta-data?
I really appreciate any pointers/tips. Thanks.
Little bit rusty, but did Oracle Apps for 10 years. From your question I understand that you are new to Oracle Apps technology. ra_customer_trx_all stands for:
"RA" => "Accounts Receivables" also known as "AR" (something you sell and want money for),
"customer" says it,
"trx" => "transactions",
"_all" => all records across all organisations (multi-org).
It is a nice table with lots of features :-)
When in Oracle Apps a column is listed with name ending in '_id' and data type of number(15, 0), it is generally a reference to a row in another table. Depending on the Oracle Apps module, you will sometimes find also a foreign key constraint. But generally most Oracle Apps modules rely on the frontend to enforce referential integrity.
So remit_to_address_id refers to another table. In this case address information. Also, the naming of the column tells us that the referred row is used in a special way (role) namely as "remit to".
You might want to join it to the address table of Apps. When you do so, please check the columns listed in the indexes. The multi-org field org_id may be listed first (probably not in AR). If you forget them, you will still have good results since the ID-s are unique across the system, but the index might not be used.
For end user queries, I generally recommend to use the multi-orged view instead of the _all table. This ensures that users only see their current organisation. Remember that you need to set up the client_identifier session variable (if I recall correctly) to store the current organisation ID in.
I hope this helps you.
I have no knowledge of eTRM, or any other Oracle business application.
That said, as a complete wild guess, I would say that the REMIT_TO_ADDRESS_ID is the ID of an address that a payment of some kind is sent to, and that the address is optional (thus the outer join). So, in an Accounts Payable system, you may have a vendor, who has a normal business address. But when you send actual monies, they have an optional Remit To Address, and the payment is sent there instead of the normal business address.

Very slow search of a simple entity relationship

We use CRM 4.0 at our institution and have no plans to upgrade presently as we've spend the last year and a half customising and extending the CRM to work with our processes.
A tiny part of model is a simply hierarchy, we have a group of learning rooms that has a one-to-many relationship with another entity that describes the courses available for that learning room.
Another entity has a list of all potential and enrolled students who have expressed an interest in whichever course.
That bit's all straightforward and works pretty well and is modelled into 3 custom entities.
Now, we've got an Admin application that reads the rooms and then wants to show the courses for that room, but only where there are enrolled students.
In SQL this is simplified to:
SELECT DISTINCT r.CourseName, r.OtherInformation
FROM Rooms r
INNER JOIN Students S
ON S.CourseId = r.CourseId
WHERE r.RoomId = #RoomId
And this indeed is very close to the eventual SQL that CRM generates.
We use a Crm QueryEntity, a Filter and a LinkEntity to represent this same structure.
The problem now is that the CRM normalizes the a customize entity into a Base Table which has the standard CRM entity data that all share, and then an ExtensionBase Table which has our customisations. To Give a flattened access to this, it creates a view that merges both tables.
This view is what is used by the Generated SQL.
Now the base tables have indices but the view doesn't.
The problem we have is that all we want to do is return Courses where the inner join is satisfied, it's enough to prove there are entries and CRM makes it SELECT DISTINCT, so we only get one item back for Room.
At first this worked perfectly well, but now we have thousands of queries, it takes well over 30 seconds and of course causes a timeout in anything but SMS.
I'm given to believe that we can create and alter indices on tables in CRM and that's not considered to be an unsupported modification; but what about Views ?
I know that if we alter an entity then its views are recreated, which would of course make us redo our indices when this happens.
Is there any way to hint to CRM4.0 that we want a specific index in place ?
Another source recommends that where you get problems like this, then it's best to bring data closer together, but this isn't something I'd feel comfortable in trying to engineer into our solution.
I had considered putting a new entity in that only has RoomId, CourseId and Enrolment Count in to it, but that smacks of being incredibly hacky too; After all, an index would resolve the need to duplicate this data and have some kind of trigger that updates the data after every student operation.
Lastly, whilst I know we're stuck on CRM4 at the moment, is this the kind of thing that we could expect to have resolved in CRM2011 ? It would certainly add more weight to the upgrading this 5 year old product argument.
Since views are "dynamic" (conceptually, their contents are generated on-the-fly from the base tables every time they are used), they typically can't be indexed. However, SQL Server does support something called an "indexed view". You need to create a unique clustered index on the view, and the query analyzer should be able to use it to speed up your join.
Someone asked a similar question here and I see no conclusive answer. The cited concerns from Microsoft are Referential Integrity (a non-issue here) and Upgrade complications. You mention the unsupported option of adding the view and managing it over upgrades and entity changes. That is an option, as unsupported and hackish as it is, it should work.
FetchXml does have aggregation but the query execution plans still uses the views: here is the SQL generated from a simple select count from incident:
'select
top 5000 COUNT(*) as "rowcount"
, MAX("__AggLimitExceededFlag__") as "__AggregateLimitExceeded__" from (select top 50001 case when ROW_NUMBER() over(order by (SELECT 1)) > 50000 then 1 else 0 end as "__AggLimitExceededFlag__" from Incident as "incident0" ...
I dont see a supported solution for your problem.
If you are building an outside admin app and you are hosting CRM 4 on-premise you could go directly to the database for your query bypassing the CRM API. Not supported but would allow you to solve the problem.
I'm going to add this as a potential answer although I don't believe its a sustainable or indeed valid long-term solution.
After analysing the indexes that CRM had defined automatically, I realised that selecting more information in my query would be enough to fulfil the column requirements of an Index and now the query runs in less then a second.

Database design: Low overhead solution for managing daily inventories / capacities?

Here is the scenario: (MySQL 5.1+, PHP, Apache)
I am planning a SaaS application that will let CLIENTS visit SHOPS and book TRIPS. (ALL CAPS are entities). SHOPS offer TRIPS but they only have a certain number of EMPLOYEES to guide the TRIPS (a transactional record). Essentially it is an issue of managing a daily capacity for each SHOP based upon the number of available EMPLOYEES. What is the best DB design solution for delivering this functionality in a way that incurs the lowest amount of overhead?
Here is a simplified view of the database entities:
table.clients
client_id (pk, ai)
table.shops
shop_id (pk, ai)
table.employees
employee_id (pk, ai)
shop_id (fk)
table.trips
trip_id (pk, ai)
client_id (fk)
shop_id (fk)
trip_date (date)
SCENARIO 1
I could run a query on TRIPS for every request when a user wants to view the calendar, like:
SELECT COUNT(*),
trips.trip_date,
trips.shop_id
FROM trips
WHERE shop_id=1
GROUP BY trips.trip_date, trips.shop_id
SCENARIO 2
Create a summary table that stored info on every day but this strategy seems nightmarish with overhead issues. For instance, imagine that there are 1000 shops each booking 1000 trips per 365 day year and the table should store info for the next 2 years (830 days). It seems like that would 1/ create a huge summary table (830,000 rows) that would 2/ be queried 1,000,000+ times per year (1000 shops * 1000 trips per shop). When a CLIENT booked a TRIP it would increment the number (or when a trip was cancelled the number would decrement) which would effectively create a daily inventory/capacity.
So, my question is this: Which method is the best? Or is there a better way to accomplish this?
Thanks!
Sounds like fun!
Firstly - I know you've given us a simplified version of the schema, so I assume there's a lot more elsewhere, but your "trips" table looks wrong - if shops have one and only one client, you don't need the client ID in the trips table.
However, you do need a "booked_trips" table, to record which trip is booked to which employee - you could store that against the "trips" table too, but typically a booking has lots of other stuff like an invoice, a booked date etc. so you may want to separate those things out.
I'd recommend something like your "option 1"- use queries to derive data stored in normalized tables, rather than option 2, which is effectively a denormalization for speed.
It's worth defining "overhead" in your question - pretty much all of these design questions trade time versus speed; if by overhead you mean disk space, you get a different answer than if you mean "time to run my queries".
Generally, my advice is to work with a normalized approach and measure performance; only denormalize if you know you have a problem.

Resources