Is it performance matters to separate select fields to separate table? - laravel

I have patients table
It's contain smoking field it's values will be (smoker or non-smoker or ex-smoker)
My question is should I save values directly as string in patient table or make another table for smoking and make one-to-one relation between these tables?
Is it affects performance of query

When I have three options like this, I typically use a nullable bit/bool. A bit/bool column will be more performant than a string column.
So, in your situation, my table would have a "smoking" column where null would be non-smoker, 0/false would be ex-smoker, and 1/true would be smoker.
If you think you might add options later, like maybe "recently quit" or something, a nullable bool might not be ideal as you'd have to alter the column, then write a script to convert all of them into the appropriate string.
If you don't want to use the nullable bool, then you'd want a one to many table where you listed each option: "smoker, non-smoker, ex-smoker" then add a foreign key onto the "smoking" table in your patients table. That would also increase performance over throwing strings in the column.

Related

Saving float changes to a float or to a varchar2 column?

I need to save before and after value changes of certain fields of an items table to an items_log table. Changes are saved by an after change trigger on the items table.
Some of the items table columns are varchar2 type and some are number(*) type.
What is the better approach? Saving to separate two before and after number fields and two before and after varchar2 fields? Or conserving space by saving everything to two before and after varchar2 fields?
The purpose of this log table is to record which user changed a field and the before and after values.
Could saving a float value to a string field lead to an unexpected diversion from the original value?
Thanks in advance
"What is the better approach?"
There is no "better" approach. There is only an approach that's good enough for your application. If your table will have a few thousand rows in it, it doesn't really matter. If your table will have a few million rows, then space may be more of a concern.
If your goal is to display to a user what changes occurred to your item and it's not going to see a lot of activity, storing everything as a varchar may be good enough. You probably don't want to store rows for fields that did not change.
I use APC's approach often. The items_log table is the same as the item table, and includes a history id, timestamp, action (I, U, or D), and user along with all the columns of the item row. Everything is maintained by a trigger. There are also built-in Oracle auditing features to do auditing for you.

Should one store a search_data tsvector in the same table or external table?

I am implementing full text search in postgres.
I would like to search all posts in my system. The posts fulltext index is an amalgamation of the post title and post body.
I have two ways of achieving this:
create a tsvector column in the posts table, trigger an update to it.
create a second table (posts_search) with a post_id and tsvector column containing the index data.
create a simple gin index ... (out of the question, cause my real world problem needs data in multiple tables for the index)
What is going to perform better, considering I sometimes need to filter down the search by other attributes in the table (like deleted_at is null and so on).
Is it a better approach to keep the tsvector column in the same table as the data (side effect select * now sucks) or a separate table (side effect, join required, index filtering is complicated)?
In my experiments, typical size of tsvector column is about 1% of the size of text field this tsvector was computed from using to_tsvector().
With this in mind, storing tsvector column in another table should provide performance benefit. For example, even if you do not use SELECT * (and you shouldn't, really), any seqscan in original single table will still have to load pages which contain original text. If you offload tsvector field to separate table, page loading will be faster by 100x.
In other words, I would favor second solution of offloading tsvector field to separate table. Or, alternatively, offloading posts (original text) deeper into your table hierarchy (but I guess it is almost the same thing).
Note that for full text search to work, original text is not necessary. You way want to even not store it in database, or store it in highly compressed format (and not necessarily easily accessible by SQL routines). It would work as long as something can create tsvector based on original text, or update when it changes.

Teradata: How to design table to be normalized with many foreign key columns?

I am designing a table in Teradata with about 30 columns. These columns are going to need to store several time-interval-style values such as Daily, Monthly, Weekly, etc. It is bad design to store the actual string values in the table since this would be an attrocious repeat of data. Instead, what I want to do is create a primitive lookup table. This table would hold Daily, Monthly, Weekly and would use Teradata's identity column to derive the primary key. This primary key would then be stored in the table I am creating as foreign keys.
This would work fine for my application since all I need to know is the primitive key value as I populate my web form's dropdown lists. However, other applications we use will need to either run reports or receive this data through feeds. Therefore, a view will need to be created that joins this table out to the primitives table so that it can actually return Daily, Monthly, and Weekly.
My concern is performance. I've never created a table with such a large amount of foreign key fields and am fairly new to Teradata. Before I go on the long road of figuring this all out the hard way, I'd like any advice I can get on the best way to achieve my goal.
Edit: I suppose I should add that this lookup table would be a mishmash of unrelated primitives. It would contain group of values relating to time intervals as already mentioned above, but also time frames such as 24x7 and 8x5. The table would be designed like this:
ID Type Value
--- ------------ ------------
1 Interval Daily
2 Interval Monthly
3 Interval Weekly
4 TimeFrame 24x7
5 TimeFrame 8x5
Edit Part 2: Added a new tag to get more exposure to this question.
What you've done should be fine. Obviously, you'll need to run the actual queries and collect statistics where appropriate.
One thing I can recommend is to have an additional row in the lookup table like so:
ID Type Value
--- ------------ ------------
0 Unknown Unknown
Then in the main table, instead of having fields as null, you would give them a value of 0. This allows you to use inner joins instead of outer joins, which will help with performance.

Having more than 50 column in a SQL table

I have designed my database in such a way that One of my table contains 52 columns. All the attributes are tightly associated with the primary key attribute, So there is no scope of further Normalization.
Please let me know if same kind of situation arises and you don't want to keep so many columns in a single table, what is the other option to do that.
It is not odd in any way to have 50 columns. ERP systems often have 100+ columns in some tables.
One thing you could look into is to ensure most columns got valid default values (null, today etc). That will simplify inserts.
Also ensure your code always specifies the columns (i.e no "select *"). Any kind of future optimization will include indexes with a subset of the columns.
One approach we used once, is that you split your table into two tables. Both of these tables get the primary key of the original table. In the first table, you put your most frequently used columns and in the second table you put the lesser used columns. Generally the first one should be smaller. You now can speed up things in the first table with various indices. In our design, we even had the first table running on memory engine (RAM), since we only had reading queries. If you need to get the combination of columns from table1 and table2 you need to join both tables with the primary key.
A table with fifty-two columns is not necessarily wrong. As others have pointed out many databases have such beasts. However I would not consider ERP systems as exemplars of good data design: in my experience they tend to be rather the opposite.
Anyway, moving on!
You say this:
"All the attributes are tightly associated with the primary key
attribute"
Which means that your table is in third normal form (or perhaps BCNF). That being the case it's not true that no further normalisation is possible. Perhaps you can go to fifth normal form?
Fifth normal form is about removing join dependencies. All your columns are dependent on the primary key but there may also be dependencies between columns: e.g, there are multiple values of COL42 associated with each value of COL23. Join dependencies means that when we add a new value of COL23 we end up inserting several records, one for each value of COL42. The Wikipedia article on 5NF has a good worked example.
I admit not many people go as far as 5NF. And it might well be that even with fifty-two columns you table is already in 5NF. But it's worth checking. Because if you can break out one or two subsidiary tables you'll have improved your data model and made your main table easier to work with.
Another option is the "item-result pair" (IRP) design over the "multi-column table" MCT design, especially if you'll be adding more columns from time to time.
MCT_TABLE
---------
KEY_col(s)
Col1
Col2
Col3
...
IRP_TABLE
---------
KEY_col(s)
ITEM
VALUE
select * from IRP_TABLE;
KEY_COL ITEM VALUE
------- ---- -----
1 NAME Joe
1 AGE 44
1 WGT 202
...
IRP is a bit harder to use, but much more flexible.
I've built very large systems using the IRP design and it can perform well even for massive data. In fact it kind of behaves like a column organized DB as you only pull in the rows you need (i.e. less I/O) rather that an entire wide row when you only need a few columns (i.e. more I/O).

Is it possible to traverse rowtype fields in Oracle?

Say i have something like this:
somerecord SOMETABLE%ROWTYPE;
Is it possible to access the fields of somerecord with out knowing the fields names?
Something like somerecord[i] such that the order of fields would be the same as the column order in the table?
I have seen a few examples using dynamic sql but i was wondering if there is a cleaner way of doing this.
What i am trying to do is generate/get the DML (insert query) for a specific row in my table but i havent been able to find anything on this.
If there is another way of doing this i'd be happy to use but would also be very curious in knowing how to do the former part of this question - it's more versatile.
Thanks
This doesn't exactly answer the question you asked, but might get you the result you want...
You can query the USER_TAB_COLUMNS view (or the other similar *_TAB_COLUMN views) to get information like the column name (COLUMN_NAME), position (COLUMN_ID), and data type (DATA_TYPE) on the columns in a table (or a view) that you might use to generate DML.
You would still need to use dynamic SQL to execute the generated DML (or at least generate static SQL separately).
However, this approach won't work for identifying the columns in an arbitrary query (unless you create a view of it). If you need that, you might need to resort to DBMS_SQL (or other tools).
Hope this helps.
As far as I know there is no clean way of referencing record fields by their index.
However, if you have a lot of different kinds of updates of the same table each with its own column set to update, you might want to avoid dynamic sql and look in the direction of statically populating your record with values, and then issuing update someTable set row = someTableRecord where someTable.id = someTableRecord.id;.
This approach has it's own drawbacks, like, issuing an update to every, even unchanged column, and thus creating additional redo log data, but I believe it should be considered.

Resources