What's the difference between "table" and "directory" terminologies? - data-structures

Paging in the x86 architecture is multilevel. There is a page directory which contains pointers, each one to a page table. The concept of directory also appears in directory-based cache coherence mechanisms, but there are no tables there.
I was wondering what's the difference between a "table" and a "directory". Do they mean the same but there are two names for the concept? Thanks.

Related

FileNet Content Engine - Database Table for Physical path

I realize this is possible with the FileNET P8 API, however I'm looking for a way to find the physical document path within the database. Specifically there are two level subfolders in the FileStore, like FN01\FN13\DocumentID but I can't find the reference to FN01 or FN13 anywhere.
You will not find the names of the folders anywhere in the FN databases. The folder structure is determined by a hashing function. Here is an excerpt from this page on filestores:
Documents are stored among the directories at the leaf level using a hashing algorithm to evenly distribute files among these leaf directories.
The IBM answer is correct only from a technical standpoint of intended functionality.
If you really really need to find the document file name and folder location, disable your actual file store(s) by making the file store(s) folder unavailable to Content Engine. I did that for each file store by simply changing the root FN#'s to FN#a. For instance, FN3 became FN3a. Once done, I changed the top tree folder back. I used that method so log files would not exceed the tool's maximum output. Any method that leaves a storage location (eg: drive, share, etc) accessible and searchable, but renders the individual files unavailable should cause the same results.
Then, run the Content Engine Consistency Checker. It will provide you with a full list of all files, IDs and locations.
After that, you can match the entries to the OBJECT_ID fields in the database tables. In non-MSSQL databases, the byte ordering is reversed for the first few octets of the UUID. You need to account for that and fix the byte ordering to match the CCC output.
...needs to be byte reversed so that it can be queried upon in Oracle.
When querying on GUIDs, GUIDs are stored in byte-reversed form in
Oracle and DB2 (not MS SQL), whereby the first three sections are pair
reversed and the last two are left alone.
Thus, the same applies in reverse. In order to use the output from the Content Consistency Checker to match output to database, one must go through the same byte ordering reversal.
See this IBM Tech Doc and the answer linked below for details:
IBM Technote: https://www.ibm.com/support/pages/node/469173
Stack Answer: https://stackoverflow.com/a/53319983/1854328
More detailed information on the storage mechanisms is located here:
IBM Technote: "How to translate the unique identifier as displayed within FileNet Enterprise Manager so that it matches what is stored in the Oracle and DB2 databases"
I do not suggest using this for anything but catastrophic need, such as rebuilding and rewriting an entire file store that got horrendously corrupted by your predecessor when they destroyed an NTFS (or some similarly nasty situation).
It is a workaround to bypass FileNet's hashing that's used to obsfucate content information from those looking at the file system.

Does an entry of page table represents a page or a linear address?

I reading the book Understanding the linux kernel, and the topic about address transition very confuses me. Book says each linear address has three fields: Directory, Table, and Offset. The Directory field relates to the Directory Table, and Table field relates to Page Table.
One thing it does not point out, or I may miss, is that whether each entry in the tables relates to a page, which is a group of linear addresses, or relates to an individual linear address.
Can someone help me?
Ok, so there are (at least) two types of page tables: single-level, and multi-level.
Single-level page tables' entries map directly to virtual addresses.
Multi-level page tables' entries can map to two different places:
They may map directly to virtual memory addresses (like single-level tables).
They may map to secondary (or tertiary, etc, etc.) page tables
Here's an example of a multi-level page table:
Remember, each page table entry holds a virtual address. It is the responsibility of the operating system to translate virtual addresses to physical addresses (the benefits of which are outside of this particular topic).
Most paging systems also maintain a frame table that keeps track of used and unused frames. The frame table is traditionally a different data structure than the page table.
You can read more about paging tables here.
You can read about page tables here.

About the views start with v$ in oracle

I notice that many views, esp. the data dictionary views, start with v$ in oracle, is there some differences than other views in oracle?
These are called dynamic performance views which are maintained by oracle server.
These views provide data on internal disk structures and memory structures. These views can be selected from, but never updated or altered by the user.
You can check more details at this link: http://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_1001.htm#i1398692

Database efficiency: references/pointers from table to table

I am working on learning databases and am unsure about something that doesn't seem to make any sense to me. In the relational model you are able to combine through references but always require a global sort of key in each table to be able to combine this information. That is obviously required in most cases, but I feel like in a perfect tree hierarchy set up of a database this is inefficient.
To explain this better I shall use the example of storing products in a database. Products have main categories and sub categories and these are very clear. (ie. Milk is a subcategory of Dairy which is a subcategory of Food, etc.)
I thought in cases like this the ability to store single or a list of references/pointers to tables in fields would take away a lot of search querying and storage requirements.
Here is a link to a simple pain layout I made to illustrate this:
Image (the table entry could have some command character like '|' after which it knows the following entry is a file directory so when the database initiates it knows to make a pointer there)
Since I am only learning to work with databases now I understand that I may just be missing some knowledge on the subject, but I don't seem to find anything when I try googling this problem. Any help explaining where to start or any confirmation that this may improve efficiency and where I could learn how to write this myself would be great.
The concept of "pointer" is useful only if the object you want to point to has a well-defined address that is at least as permanent as the pointer itself. If the address is less permanent, you could end up with a "dangling" pointer.
A row in the database does not necessarily have a permanent address.1 By referencing the row through a logical value (instead of the physical address), the reference stays valid even when the row physically moves.2 And to ensure that the value identifies exactly one row, it must be unique.3
As for storing the list of values (be it "pointers" or anything else) inside a single field, this violates the principle of atomicity and therefore the 1NF. There are very good reasons to avoid violating the 1NF, including the ability to maintain the referential integrity and utilize indexing. That being said, there are DBMSes that support arrays or even sub-tables within a single field, which may be useful on rare occasions.
1 For example, Oracle ROWID is constant as long as the row is not physically moved on disk, but that can happen in many situations that are part of the normal database operation. So aside from putting severe restrictions on how your database is used, you couldn't rely on the ROWID staying constant over the lifetime of the rows that reference it (which could be as long as the lifetime of the database itself).
2 I suppose it would be theoretically possible for a DBMS to keep track of all the pointers and update them when the row physically moves. However, I'm not aware of any DBMS that actually supports such "updatable" pointers in practice, probably because the underlying mechanism needed for that wouldn't be any more efficient than the standard "value-based" referencing.
3 And must obviously be non-NULL. Saying that the attribute (or combination thereof) is "non-NULL and unique", is synonymous to saying it's a "key". Ideally, the key should also be immutable (so there is no need for a cascading referential action such as ON UPDATE CASCADE).

Database design: Same table structure but different table

My latest project deals with a lot of "staging" data.
Like when a customer registers, the data is stored in "customer_temp" table, and when he is verified, the data is moved to "customer" table.
Before I start shooting e-mails, go on a rampage on how I think this is wrong and you should just put a flag on the row, there is always a chance that I'm the idiot.
Can anybody explain to me why this is desirable?
Creating 2 tables with the same structure, populating a table (table 1), then moving the whole row to a different table (table 2) when certain events occur.
I can understand if table 2 will store archival, non seldom used data.
But I can't understand if table 2 stores live data that can changes constantly.
To recap:
Can anyone explain how wrong (or right) this seemingly counter-productive approach is?
If there is a significant difference between a "customer" and a "potential customer" in the business logic, separating them out in the database can make sense (you don't need to always remember to query by the flag, for example). In particular if the data stored for the two may diverge in the future.
It makes reporting somewhat easier and reduces the chances of treating both types of entities as the same one.
As you say, however, this does look redundant and would probably not be the way most people design the database.
There seems to be several explanations about why would you want "customer_temp".
As you noted would be for archival purposes. To allow analyzing data but in that case the historical data should be aggregated according to some interesting query. However it using live data does not sound plausible
As oded noted, there could be a certain business logic that differentiates between customer and potential customer.
Or it could be a security feature which requires logging all attempts to register a customer in addition to storing approved customers.
Any time I see a permenant table names "customer_temp" I see a red flag. This typically means that someone was working through a problem as they were going along and didn't think ahead about it.
As for the structure you describe there are some advantages. For example the tables could be indexed differently or placed on different File locations for performance.
But typically these advantages aren't worth the cost cost of keeping the structures in synch for changes (adding a column to different tables searching for two sets of dependencies etc. )
If you really need them to be treated differently then its better to handle that by adding a layer of abstraction with a view rather than creating two separate models.
I would have used a single table design, as you suggest. But I only know what you posted about the case. Before deciding that the designer was an idiot, I would want to know what other consequences, intended or unintended, may have followed from the two table design.
For, example, it may reduce contention between processes that are storing new potential customers and processes accessing the existing customer base. Or it may permit certain columns to be constrained to be not null in the customer table that are permitted to be null in the potential customer table. Or it may permit write access to the customer table to be tightly controlled, and unavailable to operations that originate from the web.
Or the original designer may simply not have seen the benefits you and I see in a single table design.

Resources