Collection of stats in oracle [closed] - oracle

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Collecting Stats in oracle- How Performance gets improved?
When doing collect stats on fields/indexes , system collects the information like: total row counts of the table, how many distinct values are there in the column, how many rows per value, is the column indexed, if so unique or non unique etc
The above information are known as statistics.
1.How Performance gets improved?
2.How does the Parsing Engine/Cost Based Optimizer(CBO) use the statistics for the better performance of a query?
3.Why do i need to collect stats on the indexed columns , despite the fact
using indexed columns in where clause/joins itself will give better performance?

The above information are known as statistics. so How Performance gets improved?
Because the more and accurate information will let the optimizer decide for a better execution plan.
For example,
When you try to reach your destination for the first time, you gather information about the routes, directions, landmark etc. Once you reach your destination, you have all the information gathered, and the next time you would reach your destination using the shortest path or the best way to reach in least time.

Related

What's the best way to load huge volume tables using Informatica? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Currently, in our project, we are using Informatica for Data loading.
We have a requirement to load 100 tables (in future it will increase) and each has 100 Million records, and we need to perform delta operation on that. What might be the best way to perform this operation in an efficient way?
If it's possible, try truncate and load. This way after each run you will have a full, fresh dump.
If you can't truncate the targets and need the delta, get some timestamp or counter that will allow to read modified rows only - like new and updated. Some "upddated date". This way you will limit the number of data being read. This will not let you do the deletes, though. So...
Create a separate flow for seeking deleted rows, that will not read the full row, but IDs only. This will still need to check all rows, but limited to just one column, so as a result it should be quite efficient. Use it to delete rows in target - or just to mark them as deleted.

Oracle - Intervals timestamp [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
What is the best practice to manage time intervals in Oracle? For example: I have a room that will be rented between 8:15 till 9:00. So I have at least 2 fields: dt_start and dt_end, I suppose. I can not permit to enter a rent between 8:45 till 9:20. So how would be the best table structure for that? Thanks
There is no clear consensus on the best way to implement this. The answer certainly depends a great deal on your exact situation. The options are:
Table with unique constraint on ROOM_ID and a block of time. This is only realistic if the application allocates a reasonably small amount of time using reasonably large blocks. For example, if a room can only be allocated for at most a week, 5 minutes at a time. But if reservations are to the second, and can span over a year, this would require 31 million rows for one reservation.
Trigger. Avoid this solution if possible. The chance of implementing this logic in a trigger that is both consistent and concurrent is very low.
Materialized view. This is my preferred approach. For example, see my answer here.
Enforced by the application. This only works if the application can serialize access and if no ad hoc SQL is allowed.
Commercial Tool. For example, RuleGen.
BEFORE INSERT TRIGGER is the best way to accomplish your need.
In trigger, figure out that the new time is not conflicting the current time of your particular room, and if so you can Rais Error, otherwise let the update happen.

Disk / Data read increase after putting on an index [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I have a small query that runs pretty fast. And somehow I thought adding an index to an unindexed collumn would make it faster but turned out it didn't. In fact, it does increase my disk reads and execution time. What I'd like to ask is can someone explain me a detailed info about how the index works and why it could decrease performance rather than increase it.
Thanks in advance!
PS : My RDBMS : Oracle
Entirely possible on a small table. If the table is truly small it could be that the table can be read entirely into memory with a single read, and a full table scan can be performed entirely in memory. Adding an index here would require reading at least a single index page, followed by reading the data page, for a doubling of the I/O's. This is an unusual case but not unheard of.
However, this is just guesswork on my part. To truly find out what's going on grab the execution plan for your query with the index on, drop the index, and grab the execution plan without the index. Compare the plans, and decide if you want to re-add the index.
Share and enjoy.

Ruby Object manipulation [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
We have an algorithm that compare ruby objects coming from MongoDB. The majority of the time spent, it taking the results (~1000), assigning a weight to them, and comparing them to a base object. This process takes ~2 sec for 1000 objects. Afterwards, we order the objects by the weight, and take the top 10.
Given that the number of initial matches will continue to grow, I'm looking for more efficient ways to compare and sort matches in Ruby.
I know this is kind of vague, but let's assume they are User objects that have arrays of data about the person and we're comparing them to a single user to find the best match for that user.
Have you considered storing/caching the weight? This works well if the weight depends only on the attributes of each user and not on values external to that user.
Also, how complex is the calculation involving the weight associated with a user and the "base" user? If it's complex you may want to consider using a graph database, which can store data that is specific to the relation between 2 nodes/objects.

Joining very large lists [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Lets put some numbers first:
The largest of the list is about 100M records. (but is expected to grow upto 500). The other lists (5-6 of them) are in millions but would be less than 100M for the foreseeable future.
These are always joined based on a single id. and never with any other parameters.
Whats the best algorithm to join such lists?
I was thinking in lines of distributed computing. Have a good hash (the circular hash kinds, where you can add a node and there's not a lot of data movement) function and have these lists split into several smaller files. And since, they are always joined on the common id (which i will be hashing) it would boil down to joining to small files. And maybe use the nix join commands for that.
A DB (at least MySQL) would join using merge join (since it would be on primary key). Is that going to be more efficient that my approach?
I know its best to test and see. But given the magnitute of these files, its pretty time consuming. And I would like to do some theoretical calculation and then see how it fairs in practice.
Any insights on these or other ideas would be helpful. I dont mind if it takes slightly longer, but would prefer the best utilization of the resources I have. Don't have a huge budget :)
Use a Database. They are designed for performing joins (with the right indexes of course!)

Resources