Disk / Data read increase after putting on an index [closed] - performance

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I have a small query that runs pretty fast. And somehow I thought adding an index to an unindexed collumn would make it faster but turned out it didn't. In fact, it does increase my disk reads and execution time. What I'd like to ask is can someone explain me a detailed info about how the index works and why it could decrease performance rather than increase it.
Thanks in advance!
PS : My RDBMS : Oracle

Entirely possible on a small table. If the table is truly small it could be that the table can be read entirely into memory with a single read, and a full table scan can be performed entirely in memory. Adding an index here would require reading at least a single index page, followed by reading the data page, for a doubling of the I/O's. This is an unusual case but not unheard of.
However, this is just guesswork on my part. To truly find out what's going on grab the execution plan for your query with the index on, drop the index, and grab the execution plan without the index. Compare the plans, and decide if you want to re-add the index.
Share and enjoy.

Related

What's the best way to load huge volume tables using Informatica? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Currently, in our project, we are using Informatica for Data loading.
We have a requirement to load 100 tables (in future it will increase) and each has 100 Million records, and we need to perform delta operation on that. What might be the best way to perform this operation in an efficient way?
If it's possible, try truncate and load. This way after each run you will have a full, fresh dump.
If you can't truncate the targets and need the delta, get some timestamp or counter that will allow to read modified rows only - like new and updated. Some "upddated date". This way you will limit the number of data being read. This will not let you do the deletes, though. So...
Create a separate flow for seeking deleted rows, that will not read the full row, but IDs only. This will still need to check all rows, but limited to just one column, so as a result it should be quite efficient. Use it to delete rows in target - or just to mark them as deleted.

in Hbase,Try to minimize row and column sizes,why? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
Looking at: http://hbase.apache.org/1.2/book.html#rowkey.design
I cannot understand 36.3. Try to minimize row and column sizes, why? This chapter is difficult for me to understand. Can someone help me?
Thanks in advance.
The docs is talking about key length and column name length size important which was found to matter in context of indexes in hbase ( jira issue).
If key size is large then the index size also become big. Systems depending upon indexing always prefer to keep the indexes in memory since it would be really bad to hit the
disk for index access. If the index size becomes unnecessarily high (resulting in high JVM heap ) it impacts performance.

Collection of stats in oracle [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Collecting Stats in oracle- How Performance gets improved?
When doing collect stats on fields/indexes , system collects the information like: total row counts of the table, how many distinct values are there in the column, how many rows per value, is the column indexed, if so unique or non unique etc
The above information are known as statistics.
1.How Performance gets improved?
2.How does the Parsing Engine/Cost Based Optimizer(CBO) use the statistics for the better performance of a query?
3.Why do i need to collect stats on the indexed columns , despite the fact
using indexed columns in where clause/joins itself will give better performance?
The above information are known as statistics. so How Performance gets improved?
Because the more and accurate information will let the optimizer decide for a better execution plan.
For example,
When you try to reach your destination for the first time, you gather information about the routes, directions, landmark etc. Once you reach your destination, you have all the information gathered, and the next time you would reach your destination using the shortest path or the best way to reach in least time.

Website Performance Issue [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 10 months ago.
Improve this question
If a website is experiencing performance issues all of a sudden, what can be the reasons behind it?
According to me database can one reason or space on server can be one of few reasons, I would like to know more about it.
There can be n number of reasons and n depends on your specification
According to what you have specified you can have a look at,
System counters of webserver/appserver like cpu, memory, paging, io, disk
What changes you did to application if any, were those changes performance costly i.e. have a round of analysis on those changes to check whether any improvement is required.
If system counters are choking then check which one is bottleneck and try to resolve it.
Check all layers/tiers of application i.e. app server, database, directory etc.
if database is bottleneck then identify costly queries and apply indexes & other DB tuning
If app server is choking then, you need to identify & improve the method which is resource heavy.
Performance tuning is not a fast track process, it takes time, identify bottlenecks and try to solve it and repeat the process until you get desired performance.

Access query calculation Vs. Excel calculation where access query is data source [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Setup is an Access back-end on a network drive with a query as data source for an Excel table. I want to know if it is better to perform complex calculations in Excel after the data has been imported vs having the calculation in the query itself.
For example:
The db collects quality control information with individual records for every component of a lot. One calculation checks that all components of each lot have been recorded and if so checks that the most recent component has been entered before the scheduled completion time.
Obviously this is a fairly intensive calculation in excel which leads to significant calculation time after the data has been imported. (It's very possible that the calculation isn't as efficient as it could be!!)
So what I'd like to know is if the access query would be more or less efficient at doing this calculation (bearing in mind that the file is on a network drive).
I hope all that makes sense but if not let me know and I will try to clarify!
Thanks.
There is not a general rule for which platform will be faster. It depends on the size of the task and the method chosen to implement it.
MSAccess is absolutely great at collating information, but that is because it is flexible and systematic, which helps prevent errors. Not because collating information is fast. There is no general rule that says collating information will be faster in MSAccess, Excel, SQL Server or C#.
If you are using a code loop to compare all cells, that can take a long time however you do it. Post the code here to see if there is a suggestion on how to convert it to calculated cell expressions. To make Excel fast, you need to use calculated cell expressions.
If you aren't using a code loop, are you sure you aren't actually waiting for the database access?

Resources