Slow Performance on Sql Express after inserting big chunks of data - performance

We have noticed that our queries are running slower on databases that had big chunks of data added (bulk insert) when compared with databases that had the data added on record per record basis, but with similar amounts of data.
We use Sql 2005 Express and we tried reindexing all indexes without any better results.
Do you know of some kind of structural problem on the database that can be caused by inserting data in big chunks instead of one by one?
Thanks

One tip I've seen is to turn off Auto-create stats and Auto-update stats before doing the bulk insert:
ALTER DATABASE databasename SET AUTO_CREATE_STATISTICS OFF WITH NO_WAIT
ALTER DATABASE databasename SET AUTO_UPDATE_STATISTICS OFF WITH NO_WAIT
Afterwards, manually creating statistics by one of 2 methods:
--generate statistics quickly using a sample of data from the table
exec sp_createstats
or
--generate statistics using a full scan of the table
exec sp_createstats #fullscan = 'fullscan'
You should probably also turn Auto-create and Auto-update stats back on when you're done.
Another option is to check and defrag the indexes after a bulk insert. Check out Pinal Dave's blog post.

Probably SQL Server allocated new disk space in many small chunks. When doing big transactions, it's better to pre-allocate much space in both the data and log files.

That's an interesting question.
I would have guessed that Express and non-Express have the same storage layout, so when you're Googling for other people with similar problems, don't restrict yourself to Googling for problems in the Express version. On the other hand though, bulk insert is a common-place operation and performance is important, so I wouldn't consider it likely that this is a previously-undetected bug.
One obvious question: which is the clustered index? Is the clustered index also the primary key? Is the primary key unassigned when you insert, and therefore initialized by the database? If so then maybe there's a difference (between the two insert methods) in the pattern or sequence of successive values assigned by the database, which affects the way in which the data is clustered, which then affects performance.
Something else: as well as indexes, people say that SQL uses statistics (which it created as a result of runing previous queries) to optimize its execution plan. I don't know any details of that, but as well as "reindexing all indexes", check the execution plans of your queries in the two test cases to ensure that the plans are identical (and/or check the associated statistics).

Related

Best approaches to UPDATE the data in tables - Teradata

I am new to Teradata & fortunately got a chance to work on both DDL-DML statements.
One thing I observed is Teradata is very slow when time comes to UPDATE the data in a table having large number of records.
The simplest way I found on the Google to perform this update is to write an INSERT-SELECT statement with a CASE on column holding values to be update with new values.
But what when this situation arrives in Data Warehouse environment, when we need to update multiple columns from a table holding millions of rows ?
Which would be the best approach to follow ?
INSERT-SELECT only OR MERGE-UPDATE OR MLOAD ?
Not sure if any of the above approach is not used for this UPDATE operation.
Thank you in advance!
At enterprise level, we expect volumes to be huge and updates are often part of some scheduled jobs/scripts.
With huge volume of data, Updates comes as a costly operation that involve risk of blocking table for some time in case the update fails (due to fallback journal). Although scripts are tested well, and failures seldom happen in production environments, it's always better to have data that needs to be updated loaded to a temporary table in required form and inserted back to same table after deleting matching records to maintain SCD-1 (Where we don't maintain history).

Advantages of temporary tables in Oracle

I've tried to figure out which performance impacts the use of temporary tables has on an Oracle database. We want to use these tables in our ETL process to save temporary results. At this time we are using physical tables for this purpose and truncating this tables at the beginning of the ETL process. I know that the truncate process is very expensive and therefore I thought if it would be better to use temporary tables instead.
Have anyone of you experiences if there is a performance boost by using temporary tables in this scenario?
There were only some answers on this question regarding to the SQL Server like in this question. But I don't know if these recommendations also fit for the Oracle db.
It would be nice if anyone could list the advantages and disadvanteges of this feature and also point out in which scenarios this feature could be applicable.
Thanks in advance.
First of all: truncate is not expensive, a delete with no condition is very expensive.
Second: do your temporary table have indexes? What about external keys?
That could affect performance.
The temporary table works more or less like Sql Server (of course the syntax is different, like global temporary table), and both are just table.
You won't get any performance gain with temporary tables against normal table, they are just the same: they have a definition on DB, can have indexes, and are logged.
The only difference is that temporary table are exclusive to your session (except for global table) and that means if multiple scripts from multiple sessions refer to the same table, every one is reading/writing a different table and they cannot locking each other (in this case you could gain performance, but I think it's rarely the case).

Is there a way to make selecting query faster?

I want to select multiple rows from multiple tables, one of them having billions of rows. It sometimes take 20 seconds and there are over thousands of users using it so it is pretty bad.
I looked into COLUMNSTORE and tried it in my local machine and the performance is x50 faster than usual! (note that I was clearing the cache to see the difference)
However, the downside is I can't update, insert and delete rows, which is being constantly done for that table with the billion rows.
Is there a way to optimize it? (Besides the (NOLOCK) dirty read, which security is not an issue btw)
There are already indexes in that table, but doesn't help.
Is there a way to perform BATCH EXECUTION (I see it does row execution)? Or any optimization advice?
Using Microsoft SQL Server 2012
When you get to the scale of billions of rows, you often need to take different approaches for handling the data. Separating the content into multiple databases and storing on different machines might be more effective, however the design is considerably more complex.
An alternative is to consider using a combination of partitioned tables with a column-based index. That way at least, you can stage the updated data for the partition and then swap the updated one for the existing one to perform updates. See: http://technet.microsoft.com/en-us/library/gg492088.aspx#Update
An alternative is to consider using three tables: one that is static -- and is perhaps using column-based storage -- the other one dynamic, holding only recent updates and inserts, and the third holding just a list of deleted rows identified by the primary key. You then have to use a view to reconcile the content for queries.

Doesn't Read-Only make a difference for SQL Server?

I’ve been tasked with optimizing a rather nasty stored procedure in a legacy system. It’s a database dedicated to search, and a new copy is being generate every day, with a lot of complex joins being de-normalized. No writes are being performed, only SELECTs, so I figured some easy improvements could be made by making the whole database read-only and changing the recovery model to “Simple”.
Much to my surprise, this didn’t help – at all! The stored procedure still takes the same amount of time of complete. If fact, I’m so surprised that I figured I did it wrong!
My questions:
Do I need to do anything other than setting “Database read-only” to “true”?
Am I wrong to expect significant performance improvement by making the database read-only?
Same for the recovery model: Shouldn’t “Simple” have some noticeable impact?
Are there other similar database-wide configurations that can improve performance in this scenario?
The stored procedure is huge, with temporary tables, 40+ tables joined in 20+ queries. But I’d like to optimize the database itself before I edit this proc.
Since no writes are performed by your SP, there is no reason to expect noticable performance improvement from changing recovery model and read-write mode.
As others mentioned, you should look into the query plan and optimize your queries.
Another hint: indexes in the database might get fragmented while the database is filled up. Since the data is not going to be modified any more, it might help to rebuild all the indexes with fillfactor 100 - this might help to get rid of fragmentation and to compact data.
Call this for each table in the database: ALTER INDEX ALL ON table_name REBUILD WITH (FILLFACTOR = 100).
Generally, I won't expect much of performance improvement from this, but it depends on the particular database.
Speaking of query optimization, there are very useful features in SQL Server 2005 and later: Execution Related and Index-Related Dynamic Management Views. In particular, sys.dm_exec_query_stats and missing indexes are of interest.
These give you almost the same information as Tuning Advisor, but using you real-life workload, so you don't need to simulate it and feed to the Advisor.
Have you tried using the Database Engine Tuning Advisor included in SQL Server? It will analyze your query and suggest new indexes that will improve the performance of the query. Some of them will be good, some will be bad (for example, I've seen it suggest adding every column in a table to an index, sometimes like 30 of them!), so I don't follow it blindly. Generally I'll add a few indexes and then retest, to find the suggestions that are the most important. I've used it to optimize many queries that I thought I had properly indexed, only to find I could get a lot more performance out of them.
I had a similar setup, large stored procedures with lots of large temp tables.
Our problem was that the joins with and between the temp tables was very slow.
I recommend that you look at your execution plan and try to add relevant indexes to the temp tables too if you have not already.

Performance of bcp/BULK INSERT vs. Table-Valued Parameters

I'm about to have to rewrite some rather old code using SQL Server's BULK INSERT command because the schema has changed, and it occurred to me that maybe I should think about switching to a stored procedure with a TVP instead, but I'm wondering what effect it might have on performance.
Some background information that might help explain why I'm asking this question:
The data actually comes in via a web service. The web service writes a text file to a shared folder on the database server which in turn performs a BULK INSERT. This process was originally implemented on SQL Server 2000, and at the time there was really no alternative other than chucking a few hundred INSERT statements at the server, which actually was the original process and was a performance disaster.
The data is bulk inserted into a permanent staging table and then merged into a much larger table (after which it is deleted from the staging table).
The amount of data to insert is "large", but not "huge" - usually a few hundred rows, maybe 5-10k rows tops in rare instances. Therefore my gut feeling is that BULK INSERT being a non-logged operation won't make that big a difference (but of course I'm not sure, hence the question).
The insertion is actually part of a much larger pipelined batch process and needs to happen many times in succession; therefore performance is critical.
The reasons I would like to replace the BULK INSERT with a TVP are:
Writing the text file over NetBIOS is probably already costing some time, and it's pretty gruesome from an architectural perspective.
I believe that the staging table can (and should) be eliminated. The main reason it's there is that the inserted data needs to be used for a couple of other updates at the same time of insertion, and it's far costlier to attempt the update from the massive production table than it is to use an almost-empty staging table. With a TVP, the parameter basically is the staging table, I can do anything I want with it before/after the main insert.
I could pretty much do away with dupe-checking, cleanup code, and all of the overhead associated with bulk inserts.
No need to worry about lock contention on the staging table or tempdb if the server gets a few of these transactions at once (we try to avoid it, but it happens).
I'm obviously going to profile this before putting anything into production, but I thought it might be a good idea to ask around first before I spend all that time, see if anybody has any stern warnings to issue about using TVPs for this purpose.
So - for anyone who's cozy enough with SQL Server 2008 to have tried or at least investigated this, what's the verdict? For inserts of, let's say, a few hundred to a few thousand rows, happening on a fairly frequent basis, do TVPs cut the mustard? Is there a significant difference in performance compared to bulk inserts?
Update: Now with 92% fewer question marks!
(AKA: Test Results)
The end result is now in production after what feels like a 36-stage deployment process. Both solutions were extensively tested:
Ripping out the shared-folder code and using the SqlBulkCopy class directly;
Switching to a Stored Procedure with TVPs.
Just so readers can get an idea of what exactly was tested, to allay any doubts as to the reliability of this data, here is a more detailed explanation of what this import process actually does:
Start with a temporal data sequence that is ordinarily about 20-50 data points (although it can sometimes be up a few hundred);
Do a whole bunch of crazy processing on it that's mostly independent of the database. This process is parallelized, so about 8-10 of the sequences in (1) are being processed at the same time. Each parallel process generates 3 additional sequences.
Take all 3 sequences and the original sequence and combine them into a batch.
Combine the batches from all 8-10 now-finished processing tasks into one big super-batch.
Import it using either the BULK INSERT strategy (see next step), or TVP strategy (skip to step 8).
Use the SqlBulkCopy class to dump the entire super-batch into 4 permanent staging tables.
Run a Stored Procedure that (a) performs a bunch of aggregation steps on 2 of the tables, including several JOIN conditions, and then (b) performs a MERGE on 6 production tables using both the aggregated and non-aggregated data. (Finished)
OR
Generate 4 DataTable objects containing the data to be merged; 3 of them contain CLR types which unfortunately aren't properly supported by ADO.NET TVPs, so they have to be shoved in as string representations, which hurts performance a bit.
Feed the TVPs to a Stored Procedure, which does essentially the same processing as (7), but directly with the received tables. (Finished)
The results were reasonably close, but the TVP approach ultimately performed better on average, even when the data exceeded 1000 rows by a small amount.
Note that this import process is run many thousands of times in succession, so it was very easy to get an average time simply by counting how many hours (yes, hours) it took to finish all of the merges.
Originally, an average merge took almost exactly 8 seconds to complete (under normal load). Removing the NetBIOS kludge and switching to SqlBulkCopy reduced the time to almost exactly 7 seconds. Switching to TVPs further reduced the time to 5.2 seconds per batch. That's a 35% improvement in throughput for a process whose running time is measured in hours - so not bad at all. It's also a ~25% improvement over SqlBulkCopy.
I am actually fairly confident that the true improvement was significantly more than this. During testing it became apparent that the final merge was no longer the critical path; instead, the Web Service that was doing all of the data processing was starting to buckle under the number of requests coming in. Neither the CPU nor the database I/O were really maxed out, and there was no significant locking activity. In some cases we were seeing a gap of a few idle seconds between successive merges. There was a slight gap, but much smaller (half a second or so) when using SqlBulkCopy. But I suppose that will become a tale for another day.
Conclusion: Table-Valued Parameters really do perform better than BULK INSERT operations for complex import+transform processes operating on mid-sized data sets.
I'd like to add one other point, just to assuage any apprehension on part of the folks who are pro-staging-tables. In a way, this entire service is one giant staging process. Every step of the process is heavily audited, so we don't need a staging table to determine why some particular merge failed (although in practice it almost never happens). All we have to do is set a debug flag in the service and it will break to the debugger or dump its data to a file instead of the database.
In other words, we already have more than enough insight into the process and don't need the safety of a staging table; the only reason we had the staging table in the first place was to avoid thrashing on all of the INSERT and UPDATE statements that we would have had to use otherwise. In the original process, the staging data only lived in the staging table for fractions of a second anyway, so it added no value in maintenance/maintainability terms.
Also note that we have not replaced every single BULK INSERT operation with TVPs. Several operations that deal with larger amounts of data and/or don't need to do anything special with the data other than throw it at the DB still use SqlBulkCopy. I am not suggesting that TVPs are a performance panacea, only that they succeeded over SqlBulkCopy in this specific instance involving several transforms between the initial staging and the final merge.
So there you have it. Point goes to TToni for finding the most relevant link, but I appreciate the other responses as well. Thanks again!
I don't really have experience with TVP yet, however there is an nice performance comparison chart vs. BULK INSERT in MSDN here.
They say that BULK INSERT has higher startup cost, but is faster thereafter. In a remote client scenario they draw the line at around 1000 rows (for "simple" server logic). Judging from their description I would say you should be fine with using TVP's. The performance hit - if any - is probably negligible and the architectural benefits seem very good.
Edit: On a side note you can avoid the server-local file and still use bulk copy by using the SqlBulkCopy object. Just populate a DataTable, and feed it into the "WriteToServer"-Method of an SqlBulkCopy instance. Easy to use, and very fast.
The chart mentioned with regards to the link provided in #TToni's answer needs to be taken in context. I am not sure how much actual research went into those recommendations (also note that the chart seems to only be available in the 2008 and 2008 R2 versions of that documentation).
On the other hand there is this whitepaper from the SQL Server Customer Advisory Team: Maximizing Throughput with TVP
I have been using TVPs since 2009 and have found, at least in my experience, that for anything other than simple insert into a destination table with no additional logic needs (which is rarely ever the case), then TVPs are typically the better option.
I tend to avoid staging tables as data validation should be done at the app layer. By using TVPs, that is easily accommodated and the TVP Table Variable in the stored procedure is, by its very nature, a localized staging table (hence no conflict with other processes running at the same time like you get when using a real table for staging).
Regarding the testing done in the Question, I think it could be shown to be even faster than what was originally found:
You should not be using a DataTable, unless your application has use for it outside of sending the values to the TVP. Using the IEnumerable<SqlDataRecord> interface is faster and uses less memory as you are not duplicating the collection in memory only to send it to the DB. I have this documented in the following places:
How can I insert 10 million records in the shortest time possible? (lots of extra info and links here as well)
Pass Dictionary<string,int> to Stored Procedure T-SQL
Streaming Data Into SQL Server 2008 From an Application (on SQLServerCentral.com ; free registration required)
TVPs are Table Variables and as such do not maintain statistics. Meaning, they report only having 1 row to the Query Optimizer. So, in your proc, either:
Use statement-level recompile on any queries using the TVP for anything other than a simple SELECT: OPTION (RECOMPILE)
Create a local temporary table (i.e. single #) and copy the contents of the TVP into the temp table
I think I'd still stick with a bulk insert approach. You may find that tempdb still gets hit using a TVP with a reasonable number of rows. This is my gut feeling, I can't say I've tested the performance of using TVP (I am interested in hearing others input too though)
You don't mention if you use .NET, but the approach that I've taken to optimise previous solutions was to do a bulk load of data using the SqlBulkCopy class - you don't need to write the data to a file first before loading, just give the SqlBulkCopy class (e.g.) a DataTable - that's the fastest way to insert data into the DB. 5-10K rows isn't much, I've used this for up to 750K rows. I suspect that in general, with a few hundred rows it wouldn't make a vast difference using a TVP. But scaling up would be limited IMHO.
Perhaps the new MERGE functionality in SQL 2008 would benefit you?
Also, if your existing staging table is a single table that is used for each instance of this process and you're worried about contention etc, have you considered creating a new "temporary" but physical staging table each time, then dropping it when it's finished with?
Note you can optimize the loading into this staging table, by populating it without any indexes. Then once populated, add any required indexes on at that point (FILLFACTOR=100 for optimal read performance, as at this point it will not be updated).
Staging tables are good! Really I wouldn't want to do it any other way. Why? Because data imports can change unexpectedly (And often in ways you can't foresee, like the time the columns were still called first name and last name but had the first name data in the last name column, for instance, to pick an example not at random.) Easy to research the problem with a staging table so you can see exactly what data was in the columns the import handled. Harder to find I think when you use an in memory table. I know a lot of people who do imports for a living as I do and all of them recommend using staging tables. I suspect there is a reason for this.
Further fixing a small schema change to a working process is easier and less time consuming than redesigning the process. If it is working and no one is willing to pay for hours to change it, then only fix what needs to be fixed due to the schema change. By changing the whole process, you introduce far more potential new bugs than by making a small change to an existing, tested working process.
And just how are you going to do away with all the data cleanup tasks? You may be doing them differently, but they still need to be done. Again, changing the process the way you describe is very risky.
Personally it sounds to me like you are just offended by using older techniques rather than getting the chance to play with new toys. You seem to have no real basis for wanting to change other than bulk insert is so 2000.

Resources