I have two SQL Server Agent jobs that run XMLA to process SSAS objects in a SQL 2012 SSAS server.
ProcessCubeFull This runs after the related SQL database has been reloaded from scratch, this is a weekly job. This job does the following:
1. Process Dimensions, ProcessFull, uses MaxParallel="4", Time: 20 minutes
2. Process Partitions, ProcessFull, uses MaxParallel="4", Time: 85 minutes
3. Process Cubes, ProcessFull, uses MaxParallel="4", Time: 100 minutes
ProcessCubeUpdate This runs after the related SQL database had its daily update, this is a daily job. This job does the following:
1. Process Dimensions, ProcessUpdate, uses MaxParallel="4",Time: 100 minutes
2. Process Partitions, ProcessData, uses MaxParallel="4", ,Time: 15 minutes
3. Process Indexes, ProcessIndexes, uses MaxParallel="4", Time: 55 minutes
4. Processes Cubes, ProcessDefault, uses MaxParallel="4", Time: 1 minute
The performance of these jobs is very slow and getting slower.
It also seems odd that for the dimensions ProcessFull is a lot faster than ProcessUpdate.
I would like to know how I can speed these jobs up, or if they need some additional steps?
If your processing time is reasonable from your data size, I suggest you use multiple partitions for your objects
Related
I'm running a job using Spring Batch 4.2.0 with postgres (11.2) as backend. It's all wrapped in a spring boot app. I've 5 steps and each runs using a simple partitioning strategy to divide data by id ranges and reads data into each partition (which are processed by separate threads). I've about 18M rows in the table, each step reads, changes few fields and writes back. Each step reads all 18M rows and writes back. The issue I'm facing is, the queries that run to pull data into each thread scans data by id range like,
select field_1, field_2, field_66 from table where id >= 1 and id < 10000.
In this case each thread processes 10_000 rows at a time. When there's no traffic the query takes less than a second to read all 10,000 rows. But when the job runs there's about 70 threads reading all that data in. It goes progressively slower to almost a minute and a half, any ideas where to start troubleshooting this?
I do see autovacuum running in the backgroun for almost the whole duration of job. It definitely has enough memory to hold all that data in memory (about 6GB max heap). Postgres has sufficient shared_buffers 2GB, max_wal_size 2GB but not sure if that in itself is sufficient. Another thing I see is loads of COMMIT queries hanging around when checking through pg_stat_activity. Usually as much as number of partitions. So, instead of 70 connections being used by 70 partitions there are 140 conections used up with 70 of them running COMMIT. As time progresses these COMMITs get progressively slower too.
You are probably hitting https://github.com/spring-projects/spring-batch/issues/3634.
This issue has been fixed and will be part of version 4.2.3 planned to be released this week.
A scheduler job runs at 11PM every night to delete around 500,000 records in Source Oracle Database 12c. During this time, replication lag on target database suddenly increases from 4 seconds to 900 seconds and keeps on increasing to 7500 seconds till 3AM(End time of scheduler job). After that it lag starts to gradually decrease and reaches to 4 seconds again at 4AM. Is this a normal behaviour of Oracle Goldengate when running housekeeping scheduler jobs in production databases?
From your description it seems that the lag is caused by the 500k rows delete operation.
You need to check where does the lag come from. You might use the heartbeat build in functionality in OGG. It helps very much. Depending on the source of the lag:
Extract: If the operation goes in one DML operation - divide it into smaller chunks.
Network/hardware: Make some network tuning, increase the packet size. Check the hardware load.
Replicat: If your smaller chunks are still applied slowly - try to use the parallel form of replication, like Parallel Replicat, Integrated Replicat. Or, you might also consider the Coordinated Replicat mode if this delete can be run transaction-independent.
We have an application running that makes and intensive usage of EF version 4.1. Right now an update is not scheduled, but this is not the point.
We are going under a process of huge optimization as the workload over the Sql Server database is very high. I successfully succeeded (after never succeeding) in profiling the sent queries (we are under Sql Azure) and I discovered that a lot of awful queries were sent, because of stupid usage of Includes. We removed all the avoidable "Includes", by substituing them with direct queries on the sets (i.e. 10 queries against 1 before) and the total workload on Sql Server has dramatically reduced.
But then I discovered that the global code is running slower. How may this be possible? I started tracing and I discovered that even if a query was processed in less that 50ms, EF takes more than 1 second to materialize the entities. That's awful. It's 2 order degrees more than a normal sql query through ADO.Net.
I tried by disabling the Entity Tracking on the set and querying the set with AsNoTracking() but nothing seems changed.
I never had the possibility to analyze the total split time between query processing and object materialization, but I never ever thought that that was the impact. Is there any more that I can try?
--
Update 2015/06/15 - Performance Analysis
I ran 7 different set of tests, and each test has been run 10 times. I tracked the total time to execute the .NET code using the old version (1 huge query with a lot of Includes) and the new version (1 small query for each old Include).
I would say that the problem is, as Jeroen Vannevel said in the comment, the latency of the multiple request / response.
Here are the summed results. Please consider that when I write "item" I mean a complex entity, composed by at least 10 different entities taken from 10 different DB tables.
GET of 1 item from the DB
OLD method: 3.5 seconds average
NEW method: 276 milliseconds average
GET of 10 items
OLD method: 3.6 seconds average
NEW method: 340 milliseconds average (+23%)
GET of 50 items
OLD method: 4.0 seconds average
NEW method: 440 milliseconds average (+30%)
GET of 100 items
OLD method: 4.5 seconds average
NEW method: 595 milliseconds average (+35%)
GET of 250 items
OLD method: 5.7 seconds average
NEW method: 948 milliseconds average (+60%)
GET of 500 items
OLD method: 5.9 seconds average
NEW method: 1.8 seconds average (almost +100%)
GET of 1000 items
OLD method: 6.9 seconds average
NEW method: 5.1 seconds average (over +180%)
So it seems that, while the OLD method is very slow from the very first execution, because of the heavy query, it increments "slowly" as I get more and more items from the DB.
I would say that as soon as the packet grows up, the time to transfer it to the calling machine is higher than DB time.
May this really be the reason?
Thanks
Marco
I have driver program that runs a set of 5 experiments - basically the driver program just tells the program which dataset to use (of which there are 5 and they're very similar).
The first iteration takes 3.5 minutes, the second 6 minutes, the third 30 minutes and the fourth has been running for over 30 minutes.
After each run the SparkContext object is stopped, it is then re-started for the next run - I thought this method would prevent slow down, as when sc.stop is called I was under the impression that the instances were cleared of all their RDD data - this is at least how it works in local mode. The dataset is quite small and according to Spark UI only 20Mb of data on 2 nodes is used.
Does sc.stop not remove all data from a node? What would cause such a slow down?
call sc.stop after all iterations are complete. Whenever we stop SparkContenxt and invoke new, it require time to load spark configurations,jars and free driver port to execute the next job.
and
using config --executor-memory you can speed up the process, depending on how much memory you have in each node.
Stupidly, I had used T2 instances. Their burstable performance means they only work on full power for a small amount of time. Read the documentation thoroughly - lesson learnt!
In the Client Statistics Window in Sql Server Management Studio, I get the total
execution time.
However, this time is often muss less then the time the query actually took.
So what is the additional time spend for?
For example, here I got ~5,6 seconds of total execution time, but my query took 13 seconds to finish
The total execution time is the time until the result is available for display. But then, depending on the result set size and the way you display the data, the time until everything has been rendered is usually much higher.