I have a fairly large tabular model and most days it only takes 8-10 minutes but every few days it will take 4-5 hours. Where would I even being to start troubleshooting this. I looked at the ETL logs but all it shows me is the step that asks the model to be processed.
Someone mentioned seeing if it was processing in parallel mode or sequential but I can't seem to find any setting for that in VS 2017 which is what I'm using.
I should also mention that I manually process it, it takes the normal amount of time (8-10 minutes). It's only when the ETL to process it executes that I sometimes see this long processing time.
Related
We have a batch process that executes every day. This week, a job that usually does not past 18 minutes of execution time (real time, as you can see), now is taking more than 45 minutes to finish.
Fullstimmer option is already active, but we don't know why only the real time was increased.
In old documentation there are Fullstimmer stats that could help identify the problem but they do not appear in batch log. (The stats are those down below: Page Faults, Context Switches, Block Operation and so on, as you can see)
It might be an I/O issue. Does anyone know how we can identify if it is really an I/O problem or if it could be some other issue (network, for example)?
To be more specific, this is one of the queries that have increased in time dramatically. As you can see, it is reading from a data base (SQL Server, VAULT schema) and work and writing in work directory.
Number of observations its almost the same:
We asked customer about any change in network traffic, and they said still the same.
Thanks in advance.
For a process to complete, much more needs to be done than the actual calculations on the CPU.
Your data has te be read and your results have to be written.
You might have to wait for other processes to finish first, and if your process includes multiple steps, writing to and reading from disk each time, you will have to wait for the CPU each time too.
In our situation, if real time is much larger than cpu time, we usually see much trafic to our Network File System (nfs).
As a programmer, you might notice that storing intermediate results in WORK is more efficient then on remote libraries.
You might safe much time by creating intermediate results as views instead of tables, IF you only use them once. That is not only possible in SQL, but also in data steps like this
data MY_RESULT / view=MY_RESULT;
set MY_DATA;
where transaction_date between '1jan2022'd and 30jun2022'd;
run;
I am working on a report to monitor certain things on the Power BI report Server. I was wondering what items others may be monitoring on reports and how do they do it.
Some examples of things I want to monitor:
A. Whether the scheduled data refreshes failed or succeeded.
Would love to be able to get the failure message.
B. What is the average response time of a query.
Is there a way to determine when the report is first opened. I would like to calculate initial load time.
C. What was the longest response time of a query per day.
D. How many times a query took longer than 5 seconds.
We have a big package that ALWAYS encounters performance issues. We get an average of 6-10 tickets raised for this issue in a month. Sometimes the program would run successfully for minutes, sometimes it would run for days just to Error out with an unexplained error.
I started to look deeper into this and found there are a number of possible causes of the performance issues, such as numerous un-tuned SQLs and bad coding practice, etc.
One thing that struck me today is in the code, it's calling Gather Table Statistics multiple times, in multiple places before doing some big operation (such as a huge Select Statement and a lot of DML statements).
This program is run on a daily, weekly and monthly basis, depending on the organization's practices.
Unfortunately, I am unable to replicate the performance issue to know more about this, but I am guessing running Gather Table statistics to multiple tables, multiple times, can cause major performance issues in the program. I am unable to find any resources to back this idea up. Can someone confirm?
Yes, can confirm, have seen code that spends 80% of the runtime gathering stats. Given your constraints, I'd try, in the following order:
I'd have a look at the DELETE statements to check if they can be replaced by TRUNCATE TABLE.
Gather stats once the tables are filled, lock their stats and comment out any other gather_table_stats calls. The assumption is that the data will not differ widely enough from day to day or week to week to cause different query plans.
If that doesn't work, I'd try to have a look at DBA_TAB_MODIFICATIONS to at least check if the tables have been changed enough since the last stats gathering.
During my migration of couchdb from 1.6.1 to 2.3.1, couchup utility is taking a lot of time to rebuild views. There are memory issues with couchup utility. The size of databases are in 500 GB range. It is taking forever. It has been almost 5 to 6 days and still not complete. Is there any way to speed it up?
When trying to do replicate, after 2-3 mins of couchup running, couchdb dies because of memory leak issues and again it starts. Replicate will take around 10 days. For replicate, it was showing progress bar but for rebuild views, it does not show progress bar. I don't know about the how much has been done.
The couchdb is installed in a RHEL Linux server.
reducing backlog growth:
As couchup encounters views that take longer than 5 seconds to rebuild, couchup is going to carry on calling additional view urls, triggering their rebuild. Once a number of long running view rebuilds are running even rebuilds that would have been shorter will take at least 5 seconds, leading to a large backlog. If individual databases are large or (map/reduce functions are very inefficient) it would probably be best to set the timeout to something like 5 minutes. If you see more than a couple:
Timeout, view is processing. Moving on.
messages it is probably time to kill couchup and double the timeout.
Observing Index growth
By default view_index_dir is the same as the database directory so if data is in /var/lib/couchdb/shards then /var/lib/couchdb is the configured directory and indexes are stored in /var/lib/coucdh/.shards. You can observe which index shard files are being created and growing or move view_index_dir somewhere separate for easier observation.
What resources are running out?
You can tune couchdb in general, it is hard to say whether tuning is needed once the system is not rebuilding all indexes, etc.
In particular, you would want to look for and disable any autocompaction. Look at files in /proc/[couchdb proc] to figure out the effective fd limits and how many open files there are and whether the crash happens around a specific number of open files. Due to sharding the number of open files is usually a multiple of the number of those in earlier versions.
Look at memory growth and figure out if it is stabalizing enough to use swap to prevent the problem.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I have this problem that has been going on for months. I automate reports at my job, we use oracle. I write a procedure, time it, it runs in a few minutes. I then set it up for monthly runs.
And then every month, some report runs for hours. It's all the same queries that ran in a few minutes for months before and all of a sudden they're taking hours to run.
I end up rewriting my procedures every now and then and to me this defeats the purpose of automating. No one here can help me.
What am I doing wrong? How can I ensure that my queries will always take the same amount of time to run.
I did some research and it says that in a correctly setup database with correct statistics you don't even have to use hints, everything should consistently run in about the same time.
Is this true? Or does everyone have this problem and everyone just rewrites their procedures whenever they run?
Sorry for 100 questions, I'm really frustrated about this.
My main question is, why does the same query takes different amount of time (drastic difference, from minutes to hours) to run on different days?
There are three broad reasons that queries take longer at different times. Either you are getting different performance because the system is under a different sort of load, you are getting different performance because of data volume changes, or you are getting different performance because you are getting different query plans.
Different Data Volume
When you generate your initial timings, are you using data volumes that are similar to the volumes that your query will encounter when it is actually run? If you test a query on the first of the month and that query is getting all the data for the current month and performing a bunch of aggregations, you would expect that the query would get slower and slower over the course of the month because it had to process more and more data. Or you may have a query that runs quickly outside of month-end processing because various staging tables that it depends on only get populated at month end. If you are generating your initial timings in a test database, you'l very likely get different performance because test databases frequently have a small subset of the actual production data.
Different System Load
If I take a query and run it during the middle of the day against my data warehouse, there is a good chance that the data warehouse is mostly idle and therefore has lots of resources to give me to process the query. If I'm the only user, my query may run very quickly. If I try to run exactly the same query during the middle of the nightly load process, on the other hand, my query will be competing for resources with a number of other processes. Even if my query has to do exactly the same amount of work, it can easily take many times more clock time to run. If you are writing reports that will run at month end and they're all getting kicked off at roughly the same time, it's entirely possible that they're all competing with each other for the limited system resources available and that your system simply isn't sized for the load it needs to process.
Different system load can also encompass things like differences in what data is cached at any point in time. If I'm testing a particular query in prod and I run it a few times in a row, it is very likely that most of the data I'm interested in will be cached by Oracle, by the operating system, by the SAN, etc. That can make a dramatic difference in performance if every read is coming from one of the caches rather than requiring a disk read. If you run the same query later after other work has flushed out most of the blocks your query is interested in, you may end up doing a ton of physical reads rather than being able to use the nicely warmed up cache. There's not generally much you can do about this sort of thing-- you may be able to cache more data or arrange for processes that need similar data to be run at similar times so that the cache is more efficient ut that is generally expensive and hard to do.
Different Query Plans
Over time, your query plan may also change because statistics have changed (or not changed depending on the statistic in question). Normally, that indicates that Oracle has found a more efficient plan or that your data volumes have changed and Oracle expects a different plan would be more efficient with the new data volume. If, however, you are giving Oracle bad statistics (if, for example, you have tables that get much larger during month-end processing but you gather statistics when the tables are almost empty), you may induce Oracle to choose a very bad query plan. Depending on the version of Oracle, there are various ways to force Oracle to use the same query plan. If you can drill down and figure out what the problem with statistics is, Oracle probably provides a way to give the optimizer better statistics.
If you take a look at AWR/ ASH data (if you have the appropriate licenses) or Statspace data (if your DBA has installed that), you should be able to figure out which camp your problems originate in. Are you getting different query plans for different executions (you may need to capture a query plan from your initial benchmarks and compare it to the current plan or you may need to increase your AWR retention to retain query plans for a few months in order to see this). Are you doing the same number of buffer gets over time but getting vastly different amounts of I/O waits? Do you see a lot of contention for resources from other sessions?If so, that probably indicates that the issue is different load at different times.
One possibility is that your execution plan is cached so it takes a short amount of time to rerun the query, but when the plan is no longer cached (like after the DB is restarted) it might take significantly longer.
I had a similar issue with Oracle a long while ago where a very complex query for a report ran against a very large amount of data, and it would take hours to complete the first time it was run after the DB was restarted, but after that it finished in a few minutes.
this is not an answer, this is a reply to Justin Cave, i couldn't format it in any readable way in the comments.
Different Data Volume
When ….. data.
Yes, I’m using the same archive tables that I then use for months to come. Of course, data changes but it’s a pretty consistent rise, for example, if a table has 10M rows this month – it might gain 100K rows the next, 200K the next, 100K the next and so on. There are no drastic jumps as far as I know. And I’d understand if today the query took 2 minutes and next month it’d take 5. But not 3 hours. However, thank you for the idea, I will start counting rows in tables from month to month as well.
Question though, so how do people code to account for this? let’s say someone works with tables that will get large amounts of data at random times, is there a way to write the query to ensure the run times are at least in the ball park? Or do people just put up with the fact that any month their reports will run 10-20 hours.
Different System Load
If I take a …. to process.
**No, I run my queries on different days and times but I have logs of the days and the times so I will see if I can find a pattern.
Different system load …hard to do.
So are you saying that the fast times I may be getting at the time of the report design might be fast because of the things I ran on my computer previously?
Also, does the cache get stored on my computer or on the database under my login or where?**
Different Query Plans
Over time, your query plan … different load at different times.
Thank you for your explanations, you’ve given me enough to start digging.