Tableau running slow/ Queries taking a long time - performance

To speed up processing time in tableau: Is it best to combine multiple calculated fields into one calculated field or best to have the equation broken out into pieces?
Thanks!

Both options are not related to optimization. However, it is a good practice to hide unused fields from the "tables" sidebar

Related

How to detect which data are affecting the result of a feature with machine learning?

Firstly, I will illustrate the scenario that I have a dataset like;
ProductID, ProductType, MachineID, MachineModel, MachineSpeed, RejectDate, RejectVolume etc.
And I want to find which field(s) is the reason for the increase in my RejectVolume? Also, in the scenario, all products have a RejectVolume. I mean RejectVolume is nonzero and there are continuous but different values. Thanks to this, I can recognize the reason(s) and find the solution for reducing the value of RejectVolume.
Can you give me any ideas for creating the model?
Thank you.
You want to look at Feature Selection methods.
In this scenario you could start with Linear Regression using Lasso for feature selection. This is done by successively increasing the lasso regularization term, which will decrease the weight of unimportant features, leaving you with the features with the most impact.

ssas dimension processing incremental

I am having a large dimension and it is taking me more and more time to process it. I would like to decrease the processing time as much as possible
there is literally hundreds of different articles on how to process ssas objects as efficient and fast as possible.
There are lots of tips and tricks that one can apply to speed up dimensions and cube processing. I managed to apply all or at least a big majority of them and I am still not happy with the result,.
I have a large dimension built on top of a table.
It has around 60 mil records and it keeps on growing fast.
It either add new rows to it or delete the existing ones. there are no updates possible
I am looking for a solution that will allow me to perform an incremental processing of my dimension.
I know that the data in the previous month will not be changed. I would like to do smth similar to partitioning of my cube but on the dimension.
I am using SLQ SERVER 2012 and to my knowledge dimension partitioning is not supported.
I am currently using process update on my dimension - I tried processing using by attribute and by table but both render almost the same result. I have hierarchies and relationships - some set to rigid. I am only using those attributes that are truly needed etc etc etc
process update has to read all the records in a dimension even those that i know have not changed. is there a way to partition a dimension? if I could tell SSAS to only process the last 3-4 weeks of data in my dimension and not touch the rest - it would greatly speed up my processing time.
I would appreciate your help
ok so I did a bit of research and I can confirm that incremental dimension processing is not supported.
it is possible to do process add on a dimension but if you have records that got deleted or updated you cannot do that
it would be a useful thing to have but MS hasn't developed it and I don't think it will
incremental processing of any table is however possible in tabular cubes
so if you have a similar requirement and your cube is not too complex then creating a tabular cube is the way to go

How are you supposed to use the SSAS Tabular Measures Grid

As far as I can tell it doesn't do anything, and the best thing to do is to add all measures as a list in one column, but then there would be a "Measures List" instead of grid.... so what's the "grid" part do?
The main reason I can imagine is that you can put a measure below a column to which it refers somehow. This is obviously the case for the standard case of a measure being the sum of a specific column. But other than that, I agree there is no real use.

MongoDB text index search slow for common words in large table

I am hosting a mongodb database for a service that supports full text searching on a collection with 6.8 million records.
Its text index includes ten fields with varying weights.
Most searches take less than a second. Some searches take two to three seconds. However, some searches take 15 - 60 seconds! The 15-60 second search cases are unacceptable for my application. I need to find a way to speed those up.
Searching takes 15-60 seconds when words that are very common in the index are used in the search query.
I seems that the text search feature does not support lazy parameters. My first thought was to cache a list of the 50 most common words in my text index and then ask mongodb to evaluate those last (lazy) and on top of the filtered results returned by the less common parameters. Hopefully people are still with me. For example, say I have a query "products chocolate", where products is common and chocolate is uncommon. I would like to be able to ask mongodb to evaluate "chocolate" first, and then filter those results with the "products" term. Does anyone know of a way to achieve this?
I can achieve the above scenario by omitting the most common words (i.e. "products") from the db query and then reapplying the common term filter on the application side after it has received records found by db. It is preferable for all query logic to happen on the database, but am open to application side processing for a speed payout.
There are still some holes in this design. If a user only searches common terms, I have no choice but to hit the database with all the terms. From preliminary reading, I gather that it is not recommended (or not supported) to have multiple text indexes (with different names) on the same collection. My plan is to create two identical tables, each with my 6.8M records, with different indexes - one for common words and one for uncommon words. This feels kludgy and clunky, but am willing to do this for a speed increase.
Does anyone have any insight and/or advice on how to speed up this system. I'd like as much processing to happen on the database as possible to keep it fast. I'm sure my little 6.8M record table is not the largest that mongodb has seen. Thanks!
Well I worked around these performance issues by allowing MongoDB full text search to search in OR based format. I'm prioritizing my results by fine tuning the weights on my indexed fields and just ordering by rank. I do get more results than desired, but that's not a huge problem because my weighted results that appear at the top will most likely be consumed before my user gets to less relevant results at the bottom.
If anyone is struggling with MongoDB text search performance using AND searching only, just switch back to OR and control your results using weights. It performs leaps better.
hth
This is the exact same issue as $all versus $in. $all only uses the index for the first keyword in the array. I believe your seeing the same issue here, reason why the OR a.k.a. IN works for you.

SSAS aggregation not being used

So I have a fairly hefty cube that won't be much good without aggregations. I'm still in dev phases, so I'm manually attempting usage based agg design. I'm aggregating some of the main queries that we've designed. However, every time I pull these up, it looks like it's reading through each partition it hits (biggest groups are partitioned monthly).
I decided I'd try to narrow it down. After all, may just be the queries, or a blip, or what have you. So, using SQL Server Profiler and BIDS Helper, I created one and only one aggregation on one of my measure groups. I then ran said query and looked at the profiler, and it again hit every single partition, and didn't grab a thing out of an aggregation.
My only guess is that this is due to the fact that the measure being pulled back has a measure expression (currency conversion). Anybody got any ideas?
As pointed out in the Identifying Bottlenecks whitepaper, measure expressions invalidate aggregations. Once I removed all measure expressions from the measure group, the aggregations were again in use. Hoorah!

Resources