Is there some convenient way to measure dynamodb performance? Preferable locally in unit tests. The aim is to avoid accidentally doing full table scans.
Related
We are planning to use a context index for full text search in Oracle 12c standard edition.
The data on which search will run is a JSON containing one Channel post and its replies from a 3rd party tool that is loaded into our database.(basically, all the chats and replies(including other attributes like timestamp/user etc) are stored in this table).
We are expecting about 50k rows of data per year and a daily of 100-150 DMLs per day. Our index is "SYNC ON COMMIT" currently,so what are the recommendations for optimizing the Oracle Text index?
First, let me preface my response with a disclaimer: I am exploring using Oracle Text as part of a POC currently, and my knowledge is somewhat limited as we're still in the research phase. Additionally, our datasets are in the 10s of millions with 100k DML operations daily.
From what I've read, Oracle Docs suggest scheduling both a FULL and REBUILD optimization for indexes which incur DML, so I currently have implemented the following in our dev environment.
execute ctx_ddl.optimize_index('channel_json_ctx_idx', 'FULL'); --run daily
execute ctx_ddl.optimize_index('channel_json_ctx_idx', 'REBUILD'); --run weekly
I cannot imagine with the dataset you've identified that your index will really become that fragmented and cause performance issues. You could probably get away with less frequent optimizations than what I've mentioned.
You could even forego scheduling the optimization and benchmark your performance. If you see it start to degrade, then note the timespan and perhaps count of DML operations for reference. Then run a 'FULL'. Then test performance. If performance improves, create a schedule. If performance does not improve, then run 'REBUILD'. Test performance. Assuming performance improves then you could schedule the 'REBUILD'for that time range and consider adding a more frequent 'FULL'.
I would like to know what performance impact I should expect when invoking an UDF (user defined function) written in C everytime some record is created or changed (with the assumption, that the UDF code itself takes no time - I will optimize that on my own).
Let's say I have hardware capable of running an SSD-persisted namespace on 200k writes/s, can I expect atleast 50k writes/s with the UDF run everytime?
Subquestion: what might limit the UDFs performance (context switching?)
Reason for asking is that Aerospike is using those UDFs e.g. for Large Data Types, but those are not highly performant according to AS staff (compared to KVS-Ops). My usecase is to use UDFs to keep a broad range of secondary indices within a Redis Cluster up-to-date, allowing for much richer realtime queries (e.g. intersections/unions of 5-10 secondary indices).
Best thing is to run the test yourself. Its hard to predict. But I believe that you should be able to do 50k tps.
Mainly the UDF's performance is effected because of the memory allocations that happen under the hood before calling the UDF. If you are using simple datatypes like int/string/blob, then you are better off. If you use list/map in UDF, it will do more memory allocations which will impact the performance.
I am new to AWS Redshift. Although i have read the concepts, I wanted to know how to proceed with the Load testing in RedShift. I have been very comfortable with the GRINDER, but rather confused how to use with RedShift.
my basic requirement is to push certain number of rows and measure the query and server performance. I have been doing much performance review on cloud where MySQL, Cassandra etc has been deployeed.Please help me with some concept or tool to start with the Load testing.
Grinder is inappropriate for Redshift as it is designed to test the impact of multiple concurrent users reading and writing data. Redshift performs poorly for single row inserts as per the docs.
For Redshift you should be evaluating the performance at certain data sizes and query complexity. Look into the Star Schema Benchmark or TPC-H.
I wanted to know if there was a way to measure the performance of a function. In parts.
I know that you are able to measure the total time it takes to complete the function but is there a way to measure the individual queires within a function?
Just wanted to know because I can not find the bottleneck for my function's performance.
Most of the time when you see a major difference between the estimated and the actual execution plans, it is because your statistics have not been (ever) updated. The SQL Server therefore has no idea which tables have little data, which ones are huge, and so on, and is more likely to generate bogus plans (both estimated and actual), or to miscalculate estimated plan costs. The actual plan is based on real, accurate costs of the plan, but when the plan is very far from an optimal one, this accuracy is of very little value for determining bottlenecks.
To correct this, issue the UPDATE STATISTICS statement or execute the sp_updatestats procedure.
Seeing 100% actuals for your function might well be an effect of empty or almost empty database, regardless of whether you have uptodate statistics or not.
When optimizing for performance, make sure that your database is populated quasi-realistically with lots of data (put twice as much records to each table than what you expect for production; but do maintain the expected rough proportions). There is not much point in looking for a performance bottleneck using an empty or an entirely, disproportionately overblown database; query plans will be different and even if the plan will happen to be the same one, the bottleneck may be elsewhere than in production.
Is there a way to force the Oracle too "see" a table and associated indexes as being bigger than they really are?
In other words, is there a way to "fake" database statistics, so the cost based optimizer would make decisions on a nearly-empty database, that are closer to decisions that would be made in a real, big production database?
The idea is to be able to experiment (vis-a-vis execution plan) with various indexing / querying / (de)normalization strategies very early in the database design process, without wasting time writing code that fills it with representative test data (most of which will end-up being discarded anyway, since the database design is still not settled).
Importing statistics is out of question, since the production database does not even exist yet.
Sure. The DBMS_STATS package has a number of procedures that allow you to force statistics on objects. There are dbms_stats.set_table_stats and dbms_stats.set_index_stats procedures, for example.