How to test the tps of the private ETH blockchain? - performance

I deploy the private ETH blockchain on my laptop, and I want to know how many transactions per second(tps) of it.
Are there some tools to test tps of the blockchain system?

The performance of the blockchain in tps depends on many conditions - both the mining settings of the blockchain itself and the kind of the transactions being executed. Mining settings include, for example, the block generation frequency and the gas limit per block. The "weight" of transactions can also be different - the "heaviest" are transactions for creating smart contracts, the "lightest" are simple Đ•ther transfer transactions.
The easiest way to check is to start generating transactions of a certain type with some frequency and track how many of them will end up in blocks. Since the time for sending a transaction at a node is usually quite long, it is better to create a stream of transactions from several nodes to create a significant load.
The exact mechanism for generating transactions depends on which software platform you want to use.

Related

How to load test 10k requests per second using jmeter?

I need to load test my website with 10k req/sec for 1 hour using JMeter. I am confused with the values of loop count, number of thread, ramp-up period and duration.
Also will my laptop (i5 8GB) be able to do that? If not what is the alternative.
PS: I checked every question/answer on stackoverflow for this but I couldn't find any help. Please dont mark it repeated question.
You can use "Constant Throughput Timer" and define target throughput and select throughput based on "all active threads".
Define maximum number of users count in your script so that it will be enough for 10K req/sec.
Also if you are using windows machine then I think you will face this issue "https://www.baselogic.com/2011/11/23/solved-java-net-bindexception-address-use-connect-issue-windows/"
I will recommend to use distributed testing or use more than 1 machine.
The easiest way of configuring JMeter to send X requests per second is using either Precise Troughput Timer or Throughput Shaping Timer in combination with the Concurrency Thread Group. The number of threads needs to be sufficient, the exact number mainly depends on your application response time, if response time is 1 second - you will need 10k threads, if it's 500ms - you will need 5k threads, if it is 2 seconds - you will need 20k threads, etc.
Only you can answer whether your laptop can kick off the required number of virtual users as there are too many factors to consider: nature of the test, the size of the requests/responses, number of pre/post processors and assertions, etc. Make sure to follow JMeter Best Practices and monitor CPU, RAM, Network, etc. usage using i.e. JMeter PerfMon Plugin as if your laptop will be overloaded - JMeter won't be able to send requests fast enough and you will not be able to conduct 10k requests per second even if the server supports it. If your laptop hardware specifications are too low for the test scenario - you will have to go for Distributed Testing
You have a number of issues in play
test design. Use more than one load generator. In fact, use no fewer than three, evenly matched in hardware. Take one and load only one user of each type. This is your control set. If this set degrades at the same rate as your other load generators then you have a common issue, likely the site. If the control set does not degrade, but the other load generators do, then you likely have an overloaded generator. On the commercial test tool side of the fence, generating all load from one host have never been considered a good practice in performance testing.
10K requests per second. This is substantial. I have worked on some top 20 eCommerce sites and I can tell you that even they do not receive this type of traffic to the origin servers. Why? Cache! Either this his a Content Delivery Network where the load is spread across the county, OR there is a cache node directly in front of the load balancer(S) for the site (thing varnishcache of equivalent), OR both for a multi-staged cache. You might want to look for an objective reference in production to pin this to as a validation poinnt, if and only if (IFF) your goal is to represent end user behavior. Running a count of requests grouped by second from the HTTP access logs should be able to validate this number. Also, check the cache plan for fixed assets - it could be poorly managed and load would drop significantly just by better managing the sites cache settings to the client. If your goal is simply to saturate a SOAP/REST interface to the point of destruction then you might have a better path.
If you are looking to take a particular SOAP or REST set of remote procedure calls to the point of destruction, consider a classical stress test. Start your test at zero load, increase with the smallest step interval possible over the longest possible period of time. The physical analogy to this would be the classical hospital style stress test where a nurse comes around every minute and increases the speed OR the incline on the treadmill OR both until some end of test condition is achieved. For a hospital style test that is moving into Oxygen debt, an inability to keep pace, etc... For your application/interface it could be the doubling of response times from what is acceptable, a saturation of resources in the finite resource pool (CPU, DISK, MEMORY, NETWORK) on the back end hosts, etc...

How to design loadprofile in performance testing using loadrunner?

There is no NFR data in my project and no performance testing has ever been performed.
data given is there are 136 concurrent users during peak hours and how many time times each activity has been done during the peak hours.
How to design load profile in performance testing
Go to the HTTP logs for an objective view of session duration, users online blocked by session duration length, and hourly hits on unique pages tied to business processes.
Make the business sign off on your analysis and own it. As QA, you cannot both own the creation of the requirement and the assessment of the requirement, particularly since the requirement was absent from the guidance provided to development. Otherwise, no matter what you discover development, ops, architecture will claim your test is wrong because your assumptions don't match theirs.

Creating real time datawarehouse

I am doing a personal project that consists of creating the full architecture of a data warehouse (DWH). In this case as an ETL and BI analysis tool I decided to use Pentaho; it has a lot of functionality from allowing easy dashboard creation, to full data mining processes and OLAP cubes.
I have read that a data warehouse must be a relational database, and understand this. What I don't understand is how to achieve a near real time, or fully real time DWH. I have read about push and pull strategies but my conclusions are the following:
The choice of DBMS is not important to create real time DWH. I mean that is possible with MySQL, SQL Server, Oracle or any other. As I am doing it as a personal project I choose MySQL.
The key factor is the frequency of the jobs scheduling, and this is task of the scheduler. Is this assumption correct? I mean, the key to create a real time DWH is to establish jobs every second for every ETL process?
If I am wrong can you provide me some help to understand this? And then, which is the way to create a real time DWH? Is the any open source scheduler that allows that? And any not open source scheduler which allows that?
I am very confused because some references say that this is impossible, others that is possible.
Definition
Very interesting question. First of all, it should be defined how "real-time" realtime should be. Realtime really has a very low latency for incoming data but requires good architecture in the sending systems, maybe a event bus or messaging queue and good infrastructure on the receiving end. This usually involves some kind of listener and pushing from the deliviering systems.
Near-realtime would be the next "lower" level. If we say near-realtime would be about 5 minutes delay max, your approach could work as well. So for example here you could pull every minute or so the data. But keep in mind that you need some kind of high-performance check if new data is available and which to get. If this check and the pull would take longer than a minute it would become harder to keep up with the data. Really depends on the volume.
Realtime
As I said before, realtime analytics require at best a messaging queue or a service bus some jobs of yours could connect to and "listen" for new data. If a new data package is pushed into the pipeline, the size of it will probably be very small and it can be processed very fast.
If there is no infrastructure for listeners, you need to go near-realtime.
Near-realtime
This is the part where you have to develop more. You have to make sure to get realtively small data packages which will usually be some kind of delta. This could be done with triggers if you have access to the database. Otherwise you have to pull every once in a while whereas your "once" will probably be very frequent.
This could be done on Linux for example with a simple conjob or on Windows with event planning. Just keep in mind that your loading and processing time shouldn't exceed the time window you have got until the next job is being started.
Database
In the end, when you defined what you want to achieve and have a general idea how to implement delta loading or listeners, you are right - you could take a relational database. If you are interested in performance and are modelling this part as Star Schema, you also could look into Column Based Engines or Column Based Databases like Apache Cassandra.
Scheduling
Also for job scheduling you could start with Linux or Windows standard planning tools. If you code in Java you could use later something like quartz. But this would only be the case for near-realtime. Realtime requires a different architecture as I explained above.

Performance test - approach

I have User Registration, Flight Search, Book Tickets modules in my application. I have created my JMeter test & I have different thread groups for each module in my test. I verified & it works well.
Thread Group 1: XX number of users - access the site - click on regression , enter the details & register. (bold -> loop- happening again and again)
Thread Group 2: XX number of users - access the site - login, - search for flights - (bold -> loop - happening again and again)
Thread Group 3: XX number of users - access the site - login, book ticket - (bold -> loop - happening again and again)
Issue:
My manager says we need to run all modules (all thread groups) together with appropriate users as that is how It is going to be in Production. Even though i can run them all together, - in case of issues - i would not know which feature of the application caused the issue.
My aim is to run each module separately & find its performance. I think that doing the module wise would be the correct approach to get the response time, resource utilization etc.
Clarify:
I do not have much experience in performance testing. What is the correct approach / How do you do your test for your application?
If i have to find server's optimal load (at which it performs better) - what should my approach be?
Intentionally tagging loadrunner as this question is not specific to JMeter & it is generic.
If your goal is to represent human behavior to assess the risk of deployment then testing each business process atomically will not accomplish your goal.
You appear to be engaging in a process that is more appropriately termed performance unit testing. This is very common with developers (as differentiated from performance testers) who seek to qualify the performance of an individual business process across some number of users. These are also typically classified with non-normal think times (often eliminated altogether), small data sets, smaller than useful test environments and extremely short test durations, such as 5-15 minutes.
You can mark this business scenario as transactions it means the HTTP requests for each module will be grouped for ex- Login requests in one group or transaction , Search flight as one group or transaction and similarly for Book tickets etc. By following this you will be testing it in a integrated manner and it would be a production like scenario too. After your run due to grouping you can easily find out which group of request taking more time either search, book tickets etc... In this way you will get the accurate performance statistics and you will achieve the production like scenario too
The approach really depends on what your goal for the testing exercise is. If you're looking to optimize or profile a particular module, it makes sense to test it in isolation.
However, if you're trying to check if your server scales, or if you have enough capacity, you should test all your modules at once, at or above your expected load levels.
A counter example to your isolated approach:
Say you have to modules A and B. They are both CPU intensive and take up 80% CPU when you run them. You first tested A, it used 80% CPU, you had 20% to spare and it performed fine. Now you test B alone, same result.
Now you go to production and users try to use both A and B modules, both are trying to use 80% CPU and suddenly you don't have sufficient CPU and your performance suffers.
I know this is kind of late but still.
I do not have much experience in performance testing. What is the
correct approach / How do you do your test for your application?
As James mentioned, the approach to conducting a performance test in a normal scenario would be to run all the critical business flows at the same time and not in an isolated fashion.
In order to identify issues, we would group the requests under transactions and name the business flows appropriately. This will help in identifying which requests have failed and which feature/portion of the application is at fault.
Running them individually will not provide you more insights simply because, a load testing tool will only be able to confirm the presence of a bottleneck but not the root cause irrespective of the number of business flows involved.
If i have to find server's optimal load (at which it performs better)
- what should my approach be?
In order to identify the optimal load for the server, it is mandatory to run all the scripts together as the end users are going to access the application (all critical scenarios) concurrently and not in a modularized manner.

How to decide on what hardware to deploy web application

Suppose you have a web application, no specific stack (Java/.NET/LAMP/Django/Rails, all good).
How would you decide on which hardware to deploy it? What rules of thumb exist when determining how many machines you need?
How would you formulate parameters such as concurrent users, simultaneous connections, daily hits and DB read/write ratio to a decision on how much, and which, hardware you need?
Any resources on this issue would be very helpful...
Specifically - any hard numbers from real world experience and case studies would be great.
Capacity Planning is quite a detailed and extensive area. You'll need to accept an iterative model with a "Theoretical Baseline > Load Testing > Tuning & Optimizing" approach.
Theory
The first step is to decide on the Business requirements: how many users are expected for peak usage ? Remember - these numbers are usually inaccurate by some margin.
As an example, let's assume that all the peak traffic (at worst case) will be over 4 hours of the day. So if the website expects 100K hits per day, we dont divide that over 24 hours, but over 4 hours instead. So my site now needs to support a peak traffic of 25K hits per hour.
This breaks down to 417 hits per minute, or 7 hits per second. This is on the front end alone.
Add to this the number of internal transactions such as database operations, any file i/o per user, any batch jobs which might run within the system, reports etc.
Tally all these up to get the number of transactions per second, per minute etc that your system needs to support.
This gets further complicated when you have requirements such as "Avg response time must be 3 seconds etc" which means you have to figure in network latency / firewall / proxy etc
Finally - when it comes to choosing hardware, check out the published datasheets from each manufacturer such as Sun, HP, IBM, Windows etc. These detail the maximum transactions per second under test conditions. We usually accept 50% of those peaks under real conditions :)
But ultimately the choice of the hardware is usually a commercial decision.
Also you need to keep a minimum of 2 servers at each tier : web / app / even db for failover clustering.
Load testing
It's recommended to have a separate reference testing environment throughout the project lifecycle and post-launch so you can come back to run dedicated performance tests on the app. Scale this to be a smaller version of production, so if Prod has 4 servers and Ref has 1, then you test for 25% of the peak transactions etc.
Tuning & Optimizing
Too often, people throw some expensive hardware together and expect it all to work beautifully. You'll need to tune the hardware and OS for various parameters such as TCP timeouts etc - these are published by the software vendors, and these have to be done once the software are finalized. Set these tuning params on the Ref env, test and then decide which ones you need to carry over to Production.
Determine your expected load.
Setup a machine and run some tests against it with a Load testing tool.
How close are you if you only accomplished 10% of the peak load with some margin for error then you know you are going to need some load balancing. Design and implement a solution and test again. Make sure you solution is flexible enough to scale.
Trial and error is pretty much the way to go. It really depends on the individual app and usage patterns.
Test your app with a sample load and measure performance and load metrics. DB queries, disk hits, latency, whatever.
Then get an estimate of the expected load when deployed (go ask the domain expert) (you have to consider average load AND spikes).
Multiply the two and add some just to be sure. That's a really rough idea of what you need.
Then implement it, keeping in mind you usually won't scale linearly and you probably won't get the expected load ;)

Resources