Jmeter for concurrent users - jmeter

I have being using Jmeter-plugin Ultimate thread group for concurrent request.
But now I'm finding it difficult to use because the scenario is :
Each request has a trackingnumber(The trackingnumber are already generated in the system when a form is submitted, so I have to use the generated tracking number from DB) which are generated passed as a POST in http request, these trackingnumber are unique and have configured csv config for passing the trackingnumber. So once when trackingnumber is used, it cant be used again (as it would give me a error message) . So can someone please suggest me how to stress test this scenario where I have to hit a particular URL (with unique trackingnumber from csv file) for approximately 60/30 mins (with varing no of threads) till I get the crash point of the system.

1st way:-
You can pass the tracking numbers via csv file steps as,
allocate all the tracking numbers to specific uses (this can be possible with database query).
copy-paste those tracking numbers in csv file.
pass those tracking numbers as an parameter via csv data set config.
2nd way:-
fill the form & generated tracking number can be fetch via regular expression.
set allocation logic to specific user each time (disable other users).
log-in with this user & pass the fetched tracking number.
Hope will be helpful to you.


Data integration for Magento to Quick Book

I'm currently new to Talend and I'm learning through videos and documentation, so I'm just not sure how to approach/implement this with best practices.
Integrate Magento and Quick Book using Talend.
My thoughts
Initially my first thought was I will setup direct DB connection for Magento and will take relevant data which I need and will process it and will send to QuickBook using REST API's(specifically bulk API's in batch)
But then again I thought it would be little hectic for me to query Magento database(multiple joins) so I've another option to use Magento's REST API.
But as I'm not much familiar with the tool I'm struggling little to find best suitable approach, so any help is appreciated.
What I've done till now?
I've saved my auth(for QB) and db(Magento) credentials data in file and using tFileInputDelimited and tContextLoad, I'm storing them in context variables so they can be accessible globally.
I've successfully configured database connection and dbinput but I've not used metadata for connection(should I use that and if Yes how can I pass dynamic values there?). I've used my context variables data in db connection settings.
I've taken relevant fields for now but if I want multiple fields simple query is not enough as Magento stores data in multiple tables for Customer etc but it's not big deal I know but I think it might increase my work.
For now that's what I've built and my next step is send the data to QB using REST while getting access_token and saving it to context variable and again storing the QB reference into Magento DB.
Also I've decided to use QB bulk API's but I'm not sure how I can process data in chunks in Talend(I tried to check multiple resources but no luck) i.e. if the Magento is returning 500 rows I want to process them in chunks of 30 as QB batch max limit is 30, so I will be sending it using REST to QB and as I said I also want to store back QB reference ID in magento(so I can update it later).
Also this all will be on local, then how can I do same in production? how I can maintain development and production environment?
Resources I'm referring
For REST and Auth best practices -
Nice example for batch processing here:
Redirect your input to a tFileOutputDelimited.
Enter the output filename, tick the option "Split output in several files" from the "Advanced settings" and enter the value of 1000 into the field "Rows in each output file". This will create n files based on the filename with 1000 in each.
On the next subjob, use a tFileList to iterate over this file list to get records from each file.

How to run a test with distribution of load

I am new to JMeter and need your help with a problem.
I have 4 test scenarios and I need to run it with 30 users load with distribution as 30,10,30,30 percent. Out of 4 scenarios, 1 scenario create a customer ID and that ID is being used in the rest of the scenarios.TO test this, I have created a test data of customer ID's with my 1 scenarios and saved in a CSV file. Now my question is when I will run my test how would I handle the customer iD's generated at the run time and how to manage it with my test data which I have already created. Please help me.
With regards to reusing the data, generated in the runtime - you can extract the required data, i.e. customer ID using suitable JMeter Post-Processor and store it into a JMeter Variable. Once done the variable can be re-used in other scenarios. The process is known as correlation and there is a lot of information on implementation with examples over the web.
With regards to the distribution there are different approaches as well:
Throughput Controller
Switch Controller
Weighted Switch Controller
With regards to "manage test data you created" - you can read the values from a CSV file using CSV Data Set Config or __CSVRead() function

Visual Studio Load Test - Using Data Source with Multiple Agents

I'm using Visual Studio 2015 Load Test and running a Web Performance test that has a data source connected. The data source contains user login information for 250 users.
Running this in sequential order on a single agent works fine. However, I'm attempting to add in 10 test agents to share out the load. By design the Load Test copies the data source to each agent and it runs the test. What ends up happening is that all 10 agents start the test using the row 1 user from the data source. I'm hoping there's away to set up the Load Test to run sequentially across all agents (ex: Agent 1 uses row 1, Agent 2 uses row 2, Agent 3 uses row 3, etc...)
I suspect there's not an option to set this up, but wondered if anyone ran into this and had workarounds to offer. I did find this info via
Multiple machines running as a rig
Sequential – This works that same as if you are on one machine. Each agent receives a full copy of the data and each starts with row 1 in the data source. Then each agent will run through each row in the data source and continue looping until the load test completes.
Random – This also works the same as if you run the test on one machine. Each agent will receive a full copy of the data source and randomly select rows.
Unique – This one works a little differently. Each row in the data source will be used once. So if you have 3 agents, the data will be spread across the 3 agents and no row will be used more than once. As with one machine, once every row is used, the web test will stop executing.
You Can Split the Data set/CSV and distribute to each Agent, i.e in your case "25 data set"/agent and execute the test.
Each Agent can use their own Data set/CSV.
CSV Split:
The nearest you can get to what you want is to use the unique setting. However each data source row will only be used once, then the test will stop. With a data source containing 250 line only 250 test executions will take place. I do not know the exact distribution of data source rows to agents when unique is specified.
If more than one execution per data source row is wanted then another approach is to have one data source column per agent. Use the agent_id to select the column. Use the sequential data source access. A variation is to just have one set of data in the data source but append the agent_id to some of the values in the data sources. This answer has some variations on these ideas and some code.
Another possibility is to use the MoveDataTableCursor method to set a specific row for each test execution. This could be called in a PreWebTest method of a WebTestPlugin. The code would use the context parameters $AgentId and $WebTestIteration. The call would be based the following:
MoveDataTableCursor(..., ..., $AgentId * NumberOfAgents + $WebTestIteration);
The values of $AgentId and $WebTestIteration from the context are strings, they would need to be converted to numbers to do the multiply and add.
Would also need to check whether the two values are zero-based or one-based.
The documentation for MoveDataTableCursor is not very informative

Simulating server-side group and sort in Azure table storage

I have a table to which I add records whenever the user views a particular resource. The key fields are
Date Viewed
On a history page of my app, I want to present a set number (e.g., top 5) of the user's most recently viewed Resources, but I want to group by Resource, so that if some were viewed several times, only the most recent of each one is shown.
To be clear, if the raw data looked like this:
UserA | ResourceA | Jan 1
UserA | ResourceA | Jan 2
UserA | ResourceB | Jan 3
UserA | ResourceA | Jan 4
...only the bottom two records would appear in the history page.
I know you can get server-side chronological sorting by using a string derived from the date in the PartitionKey or RowKey fields.
I also see that you could enable a crude grouping mechanism by using Username and Resource as your PartitionKey and RowKey fields, and then using Insert-or-update, to maintain a table in which you kept pointers for the most recent value for each combination. However, those records wouldn't be sorted chronologically.
Is there any way to design a set of tables so that I can get the data I need without retrieving tons of extra entities and sorting on the client? I'm willing to get elaborate with the design if that's what it takes. Thanks in advance!
First, I would strongly recommend that you read this excellent Azure Storage Table Design Guide: Designing Scalable and Performant Tables document from Storage team.
Yes, I would agree that it is somewhat tricky with Azure Table Storage but it is doable :).
What you have to do is keep multiple copies of the same data. Each copy will serve a different purpose.
Considering the scenario where you want to fetch most recent lines for Resource A and B, here's what your entity structure would look like:
PartitionKey: Date/Time (in Ticks) reversed i.e. DateTime.MaxValue.Ticks - LastAccessedDateTime.Ticks. Reverse ticks is required to that most recent entries will show up on the top of the table.
RowKey: Resource name.
AccessDate: Indicates the last access date/time.
User: Name of the user who accessed that resource.
So when you are interested in just finding out most recently used resources, you could start fetching records from the top.
In short, your data storage approach should be primarily governed by how you want to fetch the data. It would even mean you will have to save the same data multiple times.
As discussed in the comments below, Table Service doesn't directly support Server Side Grouping. This is something that you would need to do on your own. What you could do is create a separate table to store the access counts. As and when the resources are accessed, you basically either insert a new record in that table or update the count for that resource in that table.
Assuming you're always interested in finding out resource access count within a date/time range, here's what your entity structure would look like:
PartitionKey: Date/Time (in Ticks). The precision would depend on your reporting requirement. For example, if you want to maintain access counts by day then your precision would be a day.
RowKey: Resource name.
AccessCount: This field will constantly update as and when a resource is accessed.
LastAccessDateTime: This field will denote when a resource was last accessed.
For updating access counts, I would recommend that you make use of a background process. Basically in this approach, as a resource is accessed you add a message in a queue. This message will have resource name and date/time resource was last accessed. Then have a background process poll this queue and fetch messages. As the messages are received, you first get the current count and last access date/time for that resource. If no records are found, you simply insert a record in this table with count as 1. If a record is found then you compare the date/time from the table with the date/time sent in the message. If the date/time from the table is smaller than the date/time sent in the message, you update both count (increase that by 1) and last access date/time. If the date/time from the table is more than the date/time sent in the message, you only update the count.
Now to find most accessed resources in a time span, you simply query this table. Assuming there are limited number of resources (say in 100s), you can get this information from the table with at least 1 request. Since you're dealing with small amount of data, you can simply download this data on the client side and order it anyway you see fit. However to see the access details for a particular resource, you would have to fetch detailed data (1000 entities at a time).
Part of your brain might still be unconsciously trapped in relational-table design paradigms, I'm still getting to grips with that issue myself.
Rather than think of table storage as a database table (with the "query-ability" that goes with it) try visualizing it in more simple (dumb) terms.
A design problem I'm working on now is storing financial transaction data, and I want to know what the total $ amount of these transactions are. Because Azure table storage doesn't (yet?) offer aggregate functions I can't simply go .Sum(). To get around that I'm going to:
Sum the values of the transactions in my app before I pass them to azure.
I'll then pass that the result of the sum into azure as a separate piece of information, called RunningTotal.
Later on I can just return RunningTotal rather than pulling down all the transactions, and I can repeat the process by increment the value of RunningTotal each time i get new transactions.
Of course there are risks to this but the app is a personal one so the risk level is low and manageable, at least as a proof-of-concept.
Perhaps you can use a similar approach for the design of your system: compute useful values in advance. I'll almost be using table storage as a long-term cache rather than a database.

Redis multiple requests

I am writing a very simple social networking app that uses Redis.
Each user has a sorted set that contains ids of items in their feed. If I want to display their feed, I do the following steps:
use ZREVRANGE to get ids of items in their feed
use HMGET to get the feed (each feed item is a string)
But now, I also want to know if the user has liked a feed item or not. So I have a set associated with each feed item that contains ids of user who have liked a feed item.
If I get 15 feed items, now I have to execute an additional 15 requests to Redis to find out, for each feed item if current user has commented on it or not (by checking if id exists in each set for each feed).
So that will take 15+1 requests.
Is this type of querying considered 'normal' when using Redis? Are there better ways I can structure the data to avoid this many requests?
I am using redis-rb gem.
You can easily refactor your code to collapse the 15 requests in one by using pipelines (which redis-rb supports).
You get the ids from the sorted sets with the first request and then you use them to get the many keys you need based on those results (using the pipeline)
With this approach you should have 2 requests in total instead of 16 and keep your code quite simple.
As an alternative you can use a lua script and fetch everything in one request.
This kind of database (Non-relational database), you have to make a trade-off between multiple requests and include some data redundancy.
You should analyze each case separately and consider some aspects, like:
How frequently this data will be accessed?
How much space this redundancy will consume?
How many requests I will have to do, in order to have all data, without redundancy?
Performance is an issue?
In your case, I would suggest to keep a Set/Hash or just a JSON encoded data for each user with a historical of all recent user interaction, such as comments, likes, etc. Every time the user access the feeds you just have to read the feeds and the historical; only two requests.
One thing to keep in mind, every user interaction, you must update all redundant data as well.
