Does partial instanc-hour appear frequently in EC2 on-demand instance [closed] - amazon-ec2

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Because pricing is per instance-hour consumed for each instance, from the time an instance is launched until it is terminated. Each partial instance-hour consumed will be billed as a full hour.
Here is my question:
Does the partial instance-hour appear frequently or rarely?
Or in what kind of context, the partial instance-hour appear frequently?
Would anyone has these experiences on it?

Partial hours happen most frequently when using systems that scale often. For example, in my system I launch 10-20 servers extra each saturday and sunday to handle the extra traffic. When these servers are stopped I will be charged a partial hour. Amazon has a new feature for auto scaling groups that tells it to terminate ( if it has to ) the servers closer to the hour marker in order to save money.
Other possible uses are for services like MapReduce where a large number of instances will be started and then when the job is complete they will be terminated.
My experiences though is that the actual cost of partial hours is insignificant for me. Maybe if you're using larger servers it costs a lot but i'm using the c1.medium and i barely notice the $5 i get charged on a weekend for partial hours.

Related

What will be Concurrent user count to load test? - I have Google Analytics report [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I am working on Web App whose peak is only once an year. From the google analytics reports it tells like on the peak day, on peak hour, there was around 3 000 Sessions between an 2 peak hours (15 000 sessions #12:00pm & 18 000 at 13:00). User spend almost 8 minutes on the site. Every year during that time our application fails. So at what concurrency, I should test my application when I am doing load test.
Once article I read formula is this Concurrent Visitors = Hourly visitors * Time Spent on Site / 3600 . If this is right, we need only to test till 400 users. But on Google Analytics, I saw 3800 users on its real time monitor.Is this right?
3600 is the number of seconds in an hour, so if I understand you correctly your calculation should be as follows -
18000 users an hour * 8 minutes * 60 seconds in a minute / 3600 = 2400
Which is closer to 3800.
I guess that your visitors weren't evenly distributed over one hour, that will explain the peak. If the cost difference of preparing for 4k visitors is less than the potential losses that can be caused by downtime on that critical day, I would go for it.

Dealing with tons of queries and avoiding duplicates [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
My project involves concurrency and database management. Meaning that I have to be editing a database simultaneously between all threads. To be more specific I am reading a line from the database then inserting a line to mark that I grabbed that line. This could work with transactions but due to the fact I will be running this program on multiple machines, I will be having different database connections on each one. Is their a better way for me to accomplish my above task?
Applying Optimistic Concurrency using transactions and a version field/column (could be a time-stamp or a time-stamp plus an actual version number that just increases or other mechanism for version number) is a must here.
But since you are doing this on different machines, it's possible that a substantial amount of repetitive failed transactions occur.
To prevent this, you could use a queuing mechanism. A dispatcher program reads the non-processed records from database and dispatch them to workers - using a queue or a job dispatcher. Then each worker will take the id from that queue and process it in a transaction.
This way:
if a transaction fails, dispatcher would queue it again
if a worker goes down, other workers would continue (noticing the going down is a matter of monitoring)
workers can easily scale-out and new workers can be added at any time (as long as your database is not your bottleneck)
A request/reply schema would do best in this case to prevent queue congestion. I've used NATS successfully (and happily) for similar cases. Of-course you could use another tool of your choice but remember that you have to take care of request/reply part. Just throwing things at queues does not solve all problems and the amount of queued work should be controlled!

which are the criteras to find out that the webserver can handle load using jmeter? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have created a test plan using 100 threads. how can we conclude that the web server can handle load? which are the factors we can be taken for the load test.
I personally think you need to define your own metrics for your test plan to get a load test pass.
Typical metrics I would use.
Each response should come back in less than 250 ms. (Adjust to what your customer would expect)
All responses should come back with a non error response.
The server should be in a 'good state' after the load test. (Check memory, threads,database connection leaks etc)
To many resources being consumed is also a bad sign. Database connections, memory , hard disk for log files. Define your own metrics here.
Successive 'soak tests' to compliment your load tests would also be a good idea.
Basically run the a smaller amount of jmeter tests every two hours (So the DBA's etc. don't complain) over the weekend and check on the Monday.
I would recommend to you to first clarify your concepts about performance testing and its types (like load test, stress test, soak test etc). You can refer to following blog to get basic understanding about performance testing and its types:
Load vs. Stress testing
http://www.testerlogic.com/performance-testing-types-concepts-issues/
Once you have a better understanding of concepts, you will be in better position to ask the right question. For now, you can focus on following points..
what is expected load on your web server (in normal and extreme scenarios!)
what is your acceptable criteria for response time, load time etc
Once you know these numbers, you can create a jmeter test which runs for a specific time span (say 1 hour) and no. of threads increase step-by-step (100 user in first 10 minutes, 200 users from 10-20 mins, 300 users from 20-30 mins and so on). (hint: you can use ramp-up period to achieve this scenario).
You can perform these tests and check the reports and compare the response time and other performance factors during first 10 minutes (when load was 100 users) and in last 10 minutes when load was maximum.
This is just to give you a high level idea. As i said before it will better if you first clarify basic performance testing concepts and then design/perform the actual testing.
Like the rjdkolb said you have to define your metrics, check what you require from your service/app.
It all depends what service you are working with - do you have some stable load on the server, or some peaks, do you think there will be like 100 users online or 10000 at once, do you need fast answers or just proper answers in reasonable time. Maybe business foresee that the load will be building gradually through next year and it will start with just 100 requests per minute but will finish with 1000 per sec?
If you think that, like mentioned in other answer, you need an answer in less than 250 ms, then gradually increase load to check how many users/requests you can handle to still have responses on time. And maybe you need answers for 1000 users working simultaneously - then try load like this and check do they have they answers and how fast are they coming back? A lot to think about, do you think?
Try to read a bit about types of performance testing - maybe here on soapui or this explanation of some metrics. A lot of texts on the internet can guide you in your way.
Have fun.

Is application caching in a server farm worth the complexity? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've inherited a system where data from a SQL RDBMS that is unlikely to change is cached on a web server.
Is this a good idea? I understand the logic of it - I don't need to query the database for this data with every request because it doesn't change, so just keep it in memory and save a database call. But, I can't help but think this doesn't really give me anything. The SQL is basic. For example:
SELECT StatusId, StatusName FROM Status WHERE Active = 1
This gives me fewer than 10 records. My database is located in the same data center as my web server. Modern databases are designed to store and recall data. Is my application cache really that much more efficient than the database call?
The problem comes when I have a server farm and have to come up with a way to keep the caches in sync between the servers. Maybe I'm underestimating the cost of a database call. Is the performance benefit gained from keeping data in memory worth the complexity of keeping each server's cache synchronized when the data does change?
Benefits of caching are related to the number of times you need the cached item and the cost of getting the cached item. Your status table, even though only 10 rows long, can be "costly" to get if you have to run a query every time: establish connection, if needed, execute a query, pass data over the network, etc. If used frequently enough, the benefits could add up and be significant. Say, you need to check some status 1000 times a second or every website request, you have saved 1000 queries and your database can do something more useful and your network is not loaded with chatter. For your web server, the cost of retrieving the item from cache is usually minimal (unless you're caching tens of thousands or hundreds of thousands of items). So pulling something from the cache will be quicker than querying a database almost every time. If your database is the bottleneck of your system (which is the case in a lot of systems) then caching definitely is useful.
But bottom line is, it is hard to say yes or no without running benchmarks or knowing the details of how you're using the data. I just highlighted some of the things to consider.
There are other factors which might come into play, for example the use of EF can add considerable extra processing to a simple data retrieval. Quantity of requests, not just volume of data could be a factor.
Future design might influence your decision - perhaps the cache gets moved elsewhere and is no longer co-located.
There's no right answer to your question. In your case, maybe there is no advantage. Though there is already a disadvantage to not using a cache - you have to change existing code.

Two questions about terminologies in computer cluster [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am a layman to computer science, I have some questions about terminology of computer cluster here.
A cluster has 300 nodes.
Does it mean the cluster has 300 computers?
The cores of CPU is using hyperthreading, so there can be a total of 16 threads running simultaneously.
What are hyperthreading or threads here? A stream of data or logic?
Hyperthreading is an Intel technology for managing simultaneous multithreading. It doesn't necessarily have anything to do with cluster computing.
A thread is as a single running task that can be executed asynchronously from everything else. So depending on what kind of a processor your computer has it may be able to execute a different amount of things asynchronously.
Meanwhile, nodes in clusters are usually separate computers. However, they can be Virtual Machines as well. Within some contexts I have seen a single processor core regarded as a node too (in this case one machine with 2 cores will have 2 nodes on it). This kind of depends on the framework and what exactly you're doing.
But usually with cluster computing you have a given task that can be scaled, and every entity that does it is called a node.

Resources