ec2 spot complete price history - amazon-ec2

I am using API to get ec2 spot price history, but I cannot get anything except for last 90 or so days, and cannot specify frequency of observations. Is there a way to get a complete history of spot prices, preferable at minute or hourly frequency?

While not explicitly documented for the DescribeSpotPriceHistory API action, this restriction is at least mentioned for the AWS Management Console (which uses that API in turn), see Viewing Spot Price History:
You can view the Spot Price history over a period from one to 90 days based on the instance type, the operating system you want the instance to run on, the time period, and the Availability Zone in which it will be launched.
Since anybody could have retrieved and logged the entire spot price history ever since this API is available (and without a doubt quite some users and researchers have done just that; even the AWS blog listed some dedicated Third-Party AWS Tracking Sites, but these are all defunct at first sight), this restriction admittedly seems a bit arbitrary, but is certainly pragmatic from a strictly operational point of view, i.e. you have all the information you need to base future bids upon (esp. given AWS has only ever reduced prices so far, and regularly does so in fact much to the delight of its customers).
Likewise there's no option to change the frequency, so you'd need to resort to client side code for the hourly aggregation.

This website has re-sampled EC2 spot price histories for some regions, you may access them via a simple API directly from your Python script:
http://ec2-spot-prices.ai-mmo-games.de/
I hope this helps.

AWS only provides 90 days of history. And the data is raw, i.e., not normalized by hour or even minute. so there are holes in the data sometimes.
One approach would be to suck in the data into an ipython notebook and use pandas excellent time series tools to resample by minute or 5-min etc. here's a short tutorial:
https://medium.com/cloud-uprising/the-data-science-of-aws-spot-pricing-8bed655caed2
here are more details on using pandas for time-series resampling:
http://pandas.pydata.org/pandas-docs/stable/timeseries.html
hope that helps...

Related

Using Graphite/Grafana for non time based data

I have been reading through documentation and have been able to get Graphite to receive data I have been sending it. I can definitely see the benefit in tracking things like concurrent users and network load.
But now I have been tasked with implementing it on the client to show things like RAM, and CPU usage. In addition to this, I must also track actions (users buying things). Maybe I am missing a large chunk of the picture here, but I am not sure how I would do these things. Do I need a timestamp? I've also seen plugins for pie charts and such and this would indicate I could perhaps create graphs from devices with different statistics.
What am I missing?
I don't think you're missing anything.
Any data you send on to, say, InfluxDB (as that's what I've used the most) will be timestamped automatically when it arrives - unless you specify an explicit one yourself.
If you're showing, for example, CPU load you can write your query to pick up the latest value, or perhaps an average (mean value) over time, or the max or min over a period of time as appropriate.
Pie charts can also be used successfully to graph relationships over time.
The key is to create a specific query (I use SQL directly) to craft the data used for the panel type. There are excellent examples in the documentation.

I have many separate installs for magento can I merge them together?

I've just started a new job and we have several installs of magneto all of different versions!
Now really it seems to me that we need to firstly upgrade them all and then get them all under one installation of magneto and using one database.
What is the best way (in general terms) of doing this.
Is it even possible or is my best bet to make the sites again under one installation and import the products into it.
There is some talk by a fellow developer that having them under different installs helps with performance. Is this true?
Once we have them all under one install things like stock control and orders as well as putting products on multiple site should also be very straightforward - correct?
We are talking quite a few stores say around 15ish and quite a few products around I would say 4000 maybe more.
My first suggestion is to consider the reasons, why you need all Magento instances to be moved under one installation. The reasons are not clear from your question. So the best developer's advice is "Does it work? Then don't touch it" :)
If there are no specific reasons, then you'd better leave it as is. All reorganization processes (upgrading, infrastructure configuration, access setup, etc.) for a software system are hard, costly, consume time, error-prone, usually have no much value from business point of view and are a little boring. This is not a Magento-specific thing, it is just general characteristic for any software.
Also note, that it is a holiday season. So it is better not to do anything with e-commerce stores until the middle of January.
If you see value in a reorganization of your Magento stores, then the best way to do it is to go gradually - step by step, store by store:
Take your most complex store. Prepare everything you need for the further steps - i.e. get ready the tools, write automatic scripts, go through the process with its copy at some testing server. Write set of functional tests
to cover it with at least smoke-tests. You'll have to repeat
such light checks many times to be sure, that the store appears to be
working. The automatic tests will save much time. Thus all these preparations will decrease your downtime.
Close public access to the store.
Upgrade store to the Magento version, you need. Move it to the new infrastructure.
Verify all the user scenarios manually and with automatic tests. Fix the issues, if any.
Open public access to the store. Monitor logs, load reports at the server machines. Fix issues, if any.
Take next store (let's call it NextStore). Make its copy at a sandbox server.
Make copy of your already converted store (let's call it ConvertedStore) at a sandbox server.
Export all the data from copy of NextStore and import it to the copy of ConvertedStore. You can use Magento Dataflow or Import/Export modules to do that. Not all data can be
imported/exported with those modules - just Catalog, Orders, Customers. You will need
to develop custom scripts to import/export other entities, if you need them.
Verify result manually and with automatic tests and manually. Write automatic scripts, that fix found issues. You will need those scripts later during the real converting process.
Close NextStore.
Move it to the new infrastructure, by engaging the already prepared procedures and scripts. You will need to consider, whether to close ConvertedStore during the converting process. It depends on your feeling, whether it is ok to have it opened or not. For safety reasons it is better to close it.
Verify, that everything works fine. Monitor logs, reports.
Fix issues, if any.
Proceed with the rest of your stores.
That is my (totally personal) view on the procedures.
There is some talk by a fellow developer that having them under
different installs helps with performance. Is this true?
Yes, your friend is right. Separating Magento (actually, anything in this world) into smaller instances makes it lighter to be handled. The performance difference is very small (for your instance of 4000 products), but it is inevitable. Consider, that after combining the instances (suppose, there are ten of them with 400 products each) you'll be handling data for 10x more customers, reports, products, stores, etc. Therefore any search will have to go through ten times more products, in order to return data. Of course, it doesn't matter, if the search takes 0.00001 second, because 0.0001s for combined instance is ok as well. But some things, like sorting or matching sets, grow non-linearly. But, as said before, for 4000 products you won't see big difference.
Once we have them all under one install things like stock control and
orders as well as putting products on multiple site should also be
very straightforward - correct?
You're right - after combining the stores together, handling orders, stock, customers will be much more simpler and straightforward process.
Good luck! :)
The most important thing to consider is what problem you're solving by having all these sites on one Magento "instance". What's more important to your business/team: having these sites share product and inventory or having the flexibility of independently modifying these sites? Any downtime or impacts to availability may affect all sites.
Further questions/areas of investigation:
How much does the product hierarchy (categories and attributes) differ?
Is pricing the same across each site or different?
Are any of these sites multi-regional and how is pricing handled for each region?
It's certainly possible to run multiple sites on one Magento instances, even if there are some rough edges within the platform.
Since there's no way to export all entities in Magento, there's no functionality to merge stores. You'd have to write custom code - it would have to take all the records from the old store, assign them new IDs while preserving referential integrity & insert them into the new store (this is what the "product import" does, but they don't have it for categories, orders, customers, etc.).
The amount of code you'd be writing to do that would take almost longer than just starting over in my opinion. You'd basically be writing the missing functionality for Magento. If it were easy they would have it done it already.
However splitting two stores apart is very easy, since you don't have to worry about reassigning unique identifiers in the DB.

Scaling Redis for Online Friends List

I'm having trouble thinking of how to implement an online friends list with Ruby and Redis (or with any NoSQL solution), just like any chat IM i.e. Facebook Chat. My requirements are:
Approximately 1 million total users
DB stores only the user friends' ids (a Set of integer values)
I'm thinking of using a Redis cluster (which I actually don't know too much about) and implement something along the lines of http://www.lukemelia.com/blog/archives/2010/01/17/redis-in-practice-whos-online/.
UPDATE: Our application really won't use Redis for anything else other than potentially for the online friends list. Additionally, it's really not write heavy (most of our queries, I anticipate, will be reads for online friends).
After discussing this question in the Redis DB Google Groups, my proposed solution (inspired by this article) is to SADD all my online users into a single set, and for each of my users, I create a user:#{user_id}:friends_list and store their friends list as another set. Whenever a user logs in, I would SINTER the user's friends list and the current online users set. Since we're read heavy and not write heavy, we would use a single master node for writes. To make it scale, we'd have a cluster of slave nodes replicating from the master, and our application will use a simple round-robin algorithm to do the SINTER
Josiah Carlson suggested a more refined approach:
When a user X logs on, you intersect their friends set with the
online users to get their initial online set, and you keep it with a
Y-minute TTL (any time they do anything on the site, you could update
the expire time to be Y more minutes into the future)
For every user in that 'online' set, you find their similar
'initial set', and you add X to the set
Any time a user Z logs out, you scan their friends set, and remove
Z from them all (whether they exist or not)
what about xmpp/jabber? They built-in massive concurrent use and highly reliable, all you need to do is the make an adapter for the user login stuff.
You should consider In Memory Data Grids, these platforms are targeted to provide such scalability.
And usually can easily be deployed as a cluster on any hardware of cloud.
See: GigaSpaces XAP, vMware GemFire or Oracle Coherence.
If you're looking for free edition XAP provides Community Edition.

Ways to enhance a trial user's first time experience

I am looking for some ideas on enhancing a trial-user's user experience when he uses a product for the first time. The product is aimed at a particular domain and has various features/workflows. Experienced users of the product naturally find interesting ways to combine features to get the results they want (somewhat like using an IDE from a programmer's perspective).Trial users get to use all features of the product in a limited fashion (For ex: If there is a search functionality, the trial-user might see only the top 20 results, or he may be allowed to search only a 100 times). My question is: What are the best ways to help a trial-user explore/understand the possibilities of the product in the trial period, especially in the first 20 - 60 mins before the user gives up on the product?
Edit 1: The product is a desktop app (served via JNLP, so no install required) and as pointed out in the comments, the expectations can be different in this case. That said, many webapps do take a virtual desktop form and so, all suggestions are welcome.
Check out how blinksale.com handles this. It's an invoicing app, but to prevent it from looking too empty for a new account, they show static images in places where you'd actually have content if you used the app. Makes it look less barren at first until you get your own data in.
if you can, avoid feature limiting a trial. it stops the user from experiencing what the product is ACTUALLY like. It also prevents a user from finding out if a feature actually works like they want/expect/need it to.
if you have a trial version, and you can, optimise it for first time use. focus on / highlight the features that allow the user to quickly and easily get benefits for useful output from the system.
allow users to export any data they enter into a trial system - and indicate that this is possible/easy. you don't want them to be put off from trying something because of a potential for wasted effort.
avoid users being required to do lots of configuration before using a trial. prepopulate settings based on typical/common/popular settings. you may also want to consider having default settings for different types of usage. e.g. "If you want to see what the system is like for scenario X, use configuration J. If you want to see what the system is like for use case Y, use configuration K." where J & K are collections of settings best suited to a particular type of usage.
I'll speak from personal experience while evaluating trial applications.
The most annoying trial applications are those which keep popping up nag screens or constantly reminding me that I'm using a trial. Trials which act exactly like the real product from the beginning till the end of the trial period are just awesome. Limited features are annoying, the only exception I can think of when you could use it is where you have rarely used feature which would allow people to exploit the trial (by using this "once-in-lifetime" needed feature and uninstalling). If you have for example video editing software trial which puts "trial" watermark on output, I'd uninstall it as soon as I'd notice it. In my opinion trial should seamlessly integrate into user work-flow so that once the trial ends they would think "Hey, I have been using this awesome program almost each day since I got the trial, I absolutely have to buy it." Sure some people will exploit it, but at the end you should target the group which will use your product in daily work-flow instead of one time users. Even if user "trials" it 2 times per year, he will keep coming back to your product and might even buy it after 2nd or 3rd "one-time use".
(Sorry for the wall of the text and rant)
As for how to improve the first session. I usually find my way around programs easily, but one time only pop-up/screen (or with check-box to never show it again) with videos showing off best features and intended work-flow are quite helpful. Also links to sample documents might be helpful. If your application can self-present itself (for example slide-show about the your slide-show program) you could include such document. People don't like to read long and boring help files, but if you have designer in your team, you could ask him to make a short colourful intro pdf. Also don't throw all the features at the user at the same time. Split information into simple categories and if user is interested into one specific category keep feeding him more specific information. That's why videos are so good, with 3-6 x ~3-5 minute videos you can tell a lot. Also depending how complex your program is you could include picture with information where specific things are located on the screen.
Just my personal opinion, I have never made a trial myself. Hope it helps.
An interactive walk through/lab exercise that really highlights the major and exciting offerings of your application.
Example: Yahoo mail does the same when the users opt to use new mail interface
There are so many ways you can go with this. I still can't claim to have found the best approach.
However, my plan from the beginning with my online (Silverlight) software was to give away something thousands of people will find useful and can use for free. The free version is pretty well representative of the professional product, with only a few features missing that enhance productivity (I'm working on those professional features now). And then I do have a nag popup that comes up every 5 minutes suggesting that you should buy it. That popup can be dismissed as many times as you want. I know that popup will annoy some people but I suppose that's the trade off. There is no perfect plan. But I don't think the occasional nag popup scares that many people away, especially when it can be dismissed with a single click.
I was inspired by Balsamiq Mockups, which has been hugely successful over the past couple years. My trial/nag popup way of doing things was copied almost exactly from Balsamiq. I honestly don't know if this is the ideal plan, but it has obviously worked for them. By the way, I think another reason for Balsamiq's success is that the demo doesn't have to be downloaded & installed. Since the demo is in Flash, there's a very high conversion rate of users actually trying it and becoming addicted to it.

How many users can an Amazon EC2 instance serve?

The use will be to serve dynamic content from data on S3. You can make up any definition of "normal" you think is normal.
What about small, medium, and large instances?
Ok. People want some data to work with, so here:
The webservice is about 100kb at start, and uses AJAX, so it doesnt have to reload the whole page much, if at all. When it loads the page, it will send between 20 - 30 requests to the database (S3) to get small chunks of text (like comments). The average user will stay on the page for 10 min, translating to about 100kb at offset, and about 400kb more through requests. Assume that hit volume is the same at night and day.
Depends on with what and how you're serving the content, not to mention how often those users will be accessing it, the size and type of the content, etc. There's essentially not one bit of information you've provided that allows us to answer your question in any sort of meaningful way.
As others have said, this might require testing under your exact conditions. Fortunately, if you're willing to go as far as setting up a test version of your server setup, you can spawn instances that simulate users. Create a bunch of these test instances, and run Apache's ab benchmarking tool on them, directing them at your test site. If the instances are within the same availability zone as your test site, you won't be charged for bandwidth, just by the hour for the running instances. Run a test for under an hour, shutting down the test instances afterward, and it will cost you very little to organize this stress test.
As one data point, running the Apache ab tool locally on my small instance, which is serving up a database-heavy Drupal site, it reported the ability of the server to handle 45-60 requests per second. I'm assuming that ab is a reasonable tool for benchmarking, and I might be wrong there, but this is what I'm seeing.
As a suggestion, not knowing too much about your particular case, I'd move your database to an Elastic Block Store (EBS) volume. S3 is not really intended to host databases, and the latency it has might kill your performance. EBS volumes can easily be snapshotted to S3 for backup, if that's what you're worried about.
One can argue that properly designed, it doesn't matter how many users an instance can support. Ideally, when your instance is saturated, you fire up a new instance to manage the traffic.
Obviously, this grossly complicates the deployment and design.
But beyond that, an EC2 instance a low end Linux box, effectively (depending on which model you choose).
Let's rephrase the question, how many users do you want to support?

Resources