How calculate the number prorated day with Stripe API? - ruby

i am using Stripe. I would like to know how can calculate number of day prorated
I want display something like that
1 additional seat ($9/month each - prorated for 26 days)
in the api i don't see any item prorate_day
Bolo

subscription_proration_date what you are looking for? Then it will calculate it for you.
See more at https://stripe.com/docs/subscriptions/guide
The example of pro-rated subscription in ruby is as follows
# Set your secret key: remember to change this to your live secret key in production
# See your keys here https://dashboard.stripe.com/account/apikeys
Stripe.api_key = "sk_test_9OkpsFpKa1HDHaZa7e0BeGaO"
proration_date = Time.now.to_i
invoice = Stripe::Invoice.upcoming(:customer => "cus_3R1W8PG2DmsmM9", :subscription => "sub_3R3PlB2YlJe84a",
:subscription_plan => "premium_monthly", :subscription_proration_date => proration_date)
current_prorations = invoice.lines.data.select { |ii| ii.period.start == proration_date }
cost = 0
current_prorations.each do |p|
cost += p.amount
end
# Display the cost of these prorations invoice items to the end user,
# and actually do the update when they agree.
# To make sure that the proration is calculated the same as when it was previewed,
# you need to pass in the proration_date parameter
# later...
subscription = Stripe::Subscription.retrieve("sub_3R3PlB2YlJe84a")
subscription.plan = "premium_monthly"
subscription.proration_date = proration_date
subscription.save

Related

How to improve Ruby structure for Shopify Script Performance

I'm using a Ruby in Shopify Scripts Editor to manage as a security measure Gift With Purchase (GWP) promotions.
The script current is:
Checking if the Customer is logged in as a Professional or Unlogged
Checking if there is a minimum amount spent in the cart
Ensuring that only one "Gift" product is been added to the cart
Removing a "Gift" product if the checkout doesn't have a "Discount Code" or the minimum set in the GWP_SETTINGS = [] obj.
The problem is that it's generating too much Production Errors like "Your script exceeded the time limit." and "Your script exceeded the cpu limit."
The current usage is CPU: 5% | Memory: 8% and it's increasing dizzyingly every time we add a new GWP promotion array.
Is there a better way to structure this logic so it takes less memory to process the entire order + GWP validation?
Here is the "Line Items" structure:
cart = Input.cart
PRO_TAG = 'professional-tag'
has_pro_tag = cart.customer && cart.customer.tags.include?(PRO_TAG)
GWP_SETTINGS = [
gwp_1 = {
"variant_id" => 98989898989898,
"discount_code" => "DISCOUNT_CODE_1",
"minimum_requirement" => Money.new(cents: 50 * 100),
"user_type" => "consumer"
},
gwp_2 = {
"variant_id" => 97979797979797,
"discount_code" => "DISCOUNT_CODE_1",
"minimum_requirement" => Money.new(cents: 50 * 100),
"user_type" => "consumer"
},
gwp_3 = {
"variant_id" => 96969696969696,
"discount_code" => "DISCOUNT_CODE_1",
"minimum_requirement" => Money.new(cents: 50 * 100),
"user_type" => "consumer"
}
]
def remove_GWP(cart, variant_id)
cart.line_items.each do |item|
next if item.variant.id != variant_id
index = cart.line_items.find_index(item)
cart.line_items.delete_at(index)
end
end
def ensure_only_one_GWP_is_added(cart, variant_id)
cart.line_items.each do |item|
next if item.variant.id != variant_id
item.instance_variable_set(:#quantity, 1)
end
end
GWP_SETTINGS.each do |gwp_item_settings|
customer_has_discount = cart.discount_code && cart.discount_code.code == gwp_item_settings["discount_code"]
customer_has_minimum = cart.subtotal_price >= gwp_item_settings["minimum_requirement"]
gwp_is_for_professional = gwp_item_settings["user_type"] == "professional-tag"
#UNLOGGED
if customer_has_discount && customer_has_minimum
ensure_only_one_GWP_is_added(cart, gwp_item_settings["variant_id"])
else
remove_GWP(cart, gwp_item_settings["variant_id"])
end
#PRO
if gwp_is_for_professional && has_pro_tag
if customer_has_discount && customer_has_minimum
ensure_only_one_GWP_is_added(cart, gwp_item_settings["variant_id"])
else
remove_GWP(cart, gwp_item_settings["variant_id"])
end
end
end
Output.cart = cart
You only have 3 settings. But a customer (an order) could have 100+ line items. You know there is only ever 1 customer, 1 order and for you, 3 GWT settings to use.
Your business logic would be smarter if you looped through the line items only once. Then you have a "this is as fast as I can go, go to town" in terms of your algorithm. You cannot go faster than that.
With things like, "does this customer have an X or Y?", you do those once, not 3 times per line item!
As you check each line item, you can do your special logic for things that might AFFECT that line item.
Basically, this is basic algorithmics. You are doing the most work possible repetitively for no reason, and Shopify is puking because of it.

How to add a maximum travel time duration for the sum of all routes in VRP Google OR-TOOLS

I am new to programming and used Google OR-tools to create my VRP model. In my current model, I have included a general time window and capacity constraint per vehicle, creating a capacitated vehicle routing problem with time windows. I followed the OR-tools guides which contains a maximum travel duration for each vehicle.
However, I want to include a maximum travel duration for the sum of all routes, whereas the maximum travel duration for each vehicle does not matter (so I set it to 100.000). Accorddingly, I want to create something in the model/solution printer that tells me which amount of addresses could not be visited due to the constraint on the maximum travel duration for the sum of all routes. From the examples I have seen I think it would be kind of easy, but my knowledge on programming is fairly limited, so my attempts had no succes. Can anyone help me?
import pandas as pd
import openpyxl
import numpy as np
import math
from random import sample
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp
from scipy.spatial.distance import squareform, pdist
from haversine import haversine
#STEP - create data
# import/read excel file
data = pd.read_excel(r'C:\Users\Jean-Paul\Documents\Thesis\OR TOOLS\Data.xlsx', engine = 'openpyxl')
df = pd.DataFrame(data, columns= ['number','lat','lng']) # create dataframe with 10805 addresses + address of the depot
#print (df)
# randomly sample X addresses from the dataframe and their corresponding number/latitude/longtitude
df_sample = df.sample(n=100)
#print (df_data)
# read first row of the excel file (= coordinates of the depot)
df_depot = pd.DataFrame(data, columns= ['number','lat','lng']).iloc[0:1]
#print (df_depot)
# combine dataframe of depot and sample into one dataframe
df_data = pd.concat([df_depot, df_sample], ignore_index=True, sort=False)
#print (df_data)
#STEP - create distance matrix data
# determine distance between latitude and longtitude
df_data.set_index('number', inplace=True)
matrix_distance = pd.DataFrame(squareform(pdist(df_data, metric=haversine)), index=df_data.index, columns=df_data.index)
matrix_list = np.array(matrix_distance)
#print (matrix_distance) # create table of distances between addresses including headers
#print (matrix_list) # converting table to list of lists and exclude headers
#STEP - create time matrix data
travel_time = matrix_list / 15 * 60 # divide distance by travel speed 20 km/h and multiply by 60 minutes
#print (travel_time) # converting distance matrix to travel time matrix
#STEP - create time window data
# create list for each sample - couriers have to visit this address within 0-X minutes of time using a list of lists
window_range = []
for i in range(len(df_data)):
list = [0, 240]
window_range.append(list) # create list of list with a time window range for each address
#print (window_range)
#STEP - create demand data
# create list for each sample - all addresses demand 1 parcel except the depot
demand_range = []
for i in range(len(df_data.iloc[0:1])):
list = 0
demand_range.append(list)
for j in range(len(df_data.iloc[1:])):
list2 = 1
demand_range.append(list2)
#print (demand_range)
#STEP - create fleet size data # amount of vehicles in the fleet
fleet_size = 6
#print (fleet_size)
#STEP - create capacity data for each vehicle
fleet_capacity = []
for i in range(fleet_size): # capacity per vehicle
list = 20
fleet_capacity.append(list)
#print (fleet_capacity)
#STEP - create data model that stores all data for the problem
def create_data_model():
data = {}
data['time_matrix'] = travel_time
data['time_windows'] = window_range
data['num_vehicles'] = fleet_size
data['depot'] = 0 # index of the depot
data['demands'] = demand_range
data['vehicle_capacities'] = fleet_capacity
return data
#STEP - creating the solution printer
def print_solution(data, manager, routing, solution):
"""Prints solution on console."""
print(f'Objective: {solution.ObjectiveValue()}')
time_dimension = routing.GetDimensionOrDie('Time')
total_time = 0
for vehicle_id in range(data['num_vehicles']):
index = routing.Start(vehicle_id)
plan_output = 'Route for vehicle {}:\n'.format(vehicle_id)
while not routing.IsEnd(index):
time_var = time_dimension.CumulVar(index)
plan_output += '{0} Time({1},{2}) -> '.format(
manager.IndexToNode(index), solution.Min(time_var),
solution.Max(time_var))
index = solution.Value(routing.NextVar(index))
time_var = time_dimension.CumulVar(index)
plan_output += '{0} Time({1},{2})\n'.format(manager.IndexToNode(index),
solution.Min(time_var),
solution.Max(time_var))
plan_output += 'Time of the route: {}min\n'.format(
solution.Min(time_var))
print(plan_output)
total_time += solution.Min(time_var)
print('Total time of all routes: {}min'.format(total_time))
#STEP - create the VRP solver
def main():
# instantiate the data problem
data = create_data_model()
# create the routing index manager
manager = pywrapcp.RoutingIndexManager(len(data['time_matrix']),
data['num_vehicles'], data['depot'])
# create routing model
routing = pywrapcp.RoutingModel(manager)
#STEP - create demand callback and dimension for capacity
# create and register a transit callback
def demand_callback(from_index):
"""Returns the demand of the node."""
# convert from routing variable Index to demands NodeIndex
from_node = manager.IndexToNode(from_index)
return data['demands'][from_node]
demand_callback_index = routing.RegisterUnaryTransitCallback(
demand_callback)
routing.AddDimensionWithVehicleCapacity(
demand_callback_index,
0, # null capacity slack
data['vehicle_capacities'], # vehicle maximum capacities
True, # start cumul to zero
'Capacity')
#STEP - create time callback
# create and register a transit callback
def time_callback(from_index, to_index):
"""Returns the travel time between the two nodes."""
# convert from routing variable Index to time matrix NodeIndex
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return data['time_matrix'][from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(time_callback)
# define cost of each Arc (costs in terms of travel time)
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
# STEP - create a dimension for the travel time (TIMEWINDOW) - dimension keeps track of quantities that accumulate over a vehicles route
# add time windows constraint
time = 'Time'
routing.AddDimension(
transit_callback_index,
2, # allow waiting time (does not have an influence in this model)
100000, # maximum total route lenght in minutes per vehicle (does not have an influence because of capacity constraint)
False, # do not force start cumul to zero
time)
time_dimension = routing.GetDimensionOrDie(time)
# add time window constraints for each location except depot
for location_idx, time_window in enumerate(data['time_windows']):
if location_idx == data['depot']:
continue
index = manager.NodeToIndex(location_idx)
time_dimension.CumulVar(index).SetRange(time_window[0], time_window[1])
# add time window constraint for each vehicle start node
depot_idx = data['depot']
for vehicle_id in range(data['num_vehicles']):
index = routing.Start(vehicle_id)
time_dimension.CumulVar(index).SetRange(
data['time_windows'][depot_idx][0],
data['time_windows'][depot_idx][1])
#STEP - instantiate route start and end times to produce feasible times
for i in range(data['num_vehicles']):
routing.AddVariableMinimizedByFinalizer(
time_dimension.CumulVar(routing.Start(i)))
routing.AddVariableMinimizedByFinalizer(
time_dimension.CumulVar(routing.End(i)))
#STEP - setting default search parameters and a heuristic method for finding the first solution
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
#STEP - solve the problem with the serach parameters and print solution
solution = routing.SolveWithParameters(search_parameters)
if solution:
print_solution(data, manager, routing, solution)
if __name__ == '__main__':
main()
See #Mizux's answer, going under-the-hood in the solver to make a summation cost over all vehicle route lengths:
https://stackoverflow.com/a/68756570/13773745

RUBY - Currency Exchange Rate Calculator

I'm trying to create a currency converter in Ruby which will calculate the exchange rate between two currencies on a given date.
I have a data file containing test data (date, currency from, currency to). The test data is in EUR, so all rates are converted to EUR and then to the target currency.
So far I have 3 files (Exchange.rb, Test_Exchange.rb, rates.json):
Exchange.rb:
require 'json'
require 'date'
module Exchange
# Return the exchange rate between from_currency and to_currency on date as a float.
# Raises an exception if unable to calculate requested rate.
# Raises an exception if there is no rate for the date provided.
#rates = JSON.parse(File.read('rates.json'))
def self.rate(date, from_currency, to_currency)
# TODO: calculate and return rate
rates = u/rates[date] # get rates of given day
from_to_eur = 1.0 / rates[from_currency] # convert to EUR
from_to_eur * rates[to_currency] # convert to target currency
end
end
Test_Exchange.rb:
require_relative 'Exchange.rb'
require 'date'
target_date = Date.new(2018,12,10).to_s
puts "USD to GBP: #{Exchange.rate(target_date, 'USD', 'GBP')}"
puts "USD to JPY: #{Exchange.rate(target_date, 'PLN', 'CHF')}"
puts "DKK to CAD: #{Exchange.rate(target_date, 'PLN', 'CHF')}"
rates.json:
{
"2018-12-11": {
"USD": 1.1379,
"JPY": 128.75,
"BGN": 1.9558,
"CZK": 25.845,
"DKK": 7.4641,
"GBP": 0.90228,
"HUF": 323.4,
"CHF": 1.1248,
"PLN": 4.2983
},
"2018-12-10": {
"USD": 1.1425,
"JPY": 128.79,
"BGN": 1.9558,
"CZK": 25.866,
"DKK": 7.4639,
"CAD": 1.5218,
"GBP": 0.90245,
"HUF": 323.15,
"PLN": 4.2921,
"CHF": 1.1295,
"ISK": 140.0,
"HRK": 7.387,
"RUB": 75.8985
},
"2018-12-05": {
"USD": 1.1354,
"JPY": 128.31,
"BGN": 1.9558,
"CZK": 25.886,
"DKK": 7.463,
"GBP": 0.88885,
"HUF": 323.49,
"PLN": 4.2826,
"RON": 4.6528,
"SEK": 10.1753,
"CHF": 1.1328,
"HRK": 7.399,
"RUB": 75.8385,
"CAD": 1.5076
}
}
I'm not sure what to add in the Exchange.rb file to allow the user to input a date and the two currencies to compare exchange rates.
Running Exchange.rb does nothing. I'm guessing it wants a date and currency parameters input?
Running Test_Exchange.rb works because the date and currencies are bootstrapped in.
I found almost the same question posted here a couple years ago, but the thread is now closed, and the solution was incomplete. Hoping someone can help me!
RUNNING EDIT:
Exchange.rb:
require 'json'
require 'date'
module Exchange
# Return the exchange rate between from_currency and to_currency on date as a float.
# Raises an exception if unable to calculate requested rate.
# Raises an exception if there is no rate for the date provided.
#rates = JSON.parse(File.read('rates.json'))
#Grab Date and Currencies from User
puts "Please enter a Date (YYYY-MM-DD)"
input_date = gets.chomp
puts "The Date you entered is: #{input_date}"
puts "Please enter a 3-letter Currency Code (ABC):"
input_curr_1 = gets.chomp
puts "The 1st Currency you entered is: #{input_curr_1}"
puts "Please enter a 2nd 3-letter Currency Code (XYZ):"
input_curr_2 = gets.chomp
puts "The 2nd Currency you entered is: #{input_curr_2}"
def self.rate(input_date, input_curr_1, input_curr_2)
# TODO: calculate and return rate
rates = #rates[input_date] # get rates of given day
from_to_eur = 1.0 / rates[input_curr_1] # convert to EUR
from_to_eur * rates[input_curr_2] # convert to target currency
end
end
I think I have to use a put and a get to capture the user date input? Then the same for each of the two currencies? Of course, my syntax for the date stuff is all wrong..
So I managed to use the gets and put functions. Now all that's left is somehow calling the rates.json file and comparing the user inputs to the existing data..

How to retrieve entire cost for a SoftLayer machine, including any extra costs such as bandwidth overages?

I've been retrieving monthly invoice cost information on our SoftLayer accounts for quite some time using the Ruby softlayer gem. However, there is a concern in the team that we may be missing certain costs, such as any overages on network utilization. I'd like to have some piece of mind that what I'm doing is correctly gathering all costs and we are not missing anything. Here is my code/query:
account = SoftLayer::Service.new("SoftLayer_Account",:username => user, :api_key => api_key, :timeout => 999999999)
softlayer_client = SoftLayer::Client.new(:username => user, :api_key => api_key, :timeout => 999999999)
billing_invoice_service = softlayer_client.service_named("Billing_Invoice")
object_filter = SoftLayer::ObjectFilter.new
object_filter.set_criteria_for_key_path('invoices.createDate', 'operation' => 'betweenDate', 'options' => [{'name' => 'startDate', 'value' => ["#{startTime}"]}, {'name' => 'endDate', 'value' => ["#{endTime}"]}])
# Set startDate and endDate around the beginning of the month in search of the "Recurring" invoice that should appear on the 1st.
invoices = account.result_limit(0,10000).object_filter(object_filter).object_mask("mask[id,typeCode,itemCount,invoiceTotalAmount,closedDate,createDate]").getInvoices
invoices.each do | invoice |
if invoice["typeCode"] == "RECURRING"
invoice_reference = billing_invoice_service.object_with_id(invoice["id"])
invoice_object = invoice_reference.object_mask("mask[itemCount]").getObject
billing_items_count = invoice_object["itemCount"]
billing_machines_map = Hash.new
all_billing_items = Array.new
# Search for billing items containing a hostName value.
# The corresponding billing item ID will become the key of a new hash.
# Child costs will be added to the existing costs.
billing_items_retrieval_operation = proc {
for i in 0..(billing_items_count/8000.0).ceil - 1
billing_items = invoice_reference.result_limit(i*8000, 8000).object_mask("mask[id,resourceTableId,billingItemId,parentId,categoryCode,hostName,domainName,hourlyRecurringFee,laborFee,oneTimeFee,recurringFee,recurringTaxAmount,setupFee,setupTaxAmount,location[name]]").getItems()
billing_items.each do | billing_item |
if billing_item["hostName"]
billing_machines_map[billing_item["id"]] = billing_item
end
end
all_billing_items.concat(billing_items)
end
}
# Look for items with parentIds or resourceTableIds.
# Both Ids represent a "parent" of the item.
# Give higher importance to parentId.
billing_items_retrieval_callback = proc {
cost_of_billing_items_without_parent = BigDecimal.new("0.00")
all_billing_items.each do | billing_item |
if billing_item["parentId"] != ""
parent_billing_machine = billing_machines_map[billing_item["parentId"]]
if parent_billing_machine parent_billing_machine["recurringFee"] = (BigDecimal.new(parent_billing_machine["recurringFee"]) + BigDecimal.new(billing_item["recurringFee"])).to_s('F')
parent_billing_machine["setupFee"] = (BigDecimal.new(parent_billing_machine["setupFee"]) + BigDecimal.new(billing_item["setupFee"])).to_s('F')
parent_billing_machine["laborFee"] = (BigDecimal.new(parent_billing_machine["laborFee"]) + BigDecimal.new(billing_item["laborFee"])).to_s('F')
parent_billing_machine["oneTimeFee"] = (BigDecimal.new(parent_billing_machine["oneTimeFee"]) + BigDecimal.new(billing_item["oneTimeFee"])).to_s('F')
end
elsif billing_item["resourceTableId"] != ""
parent_billing_machine = billing_machines_map[billing_item["resourceTableId"]]
if parent_billing_machine
parent_billing_machine["recurringFee"] = (BigDecimal.new(parent_billing_machine["recurringFee"]) + BigDecimal.new(billing_item["recurringFee"])).to_s('F')
parent_billing_machine["setupFee"] = (BigDecimal.new(parent_billing_machine["setupFee"]) + BigDecimal.new(billing_item["setupFee"])).to_s('F')
parent_billing_machine["laborFee"] = (BigDecimal.new(parent_billing_machine["laborFee"]) + BigDecimal.new(billing_item["laborFee"])).to_s('F')
parent_billing_machine["oneTimeFee"] = (BigDecimal.new(parent_billing_machine["oneTimeFee"]) + BigDecimal.new(billing_item["oneTimeFee"])).to_s('F')
end
else
cost_of_billing_items_without_parent = (BigDecimal.new(cost_of_billing_items_without_parent) + BigDecimal.new(billing_item["recurringFee"])).to_s('F')
cost_of_billing_items_without_parent = (BigDecimal.new(cost_of_billing_items_without_parent) + BigDecimal.new(billing_item["setupFee"])).to_s('F')
cost_of_billing_items_without_parent = (BigDecimal.new(cost_of_billing_items_without_parent) + BigDecimal.new(billing_item["laborFee"])).to_s('F')
cost_of_billing_items_without_parent = (BigDecimal.new(cost_of_billing_items_without_parent) + BigDecimal.new(billing_item["oneTimeFee"])).to_s('F')
end
end
pp "INVOICE: Total cost of devices for account without a parent is:"
pp cost_of_billing_items_without_parent
end
end
end
After the above I make calls to getVirtualGuests and getHardware to get some additional meta information for each machine (I tie them together based on billingItem.id. Example:
billingItemId = billing_machine["billingItemId"]
account_service = softlayer_client.service_named("Account")
filter = SoftLayer::ObjectFilter.new {|f| f.accept("virtualGuests.billingItem.id").when_it is(billingItemId)}
virtual_guests_array = account_service.object_filter(filter).object_mask("mask[id, hostname, datacenter[name], billingItem[orderItem[order[userRecord[username]]]], tagReferences[tagId, tag[name]], primaryIpAddress, primaryBackendIpAddress]").getVirtualGuests()
As you can see I don't make any calls to capture bandwith overage charges. I have printed out the various "category" values I get from the above query but I am not seeing anything specific to network utilization (it's possible there are no extra network utilization costs but I am not certain).
Thank you.
Any extra costs such as bandwidth overages will be included in the billing item from the server. So you don't need to make any other call to the api to get it.

Calculate running/cumulative cost of EC2 spot instance

I often run spot instances on EC2 (for Hadoop task jobs, temporary nodes, etc.) Some of these are long-running spot instances.
Its fairly easy to calculate the cost for on-demand or reserved EC2 instances - but how do I calculate the cost incurred for a specific node (or nodes) that are running as spot instances?
I am aware that the cost for a spot instance changes every hour depending on market rate - so is there any way to calculate the cumulative total cost for a running spot instance? Through an API or otherwise?
OK I found a way to do this in the Boto library. This code is not perfect - Boto doesn't seem to return the exact time range, but it does get the historic spot prices more or less within a range. The following code seems to work quite well. If anyone can improve on it, that would be great.
import boto, datetime, time
# Enter your AWS credentials
aws_key = "YOUR_AWS_KEY"
aws_secret = "YOUR_AWS_SECRET"
# Details of instance & time range you want to find spot prices for
instanceType = 'm1.xlarge'
startTime = '2012-07-01T21:14:45.000Z'
endTime = '2012-07-30T23:14:45.000Z'
aZ = 'us-east-1c'
# Some other variables
maxCost = 0.0
minTime = float("inf")
maxTime = 0.0
totalPrice = 0.0
oldTimee = 0.0
# Connect to EC2
conn = boto.connect_ec2(aws_key, aws_secret)
# Get prices for instance, AZ and time range
prices = conn.get_spot_price_history(instance_type=instanceType,
start_time=startTime, end_time=endTime, availability_zone=aZ)
# Output the prices
print "Historic prices"
for price in prices:
timee = time.mktime(datetime.datetime.strptime(price.timestamp,
"%Y-%m-%dT%H:%M:%S.000Z" ).timetuple())
print "\t" + price.timestamp + " => " + str(price.price)
# Get max and min time from results
if timee < minTime:
minTime = timee
if timee > maxTime:
maxTime = timee
# Get the max cost
if price.price > maxCost:
maxCost = price.price
# Calculate total price
if not (oldTimee == 0):
totalPrice += (price.price * abs(timee - oldTimee)) / 3600
oldTimee = timee
# Difference b/w first and last returned times
timeDiff = maxTime - minTime
# Output aggregate, average and max results
print "For: one %s in %s" % (instanceType, aZ)
print "From: %s to %s" % (startTime, endTime)
print "\tTotal cost = $" + str(totalPrice)
print "\tMax hourly cost = $" + str(maxCost)
print "\tAvg hourly cost = $" + str(totalPrice * 3600/ timeDiff)
I've re-written Suman's solution to work with boto3. Make sure to use utctime with the tz set!:
def get_spot_instance_pricing(ec2, instance_type, start_time, end_time, zone):
result = ec2.describe_spot_price_history(InstanceTypes=[instance_type], StartTime=start_time, EndTime=end_time, AvailabilityZone=zone)
assert 'NextToken' not in result or result['NextToken'] == ''
total_cost = 0.0
total_seconds = (end_time - start_time).total_seconds()
total_hours = total_seconds / (60*60)
computed_seconds = 0
last_time = end_time
for price in result["SpotPriceHistory"]:
price["SpotPrice"] = float(price["SpotPrice"])
available_seconds = (last_time - price["Timestamp"]).total_seconds()
remaining_seconds = total_seconds - computed_seconds
used_seconds = min(available_seconds, remaining_seconds)
total_cost += (price["SpotPrice"] / (60 * 60)) * used_seconds
computed_seconds += used_seconds
last_time = price["Timestamp"]
# Difference b/w first and last returned times
avg_hourly_cost = total_cost / total_hours
return avg_hourly_cost, total_cost, total_hours
You can subscribe to the spot instance data feed to get charges for your running instances dumped to an S3 bucket. Install the ec2 toolset and then run:
ec2-create-spot-datafeed-subscription -b bucket-to-dump-in
Note: you can have only one data feed subscription for your entire account.
In about an hour you should start seeing gzipped tabbed delimited files show up in the bucket that look something like this:
#Version: 1.0
#Fields: Timestamp UsageType Operation InstanceID MyBidID MyMaxPrice MarketPrice Charge Version
2013-05-20 14:21:07 UTC SpotUsage:m1.xlarge RunInstances:S0012 i-1870f27d sir-b398b235 0.219 USD 0.052 USD 0.052 USD 1
I have recently developed a small python library that calculates the cost of a single EMR cluster, or for a list of clusters (given a period of days).
It takes into account Spot instances and Task nodes as well (that may go up and down while the cluster is still running).
In order to calculate the cost I use the bid price, which (in many cases) might not be the exact price that you end up paying for the instance.
Depending on your bidding policy however, this price can be accurate enough.
You can find the code here: https://github.com/memosstilvi/emr-cost-calculator

Resources