How do I change the decimal place of the y-axis ticks on a seaborn barplot? - seaborn

I am having trouble figuring out how to change the ticks of my y-axis. For example the highest tick says 1.75 when it should say 17.5.
sns.barplot(data = tn_movie_budgets_df, x = 'release_date', y = 'worldwide_gross', order=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.xlabel('Month of Release')
plt.ylabel('Total Worldwide Gross in billions')
plt.title('Total Worldwide Gross VS release Month')

Related

Joint plot with regression line and classes by hue

I have the following dataframe
df = pd.DataFrame({
'Product': ['AA', 'AA', 'BB', 'BB', 'AA', 'AA', 'BB', 'BB'],
'Sales': [ 200, 100, 400, 100, 300, 100, 200, 500],
'Price': [ 5, 3, 3, 6, 4, 7, 4, 1]})
I would like to plot the regression line of the overall data, but also the scatter points by hue (in this case by Product) in the same chart.
I can get the regression line by:
g = sns.jointplot(y='Sales', x='Price', data=df, kind='reg', scatter = False)
And I can get the scatter by:
g = sns.scatterplot(y='Sales', x='Price', data=df, hue='Product')
But there are two different charts. Anyway that I can combine the two commands?
You have to tell the scatterplot in which axis object you want to plot. Options for a seaborn jointplot are the main plot area ax_joint or the two minor plot areas ax_marg_x and ax_marg_y.
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({
'Product': ['AA', 'AA', 'BB', 'BB', 'AA', 'AA', 'BB', 'BB'],
'Sales': [ 200, 100, 400, 100, 300, 100, 200, 500],
'Price': [ 5, 3, 3, 6, 4, 7, 4, 1]})
g = sns.jointplot(y='Sales', x='Price', data=df, kind='reg', scatter = False)
sns.scatterplot(y='Sales', x='Price', data=df, hue="Product", ax=g.ax_joint)
plt.show()
Sample output:
To extend on Mr. T's answer, you could also do something like this to keep the kde marginal plots of the jointplot:
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({
'Product': ['AA', 'AA', 'BB', 'BB', 'AA', 'AA', 'BB', 'BB'],
'Sales': [ 200, 100, 400, 100, 300, 100, 200, 500],
'Price': [ 5, 3, 3, 6, 4, 7, 4, 1]})
g = sns.jointplot(y='Sales', x='Price', data=df, hue="Product", alpha=0.5, xlim=(0.5,7.5), ylim=(-50, 550))
g1 = sns.regplot(y='Sales', x='Price', data=df, scatter=False, ax=g.ax_joint)
regline = g1.get_lines()[0]
regline.set_color('red')
regline.set_zorder(5)
plt.show()

Any form for a year-to-date or rolling sum function in Power Query?

I'm quite newby to Power Query. I have a column for the date, called MyDate, format (dd/mm/yy), and another variable called TotalSales. Is there any way of obtaining a variable TotalSalesYTD, with the sum of year-to-date TotalSales for each row? I've seen you can do that at Power Pivot or Power Bi, but didn't find anything for Power Query.
Alternatively, is there a way of creating a variable TotalSales12M, for the rolling sum of the last 12 months of TotalSales?
I wasn't able to test this properly, but the following code gave me your expected result:
let
initialTable = Table.FromRows({
{#date(2020, 5, 1), 150},
{#date(2020, 4, 1), 20},
{#date(2020, 3, 1), 54},
{#date(2020, 2, 1), 84},
{#date(2020, 1, 1), 564},
{#date(2019, 12, 1), 54},
{#date(2019, 11, 1), 678},
{#date(2019, 10, 1), 885},
{#date(2019, 9, 1), 54},
{#date(2019, 8, 1), 98},
{#date(2019, 7, 1), 654},
{#date(2019, 6, 1), 45},
{#date(2019, 5, 1), 64},
{#date(2019, 4, 1), 68},
{#date(2019, 3, 1), 52},
{#date(2019, 2, 1), 549},
{#date(2019, 1, 1), 463},
{#date(2018, 12, 1), 65},
{#date(2018, 11, 1), 45},
{#date(2018, 10, 1), 68},
{#date(2018, 9, 1), 65},
{#date(2018, 8, 1), 564},
{#date(2018, 7, 1), 16},
{#date(2018, 6, 1), 469},
{#date(2018, 5, 1), 4}
}, type table [MyDate = date, TotalSales = Int64.Type]),
ListCumulativeSum = (numbers as list) as list =>
let
accumulator = (listState as list, toAdd as nullable number) as list =>
let
previousTotal = List.Last(listState, 0),
combined = listState & {List.Sum({previousTotal, toAdd})}
in combined,
accumulated = List.Accumulate(numbers, {}, accumulator)
in accumulated,
TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) as table =>
let
values = Table.Column(someTable, columnToSum),
cumulative = ListCumulativeSum(values),
columns = Table.ToColumns(someTable) & {cumulative},
toTable = Table.FromColumns(columns, Table.ColumnNames(someTable) & {newColumnName})
in toTable,
yearToDateColumn =
let
groupKey = Table.AddColumn(initialTable, "$groupKey", each Date.Year([MyDate]), Int64.Type),
grouped = Table.Group(groupKey, "$groupKey", {"toCombine", each
let
sorted = Table.Sort(_, {"MyDate", Order.Ascending}),
cumulative = TableCumulativeSum(sorted, "TotalSales", "TotalSalesYTD")
in cumulative
}),
combined = Table.Combine(grouped[toCombine]),
removeGroupKey = Table.RemoveColumns(combined, "$groupKey")
in removeGroupKey,
rolling = Table.AddColumn(yearToDateColumn, "TotalSales12M", each
let
inclusiveEnd = [MyDate],
exclusiveStart = Date.AddMonths(inclusiveEnd, -12),
filtered = Table.SelectRows(yearToDateColumn, each [MyDate] > exclusiveStart and [MyDate] <= inclusiveEnd),
sum = List.Sum(filtered[TotalSales])
in sum
),
sortedRows = Table.Sort(rolling, {{"MyDate", Order.Descending}})
in
sortedRows
There might be more efficient ways to do what this code does, but if the size of your data is relatively small, then this approach should be okay.
For the year to date cumulative, the data is grouped by year, then sorted ascendingly, then a running total column is added.
For the rolling 12-month total, the data is grouped into 12-month windows and then the sales are totaled within each window. The totaling is a bit inefficient (since all rows are re-processed as opposed to only those which have entered/left the window), but you might not notice it.
Table.Range could have been used instead of Table.SelectRows when creating the 12-month windows, but I figured Table.SelectRows makes less assumptions about the input data (i.e. whether it's sorted, whether any months are missing, etc.) and is therefore safer/more robust.
This is what I get:

How to change jqplot series display order/logic

Can someone please advise how can I make jqplot charts display series a little bit differently (I am referring to the logic / order not the design).
For example lets say I have something like this:
var series1 = [1, 2, 3];
var series2 = [4, 5, 6];
var series3 = [7, 8, 9];
var ticks = ['JAN','FEB','MAR'];
The standard plotting will place the first value of every series array in 'JAN', the second in 'FEB' and the third in 'MAR'.
I want to display the entire series1 in 'JAN', series2 in 'FEB' and series3 in 'MAR'.
The arrays may not even be of the same size (series2 and series3 may have more or less elements than series1).
Any advice is apreciated,
Thank you!

Different Data Lines on Kendo UI Graphs

I have a Kendo UI line chart.
This has x axis intervals of 28 days and data plotted against this every 28 days.
I want to know if its possible to add a second line but with data plotted daily rather than every 28 days.
Thanks
Yes, you can! This type of series are called scatterLines and basically for each series you have to provide an array of pairs with the x and y values.
If for the first series you provide values 0, 28, 56... and for the second 0, 1, 2... You get what you want.
Example:
$("#chart").kendoChart({
series: [
{ type: "scatterLine", data: [[0, 4], [28, 2], [56, 3]] },
{ type: "scatterLine", data: [[1, 2], [2, 3]] }
]
});
Check it here: http://jsfiddle.net/U7SvD/

Pull in current month/year and previous month/year in Ruby [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 8 years ago.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Improve this question
I am creating a 12 month line chart that has data for the previous 12 months. The current month, or 12th line item, should say OCT 13. The first line item should say OCT 12.
How can I write something to dynamically pull in the current month as well as the previous 11 months, all the way back to the current month in last year. The issue I am having is making sure that last October gets tagged as a month in 2012, but in 3 months I need January to be tagged as 2013 without me changing the code.
a = [Date.today.prev_year]
12.times{a.push(a.last.next_month)}
a.map{|d| d.strftime("%^b %y")}
# => [
"OCT 12",
"NOV 12",
"DEC 12",
"JAN 13",
"FEB 13",
"MAR 13",
"APR 13",
"MAY 13",
"JUN 13",
"JUL 13",
"AUG 13",
"SEP 13",
"OCT 13"
]
Use << operator to shift date by a month
require "date"
12.downto(0).map{ |d| (Date.today << d).strftime("%^b %y") }
#=> ["OCT 12", "NOV 12", "DEC 12", "JAN 13", "FEB 13", "MAR 13", "APR 13",
# "MAY 13", "JUN 13", "JUL 13", "AUG 13", "SEP 13", "OCT 13"]
Used #Stefan's input to change the order.
require 'date'
def last_n_months(n, format='%^b %Y')
(n+1).times.map { |i| (Date.today << i).strftime(format) }
end
last_n_months(12)
# => ["OCT 2013", "SEP 2013", "AUG 2013", "JUL 2013", "JUN 2013", "MAY 2013", "APR 2013", "MAR 2013", "FEB 2013", "JAN 2013", "DEC 2012", "NOV 2012", "OCT 2012"]
Yes, use of the Date#methods is the way to go here, and yes, they are needed to get today's date, but wouldn't it be more satisfying to roll-your-own? Here's one way:
# Assume start_year falls in current millenium. 1 <= start_month <= 12
MONTHS = %q[JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC].split
def doit(start_year, start_month)
pairs = (((start_month-1)..11).to_a.product [start_year-1]) +
((0..(start_month-1)).to_a.product [start_year])
pairs.map! {|p| MONTHS[p.first] + " #{p.last.to_s}"}.reverse
end
p doit(13, 10)
First create pairs =>
[[9, 12], [10, 12], [11, 12]] +
[[0, 13], [1, 13], [2, 13], [3, 13], [4, 13], [5, 13], [6, 13], [7, 13], [8, 13], [9, 13]]
which is
[[9, 12], [10, 12], [11, 12], [0, 13], [1, 13], [2, 13], [3, 13],
[4, 13], [5, 13], [6, 13], [7, 13], [8, 13], [9, 13]]
Then replace elements of pairs with a string containing the month abbreviation and year.
["OCT 13", "SEP 13", "AUG 13", "JUL 13", "JUN 13", "MAY 13", "APR 13",
"MAR 13", "FEB 13", "JAN 13", "DEC 12", "NOV 12", "OCT 12"]
I evidently need all the parentheses in calculating pairs. Can anyone explain why I need the outer ones on the second line?

Resources