How to pick which coefficients get printed in statsmodels regression summary - statsmodels

Is there a way to pick which coefficients get printed from statsmodel's RegressionResults.summary function? For example, it would be great to suppress output for all the incidental parameters in a model with lots of fixed effects.
Thank you!

#djtm15 A little more context and what you have tried will help refine the question.
You are most probably looking for the params
model.params
a complete simple example
import statsmodels.api as sm
import numpy as np
xdata = np.array([0.0, 1.0, 3.0, 4.3, 7.0, 8.0, 8.5, 10.0, 12.0])
ydata = np.array([0.01, 0.02, 0.04, 0.11, 0.43, 0.7, 0.89, 0.95, 0.99])
model = sm.OLS(ydata, sm.add_constant(xdata)).fit()
print model.params

Related

Plotly express boxplot image export - part of box colors missing in svg file

I have made a boxplot using plotly express (px.box), and the resulting plot looks good in my Google Chrome / Jupyter browser window. Here is a schreenshot of two randomly selected boxes and they look as I expect.
However, after exporting using pio.write_image, it looks like this (zoomed in):
WHY does it not fill up the whole box after export? What can I do to avoid it? I have tried defining width and height as "size*300" to set the DPI to 300, I have tried with and without "scale" and I have tried to use Orca as image export engine, tried export as .PDF, and updated Plotly (Plotly vers. 5.1.0). Links to comprehensive guides for export of high-quality plotly plots for use as figures in scientific papers also much appreciated, as exporting quality very often not satifying.
A example of the problem can be reproduced with this:
import plotly.express as px
import plotly.io as pio
import pandas as pd import plotly.graph_objs as go import sys import plotly
pio.templates.default = “simple_white”
x = [‘Cat1’, ‘Cat1’,‘Cat2’, ‘Cat2’, ‘Cat3’, ‘Cat3’, ‘Cat4’, ‘Cat4’,‘Cat5’, ‘Cat5’, ‘Cat6’, ‘Cat6’, ‘Cat6’, ‘Cat7’, ‘Cat7’, ‘Cat8’, ‘Cat8’,‘Cat11’, ‘Cat11’, ‘Cat12’,‘Cat12’, ‘Cat10’, ‘Cat10’, ‘Cat9’, ‘Cat9’, ‘Cat13’, ‘Cat13’, ‘Cat14’, ‘Cat14’, ‘Cat15’, ‘Cat15’, ‘Cat16’,‘Cat16’, ‘Cat17’]
y = [0.0, 0.0, 0.0, 0.0047, 0.0, 0.036, 0.0, 0.0, 0.12314, 0.02472495, 0.004,0.0, 0.013, 0.0, 0.0, 0.184, 0.056, 0.0186, 0.005928, 0.340, 0.20335, 0.0, 0.0, 0.2481, 0.12, 0.0, 0.0, 0.0201, 0.050, 0.0,0.0, 0.041, 0.0199, 0.0]
data = { “x”: x, “y”: y, }
df = pd.DataFrame(data)
box_plot = px.box(df, x=“x”, y=“y”, points=“all”, width=800, height=400) box_plot.update_yaxes(title=“Random numbers”, title_font=dict(size=18, family=‘Arial’), tickfont=dict(family=‘Arial’, size=18)) box_plot.update_xaxes(title=None, tickangle=45, title_font=dict(size=18, family=‘Arial’), tickfont=dict(family=‘Arial’, size=18), categoryorder=‘array’, categoryarray=[“Cat2”, “Cat1”, “Cat3”,“Cat4”, “Cat5”, “Cat6”, “Cat10”, “Cat11”, “Cat12”,“Cat9”, “Cat8”, “Cat7”, “Cat13”, “Cat14”, “Cat15”,“Cat16”, “Cat17”]) box_plot.update_layout(margin=dict(l = 40, r = 10, t = 10, b = 25), width=1100, height=400, font_family=“Arial”) box_plot.update_traces(boxmean=“sd”, selector=dict(type=‘box’)) box_plot.update_traces(pointpos=-2, selector=dict(type=‘box’)) box_plot.update_traces(marker_symbol=“circle-open”, selector=dict(type=‘box’)) box_plot.show()
pio.write_image(box_plot, r"Boxplot_minimal_work_ex.svg")
I tested first with only two categories, and the export file looked fine! But when I increase the number of categories, it makes the bad quality graph. I wonder if there is an influence from setting the width - so I tried to delete the width and heigth setting from the px.box expression but it gave same bad result.

How to implement custom data generator for images in keras?

I have used Keras generators through ImageDataGenerator, however I would like to extend it to include some transformations that are currently not included (say Gaussian smoothing). For example,
datagen = ImageDataGenerator(
rotation_range = 5,
width_shift_range = 0.1,
my_smoothing_kernel = 0.3)
where obviously my_smoothing_kernel would be the function I would like to add. Does anyone have any idea how to do this? I would then like to use datagan.flow as an input into model.fit as normal. Any help would be greatly appreciated.

Resolving pytorch distributed execution printing multiple log statements for each process spawned?

I am running pytorch distributed environment to train some models and in the same script I am also using logging to print status of the program. The problem is that with pytorch distributed since its spawning multiple processes I see my log statements being printed n times where n is the number of processes being spawned. Here's an example of it:
1.0, 0.05, 2.1823, 0.1703, 1.9799, 0.2352
1.0, 0.05, 2.1804, 0.1674, 1.9767, 0.2406
1.0, 0.05, 2.1814, 0.1697, 2.0053, 0.2154
2.0, 0.05, 2.1593, 0.1741, 2.0935, 0.192
2.0, 0.05, 2.1526, 0.1779, 2.1166, 0.1908
2.0, 0.05, 2.1562, 0.1812, 2.0868, 0.2076
3.0, 0.05, 1.9319, 0.2473, 1.8041, 0.2903
3.0, 0.05, 1.9386, 0.2413, 1.8037, 0.3017
3.0, 0.05, 1.9286, 0.2443, 1.815, 0.2939
4.0, 0.05, 1.7522, 0.3153, 1.828, 0.3131
4.0, 0.05, 1.7504, 0.3207, 1.7613, 0.3245
4.0, 0.05, 1.7522, 0.3223, 1.7841, 0.3209
5.0, 0.05, 1.5815, 0.3951, 1.5559, 0.4307
5.0, 0.05, 1.5767, 0.3939, 1.5326, 0.4205
5.0, 0.05, 1.588, 0.3909, 1.5882, 0.3995
Any ideas on how to avoid or resolve this issue? Thanks!
You can choose to use NVIDIA pytorch scripts, it's optimized, which means it runs fast and print log normally.
here is the link:
https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch

How to rotate ylabel of pairplot in searborn? [duplicate]

I have a simple factorplot
import seaborn as sns
g = sns.factorplot("name", "miss_ratio", "policy", dodge=.2,
linestyles=["none", "none", "none", "none"], data=df[df["level"] == 2])
The problem is that the x labels all run together, making them unreadable. How do you rotate the text so that the labels are readable?
I had a problem with the answer by #mwaskorn, namely that
g.set_xticklabels(rotation=30)
fails, because this also requires the labels. A bit easier than the answer by #Aman is to just add
plt.xticks(rotation=45)
You can rotate tick labels with the tick_params method on matplotlib Axes objects. To provide a specific example:
ax.tick_params(axis='x', rotation=90)
This is still a matplotlib object. Try this:
# <your code here>
locs, labels = plt.xticks()
plt.setp(labels, rotation=45)
Any seaborn plots suported by facetgrid won't work with (e.g. catplot)
g.set_xticklabels(rotation=30)
however barplot, countplot, etc. will work as they are not supported by facetgrid. Below will work for them.
g.set_xticklabels(g.get_xticklabels(), rotation=30)
Also, in case you have 2 graphs overlayed on top of each other, try set_xticklabels on graph which supports it.
If anyone wonders how to this for clustermap CorrGrids (part of a given seaborn example):
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(context="paper", font="monospace")
# Load the datset of correlations between cortical brain networks
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
corrmat = df.corr()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 9))
# Draw the heatmap using seaborn
g=sns.clustermap(corrmat, vmax=.8, square=True)
rotation = 90
for i, ax in enumerate(g.fig.axes): ## getting all axes of the fig object
ax.set_xticklabels(ax.get_xticklabels(), rotation = rotation)
g.fig.show()
You can also use plt.setp as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plot=sns.barplot(data=df, x=" ", y=" ")
plt.setp(plot.get_xticklabels(), rotation=90)
to rotate the labels 90 degrees.
For a seaborn.heatmap, you can rotate these using (based on #Aman's answer)
pandas_frame = pd.DataFrame(data, index=names, columns=names)
heatmap = seaborn.heatmap(pandas_frame)
loc, labels = plt.xticks()
heatmap.set_xticklabels(labels, rotation=45)
heatmap.set_yticklabels(labels[::-1], rotation=45) # reversed order for y
One can do this with matplotlib.pyplot.xticks
import matplotlib.pyplot as plt
plt.xticks(rotation = 'vertical')
# Or use degrees explicitly
degrees = 70 # Adjust according to one's preferences/needs
plt.xticks(rotation=degrees)
Here one can see an example of how it works.
Use ax.tick_params(labelrotation=45). You can apply this to the axes figure from the plot without having to provide labels. This is an alternative to using the FacetGrid if that's not the path you want to take.
If the labels have long names it may be hard to get it right. A solution that worked well for me using catplot was:
import matplotlib.pyplot as plt
fig = plt.gcf()
fig.autofmt_xdate()

In Matplotlib, how do you add an Imagedraw object to a PyPlot?

I need to add a shape to a preexisting image generated using a pyplot (plt). The best way I know of to generate basic shapes quickly is using Imagedraw's predefined shapes. The original data has points with corresponding colors in line_holder and colorholder. I need to add a bounding box (or in this case ellipse) to the plot to make it obvious to the user whether the data is in an acceptable range.
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from PIL import Image
...
lines = LineCollection(mpl.line_holder, colors=mpl.colorholder , linestyle='solid')
plt.axes().add_collection(lines)
plt.axes().set_aspect('equal', 'datalim')
plt.axes().autoscale_view(True,True,True)
plt.draw()
plt.show()
I tried inserting this before the show():
image = Image.new('1',(int(ceil(disc/conv))+2,int(ceil(disc/conv))+1), 1)
draw = ImageDraw.Draw(image)
box=(1, 1, int(ceil(disc/conv)), int(ceil(disc/conv))) #create bounding box
draw.ellipse(box, 1, 0) #draw circle in black
but I cannot find a way to then add this ellipse to the pyplot. Does anyone know how one would go about getting the images together? If it is not possible to add an imagedraw object to a pyplot, are there good alternatives for performing this type of operation?
Matplotlib has several patches (shapes) that appear to meet your needs (and remove PIL as a dependency). They are documented here. A helpful example using shapes is here.
To add an ellipse to a plot, you first create a Ellipse patch and then add that patch to the axes you're currently working on. Beware that Circle's (or Ellipse's with equal minor radii) will appear elliptical if your aspect ratio is not equal.
In your snippet you call plt.axes() several times. This is unnecessary, as it is just returning the current axes object. I think it is clearer to keep the axes object and directly operate on it rather than repeatedly getting the same object via plt.axes(). As far as axes() is used in your snippet, gca() does the same thing. The end of my script demonstrates this.
I've also replaced your add_collection() line by a plotting a single line. These essentially do the same thing and allows my snippet to be executed as a standalone script.
import matplotlib.pyplot as plt
import matplotlib as mpl
# set up your axes object
ax = plt.axes()
ax.set_aspect('equal', 'datalim')
ax.autoscale_view(True, True, True)
# adding a LineCollection is equivalent to plotting a line
# this will run as a stand alone script
x = range(10)
plt.plot( x, x, 'x-')
# add and ellipse to the axes
c = mpl.patches.Ellipse( (5, 5), 1, 6, angle=45)
ax.add_patch(c)
# you can get the current axes a few ways
ax2 = plt.axes()
c2 = mpl.patches.Ellipse( (7, 7), 1, 6, angle=-45, color='green')
ax2.add_patch(c2)
ax3 = plt.gca()
c3 = mpl.patches.Ellipse( (0, 2), 3, 3, color='black')
ax3.add_patch(c3)
print id(ax), id(ax2), id(ax3)
plt.show()

Resources