Plot side by side box and whisker plots from two dataframes - seaborn

I'm hoping to take these two box plots and combine them into one image:
[![These are two data files I was able to make box and whisker charts for easily using Seaborn boxplot][1]][1]
The datafile I am using is from multiple excel spread sheets and looks like this:
0
1
2
3
4
5
6
...
5
2
3
5
6
2
5
...
2
3
4
6
1
2
1
...
1
2
4
6
7
8
9
...
...
...
...
...
...
...
...
...
Where the column headers represent hours and the column values are the ones I want to use to create box and whisker plots with.
Currently my code is this:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
xls = pd.ExcelFile('ControlDayVar.xlsx')
df1= pd.read_excel(xls, 'DE_ControlDays').assign(Location=1)
df2= pd.read_excel(xls, 'DE_FestDays').assign(Location=2)
DE_all =pd.concat([df1,df2])
DE= pd.melt(DE_all, id_vars=['Location'], var_name=['Hours'], value_name='Concentration')
ax= sns.boxplot(x='Hours', y= 'Concentration', hue= 'Location', data=DE)
plt.show()
The result I get is this:
[![Yikes][2]][2]
I expect my issue has to do with the format of my data files, but any help would be appreciated.Thanks!
[1]: https://i.stack.imgur.com/dXo6F.jpg
[2]: https://i.stack.imgur.com/NEpi7.jpg

This could happen if somehow the Concentration values are not properly recognized as a numerical data type anymore.
In that case, the y-axis can no longer be understood as continuous, which can lead to that "yikes" result.

Related

How to plot shaded areas on a seaborn countplot

I would like to be able to show shaded areas on a seaborn countplot as shown below. The idea would be to show the covid lockown periods that cover the data period. I have the countplot, but i cant figure out how to add the the shaded area.
My current df contains dates of properties for sale (from domain.com.au), and i have a small dataset of covid dates:
The code that generates the seaborn plot:
fig, ax = plt.subplots(figsize=(16,10));
sns.countplot(data=df, x='dates', ax=ax)
representation of what i am looking to produce (created from excel).
Assuming your property dataset looks like this:
df = pd.DataFrame({
"date": ["2022-07-01", "2022-07-01", "2022-07-01", "2022-07-02", "2022-07-02", "2022-07-03", "2022-07-04", "2022-07-05", "2022-07-05", "2022-07-05", "2022-07-06"],
"prop": ['test'] * 11
})
df['date'] = pd.to_datetime(df['date']).dt.date
date prop
0 2022-07-01 test
1 2022-07-01 test
2 2022-07-01 test
3 2022-07-02 test
4 2022-07-02 test
5 2022-07-03 test
6 2022-07-04 test
7 2022-07-05 test
8 2022-07-05 test
9 2022-07-05 test
10 2022-07-06 test
Then you could do a bar plot of the counts and shade date ranges with matplotlib's axvspan:
fig, ax = plt.subplots(figsize=(16,10));
prop_count = df['date'].value_counts()
plt.bar(prop_count.index, prop_count.values)
for start_date, end_date in zip(covid_df['start_date'], covid_df['end_date']):
ax.axvspan(start_date, end_date, alpha=0.1, color='red')
plt.show()

How to write a 3D image from a 3x3 matrix written in fortran 90?

I'm trying to write a 3D image in fortran 90.
The code for the object I want in the image:
Here is a code of a cube in fortran:
PROGRAM myimage
integer xmax,ymax,zmax
parameter (xmax=10,ymax=10,zmax=10)
INTEGER mytable(1:xmax,1:ymax,1:zmax)
do 1 i1=1,xmax
do 2 i2=1,ymax
do 3 i3=1,zmax
mytable(i1,i2,i3)=0
if ((i1.ge.3).and.(i1.le.6).and.(i2.ge.3).and.(i2.le.6).and.(i3.ge.3).and.(i3.le.6)) then
mytable(i1,i2,i3)=1
endif
3 continue
2 continue
1 continue
end
The type of image I'd like to get:
The type of image I want is like this :
The cube would be my pixels mytable=1 and around it there would be pixels : mytable=0
What I tried:
I first tried to write a code to make the image directly in fortran, but it turned out that the image issued was not a 3D image as I wanted (see Appendix 1).
The question:
Could you explain me how to view that type of object in 3D please ?
For instance, following the comment of Vladimir F, I downloaded Paraview.
I found this question that is quite similar to where I stand now.
But I don't understand what exactly I have to write in the file, if I choose to write it in the format UCD. I didn't find explanations on the internet and the link that is brought in the question there does not work.
Appendix 1:
Here is a code for a 2D image and for the 3D image I tried to code.
I first wrote a 2D image that works. I tried to generalize it to 3D. I'd like to view the object where mytable=1 in 3D.
subroutine image2d(mytable,xmax,ymax,zmax)
integer xmax,ymax,zmax,mytable(1:xmax,1:ymax,1:zmax)
character*15 fname
WRITE(fname,'(a)')'myimage2d.ppm'
open (100,file=fname,form='formatted')
write(100,'(a)') 'P3'
write(100,*) '#'
write(100,*) xmax,ymax
write(100,*) 2
do 10 i10=1,xmax
do 20 i20=1,ymax
if (mytable(i10,i20,5).eq.0) then
write(100,*) '2 2 2'
else if (mytable(i10,i20,5).eq.1) then
write(100,*) '0 0 0'
end if
20 continue
10 continue
end
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine image3d(mytable,xmax,ymax,zmax)
integer xmax,ymax,zmax,mytable(1:xmax,1:ymax,1:zmax)
character*15 fname
WRITE(fname,'(a)')'myimage3d.ppm'
open (200,file=fname,form='formatted')
write(200,'(a)') 'P3'
write(200,*) '#'
write(200,*) xmax,ymax,zmax
write(200,*) 3
do 11 i10=1,xmax
do 21 i20=1,ymax
do 31 i30=1,ymax
if (mytable(i10,i20,i30).eq.0) then
write(200,*) '2 2 2'
else if (mytable(i10,i20,i30).eq.1) then
write(200,*) '0 1 2'
end if
31 continue
21 continue
11 continue
end

Calculate features at multiple training windows in Featuretools

I have a table with customers and transactions. Is there a way how to get features that would be filtered for last 3/6/9/12 months? I would like to automatically generate features:
number of trans in last 3 months
....
number of trans in last 12 months
average trans in last 3 months
...
average trans in last 12 months
I've tried using the training_window =["1 month", "3 months"],, but it does not seem to return multiple features for each window.
Example:
import featuretools as ft
es = ft.demo.load_mock_customer(return_entityset=True)
window_features = ft.dfs(entityset=es,
target_entity="customers",
training_window=["1 hour", "1 day"],
features_only = True)
window_features
Do I have to do individual windows separately and then merge the results?
As you mentioned, in Featuretools 0.2.1 you have to build the feature matrices individually for each training window and then merge the results. With your example, you would do that as follows:
import pandas as pd
import featuretools as ft
es = ft.demo.load_mock_customer(return_entityset=True)
cutoff_times = pd.DataFrame({"customer_id": [1, 2, 3, 4, 5],
"time": pd.date_range('2014-01-01 01:41:50', periods=5, freq='25min')})
features = ft.dfs(entityset=es,
target_entity="customers",
agg_primitives=['count'],
trans_primitives=[],
features_only = True)
fm_1 = ft.calculate_feature_matrix(features,
entityset=es,
cutoff_time=cutoff_times,
training_window='1h',
verbose=True)
fm_2 = ft.calculate_feature_matrix(features,
entityset=es,
cutoff_time=cutoff_times,
training_window='1d',
verbose=True)
new_df = fm_1.reset_index()
new_df = new_df.merge(fm_2.reset_index(), on="customer_id", suffixes=("_1h", "_1d"))
Then, the new dataframe will look like:
customer_id COUNT(sessions)_1h COUNT(transactions)_1h COUNT(sessions)_1d COUNT(transactions)_1d
1 1 17 3 43
2 3 36 3 36
3 0 0 1 25
4 0 0 0 0
5 1 15 2 29

How to fetch two associated Database values Using Rails 3

Hi I have two tables in DB.The first table is given below.
Table name-
t_hcsy_details
class name in model-
class THcsyDetails < ActiveRecord::Base
end
The values in side table is given below.
HCSY_Details_ID HCSY_ID HCSY_Fund_Type_ID Amount
1 2 1 1125
2 2 2 390
3 2 3 285
4 2 4 100
5 2 5 60
6 2 6 40
My second table is given below.
Table Name:
t_hcsy_fund_type_master
class in model:
class THcsyFundTypeMaster < ActiveRecord::Base
end
Table values are given below.
HCSY_Fund_Type_ID Fund_Type_Code Fund_Type_Name Amount
1 1 woods 1125
2 2 Burning 390
3 3 goods 285
4 4 brahmin 100
5 5 swd 60
6 6 Photo 40
I know only HCSY_ID value(i.e-2) of first table.But i need Fund_Type_Name and Amount from second table.As you can see one HCSY_ID has 6 different records.But i need all Fund_Type_Name and Amount of one HCSY_ID. Please help me to resolve this issue by creating object for both two classes shown above.Please help me.
You haven't specified any relationships setup, so it would be easier to split this in two queries:
# you already have hcsy_id
fund_type_ids = THcsyDetails.where(hcsy_id: hcsy_id).pluck(:hcsy_fund_type_id)
fund_types = THcsyFundTypeMaster.where(id: fund_type_ids)
fund_types.group(:fund_type_name).sum(:amount)
In case you had proper relationships setup, the above would've simplified to:
THcsyDetails.
joins(association_name). # THcsyFundTypeMaster
where(hcsy_id: hcsy_id).
group("#{t = THcsyFundTypeMaster.table_name}.fund_type_name").
sum("#{t}.amount")

WINBUGS : adding time and product fixed effects in a hierarchical data

I am working on a Hierarchical panel data using WinBugs. Assuming a data on school performance - logs with independent variable logp & rank. All schools are divided into three categories (cat) and I need beta coefficient for each category (thus HLM). I am wanting to account for time-specific and school specific effects in the model. One way can be to have dummy variables in the list of variables under mu[i] but that would get messy because my number of schools run upto 60. I am sure there must be a better way to handle that.
My data looks like the following:
school time logs logp cat rank
1 1 4.2 8.9 1 1
1 2 4.2 8.1 1 2
1 3 3.5 9.2 1 1
2 1 4.1 7.5 1 2
2 2 4.5 6.5 1 2
3 1 5.1 6.6 2 4
3 2 6.2 6.8 3 7
#logs = log(score)
#logp = log(average hours of inputs)
#rank - rank of school
#cat = section red, section blue, section white in school (hierarchies)
My WinBUGS code is given below.
model {
# N observations
for (i in 1:n){
logs[i] ~ dnorm(mu[i], tau)
mu[i] <- bcons +bprice*(logp[i])
+ brank[cat[i]]*(rank[i])
}
}
}
# C categories
for (c in 1:C) {
brank[c] ~ dnorm(beta, taub)}
# priors
bcons ~ dnorm(0,1.0E-6)
bprice ~ dnorm(0,1.0E-6)
bad ~ dnorm(0,1.0E-6)
beta ~ dnorm(0,1.0E-6)
tau ~ dgamma(0.001,0.001)
taub ~dgamma(0.001,0.001)
}
As you can see in the data sample above, I have multiple observations for school over time. How can I modify the code to account for time and school specific fixed effects. I have used STATA in the past and we get fe,be,i.time options to take care of fixed effects in a panel data. But here I am lost.

Resources