我希望使用boxplot (目前是通过seaborn)来显示两个条件之间的差异。它将从排序中受益,然而,它认为最大的价值将来自于通过不同的方法进行排序。我的想法是计算单独数据帧中均值的差异,然后尝试使用重新排序的数据帧中的列排序来影响
#custom palette
my_pal = {"Year A": "#e42628", "Year B": "#377db6"}
plt.figure(figsize=(16, 10))
ax = sns.boxplot(x='variable', y="value", hue="Condition", showmeans=True, data=df, palette=my_pal, meanprops={"marker":"s","markerfacecolor":"white", "markeredgecolor":"black"})
plt.ylabel("Temperature (\xb0C)")
plt.xlabel("Room")
我已经计算了均值和差值,使用:
# calculate means based on condition
meansofgroups = df.groupby('Condition').mean()
# calculate difference in means and then order
diff = meansofgroups.diff()
# convert all values to positive
diff.abs()
# order values in descending order
diff.T.sort_values('Year B', ascending=False).T
这给了我们
E D G I B etc.
Condition
Year A NaN NaN NaN NaN NaN etc.
Year B 3.213795 2.473751 1.802886 0.9225 0.527404 etc.
然而,我不确定如何使用这个新的列排序数据帧来影响我的箱线图排序?谢谢!
发布于 2020-04-20 17:11:42
破解了!然而,更多的pythonic方法非常受欢迎!
# calculate means based on condition
meansofgroups = df.groupby('Condition').mean()
# calculate difference in means and then order
diff = meansofgroups.diff()
# convert all values to positive
diff2 = diff.abs()
# order values in descending order
diffordered = diff2.T.sort_values('Year B', ascending=False).T
# Reset the index to retain the condition column
diffordered2 = diffordered.reset_index()
# Sort old dataframe using new order
df_new = df[diffordered2.columns]
# Dataframe for plotting
df_new_melt = df_new.melt(id_vars=['Condition'])
然后,可以在箱线图中使用'df_new_melt‘来创建:
https://stackoverflow.com/questions/61326279
复制相似问题