我有一个多索引df,我只想提取包含空值的行和列,或者如果可能的话,提取偏移量(?)位置:
week_1 week_2 week_3 week_4 week_5 week_6 \
Year
2000 Arizona Cardinals loser winner loser loser winner loser
Atlanta Falcons winner loser winner loser loser loser
Baltimore Ravens winner NaN loser winner winner winner
Buffalo Bills NaN winner loser loser loser winner
Carolina Panthers loser winner loser loser winner loser
因此,理想的产出是:
#the entire index and column location
(2000, Baltimore Ravens , Week_2)
或者,如果这是不可能的,只包含一个南值的行
week_1 week_2 week_3 week_4 week_5 week_6 \
Year
2000
Baltimore Ravens winner NaN loser winner winner winner
Buffalo Bills NaN winner loser loser loser winner
我试过这样的方法:
idx = pd.IndexSlice
x =df.loc[idx[:, :], idx['week_1':'week_16']].isnull()
然后是dfx或df.locx,但是我得到了一个只有NaN值的数据。
week_1 week_2 week_3 week_4 week_5 week_6 week_7 \
Year
2000 Arizona Cardinals NaN NaN NaN NaN NaN NaN NaN
Atlanta Falcons NaN NaN NaN NaN NaN NaN NaN
Baltimore Ravens NaN NaN NaN NaN NaN NaN NaN
Buffalo Bills NaN NaN NaN NaN NaN NaN NaN
Carolina Panthers NaN NaN NaN NaN NaN NaN NaN
发布于 2019-09-01 16:22:53
假设你在熊猫上0.25或更高,所以你可以使用explode
s = df.apply(lambda row: row[row.isna()].index, axis=1) \
.explode() \
.dropna()
结果:
Year Team
2000 Baltimore Ravens week_2
Buffalo Bills week_1
它所做的:
apply
遍历每一行并获取属于na
的列的名称。这将返回一个可能为空的列表,因为一行可以有0到多个na
列。explode
将嵌入在每一行中的列列表转换为它们自己的行,并根据需要重复索引。dropna
删除没有na
列的行。https://stackoverflow.com/questions/57750058
复制