微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Python比较忽略了nan

虽然nan == nan总是假的,但在很多情况下人们都希望将它们视为平等,这在pandas.DataFrame.equals中得到了体现:@H_404_1@

@H_404_1@

NaNs in the same location are considered equal.@H_404_1@

当然,我可以写@H_404_1@

@H_404_1@

def equalp(x, y):
    return (x == y) or (math.isnan(x) and math.isnan(y))

但是,对于像[float(“nan”)]这样的容器和非数字上的isnan barfs(因此the complexity increases),这将失败.@H_404_1@

那么,人们如何比较可能包含nan的复杂Python对象呢?@H_404_1@

PS.动机:当比较pandas DataFrame中的两行时,我会convert them into dicts并逐个元素地比较dicts.@H_404_1@

PPS.当我说“比较”时,我在想diff,而不是equalp.@H_404_1@

解决方法:@H_404_1@

假设您有一个带有nan值的数据框:@H_404_1@

@H_404_1@

In [10]: df = pd.DataFrame(np.random.randint(0, 20, (10, 10)).astype(float), columns=["c%d"%d for d in range(10)])

In [10]: df.where(np.random.randint(0,2, df.shape).astype(bool), np.nan, inplace=True)

In [10]: df
Out[10]:
     c0    c1    c2    c3    c4    c5    c6    c7   c8    c9
0   NaN   6.0  14.0   NaN   5.0   NaN   2.0  12.0  3.0   7.0
1   NaN   6.0   5.0  17.0   NaN   NaN  13.0   NaN  NaN   NaN
2   NaN  17.0   NaN   8.0   6.0   NaN   NaN  13.0  NaN   NaN
3   3.0   NaN   NaN  15.0   NaN   8.0   3.0   NaN  3.0   NaN
4   7.0   8.0   7.0   NaN   9.0  19.0   NaN   0.0  NaN  11.0
5   NaN   NaN  14.0   2.0   NaN   NaN   0.0   NaN  NaN   8.0
6   3.0  13.0   NaN   NaN   NaN   NaN   NaN  12.0  3.0   NaN
7  13.0  14.0   NaN   5.0  13.0   NaN  18.0   6.0  NaN   5.0
8   3.0   9.0  14.0  19.0  11.0   NaN   NaN   NaN  NaN   5.0
9   3.0  17.0   NaN   NaN   0.0   NaN  11.0   NaN  NaN   0.0

并且你想要比较行,比如第0行和第8行.然后只使用fillna并进行矢量化比较:@H_404_1@

@H_404_1@

In [12]: df.iloc[0,:].fillna(0) != df.iloc[8,:].fillna(0)
Out[12]:
c0     True
c1     True
c2    False
c3     True
c4     True
c5    False
c6     True
c7     True
c8     True
c9     True
dtype: bool

如果您只想知道哪些列不同,可以使用生成的布尔数组索引列:@H_404_1@

@H_404_1@

In [14]: df.columns[df.iloc[0,:].fillna(0) != df.iloc[8,:].fillna(0)]
Out[14]: Index(['c0', 'c1', 'c3', 'c4', 'c6', 'c7', 'c8', 'c9'], dtype='object')

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐