python – pandas：多列的to_numeric

我正在使用以下df：

@H_404_5@c.sort_values('2005', ascending=False).head(3)
      GeoName ComponentName     IndustryId IndustryClassification Description                                2004 2005  2006  2007  2008  2009 2010 2011 2012 2013 2014
37926 Alabama Real GDP by state 9          213                    Support activities for mining              99   98    117   117   115   87   96   95   103  102  (NA)
37951 Alabama Real GDP by state 34         42                     Wholesale Trade                            9898 10613 10952 11034 11075 9722 9765 9703 9600 9884 10199
37932 Alabama Real GDP by state 15         327                    NonMetallic mineral products manufacturing 980  968   940   1084  861   724  714  701  589  641  (NA)

我想在所有年份强制数字：

@H_404_5@c['2014'] = pd.to_numeric(c['2014'], errors='coerce')

有没有一种简单的方法可以做到这一点,还是我必须全部输入？

解决方法:

更新：之后您无需转换您的值,您可以在阅读CSV时即时执行此操作：

@H_404_5@In [165]: df=pd.read_csv(url, index_col=0, na_values=['(NA)']).fillna(0)

In [166]: df.dtypes
Out[166]:
GeoName                    object
ComponentName              object
IndustryId                  int64
IndustryClassification     object
Description                object
2004                        int64
2005                        int64
2006                        int64
2007                        int64
2008                        int64
2009                        int64
2010                        int64
2011                        int64
2012                        int64
2013                        int64
2014                      float64
dtype: object

如果需要将多个列转换为数字dtypes – 请使用以下技术：

样本来源DF：

@H_404_5@In [271]: df
Out[271]:
     id    a  b  c  d  e    f
0  id_3  AAA  6  3  5  8    1
1  id_9    3  7  5  7  3  BBB
2  id_7    4  2  3  5  4    2
3  id_0    7  3  5  7  9    4
4  id_0    2  4  6  4  0    2

In [272]: df.dtypes
Out[272]:
id    object
a     object
b      int64
c      int64
d      int64
e      int64
f     object
dtype: object

将所选列转换为数字dtypes：

@H_404_5@In [273]: cols = df.columns.drop('id')

In [274]: df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')

In [275]: df
Out[275]:
     id    a  b  c  d  e    f
0  id_3  NaN  6  3  5  8  1.0
1  id_9  3.0  7  5  7  3  NaN
2  id_7  4.0  2  3  5  4  2.0
3  id_0  7.0  3  5  7  9  4.0
4  id_0  2.0  4  6  4  0  2.0

In [276]: df.dtypes
Out[276]:
id     object
a     float64
b       int64
c       int64
d       int64
e       int64
f     float64
dtype: object

PS如果要选择所有字符串(对象)列,请使用以下简单技巧：

@H_404_5@cols = df.columns[df.dtypes.eq('object')]

python – pandas：多列的to_numeric

相关推荐