我很难重新组织这个数据帧.我想我应该使用pd.pivot_table或pd.crosstab,但我不知道如何完成工作.
这是我的DataFrame:
vicro = pd.read_csv(vicroURL)
vicro_subset = vicro.ix[:,['P1', 'P10', 'P30', 'P71', 'P82', 'P90']]
In [6]: vicro
vicro vicroURL vicro_subset
In [6]: vicro_subset.head()
Out[6]:
P1 P10 P30 P71 P82 P90
0 - I - - - M
1 - I - V T M
2 - I - V A M
3 - I - T - M
4 - - - - A -
我该做什么是获取此数据框中的所有可能值并将它们分成行.新值将是计数.看起来像:
Out[6]:
P1 P10 P30 P71 P82 P90
I 0 4 0 0 0 0
V 0 0 0 2 0 0
A 0 0 0 0 2 0
M 0 0 0 0 0 4
T 0 0 0 1 1 0
任何帮助将不胜感激!谢谢.
编辑:
用熔化来阐述答案,这些都帮助我更多地了解了熊猫,但在“融化”答案中我有更多的未知数:
In [8]: melted_df = pd.melt(vicro_subset)
In [9]: melted_df.head()
Out[9]:
variable value
0 P1 -
1 P1 -
2 P1 -
3 P1 -
4 P1 -
In [13]: grouped_melt = melted_df.groupby(['variable','value'])['value'].count()
In [14]: grouped_melt.head()
Out[14]:
variable value
P1 - 797
. 269
P10 - 339
. 1
F 132
In [15]: unstacked_group = grouped_melt.unstack()
In [16]: unstacked_group.head()
Out[16]:
<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, P1 to P82
Data columns:
- 5 non-null values
. 2 non-null values
A 1 non-null values
AITV 1 non-null values
AT 2 non-null values
In [17]: transpose_unstack = unstacked_group.T
In [18]: transpose_unstack.head()
Out[18]:
variable P1 P10 P30 P71 P82 P90
value
- 797 339 1005 452 604 634
. 269 1 NaN NaN NaN NaN
A NaN NaN NaN NaN 282 NaN
AITV NaN NaN NaN NaN 1 NaN
AT NaN NaN NaN 1 2 NaN
解决方法:
或者,像这样的东西应该工作:
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: df = pd.DataFrame(np.random.randint(0,5,12).reshape(3,4),
columns=list('abcd'))
In [4]: print df
a b c d
0 2 2 3 1
1 0 1 0 2
2 1 3 0 4
In [5]: new = pd.concat([df[col].value_counts() for col in df.columns], axis=1)
In [6]: new.columns = df.columns
In [7]: print new
a b c d
0 1 NaN 2 NaN
1 1 1 NaN 1
2 1 1 NaN 1
3 NaN 1 1 NaN
4 NaN NaN NaN 1
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。