微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何使用pandas从数据帧中获取独特性?

我有df

2016-06-21 06:25:09 [email protected] GET HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 application/json    2130    https://edge-chat.facebook.com/pull?channel=p_100006170407238&seq=27&clientid=1d67ca6e&profile=mobile&partition=-2&sticky_token=185&msgs_recv=27&qp=y&cb=1830997782&state=active&sticky_pool=frc3c09_chat-proxy&uid=100006170407238&viewer_uid=100006170407238&m_sess=&__dyn=1Z3p5wnE-4UpwDF3GAgy78qzoC6Erz8B0GxG9xu3Z0QwFzohxO3O2G2a1mwYxm48sxadwpVEy1qK78gwUx6&__req=79&__ajax__=AYlbtcBwGC2suZLI-J88V0PWa58vtQeG3YlQLydFRsAl6UwLSjsspD7peu8mGl6NsHvd2zxfDcB6A0-XunBugUsYZ1lMYmUu97R43iV7XSfpyg&__user=100006170407238
2016-06-22 06:25:20 [email protected] POST HTTP/1.1   Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 application/x-javascript    20248   https://m.facebook.com/stories.PHP?aftercursor=MTQ2NjY2MzEwNToxNDY2NjYzMTA1Ojg6NzM0ODg0MDExMjAyNDY1MzA5NToxNDY2NjYyNzk1OjA%3D&tab=h_nor&__m_log_async__=1
2016-06-23 06:25:25 [email protected] CONNECT HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 -   0   scontent.xx.fbcdn.net:443
2016-06-23 06:25:25 [email protected] GET HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 text/html   1105    https://m.facebook.com/xti.PHP?xt=2.qid.6299270070554694533%3Amf_story_key.343726573953754118%3Aei.AI%40ecf11fb3faf9c0b1f73ce2a74bc9f228
2016-06-24 06:25:25 [email protected] CONNECT HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 -   0   scontent.xx.fbcdn.net:443
2016-06-25 06:25:25 [email protected] CONNECT HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 -   0   scontent.xx.fbcdn.net:443
2016-06-25 06:25:25 [email protected] CONNECT HTTP/1.1    Mozilla/5.0 (iPhone; cpu iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53   200 -   0   scontent.xx.fbcdn.net:443

我需要为每个ID(仅年,月和日)获取唯一的日期.
期望的输出

[email protected] - 2016-06-21, 2016-06-22, 2016-06-23
[email protected] - 2016-06-24, 2016-06-25

我怎么能得到这个日期?

解决方法:

您可以先从日期中提取所需的信息:

df['filtered date'] = [w[:10] for w in df['date']]

然后你使用`drop duplicates’:

output = df[['id','filtered date']].drop_duplicates()

然后,您可以重新排序数据框以便清晰:

output.sort_values(by['id','filtered date'],inplace = True)

你最终会得到这种输出

    id               filtered date
0   [email protected]  2016-06-24
1   [email protected]  2016-06-25
3   [email protected]  2016-06-21
4   [email protected]  2016-06-22
5   [email protected]  2016-06-23

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐