微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

2.安装Spark与Python练习

一、安装Spark

1、检查基础环境hadoop,jdk

 

3、相关文件配置

4、环境配置

5、运行python代码

二、Python编程练习:英文文本的词频统计

1、准备文本(f1.txt)

Please send this message to those people who mean something to you,to those who have touched your life in one way or another,to those who make you smile when you really need it,to those that make you see the brighter side of things when you are really down,to those who you want to let them kNow that you appreciate their friendship.And if you don’t, don’t worry,nothing bad will happen to you,you will just miss out on the opportunity to brighten someone’s day with this message.

 2、插入代码

复制代码

path='/home/hadoop/sb/f1.txt'
with open(path) as f:
    text=f.read()
words = text.split()
sb={}
for word in words:
    sb[word]=sb.get(word,0)+1
sblist=list(sb.items())
sblist.sort(key=lambda x:x[1],reverse=True)
print(sblist)

复制代码

3、输出结果

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐