一、pom.xml
<!-- spark --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.3.4</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.3.4</version> </dependency> <!-- log4j 1.2.17 --> <dependency> <groupId>log4j</groupId> <artifactId>log4j</artifactId> <version>1.2.17</version> </dependency> <!-- https://mvnrepository.com/artifact/com.google.guava/guava --> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>14.0.1</version> </dependency>
因为需要打包scala的代码和maven胖包:
<build> <!-- 缩短jar包名字 --> <finalName>myspark</finalName> <plugins> <!-- scala --> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <version>2.15.2</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <!-- 胖包 --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-assembly-plugin</artifactId> <version>2.5.5</version> <configuration> <archive> <manifest> <mainClass>com.njbdqn.MySpark</mainClass> </manifest> </archive> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build>
二、关于scala的配置,setting中java compiler-> 1.8 ;project structure -> 8; libraries 中添加scala-sdk
三、目录
四、spark代码
// 1, sparkConf val conf = new SparkConf().setMaster("local[2]").setAppName("myjob")// 本地2核 ;*表示所有的核 // 2, sparkcontext val sc = new SparkContext(conf) //创建SparkContext,该对象是提交spark App的入口 // 3, 使用sc创建RDD并执行相应的transformation和action sc.textFile("file:////root/sparkdata/wordcount.txt") // sc.textFile("file:////D:\\idea\\ideaProjects\\spark_projects\\myspark8\\src\\main\\scala\\com\\njbdqn\\wordcount.txt") .flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).foreach(println(_)) // 4, 停止sc,结束该任务 sc.stop()
五、maven打包package后,把target下胖包(是with-dependences那个jar包)放到linux下java -jar 路径 执行
./spark-submit --class com.njbdqn.MySpark /root/myspark-jar-with-dependencies.jar
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。