微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

java – Hadoop:Reduce不产生所需的输出,它与map输出相同

这是我的地图

 public static class MapClass extends Mapper<LongWritable, Text, Text, Text> {

        public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
            String[] fields = value.toString().split(",", -20);
            String country = fields[4];
            String numClaims = fields[8];
            if (numClaims.length() > 0 && !numClaims.startsWith("\"")) {
                context.write(new Text(country), new Text(numClaims + ",1"));
            }
        }
    }

这是我的减少

public void reduce(Text key, Iterator<Text> values, Context context) throws IOException, InterruptedException {
            double sum = 0.0;
            int count = 0;

            while (values.hasNext()) {
                String[] fields = values.next().toString().split(",");
                sum += Double.parseDouble(fields[0]);
                count += Integer.parseInt(fields[1]);
            }

            context.write(new Text(key), new DoubleWritable(sum/count));
        }

以下是它的配置方式

Job job = new Job(getConf());

            job.setJarByClass(AverageByAttributeUsingCombiner.class);
            job.setJobName("AverageByAttributeUsingCombiner");

            job.setoutputKeyClass(Text.class);
            job.setoutputValueClass(Text.class);

            job.setMapperClass(MapClass.class);
    //        job.setCombinerClass(Combinber.class);
            job.setReducerClass(Reduce.class);

            job.setInputFormatClass(TextInputFormat.class);
            job.setoutputFormatClass(textoutputFormat.class);

            FileInputFormat.setInputPaths(job, new Path(args[0]));
            FileOutputFormat.setoutputPath(job, new Path(args[1]));

    //        job.setNumReduceTasks(0); // to not run the reducer
            boolean success = job.waitForCompletion(true);
            return success ? 0 : 1;

输入是形式的

   "PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE","CLAims","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL","ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD│                                                                                                                                                                                                                
    ","SECDLWBD"                                                                                                                                                                                                         │                                                                                                                                                                                                                
    3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,,                                                                                                                                                                 │                                                                                                                                                                                                                
    3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,,                                                                                                                                                                  │                                                                                                                                                                                                                
    3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,,                                                                                                                                                            │                                                                                                                                                                                                                
    3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,,        

整个地图缩小的输出看起来像

“AR”5,1│
“AR”9,1│
“AR”2,1│
“AR”15,1│
“AR”13,1│
“AR”1,1│
“AR”34,1│
“AR”12,1│
“AR”8,1│
“AR”7,1│
“AR”23,1│
“AR”3,1│
“AR”4,1│
“AR”4,1

如何调试和修复此问题?我正在学习hadoop

解决方法:

如前所述,问题在于您没有覆盖认抽象Reducer类的认reduce方法.

更具体地说,到目前为止(one / the)问题是你的reduce方法签名是:

 public void reduce(Text key, **Iterator**<Text> values, Context context) 
             throws IOException, InterruptedException

相反,它应该是:

 public void reduce(Text key, **Iterable**<Text> values, Context context) 
             throws IOException, InterruptedException

旧的API版本是正确的,您实现Reducer接口reduce()方法,它的工作原理.

对这种情况的一个很好的验证是使用@Override,因为它强制编译签名不匹配的时间检查.

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐