{"id":1435,"date":"2016-08-16T07:32:48","date_gmt":"2016-08-16T07:32:48","guid":{"rendered":"http:\/\/codethataint.com\/blog\/?p=1435"},"modified":"2016-08-16T16:29:21","modified_gmt":"2016-08-16T16:29:21","slug":"mapreduce-program-with-default-mapper-and-reducer","status":"publish","type":"post","link":"https:\/\/codethataint.com\/blog\/mapreduce-program-with-default-mapper-and-reducer\/","title":{"rendered":"MapReduce Program with Default Mapper and Reducer"},"content":{"rendered":"<p><strong>Default Mapper and Reducer are from<\/strong><\/p>\n<pre>\r\nimport org.apache.hadoop.mapreduce.Mapper;\r\nimport org.apache.hadoop.mapreduce.Reducer;\r\n<\/pre>\n<p>when the Mapper and reducer are not set using <strong>job.setMapperClass()<\/strong><br \/>\n and <strong>job.setReducerClass()<\/strong> then default <strong>Mapper.class<\/strong> and <strong>Reducer.class<\/strong> will be considered<\/p>\n<p>The Mapper.class performs a word count on lines.The input and output of the default ampper and reducer are as shown.<\/p>\n<p><strong>Input<\/strong><\/p>\n<pre>\r\nTest\r\nTest\r\nTest\r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<pre>\r\n0\ttest\r\n5\ttest\r\n10\ttest\r\n<\/pre>\n<p><em>The Line is considered as a word test &#8211; 4 + Carriage Return 1 = 5<\/em> <\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\npackage com.mugilmapred;\r\n\r\nimport org.apache.hadoop.conf.Configured;\r\nimport org.apache.hadoop.fs.FileSystem;\r\nimport org.apache.hadoop.fs.Path;\r\nimport org.apache.hadoop.mapreduce.lib.input.FileInputFormat;\r\nimport org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;\r\nimport org.apache.hadoop.mapreduce.Job;\r\nimport org.apache.hadoop.util.Tool;\r\nimport org.apache.hadoop.util.ToolRunner;\r\n\r\npublic class Test extends Configured implements Tool{\r\n\r\n\tpublic static void main(String&#x5B;] args) throws Exception {\r\n\t\t\/\/ TODO Auto-generated method stub\r\n\t\tTest objTest = new Test();\r\n\t\tint result = ToolRunner.run(objTest, args);\r\n\t\tSystem.exit(result);\r\n\t}\r\n\r\n\tpublic int run(String&#x5B;] args) throws Exception {\r\n\t\t\/\/ TODO Auto-generated method stub\t\t\r\n\t\tJob job =  new Job(getConf());\r\n                job.setJarByClass(Test.class);\r\n\t\t\t\t\r\n\t\tPath inputFilepath = new Path(args&#x5B;0]);\r\n\t\tPath outputFilepath = new Path(args&#x5B;1]);\r\n\t\t\r\n\t\tFileInputFormat.addInputPath(job, inputFilepath);\r\n\t\tFileOutputFormat.setOutputPath(job, outputFilepath);\r\n\t\t\r\n\t\tFileSystem fs = FileSystem.newInstance(getConf());\r\n\t\t\r\n\t\tif(fs.exists(outputFilepath))\r\n\t\t{\r\n\t\t\tfs.delete(outputFilepath, true);\r\n\t\t}\t\t\t\r\n\t\treturn job.waitForCompletion(true)? 0:1;\r\n\t}\r\n}\r\n<\/pre>\n<p>when you dont add set jar by class it will throw <\/p>\n<pre>\r\nError: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.mugilmapred.Test$Map not found\r\n<\/pre>\n<p>If you run locally it wont expect this to be specified by when you run in local the class which contains the mapper should be specified else the system does not know in which jar file the mapper is located<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\njob.setJarByClass(Test.class);\r\n<\/pre>\n<p>You can aslo use setJar as below<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\njob.setJar(&quot;Test-0.0.1.jar&quot;);\r\n<\/pre>\n<p><strong>Using a Predefined Reducer in Program<\/strong><\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\n.\r\n.\r\n.\r\njob.setMapperClass(WordMapper.class);\r\njob.setMapOutputKeyClass(Text.class);\r\njob.setMapOutputValueClass(LongWritable.class);\r\n    \r\njob.setReducerClass(LongSumReducer.class);\r\njob.setNumReduceTasks(1);\r\n.\r\n.\r\n.\r\n.\r\n<\/pre>\n<p><strong>LongSumReducer.class takes input from mapper ([count,1] [count,1] [count,1] [count,1]) and group it together  as ([count,4])<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Default Mapper and Reducer are from import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; when the Mapper and reducer are not set using job.setMapperClass() and job.setReducerClass() then default Mapper.class and Reducer.class will be considered The Mapper.class performs a word count on lines.The input and output of the default ampper and reducer are as shown. Input Test Test Test Output&hellip; <a href=\"https:\/\/codethataint.com\/blog\/mapreduce-program-with-default-mapper-and-reducer\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[150],"tags":[],"class_list":["post-1435","post","type-post","status-publish","format-standard","hentry","category-map-reduce"],"_links":{"self":[{"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/posts\/1435","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/comments?post=1435"}],"version-history":[{"count":5,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/posts\/1435\/revisions"}],"predecessor-version":[{"id":1440,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/posts\/1435\/revisions\/1440"}],"wp:attachment":[{"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/media?parent=1435"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/categories?post=1435"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codethataint.com\/blog\/wp-json\/wp\/v2\/tags?post=1435"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}