Spark Error – java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries

SPARK ERROR – JAVA IO IOEXCEPTION COULD NOT LOCATE EXECUTABLE NULL-BIN WINUTILS.EXE IN THE HADOOP BINARIES

Spark Error – java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries

           The following error is due to missing winutils binary in the classpath while running Spark application. Winutils is a part of Hadoop ecosystem and is not included in Spark. The actual functionality of your  application may run correctly even after the exception is thrown. But it is better to have it in place to avoid unnecessary problems. In order to avoid error, download winutils.exe binary and add the same to the classpath.

Error:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

Sample Code:

           The following program counts the number of lines containing the character ‘a’. 

Download Link :

  1. Link 1
  2. Link 2

           Copy the downloaded file to ANY_DIRECTORY/bin/winutils.exe

package spark;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;

public class SparkTestApp{

    	public static void main(String[] args) {

        		System.setProperty("hadoop.home.dir", "ANY_DIRECTORY");
        		
        // Example
        // winutils.exe is copied to C:\winutil\bin\
        // System.setProperty("hadoop.home.dir", "C:\\winutil\\");
        		String logFile = "C:\\sample_log.log";
        		SparkConf conf = new SparkConf().setAppName("Simple Application").setMaster("local");
        		JavaSparkContext sc = new JavaSparkContext(conf);
        		JavaRDD logData = sc.textFile(logFile).cache();

        		long numAs = logData.filter(new Function<String, Boolean>() {
        			public Boolean call(String s) {
            				return s.contains("a");
        			}
        		}).count();

        		System.out.println("Lines with a: " + numAs);

    	}
}

Output:

Lines with a: 376358

[ YOU MAY ALSO LIKE ]

Leave a Reply