Rodney and Arlyn Ogden

Pig SET Keys
Navigation

Pig uses the set command to assign values to keys. All keys and their corresponding values (for Pig and Hadoop) are case sensitive. If set command is used without key/value pair argument, Pig prints all the configurations and system properties.
These are the possible values that can be set.
/*****************************************************************************/
/* You can turn off or turn on the debugging freature in Pig by passing      */
/* on/off to this key.                                                       */
/*****************************************************************************/
SET debug 'on';

/*****************************************************************************/
/* You can set the number of reducers for a map job by passing any whole     */
/* number as a value to this key.                                            */
/*****************************************************************************/
SET default_parallel 100;

/*****************************************************************************/
/* You can set the Job name to the required job by passing a string value to */
/* this key.                                                                 */
/*****************************************************************************/
SET job.name 'my job';

/*****************************************************************************/
/* You can set the job priority to a job by passing one of the following     */
/* values to this key:                                                       */
/*    very_low                                                               */
/*    low                                                                    */
/*    normal                                                                 */
/*    high                                                                   */
/*    very_high                                                              */
/*****************************************************************************/
SET job.priority very_high;

/*****************************************************************************/
/*****************************************************************************/
SET mapred.map.tasks.speculative.execution false;

/*****************************************************************************/
/* The following SET command will suppress the creation of the _SUCCESS file */
/* in the output directory.                                                  */
/*****************************************************************************/
SET mapreduce.fileoutputcommitter.marksuccessfuljobs false;

/*****************************************************************************/
/* The number of milliseconds before a task will be terminated if it neither */
/* reads an input, writes an output, nor updates its status string. A value  */
/* of 0 disables the timeout.  Note:  a value of 600,000 milliseconds equals */
/* 10 minutes.                                                               */
/*****************************************************************************/
SET mapreduce.task.timeout 1800;

/*****************************************************************************/
/* The total amount of buffer memory to use while sorting files, in          */
/* megabytes.  By default, gives each merge stream 1MB, which should         */
/* minimize seeks.                                                           */
/*****************************************************************************/
SET io.sort.mb 2048;

/*****************************************************************************/
/* Only disable multiquery as a temporary workaround for problems.           */
/* multiquery is on by default.                                              */
/*****************************************************************************/
SET opt.multiquery false;

/*****************************************************************************/
/*****************************************************************************/
SET pig.import.search.path '/usr/local/pig,/grid/pig';

/*****************************************************************************/
/*****************************************************************************/
SET pig.logfile mylogfile.log;

/*****************************************************************************/
/* For streaming, you can set the path from where the data is not to be      */
/* transferred, by passing the desired path in the form of a string to this  */
/* key.                                                                      */
/*****************************************************************************/ 
SET stream.skippath my.arbitary.value;