Pig SET Keys
Pig SET Keys
Navigation
- Apache Pig Website
- Wiki
- Cheat Sheets
- Pig's Data Model
- My Pig Installation
- My Pig Logging
- My Pig SET Keys
- My Pig Recipes
- My Pig UDF
- Piggybank!!
- Pig's Parameter Substitution
- Hadoop and Pig
- Programming Pig (O Reilly)
Pig uses the set
command to assign values to keys. All keys
and their corresponding values (for Pig and Hadoop) are case sensitive. If
set command is used without key/value pair argument, Pig prints all the
configurations and system properties.
These are the possible values that can be set.
/*****************************************************************************/ /* You can turn off or turn on the debugging freature in Pig by passing */ /* on/off to this key. */ /*****************************************************************************/ SET debug 'on'; /*****************************************************************************/ /* You can set the number of reducers for a map job by passing any whole */ /* number as a value to this key. */ /*****************************************************************************/ SET default_parallel 100; /*****************************************************************************/ /* You can set the Job name to the required job by passing a string value to */ /* this key. */ /*****************************************************************************/ SET job.name 'my job'; /*****************************************************************************/ /* You can set the job priority to a job by passing one of the following */ /* values to this key: */ /* very_low */ /* low */ /* normal */ /* high */ /* very_high */ /*****************************************************************************/ SET job.priority very_high; /*****************************************************************************/ /*****************************************************************************/ SET mapred.map.tasks.speculative.execution false; /*****************************************************************************/ /* The following SET command will suppress the creation of the _SUCCESS file */ /* in the output directory. */ /*****************************************************************************/ SET mapreduce.fileoutputcommitter.marksuccessfuljobs false; /*****************************************************************************/ /* The number of milliseconds before a task will be terminated if it neither */ /* reads an input, writes an output, nor updates its status string. A value */ /* of 0 disables the timeout. Note: a value of 600,000 milliseconds equals */ /* 10 minutes. */ /*****************************************************************************/ SET mapreduce.task.timeout 1800; /*****************************************************************************/ /* The total amount of buffer memory to use while sorting files, in */ /* megabytes. By default, gives each merge stream 1MB, which should */ /* minimize seeks. */ /*****************************************************************************/ SET io.sort.mb 2048; /*****************************************************************************/ /* Only disable multiquery as a temporary workaround for problems. */ /* multiquery is on by default. */ /*****************************************************************************/ SET opt.multiquery false; /*****************************************************************************/ /*****************************************************************************/ SET pig.import.search.path '/usr/local/pig,/grid/pig'; /*****************************************************************************/ /*****************************************************************************/ SET pig.logfile mylogfile.log; /*****************************************************************************/ /* For streaming, you can set the path from where the data is not to be */ /* transferred, by passing the desired path in the form of a string to this */ /* key. */ /*****************************************************************************/ SET stream.skippath my.arbitary.value;