Pig Recipes
Recipe 003
Navigation
- Apache Pig Website
- Wiki
- Cheat Sheets
- Pig's Data Model
- My Pig Installation
- My Pig Logging
- My Pig SET Keys
- My Pig Recipes
- My Pig UDF
- Piggybank!!
- Pig's Parameter Substitution
- Hadoop and Pig
- Programming Pig (O Reilly)
Goal
Copy a file from one location to another within the local file system. Use parameter subsititution to specify the source directory.
Source File: argSourceDirectory/KeyValuePair.txt
Target File: /user/hduser/dvd/out/
Parameter Substitution
- argSourceDirectory (Required, no default value defined) - The directory name that contains the source file. Ensure that the value used in this parameter does not end with a slash (/).
After execution, the output directory should look like this:
hduser> ls -la total 94492 drwxrwxr-x. 2 hduser hduser 4096 Apr 21 10:42 . drwxrwxr-x. 3 hduser hduser 52 Apr 21 10:42 .. -rw-r--r--. 1 hduser hduser 32523640 Apr 21 10:42 part-m-00000 -rw-rw-r--. 1 hduser hduser 254100 Apr 21 10:42 .part-m-00000.crc -rw-r--r--. 1 hduser hduser 32515459 Apr 21 10:42 part-m-00001 -rw-rw-r--. 1 hduser hduser 254036 Apr 21 10:42 .part-m-00001.crc -rw-r--r--. 1 hduser hduser 30942031 Apr 21 10:42 part-m-00002 -rw-rw-r--. 1 hduser hduser 241744 Apr 21 10:42 .part-m-00002.crc -rw-r--r--. 1 hduser hduser 0 Apr 21 10:42 _SUCCESS -rw-rw-r--. 1 hduser hduser 8 Apr 21 10:42 ._SUCCESS.crc
The following is the script file.
/*****************************************************************************/ /* Recipe003.pig */ /* */ /* Purpose: */ /* Copy the KeyValuePair.txt file to the KeyValuePair.out directory. Use */ /* parameter substitution to specify the input directory. */ /* */ /* Parameter Substitution */ /* argSourceDirectory (Required, no default value defined) */ /* The directory name that contains the source file. Ensure that the value */ /* used in this parameter does not end with a slash (/). */ /* */ /* Pig Execution Mode: local */ /* Pig Batch Execution: */ /* pig -x local -p argSourceDirectory=/home/hduser/data Recipe003.pig */ /* pig -x local -p argSourceDirectory=/home/hduser/data -dryrun Recipe003.pig */ /* */ /* The target directory must not exist prior to executing this script. Use */ /* this command to safely delete the target directory: */ /* rm -rf /home/hduser/data/Recipe003.out */ /* */ /*****************************************************************************/ /* Date Initials Description */ /* -------- -------- ------------------------------------------------------- */ /* 20160421 Reo Initial. */ /*****************************************************************************/ /*****************************************************************************/ /* Read in the data using a comma (,) as the delimiter. */ /*****************************************************************************/ DVDData = LOAD '$argSourceDirectory/KeyValuePair.txt' USING PigStorage(',') AS ( DVDName:chararray, AttributeName:chararray, AttributeValue:chararray ); /*****************************************************************************/ /* Time to STORE the data that was just read in. */ /*****************************************************************************/ STORE DVDData INTO '/home/hduser/data/Recipe003.out';