PIG Installation steps

By | 28th July 2019

In this previous article, we saw how to install Apache Hive in the Ubuntu machine. Both of these articles are written with an assumption that you have already installed the Hadoop framework in the machine. If not, please visit this post and install the Hadoop framework first.

Pig is another component of the Hadoop ecosystem for processing data of large volumes. Pig uses the PIG LATIN script for data processing. Let’s get the PIG installation started.

  • First, we need to download the PIG tarball. You can download that here.
  • After downloading the tarball, we need to extract it in the home directory.
  • Now that we have PIG in our home directory we need to set the paths in .bashrc file. Add the below lines to the .bashrc file
export PIG_HOME=/home/user/pig
export PATH=$PIG_HOME/bin:$PATH
  • Refresh the bashrc file with the command “source ~/.bashrc“.
  • To check if the paths are set correctly use “echo $PIG_HOME” command to see it in the terminal.

We have now installed PIG in our machine. There are two ways to work in PIG. Local mode and MapReduce mode. In local mode, you don’t have to run the daemons but in MR mode you have to. Use the below commands to start PIG in the respective modes

pig -x local

pig -x mapreduce

During the processing of data in PIG’s grunt shell, you may see a lot of information getting displayed in the shell interface. Those are the logs related to that operation in PIG. To suppress those info level logs and display only the errors please perform the below steps.

  • First, create the log4j.properties file under the conf directory of PIG installed path. This can be done by copying the log4j.properties.template file and renaming it to log4j.properties.
  • cp $PIG_HOME/conf/log4j.properties.template $PIG_HOME/conf/log4j.properties
  • Next, we have to overwrite the properties in the file that points to info with an error.

Replace
log4j.logger.org.apache.pig=info, A ===> log4j.logger.org.apache.pig=error, A

Add
log4j.logger.org.apache.hadoop = error, A

That’s it. You have successfully installed PIG. Enjoy processing using PIG.