In this post, we will see how to install Hive in your Ubuntu machine. Hive is a tool to query and process data from HDFS. Hive uses HQL(Hive Query Language) for processing data. It follows MySQL syntax so people from SQL background will find it easy to work with the hive. Let’s get into the installation steps for installing this hive tool.
- Download hive to your local machine. Keep the downloaded tarball in the home directory of your Ubuntu machine.
- Uncompress hive tarball. You can either right-click and uncompress or use the command to do it “tar -xzvf apache-hive-1.2.1-bin.tar.gz”.
- Now we have to set the path for the hive in the .bashrc file under the home directory. In case if you are not able to see the file press Ctrl+H(to make the hidden files visible). Open the .bashrc file and add the below lines
export HIVE_HOME=/home/user/apache-hive-1.2.1-bin<br>export PATH=$PATH:$HIVE_HOME/bin
- After appending these lines you can just save and close the file. For the values to get updated and set for the current shell we need to source the .bashrc file by running the command “~$source ~/.bashrc”
- Hive is installed and you can log in to hive client using “hive” command in your terminal.
- By default, the meta-information about the hive schema, tables, and other objects will be stored in derby. We have to change it to MySQL so that the “metastore” details will not be created again and again whenever you log in to hive client.
Steps to connect hive with MySQL metastore
- Here I assume that you have already installed MySQL in your machine and if not please check this post for installing MySQL and continue.
- Navigate to the MySQL folder inside the hive installed folder using the below command.
cd $HIVE_HOME/scripts/metastore/upgrade/mysql
- Next, run the below commands to create the metastore for hive in MySQL
mysql -uroot -proot CREATE DATABASE metastore; USE metastore; SOURCE hive-schema-1.2.0.mysql.sql;
- After that, you have to download the MySQL connector and copy it to the hive lib directory.
- Similar to hdfs-site.xml, we have hive-site.xml that needs to be configured.
There are few properties mandatory properties that need to be added in the hive-site.xml file. The properties are,
<?xml version="1.0" encoding="UTF-8"?> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost/metastore</value> <description>the URL of the MySQL database</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>root</value> </property>
- You can also download the hive-site.xml with the required properties from our site.
- Copy this hive-site.xml file to the conf folder inside the hive installed directory.
- We are all set to start hive now. Hive can be started from the terminal window in the Ubuntu machine with the following command.
nohup hive –service metastore &
nohup is used to run the Linux commands in the background. You can also use the command “hive –service metastore &” if you don’t want to run this in background
Incase if we want to access hive from third party tools like HUE, we need to start another hive daemon – hive server using the command
“hive –service hiveserver2 &“
That’s it. You have successfully installed hive. Happy querying 🙂