2 December 2018
In this blog post we will install Apache Hive in Ubuntu Machine(
Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-36-generic x86_64)).
Once installation is complete we will run Hive queries using Hive Query Language(HQL) to Verify the installation.
Before Installing hive ,we need to make sure that both
Hadoop is installed and configured in cluster.
First Update the Ubuntu with latest software and patches if available
sudo apt-get update && sudo apt-get -y dist-upgrade
Use the below command to Install open jdk version of Java.
sudo apt-get -y install openjdk-8-jdk-headless
First Download the latest available Hive installation archive from the mirror site.
cd /tmp sudo wget https://www-eu.apache.org/dist/hive/stable-2/apache-hive-2.3.4-bin.tar.gz
Once the file is downloaded, Uncompress the Tar file and move to installation location
tar -xvf apache-hive-2.3.4-bin.tar.gz mv apache-hive-2.3.4-bin /usr/local/hive
If you want to run hive besides
root user you need to change ownership of hive directory to desired user and hive proper permission .
For my case Apache Hive is being installed for user
hduser at location
## Give 755 Permisiion to Folder chmod 755 -R /usr/local/hive ## Change ownership chown -R hduser /usr/local/hive
Skip this step if you are installing hive as default user.
Now we have moved the hive installation file to
/usr/local/hive.We need to add this path to Ubuntu system Path if we wanto access hive from anywhere in that Ubuntu.
In Debain based system
.bashrc is is a shell script that Bash runs whenever it is started interactively. It initializes an interactive shell session.
Use the text editor like
nano to open and edit the file.
Set the Hive Home Path in the
.bashrc file like below.
#HIVE Path export HIVE_HOME=/usr/local/hive export HIVE_CONF_DIR=/usr/local/hive/conf export PATH=$HIVE_HOME/bin:$PATH
Now, to make the Hive path available ,we need to reload the
.bashrc file using the
Before running Hive we need to make sure that Apache Hadoop and Java is set up in path and running properly.
#HADOOP VARIABLES START export HADOOP_HOME="/usr/local/hadoop" export PATH="$HADOOP_HOME/bin:$PATH" export PATH="$HADOOP_HOME/sbin:$PATH" export HADOOP_MAPRED_HOME="$HADOOP_HOME" export HADOOP_COMMON_HOME="$HADOOP_HOME" export HADOOP_HDFS_HOME="$HADOOP_HOME" export YARN_HOME="$HADOOP_HOME" export HADOOP_COMMON_LIB_NATIVE_DIR="$HADOOP_HOME/lib/native" export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" #HADOOP VARIABLES END export JAVA_HOME=/usr/lib/jvm/java-8-oracle export PATH=$JAVA_HOME/bin:$PATH
Now use the
hadoop version command to check if Apache Hadoop is running or not.
Let’s configure the diretcory information in Hadoop Distributed File System(HDFS) where hive can store its data.
hdfs dfs -mkdir -p /user/hive/warehouse
Now give proper permission to the warehouse
hdfs dfs -chmod 755 /user/hive/warehouse
Now let’s inform hive about the database that it should use for its schema definition. The below command tells hive to use derby database as its metastore database. We can also specify this in the hadoop hive configuration file ‘hive-site.xml’ file.
$HIVE_HOME/bin/schematool -initSchema -dbType derby
We will create a new database named
niten_test and display all existing databses using
SHOW DATABASES command.
CREATE DATABASE IF NOT EXISTS niten_test; SHOW DATABASES;
We have just created our own database, which we can used to create table.
so switch to the databse you just created.
Now create a table inside this databse with below fields.
CREATE TABLE IF NOT EXISTS niten_table( id INT, first_name String, last_name String, website String);
Once table is successfully created ,we can display the tables and the schema of the table.
show tables; desc niten_table;
INSERT INTO TABLE niten_table VALUES(1,'Nitendra','Gautam','nitendragautam.com');
SELECT * FROM niten_test;
To conclude we have installed and validated Apache Hive in Ubuntu server.