cloud computing: Hive

Hive is a data warehouse system for Hadoop.
 Hive provides a SQL-like language called HiveQL. Due its SQL-like interface, Hive is increasingly becoming the technology of choice for using Hadoop

Prerequisites

The following are the prerequisites for setting up Hive and running Hive queries
  • You should have the latest stable build of Hadoop
  • To install hadoop,
  • Your machine should have Java 1.6 installed
  • It is assumed you have some knowledge of Java programming and are familiar with concepts such as classes and objects, inheritance, and interfaces/abstract classes.
  • Basic knowledge of Linux will help you understand many of the linux commands used in the tutorial

Setting up Hive

Platform

This tutorial assumes Linux. If using Windows, please install Cygwin. It is required for shell support in addition to the required software above.

Procedure

Download the most recent stable release of Hive as a tarball from one of the apache download mirrors. For our tutorial, we are going to use hive-0.9.0.tar.gz
Unpack the tarball in the directory of your choice, using the following command 
  $ tar -xzvf hive-x.y.z.tar.gz  
Set the environment variable HIVE_HOME to point to the installation directory:
You can either do
  $ cd hive-x.y.z
  $ export HIVE_HOME={{pwd}}
 
or set HIVE_HOME in $HOME/.profile so it will be set every time you login.
Add the following line to it.
  export HIVE_HOME=<path_to_hive_home_directory>
e.g.
  export HIVE_HOME='/Users/Work/hive-0.9.0'
  export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH
Start Hadoop (Refer to the Single-Node Hadoop Setup Guide for more information). It should show the processes being started. You can check the processes started by using the jps query
$ start-all.sh
<< Starting various hadoop processes >>
$ jps
  3097 Jps
  2355 RunJar
  2984 JobTracker
  2919 SecondaryNameNode
  2831 DataNode
  2743 NameNode
  3075 TaskTracker
In addition, you must create /tmp and /user/hive/warehouse (aka hive.metastore.warehouse.dir) and set aprpopriate permissions in HDFS before a table can be created in Hive as shown below:
  $ hadoop fs -mkdir /tmp
  $ hadoop fs -mkdir /user/hive/warehouse
  $ hadoop fs -chmod g+w /tmp
  $ hadoop fs -chmod g+w /user/hive/warehouse

Comments

Popular posts from this blog

Installing Qt Jambi For java

What Is Tomcat Default Administrator Password ?

Install NS3