Ads Top

HIVE Guide for Beginner's

Birth of a language called HIVE?

HIVE was a brain child of Facebook. When Facebook was first introduced, all it's back end data for analytics was stored in Oracle systems, these were loaded using python scripts. As Facebook expanded & became more popular, data size increased exponentially & they realized Oracle system's weren't capable of handling the data load. 

This is when they took a decision to migrate to an open-source platform called Hadoop, which was not that popular those days. Any fetch from hadoop system works on the concept of map-reduce, for which writing complex java jobs was really difficult. This is when they started a project HIVE to develop a SQL like language to convert SQL scripts to map-reduce.



What is HIVE?


  • Hive is a data ware house system for Hadoop. It runs SQL like queries called HQL (Hive query language) which gets internally converted to map reduce jobs. 
  • Hive supports Data definition Language(DDL), Data Manipulation Language(DML) and user defined functions.
  • Hive's metastore is used to persist schema i.e. table definition(table name, columns, types), location of table files, row format of table files, storage format of files. 
  • Built-in user-defined functions (UDFs) to manipulate dates, strings, and other data-mining tools. Hive supports extending the UDF set to handle use-cases not supported by built-in functions.

What HIVE is not?

  • It is not used for Online Transactional Processing.
  • It is not performance oriented, even fetching small amount of data takes time.

No comments:

Powered by Blogger.