Category Archives: BigData

Apache Hive architecture

One cannot avoid hearing the word “Hive” when it comes to the distributed processing system. In this article, we will see the hive architecture and its components What is Hive? Apache Hive is nothing but a data warehouse tool for querying and processing large datasets stored in HDFS. It uses the MapReduce processing mechanism for… Read More »

PIG Installation steps

In this previous article, we saw how to install Apache Hive in the Ubuntu machine. Both of these articles are written with an assumption that you have already installed the Hadoop framework in the machine. If not, please visit this post and install the Hadoop framework first. Pig is another component of the Hadoop ecosystem… Read More »

Hive Installation steps

In this post, we will see how to install Hive in your Ubuntu machine. Hive is a tool to query and process data from HDFS. Hive uses HQL(Hive Query Language) for processing data. It follows MySQL syntax so people from SQL background will find it easy to work with the hive. Let’s get into the… Read More »

What is HDFS?

What is HDFS? This is the common question that everyone will encounter when they start learning about Hadoop. HDFS deals with the way data is stored and managed by Hadoop Framework. What is a Distributed File System? A distributed file system deals with managing data(files and folder) across multiple nodes or computers. It serves the… Read More »

What is Hadoop?

While reading this article, Hadoop has already reached 3.0 version. However, it is important to know the history and how it evolved in the past. This will help people who are working in the migration projects from Hadoop 1.0 to Hadoop 2.0. It will also help the developers to understand and consider future use cases… Read More »

What is BigData?

What is BigData? It is the question that will arise in the minds when someone wants to learn Hadoop and other distributed processing tools. There is no end to learn about any technology as technology is growing along with you each day. Companies like Facebook, Twitter, Google are already generating Petabytes of data every day.… Read More »