Thursday, March 2, 2017

Hadoop

Hadoop

What is Hadoop?

Hadoop is Java based framework and software system which can be installed in thousands of servers to store all the types of data.

Hadoop is open source and its main Big Data handling system. Hadoop is developed at Apache Software Foundation. Hadoop consists of many modules, all the modules works together to provide a system for storage and processing of data in distributed computing environment.

Hadoop cluster automatically handles hardware failure and there is no data loss or downtime in such cases.

Hadoop is very robust system for Big Data which can handle data of any size. Big companies like Twitter, Amazon, Facbook, Yahoo and Google are using Hadoop for handling such a huge quantity of data.

Hadoop is using the distributed file system which is know as Hadoop Distributed File System (HDFS).

The Apache Hadoop composed of the following modules:

  1. Hadoop Common 
  2. Hadoop Distributed File System (HDFS) 
  3. Hadoop YARN
  4. Hadoop MapReduce

Further reading:
Thanks

No comments:

Post a Comment