The storing and processing of Big Data is done using Hadoop which is an open-source framework in a distributed environment across clusters of computers using simple programming models. According to this framework Hadoop Distributed File system replicates datasets into two additional data nodes by default to achieve availability during failure of any components. The data nodes take care of read and write operation with the file system based on instruction given by name node. The reading of data blocks from different data node is done completely in parallel for the different data block of one node. So that if any failure of one block it would get the other location of its replicated block and read block which would take up some time for it. In this project, the data blocks are read in two different orders on two different data nodes as such from top to middle and bottom to the middle respectively. In case of failure in any data node the other half of the data node is read. Hence map reduce technique is used for analysis.
The read of data is done in two ways for same data block. Top-to-middle copy1, Bottom-to-middle copy2. These copied blocks are read in parallel. If any data node doesn't respond the other half of alternate copy of data block is read. Hence it Reduces the Time Complexity.