Mapreduce Rendezvous Hasing Based Virtual Hierarchies: The Cassandra Nosql Case Study

2262 Words 10 Pages
The gradual transformation in data quantity has resulted in emergence of the Big data and immense datasets that need to be stored. Traditional relational databases are facing many difficulties meeting the requirements of the volume and heterogeneity structure of big data. NoSQL databases are designed with a novel data management system that can handle and process huge volumes of data. NOSQL systems provide horizontal scalability by supporting horizontal data partitioning across heterogeneous nodes. In this paper, a MapReduce Rendezvous Hashing Based Virtual Hierarchies (MR-RHVH) framework is proposed for scalable partitioning of Cassandra NoSQL databases. MapReduce framework is used to implement MR-RHVH on Cassandra to enhance its scalability …show more content…
MR-RHVH framework apply the distributed structure of nodes using MapReduce, that equally partition data amongst Cassandra nodes using mapper and reducer functions. The MR-RHVH framework consists of three main layers, as shown in figure 2, Cassandra/Hadoop Cluster, MRRHVH, Cassandra/Hadoop Data Center and Hadoop/MapReduce applier.
Cassandra clients ' nodes are distributed in this data center. Task Tracker and Data Node services run on each Cassandra node/client in the data center. The Task Tracker accept tasks from job tracker and then recall data needed from data node. The Data Node used to provide task trackers with the required data, using HDFS in MapReduce layer. Cassandra/Hadoop Cluster:
The Cassandra master node resident in this layer. In Cassandra/Hadoop Cluster, the Job Tracker service, running on the master nodes, is used to coordinate job requests sent to and from the Task Trackers in client 's node using MapReduce. The name node in the Cassandra master node used to save a list of all files in data center, and search for the node that keep the file or have the capability to save a file. Name node considered as a single Point of Failure (SPOF) in Hadoop/MapReduce, as when the Name Node failed, the whole system goes down. A Converter Name Node (CNN) module is used to solve this SPOF in next

Related Documents