Advantages And Disadvantages Of Storage Systems For Big Data

Superior Essays
Storage Systems for Big Data

Internet age comes with the vast amount of data that requires efficient storage and processing capabilities. To alleviate this issue, we discuss data storage systems which are tailored to store and process big data effectively. While general-purpose RDBMSs are still viable option in handling and analyzing structural data, they suffer from a variety of problems including performance and scalability issues when it comes to big data. To increase the performance of DBMS for big data storage needs, partitioning the data across several sites and paying big license fees for enterprise SQL DBMS might be the two possible options (Stonebraker, 2010), however they are not even without disadvantages such as inflexible data
…show more content…
Then Bigtable returns the corresponding data where it behaves similar to distributed hash table. Bigtable maintains the data in lexicographic order by row string and the data is partitioned according to the row range, called a tablet. By incorporating the timestamp value, Bigtable gains the ability to store multiple versions of the same data. For faster retrieval of the tablets, Bigtable takes advantage of a three-level hierarchy analogous to …show more content…
In recent years, researchers have begun to seek efficient ways to outperform legacy database systems. As the RAM capacities increase, the technique of storing partitions of data on the RAM of shared-nothing machines is more applicable than ever. NewSQL databases are designed by taking advantage of some modern techniques such as data sharding, data replication and distributed memory database and offer scalable and high performance solution to disk-based legacy database systems. NewSQL databases provide an object oriented database language that is considered easier to learn than the standard SQL language (Kumar et al., 2014).

H-Store (Kallman et al., 2008) divides database into partitions where each partition is replicated and resides in main memory. The H-Store system relies on distributed machines that share no data to improve the overall performance of database

Related Documents

  • Improved Essays

    Nt1310 Unit 4 Test Paper

    • 419 Words
    • 2 Pages

    Suited for small operations that don’t require large amounts of storage space. Also suited for operations that are critical requiring high availability and no downtime. 1. Improved…

    • 419 Words
    • 2 Pages
    Improved Essays
  • Improved Essays

    Nt1330 Unit 1 Paper

    • 521 Words
    • 3 Pages

    A Distributed File System (DFS) is being used as it will provide a centralize location for the data as well as its ability to be expanded easily as more files and folder are created. The Distributed File System (DFS) will also provide the mechanisms for replicating the data as well…

    • 521 Words
    • 3 Pages
    Improved Essays
  • Improved Essays

    This experiment showed a good results in the memory cost , because when we put Min_Sup= 0.5 % , the old algorithm generated 422 KB of memory size from frequent closed items, while our algorithm generated only 316 KB of memory size from frequent closed items and when we put Min_Sup= 3.0 % , the old algorithm generated 400 KB of memory size from frequent closed items, while our algorithm generated only 300 KB of memory size from frequent closed items and so on , as shown in figure 5.9. The results in figure 5.10 showed memory cost with different minimum support, the support of the experiment begin from [0.5% , 1.0% ,1.5 % and 2.5%] . This experiment showed a good results in the memory cost with large database , because when we put Min_Sup= 0.5 % , the old algorithm generated 4010 KB of memory size from frequent closed items, while our algorithm generated only 2750 KB of memory size from frequent closed items and when we put Min_Sup= 1.0 % , the old algorithm generated 2220 KB of memory size from frequent closed items, while our algorithm generated only 1550 KB of memory size from frequent closed items and so on. Figure 5.9 .…

    • 959 Words
    • 4 Pages
    Improved Essays
  • Improved Essays

    Nt1320 Unit 2

    • 866 Words
    • 4 Pages

    2. D H C I J E K A B F G 4. SQL-92 was a major revision and was structured into three levels: Entry, Intermediate, and Full.…

    • 866 Words
    • 4 Pages
    Improved Essays
  • Decent Essays

    Nt1310 Unit 6

    • 910 Words
    • 4 Pages

    2.2.2 Empathy Summary Canvas It consists of five key parts: 1. First part is for user. In this section, we need to specify about the main user of the utility. The main user of this utility is the Administrator only. 2.…

    • 910 Words
    • 4 Pages
    Decent Essays
  • Great Essays

    If the data is not stored appropriately, then the information that is repetitive in nature will be processed again and again. This will result in high processing time of data that could have been stored for future use. Thus, a robust infrastructure is needed that will be able to not only store high volumes of data but also be able process that data in very quick time frame. The large amount of data generated by the IoT devices also brings about the issue of security.…

    • 1436 Words
    • 6 Pages
    Great Essays
  • Improved Essays

    The problem or the issue addressed is on how to parallelize the computation, distribute the data, and handle failures conspire to obscure the original simple computation with large amounts of complex code to deal with these issues. Contributions are simple powerful interface that gives parallelization and distribution of large scale systems. So to tackle the issue of parallelization, fault tolerance and distribution of data, they acquired the map and reduce primitives. The use of a functional model with user-specified map and reduce operations allows us to parallelize large computations easily and to use re-execution as the primary mechanism for fault tolerance.…

    • 868 Words
    • 4 Pages
    Improved Essays
  • Improved Essays

    To represent the constraints on the host level (and implicitly on the level of feature packs), and uses a dependency matrix. Figure represents an example of a dependency matrix that shows the dependencies between the different hosts, as well as restrictions on these dependencies. The first column and the first row represent a file (database). Each cell contains a set of pairs (C, H), where C indicates the constraint and H indicates the type of dependency. A value of 2 means a strong dependency and 1 indicating a weak dependency.…

    • 477 Words
    • 2 Pages
    Improved Essays
  • Great Essays

    Nt1320 Assignment 1

    • 1461 Words
    • 6 Pages

    To support an acceptable level of fault tolerance, a worker can become a resource manager in case that the current RM fails. A distributed election algorithm is implemented to determine which one of the participating machines behaves as RM; e.g. the Bully Algorithm. The distributed system is deployed and…

    • 1461 Words
    • 6 Pages
    Great Essays
  • Decent Essays

    Nt2580 Unit 7

    • 395 Words
    • 2 Pages

    And the Utility, which is the availability to use the stored…

    • 395 Words
    • 2 Pages
    Decent Essays
  • Decent Essays

    Nt1330 Course Project

    • 296 Words
    • 2 Pages

    2.1 Goals The goals of this internship are: a. Provide a solution which reduces the stall time and increases the system efficiency. b. A solution which could be generalized in future to be used with more than two sub-systems. 2.2 Solution Overview The most common solution to such a problem is by making use of a buffer.…

    • 296 Words
    • 2 Pages
    Decent Essays
  • Great Essays

    There are three models for the architecture of HEALTH INFORMATION EXCHANGE. There is a model which is a federated or decentralized model, a centralized model, and a hybrid model that is little like both. The central model is the master database which contains a complete copy of all the records for every patient contained in the HEALTH INFORMATION EXCHANGE. A decentralized model has no master database. In the decentralized model the care provider has to request the patients’ information and compile a record for the…

    • 1561 Words
    • 6 Pages
    Great Essays
  • Superior Essays

    Zillow Case Study Essay

    • 1460 Words
    • 6 Pages

    The database contains data and information related to the systems, including the details of the users and the transactions, while the DBMS creates, modifies, and delete data in the database, as well as maintaining access control and data security. Zillow’s business is to provide information on real estate and mortgages. It recognizes that it needs to make its data accessible, scalable, reliable, and secure with high level of performance, and a relational database would provide the business advantages of meeting these needs, through increased flexibility, increased scalability and performance, reduced redundant information, increased information quality, and increased information security (Baltzan,…

    • 1460 Words
    • 6 Pages
    Superior Essays
  • Great Essays

    Case Study Assignment – I Campbellsville University MASSIVE DATABASE MASTERING - MASTERCARD INTERNATIONAL Various affiliations are endeavoring to address the open entryways and limit challenges related with "huge data." Industry masters gage that the total volume of data is increasing at general interims and most by a wide margin of new data being delivered is prepared to go spaces. MasterCard Universal (www.mastercard.com) is not any more impossible to miss to think about the issues identified with monstrous databases. MasterCard has amassed a data circulation focus that is more than 100-terabytes in size. Insiders expect that it will create to more than 1.8 petabytes.…

    • 933 Words
    • 4 Pages
    Great Essays
  • Superior Essays

    Amazon Research Paper

    • 1604 Words
    • 7 Pages

    Has big data improved amazon as a company Introduction This research paper will focus on answering the question “ Has big data improved amazon as a company?” The objectives of this assignment is to examine why the timeline of amazon is relevant to big data. An additional objective is to explore whether amazon have met the needs of its customers through e-commerce and whether the technology in place can be deemed beneficial to the company. An additional objective is to determine if any ethical or security issues are involved.…

    • 1604 Words
    • 7 Pages
    Superior Essays