Data Integrity And Data Warehouse

Superior Essays
Data integrity ensures that data is ultimately accurate. This is important when trying to analyze data that is stored in the database or data warehouse. Inaccurate data will lead to false results in reporting and analytics that will affect any business decisions that need to be made. When dealing with data integrity, having the full or complete set of data is a necessity. If only half or some of the data is entered than it cannot be considered accurate. There are several ways to ensure that data integrity is being followed by using: constraints, primary and foreign keys, and removing duplicate data from the database. When merging two company’s data into one data warehouse, data integrity needs to be checked using a technique known as data scrubbing. …show more content…
A unique identifier will be selected that will relate to both sets of data. The simplest way is to use the identifier CustomerID as it should guarantee that a unique number value will be given to each customer and will auto increment for new customer accounts. Once this is in place the next step is to look at all the combined accounts and check their indexes. Even though the data entered may be similar, not having the same index structures for all tables and columns will create problems when using update statements and other queries. Indexes are used to place constraints on the tables and columns, the most common are UNIQUE, NOT NULL, and FOREIGN KEY. When these constraints are applied to SQL, every time that a statement is created in the application it will check against these constraints before any changes are allowed to be made. This helps also prevent human errors from occurring when entering or removing data. The UNIQUE constraint ensures duplicate rows and table names will not be created. Even though the data entered may not be the same, having the same indexes for tables and columns will create problems when using update statements and other queries. The NOT NULL constraint is commonly combined with other constraints to specify when to allow or deny the use of NULL values. There are plenty of times when NULL values are used for non-applicable attributes. However, when a NULL value is not expected it …show more content…
Table data should not depend on anything other than a table 's primary key, data that does can be removed and placed in their own tables. Automobile, boat, RV, and home, should not exist in the same table as customer, these values rely on policy types as their own primary key to define their meanings. Policy types will then be moved to their own table and new primary key will be created and this becomes a foreign key in the customer table. The last step is removing dependencies, which is data that relies on more data to define itself. 3NF takes all the previous forms and removes all redundant and duplicate data. There is now a separate table for customer, policy type, claims, prior insurance company, etc. These all become foreign keys in the customer table that allows for reduction of a lot of wasted space in the database. The merged company data should now be fully cleaned and accurate enough to use for

Related Documents

  • Improved Essays

    Data Quality Model

    • 526 Words
    • 3 Pages

    Both of the Data Quality Model tool from American Health Information Management Association(AHIMA) and Canadian Institute of Health Information (CIHI) are developed to provide the secure and quality care service in their respective field. Each of them have developed their own characteristics of data quality to support the decision making application within the organization. Table: Data Quality checklist Data Quality characteristics Data quality measure Meet the data quality measures Precision Data should be enough to run the application process smoothly Yes, AHIMA has this data quality measure.…

    • 526 Words
    • 3 Pages
    Improved Essays
  • Decent Essays

    Nt1330 Unit 1 Study Guide

    • 573 Words
    • 3 Pages

    1. Software as a Service (SaaS) – Model of software deployment where an application is hosted as a service provided across the internet. 2. Outsourcing is the transfer of information systems development, operation, or maintenance to an outside firm that provides these services for a fee on a temporary or long-term basis. 3.…

    • 573 Words
    • 3 Pages
    Decent Essays
  • Decent Essays

    Information security policy is to protect the data and assets. We can apply policies to the users. What to access and what not to access. These security policies can protect the networks, computers, applications of the company.…

    • 342 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    P1 Unit 6 Business

    • 462 Words
    • 2 Pages

    Data Confidentiality: When you keep important data it is important that the data doesn’t go out to the wrong hands. So if a business data is suddenly leaked online or maybe stolen, then this can result in a huge loss for the business and this can potentially put the business out of business. To overcome this problem, then you would need a good security program which can safeguard against any breach in confidentiality. Data Integrity: This is used to ensure that the most important data is secure and also makes sure that the data isn’t messed around with. If any of the data has been messed around with then this can result in major problems.…

    • 462 Words
    • 2 Pages
    Improved Essays
  • Decent Essays

    Reginald, Eveyone uses statistic on a daily basis, however some individuals do not realize they are using statistic practices in solving there problems. I enjoyed reading your post. Moreover, how you where able to apply the knowledge of larger samples produce significat outcomes in your IT department. The data has proven to be critical in assisting you in making sound business decisions.…

    • 104 Words
    • 1 Pages
    Decent Essays
  • Decent Essays

    Pt2520 Week 2 Assignment

    • 517 Words
    • 3 Pages

    This week we learned lots about the decomposition of our relations we are creating in the database as well how to make the relations into normal forms, which there are three normal forms. I learned that there can be data that is redundant and therefore not useful for the database. This data has no integrity and can be misplaced and even confuse how the data can be used and retrieved. This type of problem must be avoided. I had a hard time understanding how the breaking down of one relation into two or more would help keep all the data from being redundant.…

    • 517 Words
    • 3 Pages
    Decent Essays
  • Improved Essays

    Ba501 Week 1 Assignment

    • 740 Words
    • 3 Pages

    BA501 Overview of Business Intelligence Week 1 Assignment David Nagus Grantham University Professor Jackson May 6, 2015 I Introduction With any business with a database system there are bound to be changes and different metrics used in the decisions of running a business. These days almost all businesses have a computer system or network of computers that are interlinked. One aspect of a business is how they secure their network and who has access to data and who does not. On a almost daily basis businesses are being targeted with threats from outside sources to steal data, cause harm or disrupt the daily workings of a business and affecting people's lives.…

    • 740 Words
    • 3 Pages
    Improved Essays
  • Decent Essays

    Operational Support: An operational support system (OSS) is a group of computer programs or an IT system used by communications service providers for monitoring, controlling, analyzing and managing a computer. John Lewis do this so they can see the products they are selling well they day by their till. The till sends information to a data base that records what they are selling and what is getting returned. Data Analysis: Data analysis, is a process of inspecting, cleansing, transforming, and modeling date with the goal to discovering key information, suggesting, conclusions, and supporting decision making towards a subject.…

    • 458 Words
    • 2 Pages
    Decent Essays
  • Decent Essays

    Hi Professor, Certainly, no data mining, business intelligence, or predictive analysis can be successfully accomplished without relevant data that assure the program integrity: • The USDA needs relevant sources to determine actual income from applicants in order to determined true eligibility. The unemployment office, the banking sector, and the credit score agencies could provide some of this information. • The USDA needs to track down the purchases from the SNAP participants (EBT card) in order to detect abusers from buying more items that the ones needed (baby formula). The USDA could stablish data transfers from participating stores to collect such information. • In my last company unemployment hearing, an ex-employee declared leaving the…

    • 178 Words
    • 1 Pages
    Decent Essays
  • Great Essays

    Nt1330 Unit 2 Case Analysis

    • 2280 Words
    • 10 Pages

    As a result, the data imported was duplicate or in incorrect, because the tables from the legacy system were imported as separate tables in the new system. Moreover, the analysts did not use strict data conversion security and controls while converting the legacy database to ensure data integrity and quality of the data nor did they review the relationships of the tables to ensure they were correct. To avoid the issues with the data conversion process, the analysts should have scrubbed the data prior to converting it into the new system, and validated the data for accuracy. In addition, review the format of data, fields and size of the tables to ensure they are not too large, and identified source-to-target data mapping…

    • 2280 Words
    • 10 Pages
    Great Essays
  • Decent Essays

    One of the main new changes regarding how companies recognize revenue is how revenue should be recognized with regards to customer contracts not just relying on performance obligations. Specifically does the company control the goods or services before it is transferred to the customer in which case they would be the principal otherwise they are most likely an agent. Where classification becomes murky, is when there are multiple items included in a contract, some of which are goods, some are services, and others are rights to goods or services. ASC 606 strives to clarify how contracts should be classified in order to improve the accuracy of financial statements.…

    • 405 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    IT security threats and cryptography 7/A. P1: Explain the different security threats that can affect the IT systems of originations. 7/A.M1: Assess the impact that IT security threats can have on organization's IT systems and business whilst taking account of the principles of information security and legal requirements In today's society data is a very valuable thing companies have to take in to account how to protect that data from the threats, Threats is a way in which the data is vulnerable and therefore rules and regulations have been put in place to stop these potential threats for example all will have adhere to the principles of information security this is a way in which data is protected, I have been working for a start-up company…

    • 1332 Words
    • 6 Pages
    Improved Essays
  • Improved Essays

    • The common data quality problem in healthcare performance measurement is: First, the lack of knowledge about the purpose of healthcare performance measurement. The purpose is to: - Assessment of current performance: need to find out the strength and weakness of current process - Demonstration and verification of performance improvement: evaluate and compare whether the improvement had made any difference. - Control of performance (Joshi, Ransom, Nash, & Ransom, 2014, p.135)…

    • 850 Words
    • 4 Pages
    Improved Essays
  • Superior Essays

    Zillow Case Study Essay

    • 1460 Words
    • 6 Pages

    The database contains data and information related to the systems, including the details of the users and the transactions, while the DBMS creates, modifies, and delete data in the database, as well as maintaining access control and data security. Zillow’s business is to provide information on real estate and mortgages. It recognizes that it needs to make its data accessible, scalable, reliable, and secure with high level of performance, and a relational database would provide the business advantages of meeting these needs, through increased flexibility, increased scalability and performance, reduced redundant information, increased information quality, and increased information security (Baltzan,…

    • 1460 Words
    • 6 Pages
    Superior Essays
  • Great Essays

    Dataclear is a business data analytics company, based out of Baton Rouge, Louisiana. The company was founded in 1998 by Greg McNally, a graduate of UC Berkeley, with a PhD in computer science. McNally developed his skill working as a software developer for 15 years, “at Borland and Oracle” (Case Study, para.5). DataClear established itself in a market that was wide open and offered very rapid customer and profit growth. Within the first year of operation, the company’s sales hit $2.2 million (Case Study).…

    • 1259 Words
    • 6 Pages
    Great Essays