Data Integrity And Data Warehouse

Superior Essays
Data integrity ensures that data is ultimately accurate. This is important when trying to analyze data that is stored in the database or data warehouse. Inaccurate data will lead to false results in reporting and analytics that will affect any business decisions that need to be made. When dealing with data integrity, having the full or complete set of data is a necessity. If only half or some of the data is entered than it cannot be considered accurate. There are several ways to ensure that data integrity is being followed by using: constraints, primary and foreign keys, and removing duplicate data from the database. When merging two company’s data into one data warehouse, data integrity needs to be checked using a technique known as data scrubbing. …show more content…
A unique identifier will be selected that will relate to both sets of data. The simplest way is to use the identifier CustomerID as it should guarantee that a unique number value will be given to each customer and will auto increment for new customer accounts. Once this is in place the next step is to look at all the combined accounts and check their indexes. Even though the data entered may be similar, not having the same index structures for all tables and columns will create problems when using update statements and other queries. Indexes are used to place constraints on the tables and columns, the most common are UNIQUE, NOT NULL, and FOREIGN KEY. When these constraints are applied to SQL, every time that a statement is created in the application it will check against these constraints before any changes are allowed to be made. This helps also prevent human errors from occurring when entering or removing data. The UNIQUE constraint ensures duplicate rows and table names will not be created. Even though the data entered may not be the same, having the same indexes for tables and columns will create problems when using update statements and other queries. The NOT NULL constraint is commonly combined with other constraints to specify when to allow or deny the use of NULL values. There are plenty of times when NULL values are used for non-applicable attributes. However, when a NULL value is not expected it …show more content…
Table data should not depend on anything other than a table 's primary key, data that does can be removed and placed in their own tables. Automobile, boat, RV, and home, should not exist in the same table as customer, these values rely on policy types as their own primary key to define their meanings. Policy types will then be moved to their own table and new primary key will be created and this becomes a foreign key in the customer table. The last step is removing dependencies, which is data that relies on more data to define itself. 3NF takes all the previous forms and removes all redundant and duplicate data. There is now a separate table for customer, policy type, claims, prior insurance company, etc. These all become foreign keys in the customer table that allows for reduction of a lot of wasted space in the database. The merged company data should now be fully cleaned and accurate enough to use for

Related Documents

  • Improved Essays

    Nt1320 Unit 2

    • 866 Words
    • 4 Pages

    Many database applications would like to update master tables with new data. A Purchases T table, for example, might include rows with data regarding new products and rows that change the standard value of existing…

    • 866 Words
    • 4 Pages
    Improved Essays
  • Decent Essays

    Nt1330 Unit 1 Study Guide

    • 573 Words
    • 3 Pages

    1. Software as a Service (SaaS) – Model of software deployment where an application is hosted as a service provided across the internet. 2. Outsourcing is the transfer of information systems development, operation, or maintenance to an outside firm that provides these services for a fee on a temporary or long-term basis. 3.…

    • 573 Words
    • 3 Pages
    Decent Essays
  • Improved Essays

    Briefly describe what benefits to using merge statements? MERGE statement is used to perform multiple DML operations. Benefit of MERGE statement is all the data read and processed only once. For INSERT, UPDATE or DELETE we have to use different statements to process. MERGE statement overcome this problem.…

    • 467 Words
    • 2 Pages
    Improved Essays
  • Decent Essays

    Hi Professor, Certainly, no data mining, business intelligence, or predictive analysis can be successfully accomplished without relevant data that assure the program integrity: • The USDA needs relevant sources to determine actual income from applicants in order to determined true eligibility. The unemployment office, the banking sector, and the credit score agencies could provide some of this information. • The USDA needs to track down the purchases from the SNAP participants (EBT card) in order to detect abusers from buying more items that the ones needed (baby formula). The USDA could stablish data transfers from participating stores to collect such information. • In my last company unemployment hearing, an ex-employee declared leaving the…

    • 178 Words
    • 1 Pages
    Decent Essays
  • Decent Essays

    One of the main new changes regarding how companies recognize revenue is how revenue should be recognized with regards to customer contracts not just relying on performance obligations. Specifically does the company control the goods or services before it is transferred to the customer in which case they would be the principal otherwise they are most likely an agent. Where classification becomes murky, is when there are multiple items included in a contract, some of which are goods, some are services, and others are rights to goods or services. ASC 606 strives to clarify how contracts should be classified in order to improve the accuracy of financial statements.…

    • 405 Words
    • 2 Pages
    Decent Essays
  • Decent Essays

    Operational Support: An operational support system (OSS) is a group of computer programs or an IT system used by communications service providers for monitoring, controlling, analyzing and managing a computer. John Lewis do this so they can see the products they are selling well they day by their till. The till sends information to a data base that records what they are selling and what is getting returned. Data Analysis: Data analysis, is a process of inspecting, cleansing, transforming, and modeling date with the goal to discovering key information, suggesting, conclusions, and supporting decision making towards a subject.…

    • 458 Words
    • 2 Pages
    Decent Essays
  • Decent Essays

    Unit 10 D2

    • 502 Words
    • 3 Pages

    Task 10 (D1) Discuss how potential errors in the design and construction of a database can be avoided. One of the potential errors that can occur when creating a database is that field names and names can be spelt incorrectly due to human error this can be difficult to spot if the database is big the way of making sure that there are no naming errors is to make sure that every is checked regularly and more than twice to make sure that the chances of finding a spelling mistake is reduced. Using the wrong data type is another potential error that can occur when creating a database this can be troublesome as if there is a relationship created between the tables and the mistake is found and changed then the database wouldn’t allow you to change the field where the data type is incorrect without removing the relationship first so it will be made into a long process of removing the relationship between the tables changing the data types for each of the table before remaking the relationship between the tables again the way of preventing this is by looking at the field names for the database and then choosing the correct data type also perhaps getting the opinion of another person and rechecking that they are all correct before creating the relationship.…

    • 502 Words
    • 3 Pages
    Decent Essays
  • Improved Essays

    What does integrity mean? Let’s start with the root word integ. Integ means wholeness, or truthfulness. So, integrity must mean that you basically do the right thing. Why do I think integrity is the most important?…

    • 642 Words
    • 3 Pages
    Improved Essays
  • Improved Essays

    IT security threats and cryptography 7/A. P1: Explain the different security threats that can affect the IT systems of originations. 7/A.M1: Assess the impact that IT security threats can have on organization's IT systems and business whilst taking account of the principles of information security and legal requirements In today's society data is a very valuable thing companies have to take in to account how to protect that data from the threats, Threats is a way in which the data is vulnerable and therefore rules and regulations have been put in place to stop these potential threats for example all will have adhere to the principles of information security this is a way in which data is protected, I have been working for a start-up company…

    • 1332 Words
    • 6 Pages
    Improved Essays
  • Decent Essays

    Data integrity is critical to meeting these expectations. A single…

    • 256 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    Data Quality Model

    • 526 Words
    • 3 Pages

    Both of the Data Quality Model tool from American Health Information Management Association(AHIMA) and Canadian Institute of Health Information (CIHI) are developed to provide the secure and quality care service in their respective field. Each of them have developed their own characteristics of data quality to support the decision making application within the organization. Table: Data Quality checklist Data Quality characteristics Data quality measure Meet the data quality measures Precision Data should be enough to run the application process smoothly Yes, AHIMA has this data quality measure.…

    • 526 Words
    • 3 Pages
    Improved Essays
  • Great Essays

    Case Study Assignment – I Campbellsville University MASSIVE DATABASE MASTERING - MASTERCARD INTERNATIONAL Various affiliations are endeavoring to address the open entryways and limit challenges related with "huge data." Industry masters gage that the total volume of data is increasing at general interims and most by a wide margin of new data being delivered is prepared to go spaces. MasterCard Universal (www.mastercard.com) is not any more impossible to miss to think about the issues identified with monstrous databases. MasterCard has amassed a data circulation focus that is more than 100-terabytes in size. Insiders expect that it will create to more than 1.8 petabytes.…

    • 933 Words
    • 4 Pages
    Great Essays
  • Decent Essays

    Information security policy is to protect the data and assets. We can apply policies to the users. What to access and what not to access. These security policies can protect the networks, computers, applications of the company.…

    • 342 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    • The common data quality problem in healthcare performance measurement is: First, the lack of knowledge about the purpose of healthcare performance measurement. The purpose is to: - Assessment of current performance: need to find out the strength and weakness of current process - Demonstration and verification of performance improvement: evaluate and compare whether the improvement had made any difference. - Control of performance (Joshi, Ransom, Nash, & Ransom, 2014, p.135)…

    • 850 Words
    • 4 Pages
    Improved Essays
  • Decent Essays

    Reginald, Eveyone uses statistic on a daily basis, however some individuals do not realize they are using statistic practices in solving there problems. I enjoyed reading your post. Moreover, how you where able to apply the knowledge of larger samples produce significat outcomes in your IT department. The data has proven to be critical in assisting you in making sound business decisions.…

    • 104 Words
    • 1 Pages
    Decent Essays