• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/30

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

30 Cards in this Set

  • Front
  • Back

1) What % of alleles are account for common variants which can lead to common disease/traits?


2) What is the definition of GWAS?




s5,5

1) 1-5% of alleles


2) A study of common genetic variation across the entire human genome designed to identify genetic associations with observable traits

1) So what is GWAS good for? Null hypothesis






s7

1) Scans a large number and variety of genetic markers e.g. SNPs


-> Large pop'n of genetic variants can be captured in the pop'n


-> Goal to identify SNPs/variants associated with a trait, doesnt mean causality necessarily


-> Markers found more frequently in people with a specific trait assumed to be associated with trait


-> null: Marker found at equal frequency in affected and non affected people

1) What are the advantages between GWAs and Linkage Analysis?






s8

1) -> Population level study, vs. Family level study


-> Whole genome studied vs. pedigree may not provide clues about candidate genes, where mostly referred (since we only know 8000 genes functional associations among the 20000)


-> Complex traits vs. simple traits studied


-> Not hypothesis driven (no gene involved, objective associations) vs. hypothesis driven (single gene)

1) What are some aspects of genotype variations that can be examined through GWAS?






s9

1) -> Find gene variants associated with complex traits e.g. height

-> Study gene-gene interactions e.g. to find variants which in combination are associated with disease risk


-> Find gene variants associated with gene expression e.g. SNP thats leads to enhanced expression of a gene



1) What are the 4 steps of a GWAS study?








s10

1) select study population: Large number of people with disease/trait + control group. Often focus on participants with genetic bias e.g. early onset in schizophrenia


-Proper control selection of control important


2) Isolate DNA and genotype: SNP genotyping must pass quality control threshold


3) Statistical significance: Are genotype phenotype really associated?


4) Replicate in additional cohort and perform functional experiments

1) Describe a case control GWAS design (most common way)




2) Describe a cohort design GWAS study




s12

1) Start with clinical records -> Select a case group that is affected by the disease phenotype, and a control group without it -> Isolate DNA and genotype, compare




2) Clinical records -> Have a bunch of patients come in based on chance of having disease/predisease -> isolate DNA/genotype -> and THEN put them in disease and non-disease groups -> compare

1) Describe a Trio design GWAS study.






s14



1) Clinic record -> Case group (phenotypically assessed) -> Do DNA isolation and genotyping of each individual and both parents -> If null is true, there is noassociation between phenotype and genotype (allele transmission frequency is 50%) -> if higher than 50% allele associated with phenotype

1) Which of the three studies would you do in an ideal setting?




2) What are the assumptions of a case control study?




s14-15

1) all three


2) -Case & control participants are drawn from same pop'n


-Case participants are representative of all cases of the disease


-Genomic and epidemiologist data are collected similarly in case and controls


-Differences in allele frequencies relate to the outcome of interest rather than differences in background pop'n

1) What are the assumptions in a cohost study?


2) What are the assumptions of a trio design?






s15

1) -Participants under study are more representative of pop'n they are drawn from.


-Diseases and traits are ascertained similarly in individuals with and without the gene variant




2) Disease related alleles are transmitted more than 50% of time to affected offspring from heterozygous parents

1) What are the advantages of each type of design?




s28

1) Case control: -> Short time frame, don't need people to come in, etc.


-> Large numbers of case and control participants can be assembled


-> Optimal epidemiologic design for studying rare diseases


Cohort: Cases are incidents and develop during observation, free of bias


-> Direct measure of risk


-> Fewer biases that case control studies


-> Continuum of health related measures available in pop'n samples not selected for presence of disease


Trio: Has controls of population structure, immune to population stratification.


-> Allows checks for Mendelian Inheritance patterns in genotyping quality control


-> Logistically simpler, 3 people in every group


-> Does not need phenotyping of parents

1) What is an optimal method of collecting DNA sample?

1) Saliva: needs 5 mL; can use mouthwash and then have clean sample. Saliva is more private and person gives itself. Rather than hair or swab.

1) What are the steps after DNA collection




s18

1) -> Isolate genomic DNA


-> Fragment & label DNA (can do PCR)


-> Hybridize DNA to chip containing SNP probes


-> Detect fluorescence to determine SNP genotypes for each locus tested (Level of fluorescence shows if properly genotypes)




80-90 % of SNPs should be successfully genotyped

1) What is a good call rate for SNPs? What are call rates?




S19

1) -> Proportion of total samples studied, for which a particular SNP is reliably genotyped. Means that DNA is binding to the SNP array in sufficient quantity to be able to 'call' a genotype




Quality control: SNP call rate > 95%

1) What is the minor allele frequency (MAF) and what range is a good value?




s20

1) Proportion of the less common of 2 alleles in a population.


Range <1% to <50%


Quality control: MAF > 1% (Some really rare alleles that cant even be reported, but we can identify them to exist)

1) Why are tests of significance important and what does the graph show




s23

1) For evaluating number and extent of observed associations between SNPs and phenotype.


-> Linear line would show relationship if not association.


-> Need to adjust for stratification, because there might be bias in starting pop'n

1) Explain the Linkage Disequilibrium (D)






s24

1) The non-random association of alleles: SNPs located near each other on a chromosome tend to be inherited together more than by chance.


(Linkage disequilibrium NOT the same as linkage; not necessarily on same chromosome).




High D = high probs of genes being inherited together.

1) What do r^2 stats tell you






s25

1) r = correlation coefficient between loci




r^2 : measures correlation of linked SNPs in the population


-Proportion of variation of one SNP, explained by another SNP




0 : no association


1 : perfect association1

1) What do LD plots show and how is data measured?




s26

1) Can see how r^2 value for difference SNPs line up.


Haploid blocks are SNPs that are passed on together. The higher the r^2, means higher affinity to be passed on together

1) What are Bonferroni corrections done for and how are they measured






s27

1) Used to reduce false positive rate: Conventional significance level = p < 0.05




When analyzing a million SNPs, there are about 50,000 variations between individuals. But only want 2 or 3 to see if associated with disease.




α = significance level


m1 = # of markers


α' = α / m1


e.g. 1 million SNPs, α = 0.05


α' = 0.05/1,000,000 = 5 x 10^-8


SNP freq has to be larger than 5 x 10^-8

1) What graph do we use to display GWAS results and to determine which SNPs are statistical significance?


2) What are functional studies to follow up GWAS.






s28

1) Manhatten Plots. Data points = -log p values of individuals SNPs


The 2 or 3 SNPs above the threshold line are what matter. e.g. we don't care about SNPs that are about hair colour, height, etc.


2) When GWAS suggests candidate disease gene, or SNP that confers disease risk, can do experiments e.g. cell culture, KO, CRISPR/CAS9 etc.

1) What issues could arise in GWAS results?




s30

1) Might identify regions with no known/predicted genes. Could be introns, could be pseudo-genes, could be non-canonical start sequences, etc.


-Sometimes replication in larger sample size may not be possible since rare disease

1) So what SNPs were identified associated with schizo?


What has been the challenge?




s32

1) More than 100 loci in human genome with SNP haplotypes that associate with risk. Functional alleles and mechanisms remain to be discovered. Studies suggested that strongest association is between genetic markers across MHC, spans several megabases of chromosome 6.


-> complex patterns of association to markers in the MHC locus, spans hundreds of genes, doesnt correspond to LD.

1) Which gene did the the authors of the article start with?




s34

1) CSMD1 (CUB and Sushi multiple domains-1) which encodes a regulator of the C4 gene. CSMD1 is on a different chromosome, but C4 gene IS on chromosome 6; affects the CSMD1.


So they looked at what is regulated by the chromosome 6 genes even if in different chromosomes.



1) What are the multiple forms of the human C4 gene?




s35

1) C4 gene can exist in multiple copies (from 1 to 4), on each copy of chromosome 6.


Human C4 exists as functionally distinct or paralogous genes (isotypes), C4A and C4B; they both vary in structure and copy number. Their proteins are distinguished at a key site that determines by which molecular targets they bind. Also long and short forms.


-> C4A and C4B have different type of SNPs. so C4A L, C4A S, C4B L, C4B S.


-> When endogenous retrovirus sequence (C4-HERV) inserted in intron 9, can regulate functions of nearby genes. Allowed them to distinguish L/S forms.



1) What is ddPCR and what they trying to find when they used it in this experiment?






s37

1) Used droplet digital PCR (ddPCR), allowed them to see relative contributions of each variants. Did trio studies, ddPCR expands PCR reaction and expands in droplets. Can see different PCR products, which are expressed more to see the numbers; so dont need 4 different samples.


-Wanted to identify structural haplotypes -> copy number and L/S HERV status. Present on 222 copies of human chromosome 6 in trio studies. They found that genomes contained varying numbers of variants C4A genes, C4B genes, L C4, S C4; revealing copy number of each of the 4 forms in each genome.

1) So what were the results that they found?






s39

1) C4A seemed to be more


-> Both C4A and C4B increased proportionally with copy number and C4-HERV status.


-> HERV C4 variations may affect expression; they can act as enhancers


-> They assessed how the variation affected RNA expression levels

1) What correlations did they see and why might have that happened?




s40

1) RNA expression of C4A and C4B increased proportionally with copy number of C4A and C4B respectively in line with earlier observations in human serum.


-> C4A levels 2/3 times greater than C4B levels, even after correcting for relative copy numbers.


-> Copy number of C4-HERV increased ration of C4A to C4B expression.

1) What did ddPCR show?




s41

1) More C4A copy number -> More C4A expression


More C4B copy number -> higher C4B expression


C4-HERV copy number -> higher C4A expression (So this expression of the long form somehow affects it)

1) What GWAS design study did they do next? What were the results?






s45

1) -> From 22 countries, analyzed whole genome for SNPs, across MHC locus. This locus had strongest of more than 100 GWAS studies.


Thousands of affected and control cases, in many countries; predicted expression levels.


-> Clear association with C4A. In schizo people, higher C4A association than C4B.

1) What other results did they see?






s46

1) -> Strong correlation of similarly associating SNPs, spanning over 2MB across the distal end of the extended MHC region.


-> Other peak of association centred at C4, where schizophrenia associated most strongly with the genetic predictor of C4A expression levels.


-> In the region near C4, the more strongly a SNP correlated with predicted C4A expression, the more strongly it associated schizophrenia.