Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Hint

Related Flashcards

Flashcards
»
functional

Functional

by Henrikhl, May 2017

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/114

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

114 Cards in this Set

Front
Back
3rd side (hint)

What is functional genomics?	The study of how things with different genotypes can have different functions, i.e. phenotypes!
How big the genome?	3Gb. 6 meters in each cell!
What different types of proteins are there.	Structural, regulatory__Housekeeping (general), Luxury (specific)
Why might RNA not be a good proxy for proteins?	asf
What are the 4 analyses you can do with RNA-seq?	Expression levels, DE, patterns in expression, splicing and isoforms
What is cDNA?	DNA strands produced from RNA via revere transcriptase.
Name 2 ways to measure RNA	Microarrays__RNA-seq
Explain how microarrays work	Probes with solid support.__Hybridise labeled samples of mRNA to these.__Shoot with lasers.
Explain how RNA-seq works	asd
Name elements in transcriptional regulation	Gene/protein interactions__DNA methylation__Chromatin (epigenetic structure)__microRNA__Gene expression
Explain 2-color microarrays	Experimental culture (red) + control culture (green)__Measure colour
Explain 1-color microarrays and give examples of types	BeadArray: Beads coated with copies of probe__23 base address linked to 50-base gene specific probe__30 copies of each bead type per array: 47k genes => 1.3M probes
Explain the general workflow when working with microarrays	asf
Explain the three cornerstones of experimental design	asd
How to detect outliers?	log-2 normalisation__Positive/negative controls__Number of deteceted probes per sample__Signal distribution__Clustering / PCA
Explain how control probes work	Positive: Houskeeping genes -- should be there!__Negative: Things that should not be there!__Negative also used to determine background.
What do MA-plots show?	log2 fraction vs. mean of logs. Should be straight.
Name some normalisation techniques	Simple scaling, LOESS, Quantile, robust spline
How does quantile normalisation work?	asf
What can be done to determine normalisation?	MA-plot / PCA__Search for patterns in data
Describe the RNA-seq workflow	Align to sequence (genome/transcriptome), get annotation, compare with databases
Describe why and how to filter probes	p = 0.01 => lots of finds! 50k probes -> 500 false pos.__Remove probes not present in all samples.__Use multiple probes per gene
Describe the three downstream analsysis categories	Class discovery: clustering, PCA__Class comparison: groups known, genes / pathways associated?__Class prediction: SVM, survival
What is the cross-hybridisation problem and what does it limit?	In microarrays, dna tags to the wrong probes.__Sensitivity, range, probe coverage
Benefits of RNA-seq	Can get isoform data -- not possible with microarray__Gives absolute abundance
Benefit of microarrays	Faster, easier, more mature analysis
How does sequencing work?	Reversible terminators__Sequencing by ligation__Nanopores
Describe the Solexa sequencing structure	15 steps or so
What are some modes of sequencing?	Paired-end: each end aligned separately__Multiplexing__Capture sequencing
What are some benefits of paired end sequencing?	Disambiguate non-uniquely mapped reads__Detect isoforms__Detect duplications, inversions, chromosomal rearrangements__Calculation of distribution of insert sizes
Describe multiplexing (probably not in it)	Look it up
Describe capture sequncing	Magnetic strip to specific sequences__Flush away rest.
Discuss problems with sequence alignment	Filtered / unfiltered (???)__Unique / non-unique__Duplicates (amplification bias)__Mismatches / indels__Adapters and index sequence
Sequencing trends (probably not)	sa
What questions are RNA and ChIP seq answering?	RNA: 1) WHAT is the sequence? 2) WHAT is he concentration?__ChIP: 1) WHERE does it bind? 2) HOW MUCH is bound?
Why is RNA-seq imperfect?	Data consists of 1-2 sequences per fragment__Base call qualities for each base in each read varies__RNA-data is meta-data. Read -> cDNA library__Reference genome rarely sample genome, SNPs etc, indels, structural variants__Reads prone to error (1/1000)
Describe the ChIP-seq protocol	1. Crosslink and shear__2. Add protein specific antibody and immunoprecipitate__3. Sequence one end of each fragment__4. Get coverage!
Describe the RNA-seq protocol	1. Select RNA of interest (e.g. mRNA)__2. Fragment and reverse-transcribe to dsDNA__3. Size select, denature to ss-cDNA__4. Sequence n bases from one/both ends of fragments (n ~ 50-100)
How to map RNA reads to transcripts?	De novo assembly(???)__Alignment + gene model assembly (map to DNA)__Transcriptome alignment (map to RNA)
Explain how de Bruijn graphs work (VERY POSSIBLE EXAM QUESTION)	asf
What is a kmer?	A substring of a read
Explain how to choose k in RNA-seq assembly	asf
How do SNPs/errors appear in Bruijn graphs?	Bubbles! Take path with highest coverage.
Explain differences between genome and transcriptome alignment	G: Detection of novel genes. Spliced alignment is tricky. Insert sizes harder to interpret.__T: No need for spliced alignment. Simplifies read counting for each isoform. Simplifies discrimination between mappings using insert sizes. Novel genes go undetected.
What is TopHat? Describe workflow.	RNA-seq aligner.
What is a gene model?	asf
How does Cufflinks work?	asf
RNA-Skim, kallisto, Sailfish -- hash-table aligners	asf
Describe briefly how to filter alignments	Pick part SNP / variant with best coverage__Multiply matching sequences with outlying insert lengths__Take out repeats (sole peaks)
Why are isoforms interesting?	They give increased resolution of RNA-sequence (the variants)__We have two versions of each isoform sequence in diploid orgs
What is a Poisson distribution?	Independent events occuring at given rate__Mean=Var=r8, pets_r8=dogs_r8+cats_r8
What are three determining factor for how many reads align to a transcript?	Total nr reads__Length of transcript__Abundance of transcript
What is a formula for estimated gene reads?	r_g = Poisson(b mu_g l_g)__mu_g: concentration, l_g effective length, b: norm
What are some problems with Poisson models?	Gene length ambigous due to isoforms__Cross-linkage__We don't get reads of isoforms
Explain the multinomial distribution	>2 categories. Good when there are multiple SNPs and isoforms.
MMSEQ	asf
Transcript amalgation	asf
What is gene imprinting?	Genes are expressed in a parent-of-origin specific manner__Gene from father imprinted -- only mother version expressed
What are the aims of read count normalisation?	Comparable across features (e.g. genes)__Comparable across samples__Human-friendly scale
What is RPKM normalisation?	Set k_ig such that estimates of mu_ig are comparable between genes and samples.__muhat_ig = r_ig / k_ig
From Binomial to Poisson	awff
TMM normalisation	asf
Median log deviation normalisation	asf
What is a negative binomial distribution? When do we use it?	When rate of Poisson dist not fixed, but varies according to a gamma dist.__Variance is greater than mean.
How does ChIP-seq work?	1.Isolate chromatin 2. Cross-link and fragment 3. Antibodies, precipitate 4.__Reverse cross-links, purify DNA 5. Ligate adaports, sequence
What is ChIP-seq interested in?	Where is the protein bound? 90 % background!__Where are proteins bound differentially?__What do the sites mean biologically?
Describe the ChIP-seq workflow	asf
Why use a control track in ChIP-seq?	Get background. See tissue anomalies. Open chromatin???. Experimental / technical biases. Background distribution irregular.
Name three types of controls in ChIP-seq	Input__Vehicle__Non-specific antibody
Explain good experimental design in ChIP-seq	Technical replicates: multiple lanes per flowcell__Biological: patient samples, model organisms__Experimental: repeat procedures, (different antibody)
Name the important ChIP-seq parameters in expermiental design	Single end / paired end, read length (50bp), read depth (20-30M), batches and randomisation, multiplexing (one pool is optimal)
What are blacklists for?	Duplicates etc.
How would you perform quality assesment in ChIP-seq?	Coverage histogram__Fragment length estimation (cross-correlation, cross coverage, normalised__score)
How do you call peaks?	Identify maxima. Take region around it.
Name three peak-based metrics	1. Reads in peaks (reads that overlap peaks, dist of read density)__2. Peak profiles (mean density at each position relative to summit)__3. Clustering and PCA
What are the two types of differential binding analysis?	Overlap: peak/site occupancy__Quantitative: binding _affinity___Binding site count density__Binding profile__Sliding windows
What are consensus peaks?	Peaks found in several samples from the same tissue/experiment
Explain the DiffBind workflow	1. Read in peaksets__2. Determine occupancy__3. Count reads__4. DBA__5. Plot and report, then re-evaluate.
Name plotting tools used in DBA	MA plots__Heatmaps / Clustering / PCA__Boxplots. Peak abundance / density.
Name as man yas you can of the 7 regulatory elements:	TFs, Histone mods, Nucleosome pos, chromatin domains, Polycomb group (PcG) proteins, DNA methylation, non-coding RNA.
Name four types of TFs	Master regulators, General TFs, Pioneer factors, Tissue specific factors
What are the three classes of regulatory elements?	Core promotes__Enhancers & super enhancers__Locus control regions__Silencers__Insulators
Core promoters	asf
Enhancers & super enhancers	asf
Locus control regions	asf
Silencers & insulators	asf
How to identify regulatory elements?	Motif analysis__Sequence binding sites
MEME	MEME-ChIP	HOMER
What is the difference between epigenetics and epigenomics?	genetics: gene expression__genomics: overall chromatin state of full genome; organism has single genome, many epigenomes
What is a histone?	What the DNA coils around -> gives chromatin
How can histone be modified?	5 common ways -- see slide
What are chromatin marks?	af
Chromatin segmentation	asf
WHAT ARE CHROMATIN MARKS?	asf
Explain how Hi-C works	asf
What are some Hi-C technical and biological biases?	awf
A-B compartments	af
How to detect chromatin accessibility?	awf
ATAC-seq	asf
What is a pathway?	Series of conseq interactions that give rise to a certain state
What are the three types of pathways?	Signalling__Genetic / transcriptional / regulatory__Metabolic (ana / cata / energy transport)
How would you determine enrichment of gene lists?	Fisher's / hypergeom__Chi**2__Binomial
How would you determine enrichment of ranked lists?	Kolmogorov-Smirnov__Minimum hypergenometric test__Wilcoxon rank sum
Where do the gene lists in PWA come from?	omics data: RNA / ChIP / proteomics
What is Gene Ontology?	Contains information on cellular component, molecular function, biological__process
What are some causes of SNPs?	Replication errors__Repair error__Mutagens__Spontaneous
What are some causes of INDELs?	Strand slippage__Aberrant repair__Retrotransposons
What are some causes of Structural Variations?	Replication errors__Retrotransposition__Repair errors__Recombination errors
What is a somatic variation?	Mutations acquired post conception__SNVs, not SNPs__CNAs, not CNVs__INDELs__SVs
What is the difference between SNVs/SNPs and CNAs/CNVs?	???
Explain the copy number calling workflow	Quantify signal (depth / intensity)__Segment chromosome (HMM / smoothing)__Call changes (threshold, cluster, prob. models)
B-allele frequency	af
BAF-banding	fass
Some cancer types are mutation driven, some CNA driven	as

Share This Flashcard Set

Set the Language

Related Flashcards

Functional

Add to Folders

Upgrade to Cram Premium

Card Range To Study

114 Cards in this Set