Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
71 Cards in this Set
- Front
- Back
Give an example of a conserved gene.
|
Thioredoxin protein coding gene.
-Facilitate reduction of other proteins (antioxidant) -Highly similar across different domains of life |
|
Why can we make effective genomic comparisons?
|
Unity of life
a. Central dogma (DNA --> RNA --> proteins) b. All use same isoforms of sugars and AAs (D-sugars and L-amino acids) c. Universal genetic code |
|
What are some stop codons?
|
UAA
UAG UGA |
|
What is the start AA? What is its codon?
|
Methionine
AUG |
|
What is the basis of heritable material?
|
the genome
|
|
In genomics, homology is ...
|
the basis for most sequence similarity across genomes from different species
|
|
Define homology
|
Similar feature in two organisms because inherited from a common ancestor
|
|
Define molecular homology
|
similarity at DNA level because inherited from common ancestor;
sequences remain similar until mutation happens |
|
Phylogeny is...
|
-Historically based on one gene, but w/genomics potential for 100s of genes
-use of the characters in DNA and protein sequences to determine the relationships among organisms |
|
What determines the level of similarity/dis-sim when comparing 2+ genomes?
|
Depends on how distantly related the genomes are
-Short indels, domain (exon), gene, gene cluster, segment, chromosome, genome (<--- decreasig in frequency - recall graph, lecture 9) |
|
In comparative genomics, the greatest differences will be seen at the ____________ level. .....
|
nucleotide; fewer differences observed at the level of protein sequence (greater conservation)
|
|
In aligned protein sequences, the letters below each AA alignment indicate what?
|
Uppercase = 100% conservation
Lowercase = 75% conservation |
|
What does a dash (gap) in an alignment indicate?
|
Hypothesized insertion or deletion relative to ancestor
|
|
Why is there greater conservation retained at the level of proteins (amino acids)?
|
Because of the degeneracy of the genetic code
|
|
What is the degeneracy of the genetic code?
|
Even in highly conserved proteins, mutations accumulate in third codon positions because they do not affect the protein sequence (often) and thus are not selected against?
|
|
Discuss synonymous, nonsense, and non-synonymous nt substitutions
|
Synonymous = same AA produced; neutral
Nonsense = stop; usually deleterious Non-synonymous = different AA; unknown effect |
|
How are we able to use other species as model organisms?
|
Strong similarity of proteins across species
-44% D. mel genes have homologs -25% of C. elegans genes have human homologs |
|
Why is sequencing the fly and other organisms a major part of the HGP?
|
We can put human forms of disease genes in model organisms and study the phenotypic effects
|
|
What is a key focus of comparative genomics?
|
Evolution of gene families - derived from common ancestral gene that underwent duplication events
|
|
What is evidence for a gene family?
|
Sequence similarity in genes/proteins within a genome
|
|
Where do genes come from?
|
Vertical evolution: chromosomal segment duplication
-unequal crossing over during recombination -thru RNA (retrogene) |
|
How can you tell if a gene duplicate was due to unequal crossing over?
|
They would occur in tandem
|
|
What would indicate a gene duplication occurred through retrotransposition/is a retrogene?
|
-Far from parent copy
-No introns/regulatory regions relative to parent but a polyA track -Usually pseudogenes (but can rarely be functional) - (but there are a lot in Arabidopsis?) |
|
Horizontal gene transfer example
|
Lettuce sea slug extracts chloroplasts from algae, then synthesizes chloroplast proteins to maintain their function
-Chloroplast genes acquired be horizontal transfer from algae genome! |
|
Define orthologs
|
Genes separated by a speciation event; often perform same function in different species
|
|
Define paralogs
|
Genes separated by a duplication event; often perform different function within the same species
-Genome analyses often involve identification of all paralogs! |
|
What are the two steps to searching for gene family members/paralogs in the genome?
|
1) ID groups of similar sequences through BLAST searches
2) Construct phylogenetic trees from sequences, usually using sequences from other genomes too |
|
Large-scale gene duplications in genome are ...
|
-rare but important, e.g Prader Willi; differences between humans and chimps (2.7% due to duplications)
-Usually not maintained in germline |
|
What are some kinds of large scale genomic changes?
|
Whole genome duplication (polyploidy)
Loss or gain of chromosomes (anueploidy) Chromosomal fusions, translocations, inversions |
|
Polyploidy:
Aneuploidy: |
Extra set(s) of chromosomes
Abnormal number of chromosomes |
|
Human chromosome 2 is...
Evidence for this? |
Result of fusion of 2 ape chromosomes
-Two chimp chromosomes have near identical DNA sequences to human chromosome 2 -Presence of vestigial centromere in human chromosome 2 -Presence of vestigial telomeres found in middle of human chromosome 2 |
|
Synteny is...
|
Conservation of gene order across chromosomes in different species
-Mosaic evolution (some gene organization highly variable across spp; others highly conserved) |
|
Scans for selection in comparative genomics
|
Another major activity in the discipline
-Selection usually removes variation because purifying selection dominates |
|
Relationship between genomics and phylogenetics?
|
1) Species phylogen. tree helps to map/understand genomic changes
2) Phylogen. tree used to understand evolution of gene families 3) Genomic data can be used to create phylogen. tree from sequence alignments (relies on many genes) |
|
What's the main difference between a rooted and an unrooted phylogenetic tree?
|
Unrooted has no time dimension; cannot say order of branching events.
|
|
What data may be used for constructing phylogen. trees?
|
Morphological
-Cell bio, biochem, anatomy Molcular -DNA, RNA, protein seqs -More abundant -More simple to model -Easier to determine homology across spp |
|
Which seqs should be used in making phylogenetic trees?
|
Must have rate of evol. appropriate for the question being asked
-Fast changing for closely related organisms -Slow changing for distantly related organisms Sequences must be homologous -Share a common ancestor -Alignment of sequences -Similarity alone does not imply homology |
|
Alignment of sequences is...
|
-Basic level of sequence comparison
-A hypothesis used to make decisions about function and evolution, homology |
|
What are some types of sequence alignments?
|
Pairwise
-Global or local -Useful in genome annotation Multiple seq. alignments -Aligning more than 2 similar seqs -Useful in phylogenetics and in determining functional regions in DNA or protein |
|
Dot plots are...
|
-Type of pairwise alignment
-One seq. on X axis of matrix, other on Y axis -Cells of identical characters are filled -Local similarity shown by broken diagonals (gaps shift the aligned regions; identical regions align diagonally) |
|
The optimal dot plot path is...
|
Through the
A. largest number of matching residues and B. fewest number of mismatching residues |
|
Multiple diagonals on a dotplot when comparing a sequence against itself indicates...
Criss-cross diagonals indicate... |
repetitive sequence
pallindromic sequence |
|
What does a global pairwise alignment attempt to do?
|
Find the highest alignment "score" over the entire length of the 2 sequences.
-Reward for match, penalty for mismatch. -High cost to open a gap, low cost to extend one (e.g. -5 and -1, respectively) |
|
What is the Needleman-Wunsch algorithm?
|
An algorithm for pairwise alignment
-Done with computer software |
|
What is BLOSUM62?
|
A common scoring matrix for pairwise alignment of protein sequences
|
|
How to multiple sequence alignments (MSAs) work?
|
-Uses heuristic progressive alignment starting with making all pairwise comparisons
-Order influences outcome, no guarantee of best global alignment |
|
What are the methods of calculating phylogenetic trees?
|
Maximum parsimony
Maximum likelihood Bayesian methods Distance methods (e.g. neighbor joining) UPGMA (unweighted pair group method with arithmetic mean) ALL REQUIRE COMPUTATIONAL SHORTCUTS due to the enormity of tree space (exponential growth of options with each additional taxa) |
|
UPGMA works by...
|
Calculating a dissimilarity matrix from sequence alignment, then joining the 2 seqs (taxa) that have the least dissimilarity first.
-The reduce matrix and calc avg. dissimilarity for the 2 joined seqs with the remaining seqs |
|
Maximum parsimony follows that...
|
Optimal tree is the one requiring the fewest mutations to explain the data
|
|
What is the most common method of tree support?
|
Bootstrapping - resampling the dataset with replacement to create replicate datasets of the same size.
--> Make trees and summarize |
|
What is phylogenomics?
|
Using large amounts of molecular seq data to construct phylogenetic trees
-Useful for rapid speciation and deep evolutionary relationships |
|
What are prokaryotes?
|
Bacteria and Archaea (think of Woese's r16S tree)
-Unicellular organisms -No nucleus or membrane-bound organelles -Circular chromosome sometimes w/plasmids (DNA separate from chromosomal DNA) -NOT a monophyletic group |
|
Why are prokaryotes important?
|
6e30 cells; hold as much carbon as all plants in the world; essential component of global geochemical cycles; biomedical importance (humans as a community of bacteria)
|
|
What is the human microbiome?
|
Idea that total number of cells in a human could exceed human cells by a factor of 10
|
|
What is a "helpful bacteria" example from the human microbiome project?
|
Lactobacillus johnsonii normally a gut bacteria becomes abundant in vagina during pregnancy
-Could transfer to baby on delivery --> inoculate gut in preparation for milk digestion |
|
What are key genomic similarities between Eukaryota and Archaea and Bacteria?
|
Bacteria have...
No introns No DNA-associated proteins resembling histones Unbranched fatty acid phospholipids |
|
What are Archaea?
|
The extremophiles (halophiles, thermophiles, acidophiles
-Have interesting chemistries (methanogens, sulfate reducers, cellulose digestion) |
|
How to hyperthermophilic archaea do it?
|
Stabilize DNA with special ligands; shorter proteins, more charged residues and chaperones (proteins that assist in folding)
|
|
Name an important extremophile and what phylum it belongs to.
|
Thermus aquaticus (Taq). A bacteria, not an Archaea. Economically important; thermophile found in hot geysers of Yellowstone.
|
|
What are Bacteria?
|
Rarely have sub-cellular compartments
Diverse metabolisms Simple morphology (no cytoskeleton) Circular genomes Can contain plasmids Little non-coding DNA |
|
In bacterial genome replication, the lack of a nucleus means...
|
Replication, transcription and translation co-occur, therefore more genes are encoded on the leading strand (avoid molecular collisions between RNA polymerase and replication)
|
|
E. coli fun facts?
|
1997 - completely sequenced
4.6 Mb 4288 genes 87.8% coding Genes are small (less than 1000 bp) Lab strain closely related to highly pathogenic strains! |
|
What is the difference between lab E. coli and pathogenic E. coli?
|
Pathogen genome larger and contains more genes (acquired by lateral transfer)
-Have toxins, adhesion factors, invasion factors, capsule and maybe plasmids |
|
What are eukaryotes?
|
Organisms with nucleus and membrane-bound organelles
-includes amoeba and paramecium (fka Protista) |
|
Origin of eukaryotic cell?
|
Symbiosis of archaea inside bacteria
|
|
What is the fusion hypothesis?
|
Origin of eukaryotes (archaea-bacteria fusion)
-Archaea-like genes retained for informational processes -Eubacteria-like genes retained for operational processes |
|
What are some hypotheses for the origin of Eukaryota?
|
Fusion
Engulfment Symbiosis |
|
What is Endosymbiotic theory?
|
(Lynn Margulis - UMass A)
Theory for the origin of mitochondria and plastids (resemble bacteria in size and shape) -Contain small circular genome -Have their own ribosomes similar in size and seq to bacterial ribosomes |
|
Animal mt genomes are ...
Plant mt genomes are ... |
Small (13 genes in humans)
Larger, 50-60 genes |
|
What does a reduced genome (size) indicate?
|
Symbiotic/parasitic relationships where genes were contributed to the nuclear genome and lost in the symbiont (think a-Proteobacterial endosymbiont --> mitochondrion)
|
|
How many times did multicellularity evolve in eukaryotes?
|
25 times!
-A key innovation to Eukaryota |