• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/71

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

71 Cards in this Set

  • Front
  • Back
Give an example of a conserved gene.
Thioredoxin protein coding gene.
-Facilitate reduction of other proteins (antioxidant)
-Highly similar across different domains of life
Why can we make effective genomic comparisons?
Unity of life
a. Central dogma (DNA --> RNA --> proteins)
b. All use same isoforms of sugars and AAs (D-sugars and L-amino acids)
c. Universal genetic code
What are some stop codons?
UAA
UAG
UGA
What is the start AA? What is its codon?
Methionine
AUG
What is the basis of heritable material?
the genome
In genomics, homology is ...
the basis for most sequence similarity across genomes from different species
Define homology
Similar feature in two organisms because inherited from a common ancestor
Define molecular homology
similarity at DNA level because inherited from common ancestor;
sequences remain similar until mutation happens
Phylogeny is...
-Historically based on one gene, but w/genomics potential for 100s of genes
-use of the characters in DNA and protein sequences to determine the relationships among organisms
What determines the level of similarity/dis-sim when comparing 2+ genomes?
Depends on how distantly related the genomes are
-Short indels, domain (exon), gene, gene cluster, segment, chromosome, genome
(<--- decreasig in frequency - recall graph, lecture 9)
In comparative genomics, the greatest differences will be seen at the ____________ level. .....
nucleotide; fewer differences observed at the level of protein sequence (greater conservation)
In aligned protein sequences, the letters below each AA alignment indicate what?
Uppercase = 100% conservation
Lowercase = 75% conservation
What does a dash (gap) in an alignment indicate?
Hypothesized insertion or deletion relative to ancestor
Why is there greater conservation retained at the level of proteins (amino acids)?
Because of the degeneracy of the genetic code
What is the degeneracy of the genetic code?
Even in highly conserved proteins, mutations accumulate in third codon positions because they do not affect the protein sequence (often) and thus are not selected against?
Discuss synonymous, nonsense, and non-synonymous nt substitutions
Synonymous = same AA produced; neutral
Nonsense = stop; usually deleterious
Non-synonymous = different AA; unknown effect
How are we able to use other species as model organisms?
Strong similarity of proteins across species
-44% D. mel genes have homologs
-25% of C. elegans genes have human homologs
Why is sequencing the fly and other organisms a major part of the HGP?
We can put human forms of disease genes in model organisms and study the phenotypic effects
What is a key focus of comparative genomics?
Evolution of gene families - derived from common ancestral gene that underwent duplication events
What is evidence for a gene family?
Sequence similarity in genes/proteins within a genome
Where do genes come from?
Vertical evolution: chromosomal segment duplication
-unequal crossing over during recombination
-thru RNA (retrogene)
How can you tell if a gene duplicate was due to unequal crossing over?
They would occur in tandem
What would indicate a gene duplication occurred through retrotransposition/is a retrogene?
-Far from parent copy
-No introns/regulatory regions relative to parent but a polyA track
-Usually pseudogenes (but can rarely be functional) - (but there are a lot in Arabidopsis?)
Horizontal gene transfer example
Lettuce sea slug extracts chloroplasts from algae, then synthesizes chloroplast proteins to maintain their function
-Chloroplast genes acquired be horizontal transfer from algae genome!
Define orthologs
Genes separated by a speciation event; often perform same function in different species
Define paralogs
Genes separated by a duplication event; often perform different function within the same species
-Genome analyses often involve identification of all paralogs!
What are the two steps to searching for gene family members/paralogs in the genome?
1) ID groups of similar sequences through BLAST searches
2) Construct phylogenetic trees from sequences, usually using sequences from other genomes too
Large-scale gene duplications in genome are ...
-rare but important, e.g Prader Willi; differences between humans and chimps (2.7% due to duplications)
-Usually not maintained in germline
What are some kinds of large scale genomic changes?
Whole genome duplication (polyploidy)
Loss or gain of chromosomes (anueploidy)
Chromosomal fusions, translocations, inversions
Polyploidy:
Aneuploidy:
Extra set(s) of chromosomes
Abnormal number of chromosomes
Human chromosome 2 is...
Evidence for this?
Result of fusion of 2 ape chromosomes
-Two chimp chromosomes have near identical
DNA sequences to human chromosome 2
-Presence of vestigial centromere in human chromosome 2
-Presence of vestigial telomeres found in middle of human chromosome 2
Synteny is...
Conservation of gene order across chromosomes in different species
-Mosaic evolution (some gene organization highly variable across spp; others highly conserved)
Scans for selection in comparative genomics
Another major activity in the discipline
-Selection usually removes variation because purifying selection dominates
Relationship between genomics and phylogenetics?
1) Species phylogen. tree helps to map/understand genomic changes
2) Phylogen. tree used to understand evolution of gene families
3) Genomic data can be used to create phylogen. tree from sequence alignments (relies on many genes)
What's the main difference between a rooted and an unrooted phylogenetic tree?
Unrooted has no time dimension; cannot say order of branching events.
What data may be used for constructing phylogen. trees?
Morphological
-Cell bio, biochem, anatomy
Molcular
-DNA, RNA, protein seqs
-More abundant
-More simple to model
-Easier to determine homology across spp
Which seqs should be used in making phylogenetic trees?
Must have rate of evol. appropriate for the question being asked
-Fast changing for closely related organisms
-Slow changing for distantly related organisms
Sequences must be homologous
-Share a common ancestor
-Alignment of sequences
-Similarity alone does not imply homology
Alignment of sequences is...
-Basic level of sequence comparison
-A hypothesis used to make decisions about function and evolution, homology
What are some types of sequence alignments?
Pairwise
-Global or local
-Useful in genome annotation
Multiple seq. alignments
-Aligning more than 2 similar seqs
-Useful in phylogenetics and in determining functional regions in DNA or protein
Dot plots are...
-Type of pairwise alignment
-One seq. on X axis of matrix, other on Y axis
-Cells of identical characters are filled
-Local similarity shown by broken diagonals (gaps shift the aligned regions; identical regions align diagonally)
The optimal dot plot path is...
Through the
A. largest number of matching residues and
B. fewest number of mismatching residues
Multiple diagonals on a dotplot when comparing a sequence against itself indicates...
Criss-cross diagonals indicate...
repetitive sequence
pallindromic sequence
What does a global pairwise alignment attempt to do?
Find the highest alignment "score" over the entire length of the 2 sequences.
-Reward for match, penalty for mismatch.
-High cost to open a gap, low cost to extend one (e.g. -5 and -1, respectively)
What is the Needleman-Wunsch algorithm?
An algorithm for pairwise alignment
-Done with computer software
What is BLOSUM62?
A common scoring matrix for pairwise alignment of protein sequences
How to multiple sequence alignments (MSAs) work?
-Uses heuristic progressive alignment starting with making all pairwise comparisons
-Order influences outcome, no guarantee of best global alignment
What are the methods of calculating phylogenetic trees?
Maximum parsimony
Maximum likelihood
Bayesian methods
Distance methods (e.g. neighbor joining)
UPGMA (unweighted pair group method with arithmetic mean)
ALL REQUIRE COMPUTATIONAL SHORTCUTS due to the enormity of tree space (exponential growth of options with each additional taxa)
UPGMA works by...
Calculating a dissimilarity matrix from sequence alignment, then joining the 2 seqs (taxa) that have the least dissimilarity first.
-The reduce matrix and calc avg. dissimilarity for the 2 joined seqs with the remaining seqs
Maximum parsimony follows that...
Optimal tree is the one requiring the fewest mutations to explain the data
What is the most common method of tree support?
Bootstrapping - resampling the dataset with replacement to create replicate datasets of the same size.
--> Make trees and summarize
What is phylogenomics?
Using large amounts of molecular seq data to construct phylogenetic trees
-Useful for rapid speciation and deep evolutionary relationships
What are prokaryotes?
Bacteria and Archaea (think of Woese's r16S tree)
-Unicellular organisms
-No nucleus or membrane-bound organelles
-Circular chromosome sometimes w/plasmids (DNA separate from chromosomal DNA)
-NOT a monophyletic group
Why are prokaryotes important?
6e30 cells; hold as much carbon as all plants in the world; essential component of global geochemical cycles; biomedical importance (humans as a community of bacteria)
What is the human microbiome?
Idea that total number of cells in a human could exceed human cells by a factor of 10
What is a "helpful bacteria" example from the human microbiome project?
Lactobacillus johnsonii normally a gut bacteria becomes abundant in vagina during pregnancy
-Could transfer to baby on delivery --> inoculate gut in preparation for milk digestion
What are key genomic similarities between Eukaryota and Archaea and Bacteria?
Bacteria have...
No introns
No DNA-associated proteins resembling histones
Unbranched fatty acid phospholipids
What are Archaea?
The extremophiles (halophiles, thermophiles, acidophiles
-Have interesting chemistries (methanogens, sulfate reducers, cellulose digestion)
How to hyperthermophilic archaea do it?
Stabilize DNA with special ligands; shorter proteins, more charged residues and chaperones (proteins that assist in folding)
Name an important extremophile and what phylum it belongs to.
Thermus aquaticus (Taq). A bacteria, not an Archaea. Economically important; thermophile found in hot geysers of Yellowstone.
What are Bacteria?
Rarely have sub-cellular compartments
Diverse metabolisms
Simple morphology (no cytoskeleton)
Circular genomes
Can contain plasmids
Little non-coding DNA
In bacterial genome replication, the lack of a nucleus means...
Replication, transcription and translation co-occur, therefore more genes are encoded on the leading strand (avoid molecular collisions between RNA polymerase and replication)
E. coli fun facts?
1997 - completely sequenced
4.6 Mb
4288 genes
87.8% coding
Genes are small (less than 1000 bp)
Lab strain closely related to highly pathogenic strains!
What is the difference between lab E. coli and pathogenic E. coli?
Pathogen genome larger and contains more genes (acquired by lateral transfer)
-Have toxins, adhesion factors, invasion factors, capsule and maybe plasmids
What are eukaryotes?
Organisms with nucleus and membrane-bound organelles
-includes amoeba and paramecium (fka Protista)
Origin of eukaryotic cell?
Symbiosis of archaea inside bacteria
What is the fusion hypothesis?
Origin of eukaryotes (archaea-bacteria fusion)
-Archaea-like genes retained for informational processes
-Eubacteria-like genes retained for operational processes
What are some hypotheses for the origin of Eukaryota?
Fusion
Engulfment
Symbiosis
What is Endosymbiotic theory?
(Lynn Margulis - UMass A)
Theory for the origin of mitochondria and plastids (resemble bacteria in size and shape)
-Contain small circular genome
-Have their own ribosomes similar in size and seq to bacterial ribosomes
Animal mt genomes are ...
Plant mt genomes are ...
Small (13 genes in humans)
Larger, 50-60 genes
What does a reduced genome (size) indicate?
Symbiotic/parasitic relationships where genes were contributed to the nuclear genome and lost in the symbiont (think a-Proteobacterial endosymbiont --> mitochondrion)
How many times did multicellularity evolve in eukaryotes?
25 times!
-A key innovation to Eukaryota