• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/193

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

193 Cards in this Set

  • Front
  • Back
prokaryote
genemRNA is exact replica of gene, transcription and translation occur in same compartment, transcription and translation of mRNA are coupled.
eukaryote
genemRNA not replica of gene, transcription takes place in nucleus, translation in cytoplasm, nuclear mRNA precursors (pre-RNA) undergo RNA processing (capping, splicing, polyA addition)--> mRNAs that are transported to cytoplasm for translation, transcription and mRNA processing events are coupled.
how much of each type of RNA are included in eukaryotic cell?
cytoplasmic RNAs: ribosomal RNA - 80%, transfer RNA - 15%, messenger RNA - 3%
nuclear RNA: precursor RNAs, small RNAs - 2%
discovery of mRNA processing
found out mRNA was synthesized as a longer precursor in the nucleus. larger class of nRNAs referred to as heterogenous nuclear RNA (hnRNA) has many different sizes. mRNA associated with cytoplasmic polyribosomes was smaller than hnRNA. breakthru discoveries: mRNA and hnRNA found to contain 5' caps and 3' polyA tails = unique tags marking mRNA and its nuclear RNA precursors (now called pre-RNA)
RNAs separated by Rate-Zonal sedimentation thru sucrose gradient and pulse chase
pulse chase: 30 min pulse label, actinomycin D to block transcription, 4 hr chase, and isolate and analyze hnRNA and cytoplasmic mRNA in sucrose gradients. showed a lot of mRNA doesn't come out of nucleus. mRNAs have smaller molecular size than hnRNA.
capping on mRNA
occurs immediately after transcription initiates. new pre-mRNA is capped when only 20-40 nt long. cap has 7-methyl guanine in unusual 5'-5' triphosphate linkage, addition of 5'cap requires only 3 proteins. involves RNA triphosphatase, guanate transferase, and addition of methyl. adding methyl guanine in triphosphate linkage is for resistance to nucleases, so RNA doesn't get degraded.
signals for PolyA Addition
RNA cleavage exposes polyA addition site. transription terminates 3' of polyA addition site. polyA signals flanking pre-mRNA cleavage site. pre-mRNA cleavage and enzymatic synthesis of polyA on new 3' end by polyA polymerase (PAP). polyA tails are not encoded in DNA, added post-transcriptionally, synthesized by PAP in complex composed of 12 proteins, all euk mRNA contains 3' polyA except histone. then RNA splicing will occur. you don't always reach polyA before splicing begins, somtimes takes too long.
functions of 5' cap and 3' polyA
resistance to degradation, transport out of nucleus, translation initiation, mRNA stability and turnover. determine exact function in lifetime. molecular biologists use polyA tail to purify mRNA away from rRNAs and tRNAs. polyA is used in cloning mRNA.
conclusions: eukaryotic mRNA processing
mRNA is derived from much larger hnRNA (pre-mRNA). 5' caps and 3' polyA added first on hnRNA and are preserved in mRNA processing and transport to cytoplasm. size difference between pre-mRNA and mRNA later shown to be due to removal of internal RNA sequences (called introns). introns first identified in 1977 when "split" gene structures were discovered in rukaryotes. split genes composed of exons (expressed sequences) and introns (intervening sequences). exons in pre-mRNA are joined together by RNA splicing which precisely removes introns --> mRNA.
5' untranslated region
5'UT extends from 5' cap to first AUG codon where translation is initiated
coding region
multiple codons in open reading frame that specify the amino acid sequence of one polypeptide (monocistronic)
3' untranslated region
(3'UT) follows the STOP codon at the end of ORF - 3'UT contains AAUAAA polyA signal and multiple stop codons. also contains sequences that control mRNA stability and translation via small regulatory RNA (eg miRNA) binding.
eukaryotic genes vary enormously in both their size and number of exons and introns.
average human gene is 28,000 bp long and contains ~9 exons separated by ~8 introns. RNA splicing must occur with exact precision without the loss or addition of even a single nucleotide in exons.
4 conserved sequences for RNA splicing
5' splice site: AG][GU(AG)AGU
branch site A (intron)
3' splice site: py tract (intron) and
AG][G
what interacts with splice sites?
small nuclear RNAs.
snRNA and splice sites
U1 snRNA base pairs to 5' splice site.
U2 snRNA base pairs to intron branch site and displaces BBP (at the end)
U6 replaces U1 at 5' splice site and base pairs to U2 to bring 5' splice site and branch site together.
self splicing
1st transesterification reaction: OH 2' on branch site forms a bond with Phosphorus at 5' splice site. the phosphorus breaks bond from G of 5' splice site in 5' exon.
2nd transesterification reaction: OH 3' on 5' splice site attacsk phosphorus on 3' splice site of intron and the phosphorus breaks away from guanine next to it. now you have spliced exons and intron lariat (degraded immediately). splicing require lots of energy, so needs ribozyme, RNA-enzyme, etc.
spliceosomes carry out mRNA splicing
done by spliceosome RNA (made of RNA and proteins).
composition: 5 snRNPs (small nuclear ribonucleoprotein particles, SNURPS. uridine-rich small nuclear RNAs: U1, U2, U4, U5, U6 snRNAs. 150 proteins.
roles of snRNAs in splicing
U1 binds 5' splice site, U2 binds intron branch site A
U4-U5-U6 complex brings 5' site and branch site A together. U6 catalyzes 5' splice site cleavge and lariat formation. (displaces U1). U5 mediates 3' splice site cleavage and 5' exon joining to 3' exon. snRNP functions carried out by spliceosome uridine-rich snRNAs, rather than proteins.
formation of spliceosome
binding of U1 to 5' splice site, BBP to branch point, U2AF to 3' polypyrimidine. release of BBP and binding of U2 to branch point. Binding of U4/U6, U5 and release of U2AF
splicing events
rearrangement releases U1 from 5' site and U4 from U6, U6 binds U2. U6 then catalyzes first transesterification reaction -> lariat. U5 assists in the second transesterification reaction to join exons.
capping, RNA splicing, and polyA addition are tightly coordinated with transcription
transcription and RNA splicing are coupled, enhances efficiency and accuracy of splicing. spliceosome snRNPs are loaded on "tail" (CTD) of RNA polymerase II. these snRNPs bind 2st to 5' splice site in transcribed RNA. Next transcribed 3' splice site usually used in splicing thus lowering the chance of exon skilling. RNA sequences located in exons called Exonic Splicing Enhancers (ESE) and Exonic Splicing Suppressors (ESS) bind SR (Ser/Arg)-rich nuclear proteins that direct recruitment of spliceosome to splice sits. SR binding to ESS can inhibit splicing --> alternative splicing (eg exon skipping)
errors in RNA splicing
estimated that ~15% of mutations causing genetic diseases and cancer are due to mutations that affect RNA splicing. Most splicing errors due to changes in splicing sites. examples of splicing errors: exon skipping (skips one of the exons along with the introns) and pseudo splice-site selection (looks like splice site, but isn't, so it doesn't see one of the exons).
Simple splicing
one gene-> one mRNA.
alternative splicing
one gene -> multiple mRNAs: it generates different mRNAs from the same pre-mRNA, this produces different proteins from a single eukarotic gene. Different permutation, like gene 1&3 only, or 1&4 only. 2 types: constitutive: two or more splice variant mRNAs are always made. Regulated: splice variant mRNAs are made in only certain cell types or at certain times of development. >70% human genes are alternatively spliced to make 2-3 diff mRNA species. it's imp for increasing functional capability of eukaryotic genomes. occurs in higher euks, not in unicellular.
first example of alternative RNA splicing in euk cell gene
antibody heavy chain mRNAs. secreted H-chain mRNA (code for H-chains in antibodies secreted from B-lymphocytes) and membrane H-chain mRNA (code for H-chains in antibodies on membranes of B-lymphocytes)
example of alternatively spliced mRNAs and proteins.
drosophilia DSCAM gene: 38,016 alt spliced mRNAs and proteins...
what is advantage of having introns that need to be eliminated before mRNA can be translated?
original proposal: introns allow evolution to proceed at increased pace because exons with flanking splice sites are highly portable - presumes that introns are merely spacers.
-exons often encode functional protein domains
-alternative splicing allows a variety of related protiens to be synthesized from a single gene
-exons from different genes can be rearranged to generate new genes and proteins of new function
-exon duplication increases protein complexity and expands function of genes.

alternative proposa;: introns function in controlling ene expression & the complexity of eukaryotic genomes.
what do you need to transport mRNAs out of nucleus?
PAB (3' polyA binding protein) protects mRNA from 3'->5 exonucleases and CBC (5' cap binding complex, aka eIF4) blocks 5'->3' exonucleases. they're RNA binding proteins and exon binding proteins. they're needed to transport, proteins recycled and attach to more mRNA after it goes back into nucleus. Correctly spliced mRNAs with 5' caps and 3' polyAs are transported via nuclear pores to cytoplasm. hnRNA fragments, excised introns, incorrectly spliced RNAs retained and degraded in nucleus. GTP hydrolysis drives mRNA thru nuclear pores. mRNA can't go out of nucleus if no cap or tail.
efficiency in translation
single mRNA can be simultaneously translated by multiple ribosomes - polyribosome or polysome.
mRNA -> protein
3 base code (4^3 = 64 amino acids possible). in vitro translation of synthetic homopolymer RNAs tested.. he got proteins to come out and studied mRNA and protein correspondence. alternating bases gave two different proteins alternating. 61 codons encoding 20 amino acids with 3 stop signals. universal code used in essentially all organisms of all 3 kingdoms of life on earth (archaea, bacteria, and eukaryotes).
synonyms
more than one codons generate same protein
AUG
methionine initiates all proteins
name stop codons!!!!
UAA, UAG, UGA
genetic code has evolved to minimize deleterious effects of mutations in codons
mutations in first position often give same amino acid. mutations in second position that maintain purines (A,G) or pyrimidines (U,C) continue to specify similar amino acid. mutations in third position rarely change amino acid.
mutations
silent mutations, missense mutations, nonsense mutations, frameshift mutations.
silent mutation
change a codon for an amino acid to another codon for the same amino acid - do not change protein sequence.
missense mutations
change the codon for one amino acid to a codon for another amino acid: result in a single amino acid change in the protein.
nonsense mutations
change the codon for an amino acid to a stop codon: result in premature termination of the protein at the stop codon.
frameshift mutation
when inserted or deleted bases change the open reading frame of mRNA: such shifts in the ORF alter the remainder of the protein sequence. frameshifts result from the insertion or deletion of 1 or 2 bases, but not 3 bases.
three rules govern the genetic code
1. codons in mRNA are read 5' -> 3'.
2. codons are non-overlapping and there are no gaps between codons in mRNA.
3. mRNA is translated in a fixed reading frame set by the initiation or start codon (AUG)

translation in both prokaryotes and eukaryotes initiates with an AUG which encodes methionine. this is the START of translation, thru additional signals are also required.
purpose of tRNAs
adaptor molecules: each amino acid is attached to its specific adaptor. adaptors read the codons in mRNA by base pairing. deliver amino acid specified by the codon.

they read mRNA codons and deliver specific amino acid specified by each codon.
what are stop codons read by?
stop codons not read by tRNAs - instead are rocognized by proteins that terminate translation.
characteristics of tRNA
70-80 nucleotides long. 30-40 diff species in bacteria. 50 diff species in euk. all tRNAs fold into highly similar tertiary structures. 3' ends of all tRNAs contain CCA which is added enzymatically (not transcribed from DNA). tRNA synthetases covalently attach specific amino acids to the 3' CCA of tRNAs. uncharged tRNA: tRNA without an amino acid. charged tRNA: tRNA with covalently attached amino acid.
secondary structure
resembles cloverleaf, appropriate amino acid is covalently attached tot he acceptor arm. tRNA base pairs with mRNA at the anticodon loop. CCA at top. tRNAs are precursors transcribed from genes, cleaved to expose end for enzymatic CCA addition, then certain bases modified enzymatically. unmodified tRNAs function in translation, but E.coli without tRNA modificationsgrow poorly. modifications likely functional.
charging of tRNA
catalyzed by tRNA synthetases: amino acids attached to 3'-CCA-OH ends present on all tRNAs. two-step reaction requiring ATP: 1. ATP adenylylation: activation of amino acid.
2. transfer of amino acid to CCA-OH of tRNA. one tRNA synthetase for each amino acid (20 synthetases). same synthetase charges all tRNA that carry the same amino acid (called isoaccepting tRNAs, have diff anticodons). accuracy of protein synthesis is absolutely dependent upon tRNA synthetase recognition of specific amino acid and its charging of corect tRNA - no proofreading is carried out by ribosome. tRNA structures recognized by synthetase: acceptor stem (major determinant), discriminator base (4th base from 3' end of CCA acceptor stem), anticodon loop.
recognition of tRNA by aminoacyl-tRNA synthetase
anticodon loop contributes to the recognition of tRNA (synthetase must recognize all isoaccepting tRNAs with diff anticodons for the same amino acid). synthetase has a pocket for the anticodon loop. tRNA acceptor stem plays major role in recognition. discriminator base: 4th base from 3' acceptor end. multiple synthetase contacts on acceptor stem.
3-dimensional tRNA structures differ significantly in only 2 places:
anticodon loop and 3' acceptor stem.
discriminator base is critical determinant for amino acyl synthetase recognition.
2 steps in charging of tRNAs by synthetases
1st step: adenylylation: ATP + amino acid -> amino acyl-AMP +pp
2nd step: transfer: amino acyl-AMP +tRNA -> amino acyl-tRNA + AMP (look at slides)
wobble pairing
there are 61 codons (3 stop codons) in genetic code. bacteria contain 30-40 tRNAs, eukaryotic cells contain only 50 tRNAs. therefore, a single tRNA needs to be able to recognize more than one codon. this is possible because unusual base pairings are allowed at the third position (3'base) of mRNA codons and the first position (5'base) of tRNA anticodons. anticodon of tRNA that can bind to 2 or 3 diff bases of mRNA's codon.
the pairing combinations with wobble concept
5' base in tRNA anticodon first, then the 3' base in mRNA codon next...

G U or C
C G
A U
U A or G
I A, U, or C
(I, inosine, is a base only in tRNAs)
why has evolution restricted the genetic code to only 20 amino acids and is the code really universal?
1. all 3 kingdoms of life use same genetic code (diverged from common ancestor)
2. a few organisms that utilize universal code alo show variant codon use (green algae use stop codons, UGA & UAA to encode Gly)
3. some archaea and bacteria also use UGA, UAG to code for two non-natural aa, selenocystein and pyrrolysine.
4. mito in euk basically use universal code by also have several diff codon assignments.
5. genetic code has been experimentally expanded to incorporate over 50 novel modified and non-natural aa into proteins in e.coli. requires engineering and selecting cognate pairs of a tRNA that recognizes the stop codon, UAG, and an amino acyl synthetase that charges that tRNA with the unusual aa.
what are ribosomes and different parts?
protein synthesis machines: cell free extract was fractionated using rate zonal centrifugation. prokaryotic ribosomes (70s) made up of two subunits (50s and 30s). (not truly based on mass, also on volume and density, so that's why 30+50 does not equal 70)
prok vs euk ribosomes
euk: 80s with 60s and 40s. RNA and protein bigger.
prok: 70s with 50s and 30s.
funcitons of ribosomal subunits:
small subunit: mediates interaction b/w mRNA codons and tRNA anticodons. reads message, interact with mRNA more.
large subunit: catalyzes peptide bond formation. binding sites for G-proteins that assist in initiation, elongation and termination. (carried out by GDP proteins) catalysis of polypeptide formation.
rRNA mediates protein synthesis in ribosomes
ribosome composition = 50% protein and 50% RNA in both prok and euk. 16s rRNA (in 30s subunit) mediates mRNA codon interactions with tRNA anticodons in "decoding center." tRNA makes direct contacts with 16s and 23s rRNAs. peptide bond formation is catalyzed by 23s rRNA (in 50s subunit) in "peptidyl transferase center." 80s euk ribosome carry out same functions as prok ribosomes.
prok translation is coupled to transcription.
each mRNA is translated by multiple ribosomes at the same time. in prok, transcription and translation happen together at about same spee. in euk, no translation happen in nucleus and slower.
ends of proteins
N-terminus and C-terminus. amino terminus of next amino acid to be added attacks the carboxyl C of the growing polypeptide chain, so that the polypeptide becomes covalentlylinked to the aminoacyl-tRNA. then peptidyl-tRNA that was attached to the carboxyl is now uncharged and exits from ribosome.
peptide bond formation
substrates: charged tRNAs.
energy: breaking the amino acyl tRNA bond releases more energy than making the peptide bond takes.
making a peptide bond between two free amino acids is energetically unfavorable.
ultimately, the energy for peptide bond formation is derived from the ATP used in tRNA charging reaction.
tRNA binding sites on ribosomes
A site: aminoacyl tRNA will enter at this site.
P site: new peptide bonds are formed at this site with aminoacyl tRNA. (P for peptidyl-tRNA)
E site: transiently occupied by the uncharged tRNA that is leaving ribosome.
tRNAs in A and P sites make contact with codons in mRNA
"kink" in mRNA promotes codon recognition by tRNAs.
translocation
when 1 codon moves in 5' to 3' direction when peptide bonds are made. exposes another codon in A site.
important features of protein synthesis in addition to ribosomes
proteins factors are required for phases of translation:
prokaryotic translation factors:
-initiation factors (IF-1, 2, and 3)
-Elongation factors (EF-Tu, EF-Ts, EF-G)
-Termination factors (RF-1, RF-2, RF-3-GTP)
GTP hydrolysis for conformational changes and regulation
ATP provides energy for peptide bond formation (from tRNA charging)
rRNAs play major roles in protein synthesis
rProteins mainly structural.
Three phases of translation
initiation: formation of initiation complex
elongation: peptide bond formation, translocation
termination: release of completed polypeptide, dissociation of ribosomal subunits.
how does cell distinguish between AUGs found in the middle of a coding region from those at the beginning of the gene?
initiation takes place only at AUGs with nearby ribosome binding sites. in prok, there's Shine-Dalgarno (S-D) sequence (ribosome binding site), 5 to 10 nucleotides upstream of AUG on mRNA forms base pairs with 16s rRNA. it's conserved, and 16s rRNA recognizes and know how to bind.
prok initiator tRNA
two types of tRNA^met present in bacterial cells recognize AUG codons. tRNA^met used to incorporate metionine within the growing protein chain. tRNA^met used to initiate protein synthesis. both tRNA species charged with met by same synthetase. in bacterial cells, met tRNA^met is further modified by the addition of a formyl group. after protein synthesis the formyl group is frequently removed, often the met is removed from the N terminus as well. N-formylmethionyl tRNA^met resembles peptide bond, presence of formyl group allows binding directly to P site.
goal of initiation?
bind mRNA on ribosome such that AUG is in the P-site, bound to an initiator MET-tRNAi.
what do IFs do?
promote correct initiation on 30 s subunit.
peptide bond formation
23s rRNA catalyzes peptide bond formation 23s rRNA is a ribozyme. ribosome stripped of protein can still catalyze peptide bonds. no protein near active site (peptidyltransferase center).

each cycle of peptide bond formation in elongation uses 2 GTP and one ATP.
-ATP for charging tRNA by tRNA synthetase before ribosome and mRNA
-GTP for EF-Tu delivery of charged tRNA to A site
-GTP for EF-G translocation of peptidyl-tRNA and mRNA into P-site thereby positioning next mRNA codon in open A-site.
termination of protein synthesis
stop codon is recognized on mRNA in A site by protein termination factors (called Release Factors, RF). RF trigger hydrolysis of polypeptide from tRNA in P site, completed protein released from ribosome. RNAs and mRNA released, ribosome dissociates into subunits. GTP hydrolysis required for termination. prokaryote RFs: RF1 recognizes UAG and UAA. RF2 recognizes UGA and UAA. RF3 stimulates release of RF1 and RF2. stop codons recognized by 3 aa "anticodon" peptide sequences in RFs (via RF protein/mRNA interaction)
translation in eukaryotes
genetic code: identical
tRNAs: identical (no formyl-MET-tRNAi).
ribosomes: larger, but identical functions
factors: more, but similar functions
processes: significant difference in initiation, otherwise overall similar processes (don't use formyl-met, use methionine).
elongation similar to prokaryotes
euk termination requires more factors and is coordinated by protein factors that interact with 3'UT of eukaryotic mRNAs.
where does euk mRNA initiation translation?
at first AUG after 5' cap. (kozak box: facilitates translation, mRNA translates more efficiently)
euk translation initiation
euk initiator Met-tRNAi binds small 40S subunit before mRNA
binding of Met-tRNAi on 40S subunit forms 43S pre-initiation complex with Met-tRNAi in P-site. 43S pre-initiation complex now ready to bind 5' mRNA cap-bound eIF4 complex. (methionine added and attached before ribosome comes)
euk initiation factor complex (eIF4F complex aka Cap Binding Complex, CBC) does what??
is bound to 5' cap and eIF4 helicase (eIF4B) "melts" secondary structure in adjacent mRNA sequence. required to get it out of nucleus. need it for ribosome to bind (need to iron out the hair pin).
43S pre-initiation complex then binds 5' end of mRNA with bound eIF4 complex. this complex then scans 5' to 3' along the mRNA to find the first AUG. at which point initiation factors dissociate and the large 60S subunit binds to form the complete 80S ribosome. open A-site in the 80S ribosome now ready for incoming aminoacyl tRNA-eEF1-GTP.
efficiency in euk initiation
euk mRNAs are held in circles that promote rapid re-initiation of translation. proteins bound to mRNA at 5' cap and at 3' polyA interact. as soon as it's done, start over.
examples of nonsense mutations:
Duchenne muscular dystrophy and cystic fibrosis (can't clear mucus from lungs, muscles die). drug: ribosome to ignore first UAA, how does it know to stop at second one? because usually stop codons are repeated in mRNA.
what is rate limiting step in translation?
initiation. prok mRNAs are mostly short-lived and regulation at level of translation is rare. euk mRNAs are mostly long-lived and translation control is imp regulatory mechanism: inhibition of translation initiation and mRNA turnover (degradation or decay).
translation initiation as well as prok mRNA stability and turnover are controlled by 5' cap and 3' polyA. euk mRNA degradation/turnover pathways:
-mRNA without cap degraded by 5' ->3' exonuclease
-mRNA without 3'polyA degraded by 3'->5'exonuclease
variations on the central dogma of molecular biology from...
viruses
agents causing diseases
studies of viruses have provided important discoveries in biological processes like: -RNA processing, splicing, polyA and -tumor viruses, oncogenes, and cancer.
provide useful tools for molecular biology.
life cycle of viruses in infected cells.
viruses have few genes and must rely on infected host cells for energy, enzymes and biosynthetic processes
viruses usually only infect specific host cells with interactive surface receptors.
first step in infection is attachment to host cell with either virus penetration and release of viral nucleic acid or injection of viral nucleic acid into cell.
second step is viral replication inside infected cell
third step is release of progeny virus from cell.
diff types of viruses have diff structures...
viral genomes can be DNA or RNA, single or double stranded, linear or circular, many diff mechanisms used for replication and expression.
viruses vary in structures some more...
virion (viral particle)
viral nucleocapsid: composed of nucleic acid (genome) and protein coat (capsid)
virus particles are organized in only two basic ways: *rod-shaped viruses are composed of capsid proteins surrounding the nucleic acid forming a helical tube. *icosahedral (20 sided) nucleocapsids that appear spherical. some have are surrounded by a lipid bilayer (envelope)
tobacco mosaic virus
plant virus
rod-shaped helical viral particles
contain single stranded RNA
first virus to be crystallized.
influenza virus
8 diff single-stranded RNA molecules (segmented RNA genome)
helical nucleocapsid
enveloped in lipid bilayer membrane containing viral and cell glycoproteins.
non-uniform shaped due to lipid envelope derived from infected cells
viral protein spikes protrude from envelope. two surface virion proteins, hemagglutinin (H) and neuraminidase (N) define flu strains (like H5N1) and are critical for flu virus infectivity.
bacteriophage and example
viruses that infect bacteria.
example: small icosahedral E.coli phage.
contain circular single-stranded DNA
first DNA to be completely sequenced.
very simple
another example, more complicated
T2, complex E.coli phage.

icosahedral head with rodlike tail, baseplate, and tail fibers. linear double stranded DNA. molecular syringe!
it shoots out DNA from bottom.
Lytic replication (in T4)

5 circular processes
1. absorption/injection into bacteria
2. expression of viral early proteins
3. replication of viral DNA, expression of viral late proteins.
4. assembly
5. lysis/release of free virion!
bacteria plaques on a lawn of bacteria
each plaque originates from a single cell infected with a single virus.
all viruses in a plaque are identical to the parental virus and constitute a clone.
lysogeny????? (not sure if this is right)
does not explode and virus leaves, but lysogenic growth is when the actual bacteria replicates.
4 different kinds of animal virus infections:
transformation of normal cells to tumor cells, lytic infection, persistent infection, and latent infection
transformation of normal cells to tumor cells
virus goes into cell, transforms into tumor cells, tumor cell devision... (like retroviruses, HPV)
lytic infection
penetration of virus, multiplication of virus within cell, death of cell and release of virus (like adenoviruses, polio, and flu)
persistent infection
virus penetrates, multiplication of virus within cell, then slow release of virus without cell death (hepatitis)
latent infection
penetration of virus into cell, multiplies just a little, and no more virus production, so don't grow. goes into cell, silent. can't tell even with virus scans. when activated, bad. like herpes or HIV.
viruses and other infectious agens cause how much of all human cancer cases?
~17%
HPV
human papillomaviruses: responsible for majority of ~11000 new cases of cervical cancer diagnosed each yera in US
spread sexually
vaccine now FDA approved (gardasil)
vaccine protects against two HPV strans that cause 70% of cervical cancer (also against 2 HPV strans that cause genital warts)
major breakthru for preventing human cancer.
public health programs for mandatory vaccination of young women controversial
HPV also infects men, strongly implicated in genital cancers - Gardasil vaccine now being tested in men.
plaque assays are used to determine ...
the number of infectious particles, also can be used to isolate pure virus clones
how to virions bind to specific cells?
protein-protein interactions between virion proteins and specific receptor proteins on the surface of the target cells.

so... host range (specificity of infection) of viruses is determined by these interactions
how to viruses replicate their genomes?
dsDNA viruses can use host enzymes, dsDNA->dsDNA.

RNA viruses must encode their own replication enzymes (some viruses even carry their polymerase in the virion).
RNA-directed RNA polymerase (like flu): RNA->RNA
reverse transcriptase: RNA -> DNA (retroviruses)
reverse transcriptase
first, there's 2 subunits of viral genome and 2 copies of reverse transcriptase.
using RNA template, it makes cDNA, and also using tRNA as primer (CCA end of tRNA). after that, RNA template is degraded by RNAse H. DNA replication happens, and the bits of RNA template that was not completely degraded by RNAse H is primer to make second strand in replication.
and then integration into host cell DNA
HIV proviral genome
DNA->RNA->packed into virion or translated into polyprotein.
then protease cleavage to yield viral proteins.
but what about HIV that does reverse transcriptase? what are some reverse transcriptase inhibitors and why are they not very effective?
reverse transcriptase inhibitors: nucleoside chain terminators (AZT and ddI)
non-nucleoside inhibitors (bind RT and block activity)
new drug: can activate latent ones to kill them...

but not very effective because HIV kills cells that stimulate immune responses so it's hard to find vaccine. virus mutates fast and changes.
effectiveness of reverse transcriptase and RNA-directed RNA synthesis.
RNA genomes are more prone to accumulate mutations because reverse transcriptases and RNA-directed RNA polymerases do not proofread products -> resistance to anti-viral therapies, produce mutant viruses in diseases.
what is the largest known virus
mimivirus: 400 nm diameter, icosahedral virion, infects amoeba. 1.2 Mb, double-stranded DNA genome, 1300 putative genes.
major development of 60s
discovery of plasmids, hybridization (allowed cloning)
70s
development of plasmid, phage vectors. restriction enzymes (allow cloning),DNA sequencing (cut up DNA in manageable pieces)
80s
PCR
90s
genomics
00s
bioinformatics and systems biology
first recombinant DNA cloning experiment in history
EcoRI digestion of two circular parts, plasmid DNA joining via Ligase, and hybrid created. Vector with new properties. 2 drug resistance in vectors (identify rare cell that's talking up DNA in transformation by virtue of the fact that it now is a tet and streplomycin resistant)
problems with studying genes before recombinant DNA technology
genes are all connected in vast DNA molecules, not separate entities like proteins. DNA fragments have same shape and charge to mass ratio, so cannot be easily distinguished from one another. each gene is a very small fraction of entire genome. to attempt to isolate 1 mg of human gene, need to start with approx. 100 mg of genomic DNA.
discovery of primative immune system: restriction/modification systems in prokaryotes
pairs of enzymse that recognize specific sequences in DNA: modification enzyme (methylase) adds a methyl group to one of the nucleotides at this specific site. restriction enzyme will cleave any DNA that is not methylated at this specific site. foreign DNA (bacteriophage) are not methylated. restriction enzyme (endonuclease) cleaves the foreign DNA at the restriction site, rendering it susceptible to other nucleases (exonucleases). hence, the restriction/modification system functions to protect bacteria from viruses and other foreign DNAs.
hemimethylated
still resistant to cleavage!
restriction enzymes
bacterial enzymes that recognize specific symmetric 4 to 8 base pair sequences in double stranded DNA (palindromes).
cleave both DNA strands at the restriction sites.
but DNA into a reproducible set of fragments called restriction fragments.
aka restriction endonucleases
hundreds of restriction enzymes with different target sequences now available
essential tools for molecular bio (mapping, cloning, sequencing)
kinds of ends and their names
blunt ends (flush ends)
sticky ends: overhang. (cohesive ends) (5' and 3' depending on which side the cut is)
most restriction enzymes are... and what are they?
homodimers: 2 identical subunits and lie along DNA. interaction with DNA is symmetrical
G^GATCC
A^GATCT
outside bases are diff so you don't regenerate the sites. but they make same sticky end.. so you can join them???
isoschizomers
same recognition sequences, but don't cut in the same place. one may be overhang and other is blunt (not for cloning.
which is more efficient? putting together 2 blunt ends or 2 sticky ends?
2 sticky, because you have 3'OH and 5'P in sticky.
how long should sequence be for cloning or mapping??
6 is perfect and idea. the approx cut frequency is 4kb. (4^n, so it's 4^6 = 4kb)
restriction map
map showing the order of restriction sites and the distance between them.
how do you separate DNA? explain the technique, how it's used...
gel electrophoresis. small DNA fragments move further thru the gel (to approach + from -)than large fragments.
agarose gels for long pieces of DNA, acrylamide gels for short pieces.
how do you see DNA when it's in a gel?
fluorescent dye, ethidium bromide intercalates between base pairs.
radioactively labeled DNA can be visualized by autoradiography of the gel.
southern blot vs northern
southern for DNA, northern for RNA.
plasmids
circular, double stranded DNA
occur naturally in bacteria, yeast, higher eukaryotes
exist in a parasitic or symbiotic relationship with their host cells
replicate separately from host cell's chromosomal DNA due to presence of plasmid DNA replication origin
contain ancillary genes useful to plasmid hosts.

using plasmids as vectors for recombinant DNA cloning requires the presence of selectable markers and the ability to introduce the recombinant DNA into a cell
3 things you need to use plasmids
1. portion of plasmid that has origin of replication so it replicates itself
2. selectable marker so when you put DNA into most cells by those methods, 1 in 4000 cells take up DNA. the other cells don't pick up. so that 1 transformed cell to live and others to die.
3. portion of plasmid with restriction site that can be used for cloning that doesn't affect function of plasmid to replicate or express selectable marker
3 essential features of plasmid vector
origin of replication
selectable marker (amp)
region into which DNA can be inserted
selectable markers
any gene whose product confers unique properties to the host cell to allow its isolation (cell cloning)
genes whose products are required for cell growth under certain metabolic conditions (lactose)
genes whose products allow cell growth in the presence of inhibitors (antibiotics)
uptake of DNA into cells in transformation is very inefficient. selection allows isolation of rare transformed cells.
antibiotics used as selectable markers
ampicillin (amp) inhibits bacterial cell wall synthesis
tetracycline (tet) inhibits protein synthesis by binding to 30s ribosomal subunit, blocks aminoacyl-tRNA entry into A-site
neomycin (neo) binds ribosomes and inhibits translocation
puromycin (pur) binds 50s peptidyl tranferase center, mimics aminoacyl-tRNA in A-site, functions as chain terminator, polypeptide on p-site chains form covalent bond to puromycin and are released from ribosome.
versatile plasmid cloning vector
contains polylinker with multiple unique restriction sites for inserting DNA fragments
example of expression of cloned genes encoding eukaryotic proteins in e.coli
granulocyte colony stimulating factor (G-CSF) increases granulocytes (type of white blood cell that fights infections) in cancer patients having chemotherapy
two methods used to express recombinant proteins
1. transient transfection (lasts few days)
2. stable transfection (aka transformation)

cloned genes or cDNAs can be expressed by either transfection method.
recombinant DNA
DNA made from 2 DNA molecules that are joined together.
vector
a DNA molecule that can replicate independently from the host cell genome. examples: phage, plasmids, viruses, and others
insert
DNA that is introduced into the vector
how do you insert DNA into a vector to make recombinant DNA?
litigation of restriction fragments with complementary sticky ends via T4 ligase.
DNA cloning in a plasmid vector
there's a plasmid vector. You add the DNA fragment to be cloned by enzymatically inserting DNA into plasmid vector. then you get a recombinant plasmid. Then you mix e. coli with plasmids in presence of CaCl2; heat pulse. Then you culture on nutrient agar plates containing ampicillin.

Two possible results: transformed cell: survives... or cells that don't take up plasmid die on ampicillin plates (untransformed).

The ones that live (transformed) will go thru cell multiplication. there will be a colony of cells, each containing copies of the same recombinant plasmid. AMPLIFICATION!
5 ways of introduction of DNA into cells
1. treatment of cells with CaCl2 DNA "gel": precipitates DNA onto cell surface... uptake of DNA into bacterial cells
2. electroporation: applying electric current results in transient "pores" in the cell membrane... DNA uptake useful method in eukaryotic cells
3. lipid treatment: DNA packaged into lipid vesicles that fuse with cell membrane, releasing DNA into cells
4. mechanical methods: microinjection, "shotgun blast"
5. recombinant virus or phage infection of cells, usually most efficient of all recombinant DNA delivery methods.
options for cloning genes (3)
1. screen and isolate desired clone from a library (aka bank or pool) of recombinant DNA clones (two types: cDNA or genomic DNA libraries)
2. PCR amplify gene directly from the genomic DNA and then insert this DNA fragment into a vector for cloning - requires knowledge of sequence of gene of interest to design specific primers
3. order directly from a company or request from another researcher (clone-by-phone)
two types of libraries
genomic DNA library (restriction digest). restriction fragments of genomic DNA inserted into a vector. clones can represent entire genome.

cDNA library: mRNA copied by reverse transcriptase into cDNA and inserted into vector. represents only the expressed genes in cell.
vector choice (depends on 3 things)
size of inserts, expression of insert, cell viability or not.

bacteriophage lambda is the winner with 25 kb, very efficient.
cloning in phage lambda allowshighly efficient introduction of DNA into e.coli cells
plasmid transformation is very inefficient: few recombinant plasmids get into cells, produces only 10^6 -10^7 transformed cells per ug DNA (10^11 molecules)

Phage lambda infection is much more efficient, about 1000x more efficient than plasmid DNA transformation.
-nearly all cells can be infected by phage
-can get 10^9 recombinant phage per ug of input DNA.

higher density of phage plaques/plate vs. transformed e. coli colonies/plate... streamlines screening.
phage lambda modified for use as cloning vector
some guy constructed early lambda vectors.
removed lambda genes controlling lysogenic state that are not required for lytic growth
allows replacement of 25kb of 49kb lambda DNA with insert exogenous DNA (hence are called replacement vectors)
recombinant lambda packaged in vitro into virus particles via cos (cohesive) ends of lambda DNA
recombinant lambda phage infection -> cell lysis and plaques - allows large scale, plaque screening for recombinant DNA clones.
assembling a genomic DNA lirary in phage lambda
cleave genomic DNA into 25 kb fragments by partial digestion with Sau3A1
remove replaceable central region of lambda phage genome by BamHI digest
ligate lambda arms to insert genomic DNA via complementary sticky ends
package recombinant lambda in vitro
infect E.coli -> plate for plaque screening with radioactive probe
typically screen 3-4 million plaques to isolate a gene from higher eukaryotic genomic DNA library.
cDNA cloning
cDNA: copy or complementary DNA derived from mRNA. cDNA library can contain cloned DNA copies of all mRNAs expressed in a cell.
only exons represented in cDNA = open reading frames
since only 3% of genome is expressed, cDNA libraries eliminate 97% of genomic DNA, significantly reduces complexity of library (reduces number of clones in screening)
can make libraries from different cell types to obtain cDNA clones of tissue-specific mRNAs.
isolation of mRNA on oligo dT
only 3% of RNA is mRNA...
need to isolate from rRNA and tRNA, etc.
only mRNAs have poly-A tails
poly-A anneals to oligo dT
mRNAs selectively bind to oligo dT resin.
after binding mRNA to resin, tRNA and rRNA are washed out.
mRNA is then eluted and used for cDNA cloning or other analysis
(wall invented)
reverst transcriptase allows the conversion of mRNA sequence into cDNA
oligo-dT hybridized to polyA as primer for cDNA synthesis -> first strand synthesis.
initial product is DNA-mRNA duplex
then degrade mRNA -> so you end up with single stranded DNA
then second DNA strand synthesized using random short oligonucleotide primers, DNA polymerase and dNTPs (by going opposite direction).
attach restriction site linkers to blunt-end cDNA, digest linkers and ligate stick ends into plasmid vector, transform e.coli.

(all diff cDNAs possible all depending on where primers sat down)
cDNA libraries can be made more selectively than genomic libraries
tissue specific cDNA libraries: examples - brain, muscle, liver...
cDNA libraries from before and after differentiation
cancer cell cDNA libraries.
PCR
an alternative to cloning (or used in conjunction with it)
PCR: polymerase chain reaction
method to amplify specific regions of DNA of known sequences in test tubes
based on in vitro DNA synthesis: huge amplification of even tiny amounts of DNA
has many diff uses, revolutionized molecular biology and many other fields
single cell PCR for recombinant DNA cloning from one 1-2 cells
RT-PCR = PCR carried out on cDNA made from RNA by reverse transcriptase.
PCR reaction
primers: short synthetic oligonucleotides (15-20 nt long) complementary to DNA sequences flanking the specific region to be amplified.
polymerase: taq DNA polymerase, isolated from thermophilic bacteria, Thermus aquaticus, enzyme not denatured at high temperatures. improved Pfu DNA polermase from Pyrococcus furiosus stale at higher temperature and makes fewer errors than Taq
reactions: cycles of DNA synthesis repeated many times to amplify a specfic region from as little as one copy of DNA.
the PCR cycle
1. DNA denaturation at 95 C
2. primer hybridization to ss DNA, extension by Taq polymerase (DNA synthesis) at 50-70 C
3. DNA denaturation at 95 c, new primers added (new Taq not required)
4. primer hybridization to ss DNA, extension by Taq polymerase (DNA synthesis) at 50-70C
5. PCR cycle repeated 25-35 times, yields over a billion copies of the amplified region from a single copy of DNA
what's good about PCR?
start to get same length DNA between primers after a while. that's unit length product. dominates after first few cycles.
influenza virus
viral genome composed of 8 single stranded RNAs: 11 proteins including an RNA polymerase that replicates viral RNA.
two viral glycoproteins in spikes on virion envelope (H and N) determine infectivity of flu viruses and are major antigens for anti-flu virus antibodies.
activities of H and N viral proteins ...
-hemagglutinin (H, 15 types): trimeric protein spikes, bind to sialic acid containing sugars on host cell membranes... viral entry into infected cell. H-binding determines flu virus host specificity
-neuraminidase (N, 9 types): clips and releases sialic acid on cell membranes of infected cells and on new flu virus particles budding from infected cells. thereby releasing newly-replicated flu viruses that efficiently infect more host cells.
tracking down the mystery of the 1918 spanish flu pandemic using PCR
most deadyflu epidemic in recorded history
swept across world
killed 20-40 million
extremely virulent. fatal in only 1-2 days
unusual pattern of susceptibility: young people b/w 15-45 most vulnerable
nature of exceptional virulence and cause of shift to human host preference now known
PCR used to replicate flu (H1N1) virus RNA in tissue samples from flu victimes: viral genome sequenced, infective flu virus reconstructed by reverse genetics, recombinant flu virus has human infectivity and high virulence like 1918 flu.
PCR: reconstruction of the
identified as avian flue (H1N1) by sequencing PCR-generated cloned DNA: total flu genome sequence differed by only 25-30 amino acids in 4000 total from non-virulent 1918 avian flu strain.
This flu was lethal in mice and produced severe lung pathology like that seen in pandemic fatalities
1918 H1 differed in only 1-2 amino acids from avian H1 sequence: shows that minimal mutations changed infectivity for new human host.
...
new flu strains arise via two mechanisms
1. mutations acquired during genomic viral RNA segment replication (RNA polym do not have proofreading). only 1-2 amino acid changes in avian H1 changed 1918 flu into pandemic virus able to efficiently infect human cells
2. reassortment of genomic viral RNA segments: occurs when two diff flu viruses infect and replicate in the same host cell (co-infection of avian and human flu viruses) -> swapping of RNA genome segments -> new virulent flu strain with ability to efficiently infect human cells, such reassortments can produce major changes in flu virus virulence and host specificity hybrid...
threat of a new avian flu pandemic (H5N1)

flu pandemics vs epidemics
flu pandemics (due to new virus strain, no resistance in population) vs. epidemics (caused by flu strains usually closely related to previous years viruses, so protective antibodies present from previous infections or flu vaccines.
flu pandemics arise when 3 essential changes occur
1. new flu virus strains emerge from animal hosts by acquiring the capability to infect humans.
2. new flu virus is exceptionally virulent for humans
3. new flu virus is able to spread efficiently among humans
potential next flu pandemic strain (H5N1)
1. first appeared in chickens in 1997 in Asia, now seasonally infects and kills poultry, wildfowl, and some mammals worldwide
first human fatality in 2003 thru 2008 has caused over 230 confirmed deaths worldwide
to date H5N1 strain is ineffective in human to human transmission (b/c it's deep in lungs, not just coughing).
nature of H5N1
1. infections occur deep in lungs and other tissues
2. high lethality due to aberrant immune responses (cytokine storms)
3. young adults most susceptible (median 20 yrs old)
4. H5N1 still exhibits inefficient human to human transmission but could easily acquire this lethal capability.
most difficult part of recombinant DNA cloning is identifying clones containing the gene of interest
3 diff strategies used to identify and isolate clones in DNA libraries of phage plaques or bacterial colonies
1. select for expression of cloned gene in a mutant cell background (most commonly used in bacteria or yeast), called complementation (functional replacement of mutant gene)
2. purify the protein and use it to ID the gene (aka reverse genetics) - two general methods: specific antibody probe used to screen for desired protein in recombinant DNA expression library, synthetic DNA oligonucleotide probe set designed form protein coding sequence used to screen for desired clone in DNA library by hybridization
3. screening of recombinant DNA phage or plasmid library using either cloned gene or cDNA, enriched mRNA, or PCR product as radioactive labeled probe in hybridization
method 1: complementation
thousands of genes have been cloned by complementation of mutant bacterial or fungal strains. utilizes a cell strain defective in gene of interest. (ex: lac- E. coli cells containing a mutation in the lacZ gene unable to grow on minimal media containing lactose as sole carbon source.
transform competent lac- E.coli cells w/ an antibiotic resistent (ampr) plasmid containing wildtype lacZ gene
plate on minimal medium containing lactose and ampicillin -> survival of only lac+/ampr transformed recombinant bacteria. isolated colonies = recombinant clones.
method 2: use protein to identify desired gene clone (reverse genetics approach):
two general ways to do this:
1. identification of the gene using specific antibody made against the isolated, purified protein (involves expression cloning to produce recombinant protein, then uses labeled antibody for screening)
2. identification of gene using known sequence of protein and genetic code to synthesize DNA oligonucleotide probes for screening recombinant DNA libraries by hybridization
first approach: antibody screens - require expression of insert DNA
Use recombinant expression vectors that contain a promoter & linked bacterial protein-coding gene (e.g.,
bacteriophage vectors are useful because expressed
recombinant fusion protein is released in plaques
Cloned DNA inserts, usually cDNA (without introns) or prokaryotic genomic DNA, are inserted in-frame into the coding sequence of the bacterial protein-coding gene present in the expression vector
DNA insert is transcribed and translated in cells to produce “chimeric” fusion proteins present in the recombinant phage plaques (or bacterial colonies)
Plaques (or colonies) are screened with specific antibody or other molecules known to bind to the protein to identify the gene (cDNA) of interest
second approach:
synthesize DNA probes based on amino acid sequences of isolated peptides from purified protein. using genetic code to synthesize a DNA probe set containing all possible coding sequences for a peptide
oligonucleotide probes: chemically-synthesized short DNa probes (15-20 nt long, used as hybridization probes in library screening.
DNA probe set containing all possible coding sequences derived from genetic code is most reliable choice for screening (called fully degenerate probe)
lowest degeneracy... least degenerate 20-base region, prepare 20mer degenerate probe to screen genomic library. why use a 20-mer probe? a 20 nt sequence is predicted to occur once in 10^12 bp so 20 mer is unique probe in large genome like humans...
chemical synthesis of DNA
Single-stranded DNA molecules synthesized by sequential addition of reactive nucleotide derivatives in 3’ -> 5’ direction
Mixed oligonucleotide probes produced by adding more than one nucleotide derivative to a coupling reaction
preparation of radioactive probes for library screening
1. Phosphatase needed to remove 5’ phosphate groups from DNA restriction fragments (not necessary for chemically-synthesized DNA oligonucleotides which have 5’-OH)
2. Polynucleotide kinase used top add 32P-phosphate to the 5’ end of DNA molecules
3. Labeled oligonucleotide probes used to screen recombinant DNA libraries by plaque or colony hybridization
reverse genetics: cloning the gene (clotting factor VIII) responsible for hemophilia A
1. Essential role of Factor VIII in blood clotting established by biochemical & clinical studies
2. Lack of Factor VIII in hemophilia A patients results in severe uncontrolled bleeding
3. Factor VIII protein purified & amino acid sequence determined
4. Factor VIII peptides with least degeneracy used to synthesize fully degenerate oligonucleotide probe sets
5. Human genomic DNA library screened with probe oligo’s
6. Factor VIII gene clones isolated
7. Structure of gene determined from mapping & sequencing of amplified recombinant genomic DNA clones
8. Recombinant Factor VIII (from Genentech) & used to treat hemophilia A patients
9. Future Drugs from “Pharm” animals: transgenic goats bioengineered to produce human clotting factors in breast milk
method 3: screening libraries using labeled DNA or RNA probes in hybridization
Examples of hybridization probes:
enriched mRNA (often used in early work before cloned DNA probes became available)
cloned cDNA or gene (in early studies, cDNA clones often isolated first & then used for genomic gene isolation & characterization)
because of high sequence conservation of genes in evolution, cloned probes from different species can be used in screening to isolate a homologous gene or cDNA (e.g., use of mouse globin cDNA for isolating human globin cDNA or gene)
PCR product (now preferred method) due to availability of complete genome sequences from many organisms for designing PCR primers
PCR amplification of specific DNA sequences for direct cloning or for use as probes

cloned (or PCR-generated) DNA probes are used in many kids of hybridization studies
Southern blots (DNA electrophoresis)
Determine gene structure & genomic organization (e.g., DNA rearrangements, deletions, insertions) map transcription control regions, detect mutations, analyze gene copy number, DNA fingerprinting, RFLP linkage analysis (LS4)
Northern blots (RNA electrophoresis)
Analyze mRNA species & RNA processing events, determine changes in mRNA splicing in biological processes
Microarrays/Gene Expression Profiling (RNA hybridized to DNA immobilized on chips or other surfaces)
Analyze quantitative changes in different mRNAs expressed from up to every gene in a genome - can be used to define global gene expression patterns in biological processes
southern blots can detect specific DNA fragments in complex mixtures of restriction fragments separated by gel electrophoresis

gene copy number estimated by southern blot
Neu oncogene is amplified in ~30% of advanced breast cancers & is correlated with poor clinical prognosis
Experiment: Southern blot comparing genomic DNA samples probed for Neu
Breast cancer sample with amplified Neu shows more intense hybridized band vs other tumor & normal samples
Recombinant antibody against Neu protein (Herceptin) from Genentech now used to successfully treat breast cancers with high Neu
northern blots
used to analyze mRNA expression, induction of B-globin gene expression when precursor cells differentiate into red blood cells
isolate mRNA at diff times after induction of red blood cell differentiation
separate polyA + RNA on gel
hybridize northern blot w/ B-globin probe
microarrays measure genome-wide changes in transcription in diff cells or tissues
Developed to quantitate & compare mRNA levels by hybridization to specific DNAs displayed in high density patterns suitable for microsensor scanning
Utilize fluorescent-labeled RNA or DNA probes (use of different color fluorescent labels allows simultaneous hybridization of different labeled probe samples to arrayed DNAs)
Provide rapid & reproducible large scale surveys comparing gene expression patterns of all known genes (e.g., used to define changes in mRNA in normal cellular processes vs. cancer & other diseases). Now expanded to analyze all types of RNA transcripts expressed from an entire genome (i.e., tiling arrays)
General types of DNA microarrays:
cDNA arrays (~10,000 different cloned cDNA spots/2x2cm grid)
synthetic DNA oligonucleotide arrays (typically 20-50 mers) on chips or slides, similar manufacturing methods as used for computer chips. One Chip (1 cm2) can contain 106 50-mers)
PCR-generated exon-specific probes (>250,000 DNAs/slide), every gene in human genome can be arrayed on one microscope slide for genome-wide surveys of protein-coding sequence patterns
Genome tiling arrays - oligonucleotides (e.g., 50-mers) are synthesized at 150-200 bp intervals across entire genome for surveying all coding & non-coding RNAs. All RNAs transcribed from the genome is called the transcriptome.
example: microarray of analysis of yeast cells grown under diff conditions:
A. if spot is yellow, expression of that gene is same in cells grown either on glucose or ethanol
B. if spot is green, expression of that gene is greater in cells grown in glucose
C. if spot is red, expression of that gene is greater in cells grown in ethanol
1918 flu caused prolonged severe inflamations (aka cytokine storms) that produced its unusual mortality
Survey of changes in expression of 40 cytokine & pro-inflamatory genes.
Microarray data showing that both the reconstructed 1918 pandemic flu strain & a non-pandemic current flu strain (K173) increased expression of pro-inflamatory immune response genes (as shown by red bars) after 3 days infection. Most of these genes returned to normal or lower levels (as shown by black & green bars, respectively) after 8 days infection with the K173 flu strain, but the majority of the pro-inflamatory immune response genes remain dangerously active (red) with the pandemic 1918 flu virus
genomic medicine: new microarray screening can identify metastatic cancers in advance
Metastasis is the major cause
of cancer death
Metastasizes previously thought to
originate as rare cells in larger, more
advanced tumors
New results show that metastatic
gene programs are already activated
throughout cancer cells in some early
stage, small tumors
Early detection of these high risk
tumors (using “signature gene sets”)
is expected to lead to better results
in treatment and less complications for
Patients with lower risk tumors
classical DNA sequencing
maxam gilbert sequencing: based on chemical degradation of DNA
Sanger sequencing (1975): based on termination of enzymatic DNA synthesis using modified NTPs called dideoxynucleotides
2',3'-dideoxyNTPs (ddNTPs) lack 3'OH of regular deoxyNTPs - block formation of further phosphodiester bonds when incorporated into replicating DNA strands and cause chain termination...
more on sanger or dideoxy sequencing
1. duplex DNA to be sequenced is denatured
2. labeled oligonucleotide primer of known sequence is hybridized to single-stranded template DNA.
3. template-primer is distributed in four diff tubes
4. to each tube are added add all the substrates for DNA syntesis (dNTPs, DNA polymerase), plus one of four dideoxynucleotide triphosphate (ddNTP, added at 1/100 conc. of dNTP)
5. in vitro DNA synthesis produces a collection of DNA fragments, each terminating with the dideoxynucleotide
6. fragments are denatured and separated on polyacrylamide gel. labeled bands are visualized by autoradiography.
automated sequncing uses fluorescent labels instead of radioactive labels
ddA: green tag, ddG: black, ddT: red tag, ddC: blue tag.
1. One reaction containing all four dNTPs and all four labeled (i.e., “color tagged”) ddNTPs.
2. PCR used to amplify sequencing products (only 1 primer)
3. Chain-terminated DNA Fragments separated by electrophoresis in one column in a “Sequenator”
4. A fluorescent detector records the color of the passing bands and generates a sequence.

using four diff fluorescent tags allows all four bases to be analyzed in a single lane.
shotgun (random cloning) strategy used in whole genome sequencing
7X sequence coverage conducted (i.e. each base sequenced seven times) to insure that all sequences are captured & correctly assigned in final assembly - 3 types of libraries made from isolated human chromosomes (note: average human chromosome = ~150 Mb of DNA)
“Paired End Sequences” of ~600 bases in from both ends of the 1kb, 5 kb & 100 kb DNA clones aligned by computers into long DNA sequences
high throughput genome sequencing
1. Accomplished via New Strategy
“Shotgun” (i.e., random) genomic DNA cloning & automated sequencing coupled with computer assembly & alignment - most efficient & cost effective method for genome sequencing (proposed by C. Venter, 1998). Overcame dependence on mapping cloned DNAs which was the bottleneck in early genome sequencing. With shotgun approach, early computer assembly of DNA sequences was rate limiting, not automated DNA sequencing capability.
2. Final Complete Human Genome Sequence
Working draft of human genome compiled in 2001 with corrected final version completed in 2003, required 10 years work & at a total cost = ~$3 billion (sequencing cost alone = ~$300 million). The final sequence provides a powerful reference/scaffold for compiling/aligning future human genome sequences.
3. New Era: Personal Genome Sequences
J. Watson’s personal genome completed in just 3 months at cost of $1 million (presented on two DVDs to Watson, May 31, 2007). Second personal genome sequence (C. Venter) also now completed.
Note: human genomes are ~99.9% identical (i.e., they differ by ~3 million single base pair changes = ~1 bp /1000 bp). Individual genomes also differ by longer DNA insertions/deletions & by changes in copy number -these differences easily account for individual variations of traits in people
4. Future developments
Continuing advances now make a $5000 genome sequence feasible - ultimate aim is a complete genome sequence for $1000.

big question: how useful is a personal genome sequence?
genomics: analyses of whole sequenced genomes
>400 different genomes have been completed - many more in progress
Comparative gene & genome studies reveal how different organisms evolved
Enormous amount of complex new information to decipher, store & access
Sequences stored in public databases: Genbank (US/NIH), EMBL (European Molecular Biology Lab), DDB (DNA DataBase of Japan)
BIOINFORMATICS: new science to analyze & compare genomic sequences
ANNOTATION: identification of all coding, non-coding & regulatory sequences in a genome
GENE FINDER programs detect protein-coding gene sequences: look for long (>100 aa) Open Reading Frames (ORF), exons/introns, splice sites, polyA signals
Programs that predict promoters, enhancers & other transcription control regions (VISTA, search for clustered known transcription factor motifs)
BLAST comparisons of protein sequences predict relatedness & function
-amino acid searches more efficient than DNA comparisons
-proteins with overall conserved sequences have similar structures & activities
-proteins are organized in domains (discrete functional regions) homologous to related domains in other proteins
ENCODE (Encyclopedia of DNA Elements) next “big biology” project: initiated after human genome finished in 2003, combines computational & experimental strategies to annotate (i.e., define the functions) all expressed sequences in the human genome! ENCODE analyses of 1% of human genome completed mid-2007.
BLAST
basic local alignment search tool: high sequence similarity predicts NF-1 functions like Ira-GTPase that controls key cell cycle regulator, G-protein Ras.
prok and yeast genomes
prok genomes lack introns. more complex bacterial genomes are similar in size and gene number to yeast and also have similar numbers of genes for diff cellular metabolic processes.
like prok, most yeast protein-coding genes lack introns-only 4% have a single intron.
genomes in bacteria show much more variation in the number and types of genes than genomes in animals
Bacteria have long evolutionary history - over 3 billion years
Live in many diverse & extreme environments requiring many specialized types of metabolism
These adaptaions reflected in large variation in numbers of genes (e.g. Mycoplasma = 500 genes, Streptomyces = 7000 genes)
Gene content (kinds of genes) also highly divergent (e.g. E. coli and Staphococcus genomes have 25% unique genes in similar sized genomes)
comparisons of the number and types of expressed genes in the genomes of diff eukaryotes
>50% of expressed RNAs are of unknown function (EST = expressed sequence tags) - many EST RNAs likely are non-coding RNAs. note: defense and immunity only class of genes notably expanded in mammals vs other euk.
genome size, but not the number of protein coding genes increases in relation to biological complexity in multi-cellular eukaryotes.
comparisons of the chromosomal gene density in the genomes of diff organisms
Bacterial genomes have densely-packed protein-coding genes, with little non-coding DNA & intergenic DNA (sequences between genes).
Gene density decreases with increased biological complexity & genome size in eukaryotes.
Protein-coding genes of higher eukaryotes are usually larger in size & have more & larger introns than lower eukaryotes (only a few yeast genes even have a single intron).
Amount of intron sequence (non-coding transcribed DNA) & also intergenic sequence (both transcribed & untranscribed DNA) increases with genome size & biological complexity in eukaryotes. Repeated sequences (including transposons) also increase in relation to genome size in eukaryotes.
molecular origins of variation and diversity in evolution
Changes in Gene Number: Protein-coding gene duplication accounts for increased gene number & biological complexity between yeast (~6000 genes), invertebrates (~15,000 genes) & vertebrates (~25,000 genes). Most vertebrate genes have direct counterparts in invertebrates (many also in yeast). Conclusion: very few “new” protein-coding genes invented in evolution. Gene duplication/loss are important factors contributing to adaptation & evolution
Gene Mutation: Changes in protein-coding genes usually generate related proteins with modified functions rather than entirely new activities. Accumulation of separate mutations in duplicated genes can lead to proteins with different activities or functions
3. Changes in regulatory DNA sequences or Transcription Factors: Affect expression of a single gene or alter more complex gene expression & control networks (e.g. tissue- or organ-specific, developmental gene programs).
4. Alternative RNA splicing: Most human protein-coding genes are alternatively-spliced (~70%), this produces >100,000 proteins with altered functions & expands the proteome (i.e., total proteins expressed). Alternative splicing increases with biological complexity.
5. Non-protein coding RNA regulation: Large fraction of all eukaryotic genomes is non-coding DNA (coding DNA = exons) that is transcribed into non-coding RNA. Amount of non-coding DNA & non-coding RNA increases with genome size & biological complexity in eukaryotes. Non-coding RNA likely involved in controlling all biological processes -> novel mechanisms for gene regulation (more next lecture)
adaptation by variation in gene copy number (duplication)
Example: The salivary amylase gene (AMY1 ) encodes an enzyme responsible for starch digestion. This gene has undergone duplication in human evolution in adaptation to increasing starch in the diet. Chimps (who eat shoots & leaves) have only 1 copy. Human groups with moderate starch diets have ~5 copies, while humans in modern agricultural societies with high-starch diets have 7-10 copies. Increased amylase gene copies in all humans vs. chimps provides a selective advantage for the digestion of
increased starch in the diet. Even further amylase gene duplications have occurred as an adaptation to diets of very starch-rich foods after the agricultural domestication of crops like maize, rice & wheat ~10,000 years ago (Sci. Am. January, 2009).
variation due to point mutation: myostatin gene in whippets
Example: Myostatin protein regulates (slows) muscle growth. A point mutation in the myostatin gene appeared spontaneously in these highly in-bred dogs. Selective breeding aimed at producing faster racers from crosses of moderately-muscled whippets unexpectedly produced some slow, muscle-bound dogs known as “bully whippets”
variation via changes in regulatory regions controlling gene expression
Example: Lactose tolerance in adults results from different enhancer mutations that allow transcription of the lactase gene (LCT) to continue after childhood. These mutations have arisen independently in different parts of the world and at different times over the last ~9000 years of human evolution. This is compelling evidence for ongoing human adaptation & natural selection since the estimated origins of different LCT mutations coincide with the earliest domestications of milk-producing herd animals (goats, cattle, sheep) in Europe, Africa & the Middle East (Sci. Am., January, 2009).
comparative genomics: understanding how diff organisms have evolved by comparing their genes and entire genomes
Example: The protein-coding genes & the genomes of mice & humans are remarkably similar even after 50 million years of evolution
Both species have ~25,000 protein-coding genes
Specific gene counterparts in both have similar exon/intron organization
~80% of protein-coding genes have a single direct counterpart in both species
~20% of mouse & human genes are result of lineage-specific duplication or loss
Most mouse & human genes are highly conserved (i.e. show an average of ~80% identity in amino acid sequences)
~5% of mouse & human genes are very highly conserved & thought to be basic “tool kit” for constructing mammals
Mouse & human genomes also show “synteny” = high conservation in the linkage (i.e., association & organization) of genes
No “new” protein-coding genes invented in the 50 millian years of evolution since divergence of mice and humans from a common ancestor
human-chimp genome comparisons: insights in human origins
Humans & chimps diverged from a common ancestor ~6 million years ago. Chimps are our closest living relatives
Genomic DNA sequences of humans & chimps are ~98.8% alike ( ~1.2% single bp differences = ~30 million bp changes out of 3 billion bp total, occur on average ~1/100 bp in the genome
Such differences result from random genetic mutations (average rate of single bp substitutions = ~1 per 108 bp per generation) that accumulate in genomes at steady rate over time
Sequence differences are distributed throughout human & chimp genomes - most have no effect on human or chimp biology because only small fraction of the genomes are protein coding or regulatory sequences
Majority of human & chimp proteins have essentially identical amino acid sequences (i.e., differ by ~2-3 amino acids) & 29% have identical amino acid sequences - indicates positive selection maintaining common protein functions
Gene duplication/loss is responsible for larger genomic differences between humans & chimps. Their genomes differ ~ 4% in gene number due to lineage-specific gene duplication or loss. Such changes account for significant differences in humans & chimps
Example: salivary amylase gene (AMY1) duplications in humans v. chimps - adaptation to increased starch-rich diets in humans
Changes in Regulatory DNA Sequences (transcription control regions) in human v. chimp genes
Example: regulatory region mutations that extend lactase (LCT) gene activity into adulthood -> adult lactose tolerance in humans descended from ancient herders
comparison of FOXP2 gene, sequence in human, chimp, and mouse
FOXP2 (715 amino acids) transcription factor known to function in human speech because severe inherited speech defects are associated with point mutations in FOXP2 protein (i.e., mutations predict function).
Human speech is associated with two nonconservative FOXP2 amino acid changes (Asparagine303, Serine325) which occurred after humans & chimps diverged 6 million years ago. Chimps & mice differ by only one nonconservative amino acid change (Asparagine325 v. Alanine325) after 50 million years divergent evolution.
FOXP2 protein sequence is extremely conserved - predicts essential developmental/regulatory function for other animals besides humans
FOXP2 reported to be required for bird songs (complex vocalization like human speech) & FOXP2 KO mice are squeakless. FOXP2 KI mice with the human FOXP2 gene make novel sounds compared to wildtype mice.
Genome Regions That Have Changed the Most Since Humans
and Chimps Diverged Give Clues to What Makes Us Human
Human genome DNA sequences with the most changes compared to chimp
indicate sites of accelerated mutation from strong positive selection.
These Human Accelerated Regions (HAR) are predicted to have shaped
evolutionary changes between humans & chimps!
Sites involved in muscular/skeletal development
MYH16: mutant myosin gene found only in humans (no other primates!) limits
jaw development-> smaller jaw & larger human brain size
HAR2: gene regulatory site controlling wrist/thumb fetal development, allowed for increased dexterity for tool making & use
Multiple loci involved in Brain development
ASPM, MCPH1, MCPH1 & CENPJ: four protein coding genes known to control brain size because genetic mutations cause microcephaly in humans
HAR1 encodes short non-coding RNA that differs in 18/118 bases in humans vs. chimps (i.e., highest sequence variation in human vs. chimp) - known to function in human fetal neuron formation during development of cerebral cortex at 2-5 months in human embryos
Other HAR sites that do not encode protein or RNA
>200 identified, most predicted to control transcription of nearby genes