• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/33

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

33 Cards in this Set

  • Front
  • Back
4 phases/quarters of scientific progress
Chromosomes = cellular basis of heredity

DNA double helix = molecular basis of heredity

Mechanism by which cells read information contained in genes = biological basis of heredity

With recombinant DNA technologies of cloning and sequencing, scientists can do the same

Genomics ~ deciphering genes and entire genomes
Total codons
4^3 = 64
Start codon
ATG

AUG
Stop codons
TAA, TAG, TGA

UAA, UAG, UGA
Use of fourier transform
DNA coding sequences exhibit 3-base periodicity ~ periodicity can be modeled via fourier tranform

DNA non-coding sequences do not exhibit 3-base periodicity
Method to measure gene expression
microarray
What do microarrays do
quantifies [mRNA]

measures expression levels of thousands of genes at once
Modernization of sanger method
Automated

Capillary electrophoresis (no more gels)

ddNTPs with fluorophores (not radioactive)

Different bases ~ different colors

Chromatogram shows results (no more gels)

Provides information on quality and composition

Automated base calling
speed of best sanger machines
440 kbp/day
speed of next gen sequencing machines
325 Mbp/day
steps in PCR
denature

anneal

extension

repeat

electrophorese
forces that change DNA over time
genetic drift

natural selection

mutation

(recombination)
natural selection description (4)
More organisms are born than can survive or reproduce (struggle for existence)

Some organisms possess phenotypes (morphological, behavioral, biochemical attributes etc.) that enable them to better survive and reproduce

Variation in phenotypes that affects survival and reproduction is heritable (genetically controlled)

Individuals with favorable traits/phenotypes will survive and reproduce, thus passing these traits (and the controlling genes) along to their offspring
phenotype altering mutations
missense

nonsense
fates of new mutation/allele
lost in population

become polymorphisms

undergo fixation
what represents evolution at the molecular level
fates of new mutations or any alleles
rationale for studying sequences/alignment
Determine how pathogens harm hosts (~mechanisms of virulence)

Understand whether genes evolved due to GD or NS; may facilitate vaccine development since vaccines are best when they target antigens that change slowly

Study evolutionary history of organisms
rationale for sequence alignment
Available sequences often differ in length because sequencing efforts/methods are not standardized and indels

One may wish to know whether a sequenced genomic fragment is homologous to something else that has been previously sequenced and deposited into a large database (e.g. metagenomics)

Phylogenetics

Evolution of genes and genomes: rates of evolution (~ vaccine development); effects of selection on genomes; rates of transitions verses transversions; determining whether sequences are truly homologous (often ascertained by determining whether sequence similarity is greater than expected by chance): e.g. BLAST; assist in functional annotation (i.e. inferring a gene’s function based on homology to annotated genes)
what is a sequence alignment
arrangement of two or more DNA or protein sequences that minimizes the number of differences between them
2 types of optimal alignment
percent sequence identity

percent sequence similarity
reason to give penalties
Point mutations/substitutions

AA (residue) changes

Gaps

Gap extensions - penalty for extending a gap once its been "opened"; Most indels are greater than 1 bp, therefore extension penalties are lower than gap penalties
what is a substitution matrix
matrix that shows scores (+/- numerical values) applied for identities/similarities/differences between amino acids or identities/differences between nucleotides; add all individuals scores, plus gap and extension penalties to generate total alignment score (used to distinguish between optimal and suboptimal alignments)
2 types of substitution matrices
identity matrix

protein substitution matrix
describe an identity matrix
Identical NTs are given the same score (e.g., +1)

Different NTs are penalized with identical negative values (e.g., -1,000)

Internal gaps are not part of the matrix, but are penalized with a value that you think to be “fair” based on your expectations for the evolution of that gene or locus (gap penalties are not part of the matrix)
2 types of protein substitution matrices
PAM

BLOSSUM
describe BLOSSUM 62
based on consideration of common amino acid substitutions among proteins with >62% identity
describe PAM 120
calculated from alignments in which there have been 120 substitutions per 100 residues
CDS
CoDing Sequence, region of nucleotides that corresponds to the sequence of amino acids in the predicted protein. The CDS includes start and stop codons, therefore coding sequences begin with an "ATG" and end with a stop codon. In SGD, unexpressed sequences, including the 5'-UTR, the 3'-UTR, introns, or bases not expressed due to frameshifting, are not included within a CDS. Note that the CDS does not correspond to the actual mRNA sequence.
average bacterial gene
~1000 bp
average eukaryotic gene
~1200-1500 bp
class ~ some of the smallest genomes in the planet and is an endosymbiot with bacteria
Hodgkinia cicadicola
Dynamic time warping
aligning speech waveforms and subsequently scoring them

You have both voices and you extract features distinct for each voice

Essentially find distances between the two and subsequently use Dynamic Time Warping

A path is then generated that allows one to match the sequences

If they were exactly the same, you'd get an exact diagonal line

If there is variation, the line is jagged

If there is a lot of variation, the line is significantly diverge

Use varying scores for insertions/deletions

Use fixed score for gaps
2 types of dynamic programming
Needleman-Wunsch - global

Smith-Waterman - local