Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

7 Cards in this Set

  • Front
  • Back
-study of genomes.
-looks at: gene set interaction, genetic organization, chromatin architecture, evolutionary relationships (homology), gene regulation (transcriptome, proteome).
-Methods: seq. acquisition, storage, analysis.
-computational and mathematical analysis of biological (molecular) information; mostly seq. analysis (SEQ. COMPARISON) via alignment algorithms.
-FASTA (older method)
-BLAST (most common): Basic Local Alignment Search Tool.
*Expressed Seq. Tag (EST) - a short sub-seq. of a transcribed, spliced nucleotide sequence (either protein-coding or not).
*Single Nucleotide Polymorphism (SNP) - a DNA seq. variation that involves a change in a single nucleotide; molecular term for 'allele' (an alternative form of a gene).
-Open Reading Frame (ORF)- part of the gene that is used to start the production of some RNA from a gene made of DNA.
*Homology - DNA/protein seq. similarity b/w indiv.'s of same/diff. species.
*Paralogs - related genes via duplication w/in a genome; evolve new functions from gene from which they were copied; removed from selective pressure.
*Orthologs - genes with similar functions to those in evolutionarily related species.
*Query - seq. being studied.
*FASTA format - text file format for sequences.
*FASTA definition line - first line of FASTA data set (seq.).
*Accession #/GI # - a seq.'s I.D. #.
*Hit - seq. that is similar to/'matches' query seq.
*Raw score (more informative) - calc. #, includes I.D.'s, mismatches, gaps.
*Bit score - measure of similary b/w hit & query seq.'s; high value = high degree of similarity.
*E-value (E) - 'expect value,' # of hits expected to match by 'chance'; low val.= similarity is significant; depend on BIT score, query length, & db size.
*Annotated/Annotation - seq. analysis & characterization.
Example: BLASTn of the NR GenBan k db.
1. Obtain seq. of interest.
2. Submit query via NCBI interface.
a. Choose appropriate BLAST algorithm.
b. Choose db.
c. Choose parameters.
3. FORMAT query.
4. Analyze results.
-set of seq. comparison algorithms.
-breaks query & db's seq.'s into frags.
-seeks matches b/w frags.
-Initial search: for frag length that obtains a certain score.
-frag hits then extended in either direction to obtain greater score.
*DNA: base-by-base alignment via Unitary Matrices (aka.. Identity Matrix) - a scoring sys. in which identical characters receive (+) score.
*protein: amino acid-by-amino acid alignment via Substitution Matrices - scoring sys. in which each possible a.a. reside subst. is given a score reflecting probability that it's related to corresponding residue in query.
*Gaps - positions at which letter is paired with null, give (-) score; gap presence more significant than gap length as it's the result of mutational insertion/deletion of >1 residue; initial gap heavily penalized, lesser penalty to ea. subsequent gap.
*Gene 'off': chromatin condensed, inaccessible, more euchromatous.
*Gene 'on': chromatin decondensed, accessible, more heterochromatous.
-binding proteins promote, repress, etc.