Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Understanding Multimedia Information Review

Understanding Multimedia Information Review

by msprentiss, Apr. 2012

Subjects: audio digital images physics psychoacoustics sound video

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/125

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

125 Cards in this Set

Front
Back

	how humans describe sound: loudness	related to physical concept of amplitude
	how humans describe sound: pitch	Related to Fundamental Frequency for periodic sounds. Some noise-like sounds have different pitch heights depending on where in the spectrum most of the energy is concentrated
	how humans describe sound: chroma	musical notes---cultural
	how humans describe sound: timbre	Quality or color of a sound; related to physical concept of Spectrum (amplitude of harmonics for periodic waves.though noise has energy at all frequencies, some noise will have more energy at certain parts of the spectrum
	physics is related to	acoustics
	psychoacoustics is related to	hearing
	to understand sound in multimedia you need to know	how the physics and psychacoustics of sound work
	sound	longitudinal waves travelling through a mediuam (typically air)
	sound waves	pressure variations in back and forth motion, in direction of sound propagation
	sinusoid waves	represent circular motion--considered pure waves
	a sound wave's amplitude is related to	loudness
	amplitude	peak value that a periodic wave achieves---shows how much pressure varies
	period (t)	amount of time it takes to complete one cycle
	a soundwave's frequency is related to...	pitch
	frequency's formula is	1/T (measures how many cycles are completed in a second---Hertz)
	real world sounds are	composite sounds
	Fourier analysis	can create any periodic waveform (ie musical sound) by adding harmonically related sinusoids together
	fundamental frequency	lowest mode of vibration
	overtones	additional modes above fundamental frequency
	harmonics	overtones that obey a harmonic relationship to the fundamental (at integer multiples to the fundamental frequency) e.g. 220hz--its harmonics are 220, 440, 660, 880 hz
	waveform shows	how amplitude of a wave behaves over time.
	periodic waves(repeating waves) have a pitch because	they have fundamental frequency
	when you represent noise as a sound wave, it	is aperiodic, random
	why do instruments sound different if a note has the same harmonics?	stregnth of harmonic amplitudes is different relative to the fundamental of each type of sound (timbre). A tuba could have a really strong 440hz harmonic, while a flute could have a really weak 440hz harmonic. The waveforms will look different.
	spectrum shows	the amplitude of different frequencies. (useful to show how different instruments sound different even when playing same note).
	wave forms are in what domain	time domain
	spectrum are in what domain	frequency domain
	when you plot a spectrum	you can look at frequency and amplitude at a particular moment in time.
	contain all frequencies	nonperiodic sounds, aka NOISE
	time-varying spectrums are a property of	real life sounds
	loudness increases	logarithmically but NOT linearly
	loudness depends on three things	mainly amplitude and also frequency and timbre
	our ears are more sensitive to	certain frequencies
	loudness is measured in	decibels
	threshhold of hearing is how many DB	0
	to double loudness...	you need to make the power 10x stronger. (more complicated explanation: you will need to exponentially increase the power by 10ⁿ. e.g. doubling loudness from 1 watt to 10 watts is (10 to the 0 watts to 10 to the 1 watts). tripling would be 10 to the 2 watts=100 watts. )
	Our ears are most sensitive in the	2000-4000 Hz range. means we hear these sounds as louder than ones outside the range.
	lowest/highest frequencies we can hear	20 hz (lowest)- 20,000 hz (highest)
	timbre affects loudness how?	Sounds that have frequency content spread over a wider area are perceived as louder (spread across multiple critical bands)
	Loudness affects pitch how?	•High pitches get higher the louder they are •Low pitches get lower the louder they are
	The Spectrogram shows	the spectrum over time. so we can see the spectrum of real sounds, which happen in time. Time is on one axis (usually horizontal) ,Frequency is on the other axis (usually vertical), and The color measures amplitude/energy
	to convert analog audio to digital audio, we need to	sample the amplitude at a set time jump. then convert those amplitude values that into a form a computer can understand (quantize)
	aliasing occurs	when you sample too slowly
	aliasing is	when you've sampled too slowly and created a waveform at the wrong frequency. it sounds awful.
	Nyquist theorem	must sample at 2 times the highest frequency we want to hear so we can hear it
	To be able to store audio in the whole range that humans can perceive, we need to sample	at around >40,000 times per second. (highest frequency humans can hear is ~20,000 Hz)
	cd audio's sample rate	44,100 Hz
	quantize	converts amplitude values that into a form a computer can understand w/a certain number of bits/bytes. a bit- 2 values, a byte-8 bits. Can represent 28 = 256 unique values. 2 bytes: 16 bits. Can represent 216 = 65,536 unique values
	quantization noise	We can’t perfectly represent any amplitude, so we “round” the amplitude to the closest quantization level--creating an error noise in our sound atop the perfect sound.
	to decrease quantization noise	add more bits. We gain 6 dB of signal to noise ratio per bit added.
	CD Audio uses how many bytes per sample	CD Audio – 16 bits (2 bytes) per sample – 96 dB of dynamic range
	Sample rate controls	the highest frequency that can be stored. Sample rate must be twice the highest frequency we wish to store per the Nyquist Theorem
	Quantization bit-depth	controls the dynamic range, i.e. How much the signal is above the noise level introduced by the quantizer
	channels	1 or 2, mono or stereo
	to calculate the size of a digital audio file	multiply sample rate by bit depth (#of bytes) by # of channels by duration (# of seconds). samples/secbytes/samplesec*# of channels
	How many bytes is 5 minute of CD audio?	44100 SR x 2 bytes x 2 channels x 300 seconds=52920000 bytes
	perceptual coding of audio	exploits psychoacoustic masking
	masking is	phenomenon where 1 sound renders another sound inaudible
	masking threshhold is	the cone of deaf
	if we reduce sample rate and bit depth to make an audio file smaller	We will lose high frequency information and have a noisier sound
	types of masking and their definitions	Simultaneous masking: Two sounds simultaneously occurring where one sound makes another inaudible –Forward masking: A sound makes another sound immediately following it inaudible –Backward masking: A sound makes another sound immediately preceding it inaudible (?!?!)
	to save space, digital audio drops	sounds outside the threshhold of hearing and sounds in the cone of deaf
	compression ratio	raw size/compressed size cd audio / mp3 @ 128kbps= 1411/128= ~11
	name three lossless encoding schemes	run length, dictionary, and entropy encoders
	dictionary encoding	Symbols and sequences of symbols are then simply referenced as an index into the dictionary. only works well when symbols and sequences are sufficiently repetitive
	run length encoding	Given a sequence of symbols, encode the symbol and how many times it repeats –AAAABBBCCCCCAABBB = 4A 3B 5C 2A 3B only works well w/stuff that has a lot of repetition.
	entropy encoding	entropy encoding is to exploit that some symbols occur more frequently than others (like letter e vs letter z) does not work well when symbols are generally equally likely
	lossy compression pros and cons	pro: substantial size shrinkage (high compression rate) cons: permanently lose data
	What is light?	a wave phenomenon w/spectra that usually comes from two sources –Thermal/black-body radiation –Emission (electron energy state changes
	what is the visible spectrum	400-790 terahertz
	The actual spectrum of a source is	its physical color:
	regulates color perception in human eye	cones
	short, medium, and long cones are most sensitive to	Red, Green, and Blue wavelengths, respectively
	a color like orange	may excite the red cone most strongly, but also excite the green and blue cones some as well
	we create perceived colors by	mixing together the correct amounts of red, green, and blue light--moving from infinte dimensional representation of color to three dimensional
	Additive Color	Light--mixing all 3 (rgb) creates white
	Subtractive Color	Most objects reflect light, and do not generate it. a green notebook is green because it absorbs the other light wavlegnths but reflects green. art: ryb printing: cmyk
	how printing with cmyk colors works	Cyan Ink: Absorbs red, reflects blue and green •Magenta Ink: Absorbs green, reflects red and blue •Yellow Ink: Absorbs blue, reflects red and green •To make blue on paper: Apply cyan and magenta so that only blue is reflected
	why don't people use rgb values much when making stuff	hard to remember them!
	alternative to rgb values that is easier	hue, saturation, value (HSV). maps 1:1 to RGB.
	hue is	“Color”
	saturation is	"Colorfulness"
	value is	“Brightness”
	Color Vision	: A spectrum analyzer with receptors that analyze how much red, green, and blue light is present
	Color Theory	Add together R, G, B light to create colors. Map RGB values to different representations like HSV to be more intuitive
	digital images use which color model	additive (rgb)
	what is a raster image?	a representation of an image using pixels, where each pixel takes on an RGB value
	how do you calculate a raster image's size?	resolution (pixels wide x pixels high) x color depth (bits or bytes per pixel) = raw file size
	1 bit color is	black and white (1 and 0) --(2 to the 1 power)
	1 byte (8 bit) color is	256 colors (2 to the 8th power)
	3 byte (24 bit) true color is	16,777,216 (2 to the 24th power)
	an 800 x 600 true color images file size would be	800 pixels x 600 pixels x 3 bytes=1, 440,000 bytes
	the # of color possibilities available per # of bits available per pixel can be caluclated like...	1 bit = 2 possibilities, 2 bits = 4, 8 bits = 256 (2 to the 1, 2 to the 2....2 to the 8---see the pattern here:)
	GIF--color depth, compression type, good for	•One of earliest examples •Supports only 8-bit color (though each image can have its own 256-color palette) •Lossless compression using the once patented LZW algorithm •Good for logos, etc. •Supports animation
	PNG--color depth, compression type, good for, other features	Supports 24-bit color •Uses patent-free DEFLATE lossless compression good for logos and text supports alpha channel
	JPG-- compression type, good for, other features	lossy, photos,
	JPEG's compression exploits	that human eye is good at noticing slight changes in brightness over large areas (low frequency info) but far less sensitive to sharp transitions, e.g., edges (high frequency info)
	jpeg compression's steps	Break up image into 8x8 pixel blocks •Transform each block's data from spacial domain to frequency domain •Quantize the frequency domain coefficients, and possibly remove high frequency content (edges) •Colors are averaged in the blocks, and each block is clearly visible
	a sequence of pixels has a measurable	spectrum
	•Removing too much high frequency information in a jpeg (extreme compression) leads to	blocking artifacts
	SVG (vector graphics) is not	a raster image
	SVG uses	geometric formulas to draw an image great for items you need to scale bad for pictures, good for fonts and some drawn images
	video is	a sequence of still images
	frame rate is	number of images per second, unit is FPS (frames per second)
	how do you calculate the size of video	time (in seconds) x (resolutionxbitdepth of still image) x frame rate
	How large would one minute (60 seconds) of a 1080p (1920 x 1080) true-color (24 bits, or 3 bytes per pixel) video at 30 FPS be?	BIG 11,197,440,000 bits (1,399,680,000 bytes)
	frame rate film and video standards	Film standard is 24 FPS •Video standard is 30 FPS movie theaters project at project at 72 FPS, displaying each image 3 times (for flicker and motion reasons)
	“Trumotion” technologies do what	attempt to “create” frames between existing ones for more realistic motion
	How does intraframe video compression work?	works a lot like jpeg. Take 8 x 8 pixel blocks –Transform each block from spatial domain to frequency domain –Quantize frequency domain coefficients and possibly remove high frequency content
	How does interframe video compression work?	Exploit similarity among adjacent frames to achieve compression (e.g., if the background is the same, don't recode it over and over)
	when frames are the same in interframe compression	code with a short command to copy
	when frames are not the same in interframe compression	use motion compensation. (Uses previous frames to predict the current one, and notice/store the difference)
	Motion compensation uses	previous frames to predict the current one, and notice/store the difference
	frame types	i (intraframe compressed image),p (predictive frame) and b (bi-predictive frame)
	I frame	original intraframe compressed image--no motion compensation or prediction
	p frame	predictive. A Delta frame that depends on previous I and P frames
	b frame	bi-predictive. Uses both previous (past) and subsequent (future) frames
	we can compress motion further by	using prediction--If the prediction is decent, all that needs to be stored is the parameters of the prediction and the ERROR –With good prediction, the error is small –If the error is small, it can be stored very compactly
	Advances in Video Coding	Variable pixel block sizes, using more reference frames (frames from way back or forward),Extremely complicated motion compensation systems. drawback: usually requires high computational cost, and thus better and faster computers
	compression scheme is referred to as	a codec
	codec stands for	coder/decorder
	common codecs	MPEG-2/H.262: DVD standard MPEG-4 AVC/H.264: Blue-ray. Widely gaining ground as the dominant standard. Very CPU intensive •VP6, VP7, VP8: Proprietary format. Was used extensively in Flash. •WMV: Proprietary Microsoft format.
	a video file type is NOT	a codec, though some share the same file names (ick)
	video containers	•AVI: Microsoft container format. Linked to no specific CODEC •MOV: Apple Quicktime Format - Supports all MPEG formats, amongst others. Became the basis of the MPEG-4 container format •MPEG: Container for MPEG-1 and MPEG-2 videos •MPEG-4/MP4: Based off of newest Quicktime MOV. Main container for H.264 encoded videos •FLASH: Common web streaming format. Was dominated with VP series codecs, now H.264 •REALMEDIA: Real's container format. Uses p
	a video container contains	Video compressed with some CODEC (e.g. H.264) –Audio compressed with some audio codec (e.g. mp3) –Perhaps text, menus, etc.

Share This Flashcard Set

Set the Language

Understanding Multimedia Information Review

Add to Folders

Upgrade to Cram Premium

Card Range To Study

125 Cards in this Set