Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
115 Cards in this Set
- Front
- Back
What is statistics
|
branch of math that transfors data into useful information for decision making
|
|
Why do we study statistics?
|
present and describe data
draw conclusions about lg. pop make reliable forcasts improve process |
|
What is inferential statistics?
|
branch of stats that use sample data to draw conclusions about an entire population
|
|
What is a variable?
|
characteristic of an item or individual
|
|
What are operational definitions?
|
universally accepted meanings that are clear to all associated with analysis
|
|
What is a sample?
|
portion of a population selected for analysis
|
|
What is a parameter?
|
numerical measure that describes a characteristic of a population
|
|
What is a statistic?
|
numerical measure that describes a characteristic of a population
|
|
Name two data sources
|
Primary (data used by primary data collector) and Secondary (non-data collector using statistics)
|
|
Four Sources of Data
|
Data dist. by organ or indiv.
Designed experiment Survey Observational Study |
|
What is a categorical variable?
|
"qualitative" values that can only be placed into categories such as "yes" and "no" or days of the week
|
|
What are the types of variables?
|
Categorical and Numerical (numerical then divided into discrete and continuous)
|
|
What is a numerical variable?
|
"quantative" has value that represents quantaties (subdivided into discrete or continuous)
|
|
What is a discrete variable?
|
numerical values that arise from a counting process
|
|
What is a continuous variable?
|
produces numerical responses that arise from a measuring process (no two cont. variables will be identical)
|
|
What are the types of scales?
|
Nominal
Ordinal Interval Ratio |
|
What is a nominal variable?
|
Classifies data into distinct categories in which no ranking is implied (weakest form)
|
|
What is an ordinal scale?
|
classifies data into distinct categories in which ranking is implied (excellent, very good, fair, poor) weak form
|
|
What is a ratio scale?
|
ordered scale in which the difference btwn measurements involves a true zero point (age, weight, salary)
|
|
What is an interval scale?
|
ordered scale in which the diffs btwn msuremnts is a meaningful quantity but does not involve true zero point
|
|
What is a pie chart
|
circle broken up into slices that represent categories; size varies according to %
|
|
What is a summary table?
|
indicates freq amt of % of items in set of categories so diffs can be seen btwn cat. (cat/freq, amt. or % in diff col.)
|
|
What is a bar chart?
|
shows category and length of which rep. the amt, freq. or % of values falling into category
|
|
When do you use a bar chart?
|
if comparison of categories is most impt.
|
|
When do you use a pie chart?
|
when observing the portion of the whole that is in a particular category is most impt
|
|
What is the Pareto Principle?
|
Exists when maj. of items in set of data occur in small # of cat. & few remaining items are spread out over lg # of cat.
|
|
What is a Pareto Diagram?
|
Categorized responses are plotted in descending order according to freq. and are combined with a sum % line on same chart (prioritizing improvements)
|
|
What is an ordered array?
|
sequence of data, in rank order, from smlest to lgest
|
|
What is a stem and leaf display?
|
organizes data into grps ("stems") so values in each grp (leaves) branch out to the right on each row
|
|
What is a frequency distribution?
|
summary table in which data are arranged into numerically ordered class grpings (draw conclusions about maj char.)
|
|
What are class groupings?
|
must establish appropriate # and suitable width & boundaries to avoid overlap
|
|
What are class boundaries?
|
place each value in one and only one class
|
|
What is class midpoint?
|
center of each class (halfway btwn lower boundary and upper boundar of class)
|
|
What is the width of class interval?
|
Width=range/#of desired class groupings
|
|
What is the range?
|
highest value - lowest value= range
|
|
What is relative frequency distribution?
|
dividing freq. in each class of freq. dist. by the total # of values
|
|
What is percentage distribution?
|
multiply each relative freq. by 100%
|
|
What is cumulative % distribution?
|
provides way of presenting info. about the % of items that are less than a certain value
|
|
What is a histogram?
|
bar chart for grped numerical data in which freq. or % of ech grp of num. data are represntd as indiv vertical bars (no gaps)
|
|
When do you use histogram, polygons and/or cumulative % polygons?
|
when analyzing a single numerical value
|
|
What is a cumulative percentage polygon?
|
"Ogive" displays the variable of interest along the X axis & the cumulative % along theY axis
|
|
What is a contingency table?
|
presents results of two categorical variables
|
|
What are the cells in a contingency table?
|
values located at the intersection of the rows/columns
|
|
What is a side by side bar chart?
|
useful way to visually display the result of cross-classification data
|
|
When do you use scatter plots and time series plots?
|
when analyzing two numerical values
|
|
What is a scatter plot?
|
examines possible relationships btwn two num. variables
|
|
What is a time series plot?
|
Studies patterns in the values of a numerical variable over time
|
|
What kinds of charts obscure data in Excel?
|
doughnut, radar, surface, bubble, cone and pyramid
|
|
What are guidelines for developing good graphs?
|
do not distort data; title; no cartjunk; two dimensional; scale on vertical axis should start at 0; simplest possible
|
|
What is central tendency?
|
extent to which all data values group around a typical or central value
|
|
What is a variation?
|
amt of dispersion or scattering of values away from a central value
|
|
What is a shape?
|
pattern of distribution of values from the lowest to highest values
|
|
What are the 3 measures of central tendency?
|
mean, median, mode
|
|
What is the arithmetic mean "mean"
|
most common measure of central tendency "balance point"
Mean = sum of values / # of values |
|
What is the summation notation?
|
all n values added together
|
|
What is the summation of the square?
|
in stats the squared values of variables are often summed
|
|
What is the square of the sum?
|
the summation notation then squared
|
|
What is the sample mean?
|
Sum of values divided by the # of values
|
|
What does n equal?
|
number of values or sample size
|
|
What is an outlier?
|
extreme value
|
|
What is the median?
|
middle value in a set of data (if odd number is middle number is set....if even is average of two middle values)
|
|
What is the mode?
|
the most frequently occuring value in a set of data
|
|
What is the middle quartile?
|
the median
|
|
What are the rules for calculating quartiles?
|
whole number = ranked value
fract. 1/2 = avg. of corresp. ranked values if not whole or fract 1/2 then result is rounded to nearest integar and select that ranked value |
|
What is the geometric mean?
|
measures rate of change of variable over time
|
|
What does variation measure?
|
spread or dispersion of values in a data set
|
|
What is the range?
|
measure of variation; diff btwn lgest and smallest values
"simplest umerical descriptive measure of variation in data set |
|
What are the two most common measures of variation?
|
variance and standard deviation
|
|
What is a shape of data set?
|
repreesnts pattern of all values from lowest to highest
|
|
What is the range?
|
midspread "middle fifty" Q3-Q1 (cannot be affected by extreme values)
|
|
What is resistant measures?
|
summary meausres such as median, Q1, Q3, and interquartile range which cannot be influenced by extreme measures
|
|
What do variance and standard deviation measure?
|
the "average" scatter around the mean - how larger values fluctuate above and smaller values distribute below
|
|
What is the sum of squares (SS)?
|
squares the difference btwn each value and the mean and then sums these squared differences
|
|
What is the sample variance?
|
sum of the squared differences around the mean divided by the sample size minus one
|
|
What is the sample standard deviation?
|
square root of the sum of the squared differences around the mean divided by the sample size minus one
|
|
What is the coefficient of variation?
|
relative measure of variation always expressed as a % rather than in terms of units of the particular data **measures scatter relative to the mean**
|
|
What is considered an outlier in Z-scores?
|
anything above 3.0 or below -3.0
|
|
What is the shape?
|
pattern of dist. of data values throughout entire range of all variables
|
|
What are the types of distribution?
|
symmetrical and skewed (values not symmetrical around the mean)
|
|
What does left skewed mean?
|
mean < median: negative
|
|
What does right skewed mean?
|
mean > median: positive
|
|
What does symmetrical mean?
|
mean = median: symmetrical
|
|
What are parameters?
|
summary measures for a population
|
|
What are population parameters?
|
population mean
population variance population standard deviation |
|
What is a population mean?
|
sum of values in the population divided by the size N
|
|
What is the general addition rule?
|
The probability of A or B is equal to the probability of A plus the probability of B minus the probability of A and B
P(AorB)=P(A)+P(B)-P(AandB) |
|
What does collectively exhaustive mean?
|
when one of the events in a set of events must occur
|
|
What is mutually exclusive?
|
when both events cannot occur simultaneously
|
|
What is joint probability?
|
the probability of an occurrence involving two or more events
|
|
What is marginal probability?
|
"simple probability" because it enables you to compute the total # of successes from the appropriate margins of the contingency table
|
|
What is simple probability?
|
probability of occurrence of a simple event P(A)
|
|
What is a contingency table?
|
way to present a sample space
|
|
What is a sample space?
|
collection of all possible events
|
|
What is a complement of event A "A"?
|
All parts that are not part of event A
|
|
What is a joint event?
|
an event that has two or more characteristics
|
|
What is a simple event?
|
a single characteristic
|
|
What is an event?
|
each possible outcome of a variable
|
|
What is subjective probability?
|
differs from person to person (useful in making decisions when you cannot use a priori or empirical classical)
|
|
What is the probability of occurrence?
|
the # of ways in which events occur / the total # of possible outcomes
|
|
What is empirical classical probability?
|
outcomes are based on observed data, not on prior knowledge of a process
|
|
What is a priori classical probability?
|
the probability of success is based on prior knowledge of the process involved
|
|
What are three approaches to the subject of probability?
|
a priori classical probability
empirical classical probability subjective probability |
|
What is an impossible event?
|
event that has no chance of occurring....probability of 0
|
|
What is a probability?
|
numeric value representing the chance, likelihood or possibility a particular event will occur
|
|
What is the empirical rule?
|
68% of values are within 1 standard deviation from mean; 95% are within 2 sd of the mean and 99.7% are within 3
|
|
What is chebyshev rule?
|
states that for any data set, regardless of shape, the % values that are found within a distance of K standard deviations from the mean must be at least (1-1/K squared) x100%
|
|
What is a five number summary?
|
X smallest, Q1, median, Q3, X largest
|
|
What is a box and whisker plot?
|
graphical representation of the data based on the five number summary
|
|
What is covariance?
|
measures strength of the linear relationship between 2 numerical variables (X & Y)
|
|
What is the coefficient of correlation?
|
measures the relative strength of a linear relationship between two numerical variables
|
|
What is Bayes' theorem?
|
used to revise previously calculated probabilities based on new info.
|
|
What is conditional probability?
|
probability of Event A given info about the occurrence of another Event B
|
|
What is a decision tree?
|
alternative to contingency table
|
|
What is statistical independence?
|
when outcome of one event does NOT affect theprobability of occurrence of another event
|
|
What is the general multiplication rule?
|
Probability of A & B is equal to the probabilityof A given B times the probability of B
P(A&B)=P(A\B)P(B) |