• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/118

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

118 Cards in this Set

  • Front
  • Back
Statistics
the science of conducting studies to organize, collect, summarize, analyze, and draw conclusions from data.
Variable
a characteristic that can assume different values
Data
the values that the variables can assume
Descriptive Statistics
the collection, organization, summarization, and presentation of data
Inferential Statistics
generalizing from samples to populations, performing estimations and hypothesis tests, determining relatinships among variables, and making predictions
Population
all subjects that are being studied
Sample
group of subjects selected from a population
hypothesis testing
decision making process for evaluating claims about a population, based on information obtained from samples
qualitative variable
variables that can be placed in categories, according to a characteristic, not a number
quantitative variable
numerical and can be ordered or ranked
Discrete variable
values that can be counted
Continuous variable
assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals
Nominal level of measurement
classifies data into mutually exclusive, exhausting categories in which no order or ranking can be imposed on the data
ordinal level of measurement
classifies data into categories that can be ranked. precise differences between ranks don't exist
interval level measurement
ranks data, and precise differences between units of measure do exist. no meaningful zero
ratio level of measurement
possesses all the characteristics of interval measurement, and there exists a true zero. in addition, true ratios exist when the same variable is measured on two different members of the population
observational study
the researcher observes what is happening or what has happened in the past and tries to draw conclusions based on these observations
experimental study
teh researcher manipulates one of the variables and tries to determine how the manipulation influences other variables
random sampling
selected by using chance methods or random numbers
systematic sampling
each subject of the population is numbered and then every kth subject is selected
stratified sampling
the population is divided into groups according to some characterstic that is important to the study, then sampling from each group
cluster sampling
the population is divided into groups called clusters by some means
categorical frequency distribution
used for datat hat can be placed in specific categories, such as nominal, or ordinal, level data
grouped frequency distribution
when the data must be grouped into classes that are more than one unit in width becasue the range of the data is large
class boundaries
used to separate the classes so that there are no gaps in the frequency distribution
class limits
represent the smallest and largest data value included in the class
histogram
a graph that displays the data by using contiguous vertical bars of various heights to represent the frequencies of the classes
cumulative frequency polygon
also ogive. a graph that represents the cumulative frequencies for the classes ina frequency distribution
stem and leaf plot
a data plot that uses part of the data as the stem and part of the data value as the leaf to form groups or classes
pareto chart (bar graph)
used to represent a frequency distribution for a catergorical variable, and the frequencies are displayed by the heights of vertical vars, whcih are arranged in order from highest to lowest
pie chart
a cirlce that is divided into sections or wedges according to the percentage of frequencies in each category of the distribution
scatter plot
a graph of order pairs of data values that is used to determine if a relationship exists between the two variables
statistic
a characteristic or measure obtained by using the data values from a sample
parameter
a charactersitic or measure obtained by using all the data values from a specific population
mean
the arithmetic average. found by adding the values of the data and dividing by the total number of values
median
the halfway point in the data set. first data must be arranged in order
mode
the most abundant data value
range
the highest value minus the lowest value
variance
teh average of the squares of the distance each value is from the mean
standard deviation
the square root of the variance
percentile
divide the data set into 100 equal parts
standard score (z score)
obtianed by subtracting the mean form the value and dividng the result by the standard deviation
outlier
an extrememly high or an extrememly low data value when compared wiht the rest of the data values
box plot
a graph of a data set obtained by drawing a horizontal line form the minimum data value to Q1, drawing a horizontal line form Q3 to the maximum data value, and drawing a box whose vertical sides pass through Q1 and Q3 with a vertical line inside the box passing throught the median or Q2
Probability Experiment
a chance process that lead to well-defined results called outcomes
outcome
the result of a single trial of a probability experiment
sample space
the set of all possible outcomes of a probability experiment
tree diagram
a device consisiting of line segments emanating from a starting point and also from the outcome point. It is used to determine all possible outcomes of a probability experiment
event
a set of outcomes of a probability experiment
classical probability
uses samples spaces to determine the numerical probability that an event will happen
emperical probability
relies on actual experience to determine the likelihood of outcomes. f/n
subjective probability
uses a probability value based on a n educated guess or estimate, employing opinions and inexact information
four probability rules
the probability of any event E is a number between 0 and 1; if an event E cannot occur, it's probability is 0; if an event E is certain, it's probability is 1; the sum of the probability of all the outcomes in the sample space is 1
rule for complementary events
if the probability of ane vent or the probability of its complement is known, then the other can be found by subtracting the probability from 1
mutually exclusive
probability events that cannot occur at the same time
addition rule
when two events A and B are mutually exclusive, the probability that A or B will occur is p(A and B)=P(A)+P(B)
addition rule
if a and b aren't mutually exclusive, then P(A and B)=P(A)+P(B)-P(A and B)
dependent events
when the outcome or occurance of the first event affects the outcome or occurance of the second event in such a way that the probability is changed
independent events
if A occurs and doesn't affect the probability of B occuring
multiplication rule when events are independent
P(A and B)=P(A)*P(B)
multiplication rule when events are dependent
P(A and B)=P(A)*P(B\A)
rules for least problems
P(E)=1-P(e`)
fundamental rule for counting
in a sequence of n events in which the first one has k, possibilities and the second event has k2, and the third has k3, and so forth, the total number of possibilities of the sequence will be k1*k2*k3...kn
factorial formula
for any counting n
n!=n(n-1)(n-2)...1
0!=1
permutation formula
nPr=N1/(n-r)!
combination formula
nCr=n!/(n-r)!r!
random variable
a variable whose values are determined by chance
probability distribution
the values a random variable can assume and the corresponding probabilities of the values
discrete probability distribution
consists of the values a random variable can assume and the corresponding probabilities of the values. the probabilities are determined theoretically or by observation
binomial experiment
a probability experiment that there is a fixed number of trials, each trial can have only two outcomes or outcomes that can be reduced to two outcomes, the outcomesof each trial must be independent of one another, the probability of a success must remain the same for each trial
binomial distribution
the outcomes of a binomial experiment and the corresponding probabilities of these outcomes
probability problems using formula
n!/)n-x)!x! *p^x * q^n-x
probability problems using table b
go to the n=x section, find the correct x section, go over to the correct p=w section and that is the answer. table b in appendix c
normal distribution
when the data values are evenly distributed about the mean
skewed distribution
when the majority of the data values fall to the left or right of the mean
properties of theoretical normal distribution
a normal distribution curve is bell-shaped. the mean, median, and mode are equal and are located at the center of the distribution. is unimodal. the curve is symmetric about the mean, which is equivalent to saying that its shape its shape is the same on both sides of a vertical line passing through the center. the curve is continuous: no gaps or holes. the curve never touches the x axis. the total area under a normal distribution curve is equal to 1.00. the area under the part of a normal curve that lies within 1 standard deviation of the mean is approximately 0.68; within 2 standard deviations, about 0.95; and within 3 standard deviations, about 0.997.
standard normal distribution
a normal distribution with a mean of 0 and a standard deviation of 1
sampling distribution of sample means
a distribution using the means computed from all possible random samples of a specific size taken from a population
properties of the distribution of sample means
the mean of the sample means will be the same as the population mean; the standard deviation of the sample means will be smaller than the standard deviation of the population, and it will be equal to the population standard deviation divided by the square root of the sample size.
central limit theorum
as the sample size n increases without limit, the shape of the distribution of the sample means taken with replacement from a population with mean m and standard deviation o will approach a normal distribution
parameter compared to statistics
parameter is for population, statistic is for sample
properties of a good estimator
unbiased, consistent, relatively efficient
interval estimate
an interval or range of values used to estimate the parameter
point estimate
a specific numerical value estimate of a parameter
confidence level
the probability that the interval estimate will contain the parameter, assuming that a large number of samples are selected and that the estimation process on the same parameter is repeated
t distribution compared to z distribution
the variance is greater than 1, the t distribution is actually a family of curves based on the concept of degrees of freedom, as the sample size increases, the t distribution approaches the standard normal distribution
statistical hypothesis
a conjecture about a population parameter; may or may not be true
null hypothesis
Ho; a statistical hypothesis that states that there is no difference between a parameter and a specific value, or between two parameters
alternative hypothesis
Hi; a statistical hypothesis that states the existence of a difference between a parameter and a specific value, or states that there is a difference between two parameters
statistical test
uses the data obtained from a sample to make a decision about whether the null hypothesis should be rejected
test value
the numerical value obtained from a statistical test
Type I Error
if you reject the null hypothesis when it is true
Type II Error
occurs if you don't reject eh null hypothesis when it's false
Level of significance
the maximum probability of committing a type I error. alpha
critical value
separates the critical region from the noncritical value
critical region
(rejection region) the range of values of the test value that indicates that there is a significant difference and that the null hypothesis should be rejected
noncritical region
the range of values of the test value that indicates that the difference was probably due to chance and that the null hypothesis shouldn't be rejected
one-tailed test
indicates that the null hypothesis should be rejected when the test value is in the critical region on one side of the mean
two-tailed test
the null hypothesis should be rejected when the test value is in either of the two critical regions
z-test
a statistical test for the mean of a population. used when n is greater than or equal to 30 or when the population is normally distributed and population standard deviation is known
t-test
a statistical test for the man of a population and is used when the population is normally or approximately normally distributed, population standard deviation is unknown
large independent samples
large samples that are not related
small independent samples
small samples that are not related
dependent samples
samples in which the subjects are paired or matched in some way, i.e. the samples are related
correlation
correlation is a statistical method used to determine whether a relationship between variables exists.
regression
regression is a statistical method used to describe the nature of the relationship between variables
simple relationships
(simple regression) there is one independent variable that is used to predict the dependent variable
multiple relationships (multiple regression)
two or more independent variables are used to predict one dependent variable
positive relationship
exists when both variables are used to predict one dependent variable
negative relationship
as one variable increases, the other variable decreases, and vice versa
independent variable
explanatory variable or a predictor variable
dependent variable
a response variable
scatter plot
a graph of the ordered paris (x,y) of numbers consisting of the independent variable and the dependent variable y
PPMC (Pearson product moment correlation coefficient)
a statistic used to determine the strength of a relationship when the variables are normally distributed
Correlation coefficient
a statistic or parameter that measures the strength and direction of a linear relationship between two variables
regression line
the line of best fit of the data
coefficient of determination
a measure of the variation of the dependent variable that is explained by the regression line and the independent variable
standard error of estimate
the standard deviation of the observed y values about the predicted y1 values in regression and correlation analysis