• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/43

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

43 Cards in this Set

  • Front
  • Back

There are four main components of statistics; what does the first component, or the Design component, refer to?

A plan on how to OBTAIN DATA needed to answer the question.

What is the Description Component refer to?

Using tables, graphs, etc to summarize and analyze the data.

What is the third component (probability) refer to?

Determine how sample differs from the population.

What is the fourth and final component refer to (also called the inference)?

Make decisions and predictions.

What is the difference between Descriptive Statistics and Inferential Statistics?

Descriptive Statistics uses graphs and numerical summaries; used with samples or populations. Inferential we use data from SAMPLES to make predictions about populations

When is Inferential Statistics not necessary?

When we already have information about a population and there is no need to make a prediction (i.e. mean age of a population).

A survey of all graduating students was taken and numerical summaries are taken of average starting salary and the percentage of students earning more than $30k/year.


1.) Are these Descriptive or Inferential?


2.) And, are the numerical summaries statistics or parameters?

1.) Descriptive - they summarize data from a population




2.) Parameter - they refer to a population

What is a Response Variable?

The variable we are interested in measuring (dependent variable; y-axis).

What is a Component?

What you are actually simulating through the use of a random device.

True or False?


In a particular study, you could use descriptive statistics or inferential statistics, but you would rarely use both.

False. We often want to describe the sample and make inferences (predictions) about the population.

Inferential Statistics are often used to draw a conclusion. What are some methods for drawing measuring the reliability and conclusions about a population?

Things like confidence intervals and hypotheses.

What is a categorical variable?

A qualitative, non-numerical variable with different categories. Ex: types of pets.

What is a Quantitative variable?

numerical, measurable variable. Ex: GPA

Which of Qualitative and Quantitative variables are further categorized into discrete and continuous?

Quantitative.

What is a discrete, quantitative variable?

values that for a set of separate numbers. Ex: shoe size, rolling a die, or something that can be counted. Often whole numbers. Something we count.

What is a continuous, quantitative variable?

values form a continuum of values over a real number line. Ex: height, weight, time, age. Something we can measure.

What kind of charts are used for Categorical data?

Pie, Bar, Pareto (specialized bar; ordered in relative frequency)

When would a data set have the same median and mode?

A symmetric distribution.

If the mean is greater than the median the data are likely skewed which way?

Right. Mean>Median = right-skewed

If the median is larger than the mean the data are likely skewed...

Left. Median>Mean = left-skewed

What percent of a data set is said to be one standard deviation from the mean (in a normal distribution)? Two standard deviations? Three?

68%




95%




Nearly all for three standard deviations.

A graph of the 5 number summary is just a...?

Boxplot - min, Q1, median, Q3, max

Why is the standard deviation (s) preferred over the range?

the range is more affected by an outlier and the standard deviation uses all of the data.

Why is the IQR sometimes preferred to (s)?

IQR is not affected by the outlier

What is the advantage of (s) over the IQR?

the standard deviation takes into account all observations.

How do you describe data distributions?

SOCS - shape, outliers, central tendency, and spread.

What does shape refer to in SOCS?

Modality (# of peak) - uni, bi, multimodal.


Skewness & Symmetry - left (negative), right (positive), symmetric (bell-shaped or normal)

What does Outliers refer to in SOCS?

Unusual values. A data value with a z-score greater than zero.

What does the Central Tendency of SOCS refer to?

Mean for symmetric distributions.


Median for skewed distributions

The spread of SOCS refers to?

Standard deviation (mean)


IQR and Range (median)

What is an explanatory variable?

The varibale being manipulated (x-axis; independent variable). Explanatory explains the response variable.

What is a lurking variable?

unobserved variable that influences the association between explanatory and response variables.

True or False? An observational study can establish causation.

False. An observational study can reveal association or correlation.

How is Margin of Error calculated?

= 1/[sqrt(n)] x 100%

What is the significance of the Margin of Error? Why is it used?

Used to give a range of plausible values for the population parameter.

How do you find the range of plausible values for a given parameter?

= sample statistic (+/-) margin of error




ex: 78%(+/-)2.9% = (75.1%, 80.9%)

What is sampling bias?

Sampling method that tends to obtain non-representative samples.

What is under-coverage?

Sampling frame does not represent all parts of the population; portion is not sampled or has smaller representation in sample than it does the population.

What is over-coverage?

Members that are not in the population of interest are included in the sample.

What is non-response bias?

Sampled subjects cannot be reached or refuse to participate. Or some may not respond to some questions further resulting in missing data.

Name three poor ways to sample.

Convenience sample (individuals who are easy to sample, which may not rep pop)


Volunteer sample (common type of convenient sample)


Large, non-representative sample

What is Cluster Random Sampling?

Heterogenous "clusters" that resemble the population; a census is taken of each cluster.


What is Stratified Random Sampling?

Homogenous groups (to reduce variability); then use simple random sampling to choose members from each strata. Each stratum will be different from the others.