Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
55 Cards in this Set
- Front
- Back
Statistics is concerned with:
1. 2. |
1. processing and analysing data.
2. collecting, presenting and transforming data to assist decision makers. |
|
Population
|
A population consists of all the members of a group about which you want to draw a conclusion.
|
|
Sample
|
A sample is the proportion of the population selected for analysis.
|
|
Parameter
|
A parameter is a numerical measure that describes a characteristic of a population.
|
|
Statistic
|
A statistic is a numerical measure that describes a characteristic of a sample.
|
|
What are the two branches of statistics?
|
1. Descriptive Statistics
2. Inferential Statistics |
|
What is descriptive statistics to do with?
|
Collecting, summarising and presenting data.
|
|
What are some examples of descriptive statistics?
|
Collecting - survey
Presenting - tables and graphs Characterising - smaple mean |
|
What is inferential statistics?
|
Drawing conclusions about a population based only on sample data.
|
|
What are some examples of inferential statistics?
|
Estimation - estimate the populations mean weight using a sample mean weight.
Hypothesis testing - test the claim that the populations mean weight is 100kg. |
|
What is univariate data?
|
Data that contains only one variable.
ie: mean salaries |
|
What is bivariate data?
|
Data that contains only two variables.
|
|
What is multivariate data?
|
Data that records several variables for each 'individual'.
ie: table of salaries from each state. |
|
What are the two forms of data?
|
1. Primary Data
2. Secondary Data. |
|
What is primary data? Give some forms.
|
Primary data is data that is collected specifically to answer your question of interest.
Forms: observation, experiementail, survry |
|
What is secondary data? Give it's forms.
|
Secondary data is data that has been collected by somebody else for another purpose.
Forms: print or electronic (internet). |
|
What is a statistical experiement?
|
In an experiment, the researcher actively changes some characteristic of the units before the data is collected.
Therefore some of the variables values are under the control of the experimentor. |
|
What is an observational study?
|
Data is collected in an observational study id we passively record (observe) values from each unit.
ie: a survey. |
|
What are the two types of data?
|
1. Categorical
2. Numerical |
|
What is categorical data and what are its levels of measurement?
|
Categorical data values are selected from a sall group of categories.
Levels of measurement: 1. Ordinal 2. Nominal |
|
What are ordinal categorical variables?
|
Ordinal categorical values have categories that can be meaningfully ordered.
ie: service quality rating. |
|
What are nominal categorical variables?
|
Nominal categorical variables have categories that are equally meaningful.
ie: a students religion (christian, atheist, jewish) |
|
What are the two forms of numerical variables?
|
1. Discrete
2. Continuous |
|
What is a discrete numerical variable?
|
A variable whose values are whole numbers (counts)
eg: Number of items purchased by a customer at a supermarket. |
|
What is a continuous numerical variable?
|
A variable that may contain any value within some range.
Measured characteristics. eg: the amount of time that the customer spends in the supermarket. |
|
What is the categorical 'trick' that could be encountered?
|
Sometimes categorical variables are CODED as numbers when the data is recorded.
eg: 0 for male, 1 for female. The variable is still categorical dispite the numbers. |
|
What is unstacked data?
|
Data that is presented in a separate list for each group.
|
|
What is stacked data?
|
Data that is presented as a single list alonside a categorical variable.
|
|
What are the formats used to show the relationship between two numerical values?
|
- Correlation
- Least squares |
|
What are the formats used to show the relationship between two categorical values?
|
- Contingency table
- Conditional proportions |
|
What is a census?
|
Measurements that are made from every item in the target population.
|
|
Why is completing a census often not possible?
|
- the cost and time required
- recording some variables destroys the units ie: testing the strength of seatbels leaves them damaged. |
|
What are the two types of samples used?
|
1. non-probability sample
2. probability sample |
|
What is a non-probability sample?
|
Itmes included are chosen without regard to their probability of occurance.
|
|
What are some examples of non-probability samples.
|
1. judgement
2. chunk 3. quota 4. convenience |
|
What is a probability sample?
|
Items in the sample are chosen on the basis of known probabilities.
|
|
Wha are some examples of probability samples?
|
1. simple random
2. systematic 3. stratefied 4. cluster |
|
What is a simple-random sample?
|
Each unit has the same change of being selected.
A random mechanism is used to determine which units are included in the cycle. Selection may be with or without replacement. |
|
How do you calculate a systematic sample?
|
1. Decide on a sample size (n)
2. Divide the frame of N individuals into groups of k individuals. k=N/n 3. Randomly select one individual from the first group. 4. select every 'k'th individual from thereafter. |
|
What does k = N/n mean?
|
N = the size of the population (ie: 200)
n = the sample size (ie: 20) k = every 'k'th number is included in the sample. (ie: 200/20=10. So every 10th value is included in the sample). |
|
What is strata?
|
Subgroups
|
|
How do you complete a stratefied sample?
|
1. Divide the population into strata according to some common characteristic.
2. A simple random sample is selected from each strata, with sample sizes proportional to strata sizes. 3. samples from the strata are combined into one. |
|
How do you perform a cluster sample?
|
1. Divide the population into several 'clusters', each representative of the population.
2. a simple random sample of clusters is selected. NB: all items in the selected clusters can be used, or items can be chosen from a cluster using another probability samplling technique. |
|
What is the advantage and disadvantage of simple random sampling and systematic sampling?
|
A: simple to use.
D: May not be a good representation of the population's underlying characterisitcs. |
|
What is the advantage of a stratefied sample?
|
Ensures representation of individuals across the entire population.
|
|
What is the advantage and disadvantage of cluster sampling?
|
A: more cost-effective.
D: less efficient (need larger sample to acquire the same level of precision.) |
|
What are the five types of survey errors?
|
1. Coverage error
2. Non-response error 3. Sampling error 4. Instrument error 5. Interviewer error |
|
What is a coverage erorr?
|
It occurs when the sample is not selected from the target population but only a PART of the target population.
|
|
What is non-response error?
|
1. Failure to contact the individuals (ie: in a phone survey some numbers won't be answered)
2. Refusal to participate in the survey. 3. Refusal to answer specific questions (ie: salary) |
|
What do coverage and non-response errors have in common?
|
They are both the missing responses caused by failure to obtain information from some population members.
|
|
What is a sampling error?
|
The estimated mean or proportion is unlikely to be exactly the same as the underlying population parameter that is being estimated.
|
|
What is a non-sampling error?
|
A person may refuse to be part of a sample. A non-sampling error can be worse than a sampling error as:
- it is extremely difficult to assess their likely size - they often distort estimates by pulling them in one direction, the estimates are then biased. |
|
What is an instrument error?
|
This results from poorly designed questions. Different wordings can lead to different asnwers being given.
ie: leading question. |
|
What is an interviewer error?
|
This occurs when some characteristic of the interviewer (ie: gender, age) affects the way in which respondents answer questions.
|
|
What are the four ways to evaluate survey worthiness?
|
1. What is the purpose of the survey?
2. Are the questions appropriate and unambiguous? 3. Is the survey based on a probability sample? 4. Coverage error - is it an appropriate frame? |