• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/22

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

22 Cards in this Set

  • Front
  • Back

What are the two types of data you can encounter?

1) Categorical


2) Continous

What's referred as categorical data?

Data points consist of distinct categories.



No inherent order or numerical meaning.

What's referred as continous data?

Measurements of any numerical value within a range.

What are the measurements for categorical data? (4)

1) Frequency count


2) Mode


3) Proportion


4) Cross-tabulation (contingency table)

What's cross-tabulation?

Also known as contingency table puts in relation two categorical variables.



One variable forming the rows (e.g. male/female) and one forming the columns (e.g. employed/unemployed).

What are the measurements for continuous data? (4)

1) Central Tendency


2) Spread


3) Shape (Skewness)


4) Correlation and Regression

What are the measures of Central Tendency? (3)

1) Mean


2) Median


3) Mode

What are the measures of Spread? (5)

1) Range


2) Interquartile Range (IQR)


3) Variance


4) Standard Deviation


5) Coefficient of Variation

What are the measures of Shape? (2)

1) Skewness (symmetry)


2) Kurtosis (flatness)

Moments are statistical ___ used to describe ___ of a ___ distribution.



Why the term moment?

Measurements, characteristics, probability



A moment of a mathematical function describe its shape and behavior.

Enumerate the moments of statistics (4)

1) Central Tendency: Mean (mu)


2) Spread: Variance (sigma squared)


3) Symmetry: Skewness (gamma)


4) Flatness: Kurtosis (kappa)

Null hypothesis (H0) claims that there is NO significant ___ between ___, or NO ___ between ___.



The word "null" stands for the ___ of ___.

Correlation, variables, effect, treatments



Absence, relationship

The p-value is the ___ of the observed data to ___ if the null hypothesis was ___.



You choose a ___ level (a) in advance if ___ < ___ then it's OK to ___ the null hypothesis.

Probability, occure, true



Significance, p < a, reject

Correlation is the measure of ___ and ___ of a linear ___ between two ___.



Correlation doesn't imply ___; it simply indicates the ___ to which changes in one variable are ___ with changes in other one.

Strentgh and direction


Relationship, variables



Causation


Degree, associated

Regression is used to ___ the relationship between a ___ variable and one or more ___ variables.



Regression helps ___ the value of a ___ variable based on the values of ___ variables.

Model, dependent, independent



Predict, dependent, independent

What are the measures of correlation? (2)

1. Correlation coefficients


2. Scatter plots

What are the measures of regression? (5)

1. Regression analysis


2. Coefficient of determination


3. Model evaluation


4. Residual analysis


5. Significance testing

Make an example of Regression Analysis

Regression analysis can help you build a mathematical model that, for instance, relates years of experience (independent variable) to salary (dependent variable) and make predictions for new individuals.


E.g. Salary = 1000 + 50 * Years / 4

Make an example of Coefficient of Determination

After performing a regression analysis, you calculate an R-squared value of let's say 0.85.


This means that 85% of the variability in the dependent variable can be explained by independent variable, indicating a strong correlation between the two.

Make an example of Residual Analysis

Residual analysis involves plotting the differences between the predicted prices and actual prices to check for patterns or trends. If you notice a pattern in the residuals, it may indicate that your model has systematic errors that need to be addressed.

Make an example of Significance Testing

You perform a significance test, such as a t-test or ANOVA, to determine if there is a statistically significant difference between the treatment and control groups.


This helps you determine whether the drug has a real effect or if the results could have occurred by chance.

Make an example of Model Evaluation

You have trained a model to classify emails as spam or not. To evaluate its performance, you use a dataset of 1,000 emails with known labels. You measure its accuracy, precision, recall, and F1-score to assess how well it classifies emails correctly.