forked from eggy/eifueo
math: add bias and types of data, expand descriptions
This commit is contained in:
parent
d104d80894
commit
d060ad4322
@ -5,16 +5,55 @@ The course code for this page is **MHF4U7**.
|
||||
## 4 - Statistics and probability
|
||||
|
||||
!!! note "Definition"
|
||||
- **Statistics:** The techniques and procedures to analyse, interpret, display, and make decisions based on data.
|
||||
- **Descriptive statistics:** The use of methods to organise, display, and describe data by using various charts and summary methods to reduce data to a manageable size.
|
||||
- **Inferential statistics:** The use of samples to make judgements about a population.
|
||||
- **Data set:** A collection of data with elements and observations, typically in the form of a table. It is similar to a map or dictionary in programming.
|
||||
- **Element:** The name of an observation(s), similar to a key to a map/dictionary in programming.
|
||||
- **Observation:** The collected data linked to an element, similar to a value to a map/dictionary in programming.
|
||||
- **Raw data:** Data collected prior to processing or ranking.
|
||||
- **Population**: A collection of all elements of interest within a data set.
|
||||
- **Sample**: The selection of a few elements within a population to represent that population.
|
||||
- **Raw data:** Data collected prior to processing or ranking.\
|
||||
|
||||
### Sampling
|
||||
|
||||
A good sample:
|
||||
|
||||
- represents the relevant features of the full population,
|
||||
- is large enough so that it decently represents the full population,
|
||||
- and is random.
|
||||
|
||||
The types of random sampling include:
|
||||
|
||||
- **Simple**: Choosing a sample completely randomly.
|
||||
- **Convenience**: Choosing a sample based on ease of access to the data.
|
||||
- **Systematic**: Choosing a random starting point, then choosing the rest of the sample at a consistent interval in a list.
|
||||
- **Quota**: Choosing a sample whose members have specific characteristics.
|
||||
- **Stratified**: Choosing a sample so that the proportion of specific characteristics matches that of the population.
|
||||
|
||||
??? example
|
||||
- Simple: Using a random number generator to pick items from a list.
|
||||
- Convenience: Asking the first 20 people met to answer a survey,
|
||||
- Systematic: Rolling a die and getting a 6, so choosing the 6th element and every 10th element after that.
|
||||
- Quota: Ensuring that all members of the sample all wear red jackets.
|
||||
- Stratified: The population is 45% male and 55% female, so the proportion of the sample is also 45% male and 55% female.
|
||||
|
||||
### Types of data
|
||||
|
||||
!!! note "Definition"
|
||||
- **Quantitative variable**: A variable that is numerical and can be sorted.
|
||||
- **Discrete variable**: A quantitative variable that is countable.
|
||||
- **Continuous variable**: A quantitative variable that can contain an infinite number of values between any two values.
|
||||
- **Qualitative variable**: A variable that is not numerical and cannot be sorted.
|
||||
- **Bias**: An unfair influence in data during the collection process, causing the data to be not truly representative of the population.
|
||||
|
||||
|
||||
### Frequency distribution
|
||||
|
||||
A **frequency distribution** is a data set that lists ranges and the number of values in each range. It can be displayed using a frequency distribution table.
|
||||
|
||||
!!! note "Definition"
|
||||
|
||||
|
||||
## Resources
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user