skip to main content

PBBSC SY INTRODUCTION TO NURSING RESEARCH AND STATISTICS UNIT 7

  • Descriptive Statistics

Descriptive Statistics

Descriptive statistics involves summarizing and organizing data to provide a clear understanding of its characteristics. It is the foundation of data analysis, offering insights into patterns, trends, and distributions within a dataset.


Purpose of Descriptive Statistics

  1. Summarize Data:
    • Reduces large datasets into concise, interpretable summaries.
  2. Highlight Patterns:
    • Identifies trends and distributions in the data.
  3. Facilitate Comparison:
    • Compares different datasets or groups.
  4. Prepare for Inferential Analysis:
    • Provides a foundation for further statistical testing.

Types of Descriptive Statistics

1. Measures of Central Tendency

  • Indicates the center or typical value of a dataset.
  • Mean (Average):
    • Sum of all data points divided by the total number.
    • Example: Average patient age = (25 + 30 + 35 + 40) ÷ 4 = 32.5 years.
  • Median:
    • The middle value when data is arranged in ascending order.
    • Example: Median of {25, 30, 35, 40} = 32.5 (average of middle two values).
  • Mode:
    • The most frequently occurring value in the dataset.
    • Example: Mode of {25, 25, 30, 35} = 25.

2. Measures of Dispersion

  • Indicates the spread or variability of data.
  • Range:
    • Difference between the highest and lowest values.
    • Example: Range of {10, 20, 30, 40} = 40 – 10 = 30.
  • Variance:
    • Average squared deviation from the mean.
    • Example: Variance = Σ (X – Mean)² / N.
  • Standard Deviation (SD):
    • Square root of the variance, representing average deviation from the mean.
    • Example: SD of {10, 20, 30, 40} ≈ 12.91.

3. Measures of Distribution

  • Describes the shape and spread of data.
  • Frequency Distribution:
    • Counts the number of occurrences of each value or category.
    • Example:Age GroupFrequency20–301031–401541–505
  • Skewness:
    • Measures asymmetry in data distribution.
      • Positive skew: Tail is longer on the right.
      • Negative skew: Tail is longer on the left.
  • Kurtosis:
    • Describes the peakedness or flatness of data distribution.

4. Graphical Representation

  • Visual tools to summarize and present data.
  • Bar Graphs:
    • Displays categorical data.
  • Histograms:
    • Shows frequency distribution for continuous data.
  • Pie Charts:
    • Represents proportions or percentages.
  • Box Plots:
    • Highlights data spread, median, and outliers.

Steps in Using Descriptive Statistics

  1. Collect Data:
    • Gather raw data through surveys, experiments, or observations.
  2. Organize Data:
    • Arrange data in a logical order, such as tables or spreadsheets.
  3. Calculate Measures:
    • Compute central tendency, dispersion, and distribution metrics.
  4. Visualize Data:
    • Use graphs or charts to present insights.
  5. Interpret Results:
    • Analyze patterns and trends for meaningful conclusions.

Applications of Descriptive Statistics

In Nursing Research

  1. Patient Demographics:
    • Summarizing age, gender, or medical history of patients.
  2. Clinical Outcomes:
    • Analyzing treatment success rates or recovery times.
  3. Survey Results:
    • Presenting feedback on healthcare services.

In Education Research

  1. Student Performance:
    • Summarizing test scores or attendance records.
  2. Teacher Feedback:
    • Analyzing responses to professional development programs.

Advantages of Descriptive Statistics

  1. Simplifies Data:
    • Makes large datasets manageable.
  2. Enhances Understanding:
    • Provides insights into patterns and relationships.
  3. Supports Decision-Making:
    • Helps identify trends for actionable steps.

Limitations of Descriptive Statistics

  1. No Cause-and-Effect Analysis:
    • Cannot determine relationships between variables.
  2. Limited to Summary:
    • Provides no inference about the population beyond the dataset.
  • frequencies,

Frequencies in Data Analysis

Frequencies represent the count of occurrences of a particular value or category in a dataset. It is one of the simplest and most fundamental tools in descriptive statistics, often used to summarize and analyze data.


Types of Frequencies

  1. Absolute Frequency:
    • The actual count of occurrences for each value or category.
    • Example:
      • Number of patients in each age group:Age GroupFrequency18–301031–501551–705
  2. Relative Frequency:
    • The proportion or percentage of the total occurrences for each value or category.
    • Formula: Relative Frequency=Frequency of CategoryTotal Frequency×100\text{Relative Frequency} = \frac{\text{Frequency of Category}}{\text{Total Frequency}} \times 100Relative Frequency=Total FrequencyFrequency of Category​×100
    • Example:
      • For age group 18–30: 1030×100=33.33%\frac{10}{30} \times 100 = 33.33\%3010​×100=33.33%
  3. Cumulative Frequency:
    • The running total of frequencies up to a certain value or category.
    • Example:Age GroupFrequencyCumulative Frequency18–30101031–50152551–70530

Frequency Distribution Table

A frequency distribution table is a structured way to display frequencies. It organizes data into categories or intervals, showing the count (absolute frequency), percentage (relative frequency), or cumulative count.

Example of a Frequency Distribution Table

  • Dataset: Patient ages in a clinic.
Age IntervalFrequencyRelative Frequency (%)Cumulative Frequency
18–301033.3310
31–501550.0025
51–70516.6730

Graphical Representation of Frequencies

  1. Bar Graph:
    • Displays absolute or relative frequencies for categorical data.
    • Example: Number of patients in different age groups.
  2. Histogram:
    • Represents frequencies for continuous data, grouped into intervals.
    • Example: Distribution of patient weights.
  3. Pie Chart:
    • Represents relative frequencies as slices of a circle.
    • Example: Percentage of patients by gender.
  4. Line Graph (Cumulative Frequency Curve):
    • Shows cumulative frequencies over categories or intervals.

Steps to Calculate Frequencies

  1. Organize the Data:
    • Arrange data in ascending order (for continuous or ordinal data).
  2. Define Categories or Intervals:
    • Create appropriate categories or intervals for grouping data.
  3. Count Occurrences:
    • Count how often each value or interval occurs.
  4. Calculate Relative Frequencies (if needed):
    • Use the formula provided above.
  5. Calculate Cumulative Frequencies (if needed):
    • Add frequencies cumulatively.

Applications of Frequencies

In Nursing Research

  1. Patient Demographics:
    • Summarizing age, gender, or diagnosis frequencies.
  2. Clinical Outcomes:
    • Counting occurrences of specific recovery rates or side effects.

In Education Research

  1. Test Scores:
    • Summarizing the number of students in each score range.
  2. Attendance Records:
    • Counting the number of students with different attendance rates.

Advantages of Using Frequencies

  1. Simplicity:
    • Easy to calculate and understand.
  2. Highlights Trends:
    • Quickly identifies the most common values or categories.
  3. Facilitates Further Analysis:
    • Prepares data for more advanced statistical tests.
  • class interval

Class Interval

A class interval is a range of values used to group continuous data into categories for easier analysis and representation. It is commonly used in frequency distribution tables to summarize large datasets.


Key Components of a Class Interval

  1. Lower Limit:
    • The smallest value in the interval.
    • Example: In the interval 10–20, the lower limit is 10.
  2. Upper Limit:
    • The largest value in the interval.
    • Example: In the interval 10–20, the upper limit is 20.
  3. Class Width:
    • The difference between the upper and lower limits of a class.
    • Formula: Class Width=Upper Limit−Lower Limit\text{Class Width} = \text{Upper Limit} – \text{Lower Limit}Class Width=Upper Limit−Lower Limit
    • Example: For the interval 10–20, the class width is 20−10=1020 – 10 = 1020−10=10.
  4. Class Boundaries:
    • The actual boundaries of a class interval, often adjusted to eliminate gaps.
    • Example: For 10–20, the boundaries may be 9.5–20.5.
  5. Class Midpoint:
    • The average of the lower and upper limits of a class.
    • Formula: Midpoint=Lower Limit+Upper Limit2\text{Midpoint} = \frac{\text{Lower Limit} + \text{Upper Limit}}{2}Midpoint=2Lower Limit+Upper Limit​
    • Example: For 10–20, the midpoint is 10+202=15\frac{10 + 20}{2} = 15210+20​=15.

Steps to Create Class Intervals

  1. Identify the Range of Data:
    • Calculate the range: Range=Maximum Value−Minimum Value\text{Range} = \text{Maximum Value} – \text{Minimum Value}Range=Maximum Value−Minimum Value
  2. Decide the Number of Classes:
    • The number of intervals depends on the dataset size, typically between 5 and 20.
  3. Determine Class Width:
    • Formula: Class Width=RangeNumber of Classes\text{Class Width} = \frac{\text{Range}}{\text{Number of Classes}}Class Width=Number of ClassesRange​
  4. Create Intervals:
    • Start with the minimum value and add the class width to define each interval.
  5. Assign Frequencies:
    • Count the number of data points falling into each interval.

Example: Creating Class Intervals

Dataset:

10, 12, 15, 18, 20, 22, 25, 27, 30, 32

  1. Range:
    • 32−10=2232 – 10 = 2232−10=22
  2. Number of Classes:
    • Assume 5 classes.
  3. Class Width:
    • 225=4.4\frac{22}{5} = 4.4522​=4.4 (round to 5)
  4. Class Intervals:
    • Start at 10 with a width of 5:
      • 10–15, 16–20, 21–25, 26–30, 31–35
  5. Frequency Table:
Class IntervalFrequency
10–153
16–202
21–252
26–302
31–351

Applications of Class Intervals

In Nursing Research

  1. Patient Data:
    • Grouping patient ages or lab test results into intervals.
  2. Recovery Times:
    • Analyzing recovery durations by intervals.

In Education Research

  1. Test Scores:
    • Summarizing students’ scores into ranges (e.g., 0–10, 11–20).
  2. Attendance:
    • Categorizing students based on attendance rates.

Advantages of Using Class Intervals

  1. Simplifies Large Datasets:
    • Reduces complexity for better visualization.
  2. Highlights Patterns:
    • Makes trends and distributions easier to identify.
  3. Facilitates Further Analysis:
    • Prepares data for histograms and statistical measures.
  • graphic methods of describing frequency

Graphical Methods of Describing Frequency

Graphical methods provide visual representations of frequency data, making it easier to identify trends, patterns, and distributions. Below are some commonly used graphical methods to describe frequency.


1. Bar Graph

  • Definition: A bar graph represents categorical frequency data using rectangular bars.
  • Application:
    • Used for discrete data, such as survey responses or patient categories.
  • Characteristics:
    • Bars are separated.
    • The height of the bar indicates the frequency.

Example:

Frequency of patients visiting different hospital departments:

DepartmentFrequency
Outpatient50
Emergency30
Surgery20

Bar Graph:

  • X-axis: Departments.
  • Y-axis: Frequency.
  • Each department is represented by a bar proportional to its frequency.

2. Histogram

  • Definition: A histogram represents the frequency distribution of continuous data using adjoining bars.
  • Application:
    • Used for data grouped into class intervals, such as patient ages or test scores.
  • Characteristics:
    • Bars touch each other to indicate continuous data.
    • X-axis represents class intervals; Y-axis represents frequency.

Example:

Patient ages grouped into intervals:

Age GroupFrequency
10–205
21–3015
31–4010

3. Frequency Polygon

  • Definition: A frequency polygon connects points plotted at the midpoints of class intervals, with the frequency on the Y-axis.
  • Application:
    • Used to show the shape of the distribution and compare multiple datasets.
  • Characteristics:
    • The polygon starts and ends at the baseline (X-axis).
    • Easier to compare datasets than a histogram.

Example:

Using the same data as the histogram example, plot the midpoints (e.g., 15, 25, 35) against the frequencies.


4. Pie Chart

  • Definition: A pie chart represents frequency data as slices of a circle, showing proportions.
  • Application:
    • Used for relative frequency or percentage data.
  • Characteristics:
    • Each slice represents a category, proportional to its frequency.

Example:

Distribution of disease cases in a hospital:

DiseaseFrequencyPercentage
Diabetes4040%
Hypertension3030%
Others3030%

The pie chart will have slices of 40%, 30%, and 30%.


5. Line Graph

  • Definition: A line graph uses points connected by a line to show trends or changes over time.
  • Application:
    • Used for time-series data, such as weekly patient admissions.
  • Characteristics:
    • X-axis: Time intervals.
    • Y-axis: Frequency.

Example:

Weekly patient admissions:

WeekAdmissions
Week 120
Week 230
Week 325

6. Ogive (Cumulative Frequency Graph)

  • Definition: An ogive represents cumulative frequency data, either less than or greater than a given value.
  • Application:
    • Used to determine percentiles or medians.
  • Characteristics:
    • X-axis: Class intervals.
    • Y-axis: Cumulative frequency.

Example:

Cumulative frequency data for patient ages:

Age GroupFrequencyCumulative Frequency
10–2055
21–301520
31–401030

7. Scatter Plot

  • Definition: A scatter plot represents the relationship between two continuous variables.
  • Application:
    • Used to visualize correlations (positive, negative, or none).
  • Characteristics:
    • Each point represents an observation.

Example:

Relationship between BMI and blood pressure readings.

BMIBlood Pressure
25120
30140
35160

8. Box Plot

  • Definition: A box plot shows the distribution of a dataset, including its median, quartiles, and outliers.
  • Application:
    • Used to identify variability and outliers.
  • Characteristics:
    • Box represents the interquartile range.
    • Whiskers extend to the smallest and largest values within 1.5 times the IQR.

Comparison of Graphical Methods

Graph TypeBest ForExample
Bar GraphCategorical dataFrequency of diseases in departments.
HistogramContinuous dataAge distribution of patients.
Frequency PolygonComparing distributionsTest score comparisons.
Pie ChartProportions/percentagesDisease case distribution.
Line GraphTime-series dataWeekly admissions in a hospital.
OgiveCumulative frequenciesMedian income levels.
Scatter PlotRelationships between variablesCorrelation between BMI and blood pressure.
Box PlotDistribution and outliersRecovery time variability in treatments.

Tips for Effective Graphical Representation

  1. Choose the Right Graph:
    • Match the graph type to the data and objectives.
  2. Label Clearly:
    • Use meaningful titles, axis labels, and legends.
  3. Simplify:
    • Avoid clutter; focus on key insights.
  4. Use Colors Judiciously:
    • Use consistent colors to differentiate categories or variables.
  5. Validate Data:
    • Ensure accuracy of data before plotting.
  • Measures of central tendency –Mode, Median and mean.

Measures of Central Tendency: Mode, Median, and Mean

Measures of central tendency describe the center or typical value of a dataset. These measures summarize data into a single value, which represents the “average” or “middle” of the distribution.


1. Mode

Definition

  • The mode is the value or category that appears most frequently in a dataset.

Key Characteristics

  1. For Categorical Data:
    • Identifies the most common category.
    • Example: In a survey of favorite colors: Red (4), Blue (5), Green (3). Mode = Blue.
  2. For Numerical Data:
    • Can have one mode (unimodal), two modes (bimodal), or more (multimodal).
    • Example: In {1, 2, 2, 3, 3, 4}, Modes = 2 and 3 (bimodal).

Advantages

  • Easy to identify.
  • Useful for categorical data.

Disadvantages

  • May not exist in some datasets.
  • Not helpful for datasets with uniform frequencies.

2. Median

Definition

  • The median is the middle value of a dataset when arranged in ascending or descending order.

Calculation Steps

  1. Arrange the data in order.
  2. Find the middle value:
    • For odd numbers: The middle value is the median.
    • For even numbers: The average of the two middle values.

Example:

  1. Odd Dataset: {3, 7, 8, 12, 15}
    Median = 8.
  2. Even Dataset: {4, 6, 8, 10, 12, 14}
    Median = 8+102=9\frac{8+10}{2} = 928+10​=9.

Key Characteristics

  • Insensitive to extreme values (outliers).
  • Represents the 50th percentile.

Advantages

  • Robust to outliers.
  • Suitable for ordinal data.

Disadvantages

  • Does not consider all values in the dataset.

3. Mean

Definition

  • The mean (average) is the sum of all values divided by the total number of values.

Formula

Mean=Sum of all valuesNumber of values\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}}Mean=Number of valuesSum of all values​

Example:

Dataset: {5, 10, 15, 20, 25}
Mean = 5+10+15+20+255=15\frac{5+10+15+20+25}{5} = 1555+10+15+20+25​=15.

Key Characteristics

  • Sensitive to extreme values (outliers).
  • Reflects the “balance point” of the dataset.

Advantages

  • Considers all values.
  • Suitable for further statistical analysis.

Disadvantages

  • Skewed by outliers.
  • Not suitable for categorical data.

Comparison of Mode, Median, and Mean

AspectModeMedianMean
DefinitionMost frequent value.Middle value.Average value.
Data TypeCategorical or numerical.Ordinal or numerical.Numerical.
Sensitivity to OutliersNot affected.Not affected.Highly affected.
Ease of CalculationEasiest.Moderate.Requires computation.
Use CaseCommon categories.Skewed distributions.Normal distributions.

Examples in Nursing Research

  1. Mode:
    • Identifying the most common symptoms reported by patients.
  2. Median:
    • Analyzing recovery times to find the “typical” patient experience.
  3. Mean:
    • Calculating the average heart rate of patients during treatment.

When to Use

MeasureBest Used When
ModeCategorical data or finding the most common occurrence.
MedianSkewed data or when outliers are present (e.g., income distribution).
MeanNormally distributed numerical data for advanced statistical analysis.
  • Measures of variability : Range, standard deviation

Measures of Variability: Range and Standard Deviation

Measures of variability describe the extent to which data values differ from each other or from the central tendency. They provide insights into the spread or dispersion of the data.


1. Range

Definition

  • The range is the difference between the largest and smallest values in a dataset.
  • Formula: Range=Maximum Value−Minimum Value\text{Range} = \text{Maximum Value} – \text{Minimum Value}Range=Maximum Value−Minimum Value

Example:

  • Dataset: {12, 15, 18, 22, 28}
    • Maximum = 28, Minimum = 12
    • Range = 28−12=1628 – 12 = 1628−12=16

Key Characteristics

  • Simple measure of variability.
  • Only considers the extreme values.

Advantages

  1. Easy to calculate and interpret.
  2. Provides a quick estimate of data dispersion.

Disadvantages

  1. Sensitive to outliers.
    • Example: {10, 15, 20, 25, 100}
      • Range = 100−10=90100 – 10 = 90100−10=90 (skewed by 100).
  2. Does not indicate the distribution of values within the dataset.

2. Standard Deviation (SD)

Definition

  • Standard deviation measures the average deviation of data points from the mean.
  • It provides a more comprehensive measure of variability compared to the range.

Formula:

  1. For a Population:σ=Σ(X−μ)2N\sigma = \sqrt{\frac{\Sigma (X – \mu)^2}{N}}σ=NΣ(X−μ)2​​
    • σ\sigmaσ: Standard deviation
    • XXX: Individual data point
    • μ\muμ: Population mean
    • NNN: Total number of data points
  2. For a Sample:S=Σ(X−Xˉ)2n−1S = \sqrt{\frac{\Sigma (X – \bar{X})^2}{n-1}}S=n−1Σ(X−Xˉ)2​​
    • SSS: Sample standard deviation
    • Xˉ\bar{X}Xˉ: Sample mean
    • nnn: Number of sample data points

Steps to Calculate Standard Deviation

  1. Calculate the mean (μ\muμ or Xˉ\bar{X}Xˉ).
  2. Subtract the mean from each data point (X−μX – \muX−μ).
  3. Square the deviations ((X−μ)2(X – \mu)^2(X−μ)2).
  4. Find the average of squared deviations:
    • For a population: Divide by NNN.
    • For a sample: Divide by n−1n – 1n−1.
  5. Take the square root of the result.

Example:

Dataset: {10, 12, 14, 16, 18}

  1. Mean:
    Xˉ=10+12+14+16+185=14\bar{X} = \frac{10 + 12 + 14 + 16 + 18}{5} = 14Xˉ=510+12+14+16+18​=14
  2. Deviations:
    • 10−14=−410 – 14 = -410−14=−4, 12−14=−212 – 14 = -212−14=−2, 14−14=014 – 14 = 014−14=0, 16−14=216 – 14 = 216−14=2, 18−14=418 – 14 = 418−14=4
  3. Squared Deviations:
    • (−4)2=16(-4)^2 = 16(−4)2=16, (−2)2=4(-2)^2 = 4(−2)2=4, (0)2=0(0)^2 = 0(0)2=0, (2)2=4(2)^2 = 4(2)2=4, (4)2=16(4)^2 = 16(4)2=16
  4. Sum of Squared Deviations:
    16+4+0+4+16=4016 + 4 + 0 + 4 + 16 = 4016+4+0+4+16=40
  5. Variance (for population):
    405=8\frac{40}{5} = 8540​=8
  6. Standard Deviation:
    8≈2.83\sqrt{8} \approx 2.838​≈2.83

Key Characteristics

  • Reflects data spread more accurately than the range.
  • Used for advanced statistical calculations.

Advantages

  1. Considers all data points.
  2. Suitable for comparing variability across datasets.

Disadvantages

  1. Complex to calculate manually for large datasets.
  2. Sensitive to outliers.

Comparison of Range and Standard Deviation

AspectRangeStandard Deviation
DefinitionDifference between max and min.Average deviation from the mean.
Sensitivity to OutliersHighly sensitive.Moderately sensitive.
Calculation SimplicitySimple to compute.Complex to calculate.
Information ProvidedLimited (extreme values only).Comprehensive (all data points).
ApplicationQuick dispersion estimate.Detailed variability analysis.

Applications in Nursing Research

  1. Range:
    • Quick assessment of patient recovery times (e.g., shortest and longest durations).
  2. Standard Deviation:
    • Analyzing blood pressure readings to evaluate variability within patient groups.
  • Range is best for quick and simple variability estimates.
  • Standard deviation provides a deeper understanding of data dispersion and is crucial for detailed analysis.
  • Introduction to normal probability

Introduction to Normal Probability

The normal probability concept is rooted in the normal distribution, which is a fundamental statistical tool used to model a wide range of natural phenomena. The normal distribution is also known as the Gaussian distribution or bell curve due to its characteristic shape.


Key Characteristics of Normal Distribution

  1. Symmetry:
    • The curve is perfectly symmetrical around its mean.
    • Example: In a class test, if most students score around the average, the distribution of scores is likely symmetric.
  2. Mean, Median, and Mode:
    • All three measures of central tendency are equal and located at the center of the curve.
  3. Shape:
    • The curve is bell-shaped, with a peak at the mean and tails extending infinitely in both directions but never touching the X-axis.
  4. Standard Deviation:
    • Defines the spread or variability of the distribution.
    • A smaller standard deviation results in a narrower curve; a larger one results in a wider curve.
  5. Area Under the Curve:
    • The total area under the curve is 1 (or 100%), representing the probability of all outcomes.

Probability in Normal Distribution

The probability of a specific range of values in a normal distribution is determined by the area under the curve for that range.

Empirical Rule (68-95-99.7 Rule):

  • 68% of data falls within 1 standard deviation of the mean.
  • 95% of data falls within 2 standard deviations of the mean.
  • 99.7% of data falls within 3 standard deviations of the mean.

Example:

  • Mean height of adults = 170 cm, standard deviation = 10 cm:
    • 68% of adults have heights between 160 cm160 \, \text{cm}160cm and 180 cm180 \, \text{cm}180cm.
    • 95% have heights between 150 cm150 \, \text{cm}150cm and 190 cm190 \, \text{cm}190cm.
    • 99.7% have heights between 140 cm140 \, \text{cm}140cm and 200 cm200 \, \text{cm}200cm.

Standard Normal Distribution

The standard normal distribution is a special case where:

  • Mean (μ\muμ) = 0.
  • Standard deviation (σ\sigmaσ) = 1.

Z-Score:

  • A z-score measures how many standard deviations a data point is from the mean.
  • Formula: Z=X−μσZ = \frac{X – \mu}{\sigma}Z=σX−μ​
    • XXX: Data point.
    • μ\muμ: Mean.
    • σ\sigmaσ: Standard deviation.

Example:

  • A student scored 85 in a test where the mean score is 75 and the standard deviation is 10. Z=85−7510=1Z = \frac{85 – 75}{10} = 1Z=1085−75​=1
    • Interpretation: The student’s score is 1 standard deviation above the mean.

Applications of Normal Probability

  1. Healthcare:
    • Analyzing patients’ vital signs (e.g., blood pressure, heart rate) to detect abnormalities.
    • Example: Blood pressure measurements often follow a normal distribution.
  2. Education:
    • Evaluating student performance in exams, where scores typically form a bell curve.
  3. Quality Control:
    • Monitoring product dimensions in manufacturing to ensure they meet specifications.
  4. Finance:
    • Modeling stock returns or economic indicators.

Advantages of Normal Distribution

  1. Widely Applicable:
    • Many natural and social phenomena approximate a normal distribution.
  2. Simplifies Analysis:
    • Enables the use of standardized methods, such as z-scores and probability tables.
  3. Foundation for Inferential Statistics:
    • Central to hypothesis testing, confidence intervals, and regression analysis.

Limitations of Normal Distribution

  1. Assumption of Normality:
    • Not all datasets follow a normal distribution.
  2. Sensitive to Outliers:
    • Extreme values can distort the mean and standard deviation.