What are the Measures of Variability in Statistics?

What are the Measures of Variability in Statistics?

Data is everything in AI today. Knowing the middle value—like mean or median—is good, but we also need to know how much the data spreads. Let's take a look about different measures of variability: range, variance, standard deviation, and interquartile range. This is in continuation to understanding Central Tendency of data (If you have not already read that post, Have a quick look at “What is Central Tendency?”). 

What is Variability?

Variability shows how far data points are from each other or the center. For example, If we say a data set has a mean of 11.4 doesn’t tell us if numbers are close to it or all over. In AI, spread matters—big swings might mean bad data for a model. For example, In any application different job run times spread could show system issues.

We measure variability with:

  1. Range
  2. Variance
  3. Standard Deviation
  4. Interquartile Range

Let’s use the same numbers from last post to see how they work.

The Measures

Range

Range is easy—it’s the biggest number minus the smallest. It shows the full spread but not what’s in between.

Range = Biggest number - Smallest number

Data:

8, 10, 15, 6, 18, 10, 22, 16, 2, 7

Re-arranging this data in ascending order: 2, 6, 7, 8, 10, 10, 15, 16, 18, 22

Biggest = 22, Smallest = 2

Range = 22 - 2 = 20

Range is quick for AI—like checking data limits in a model—but one odd number messes it up. Like If we have all off the data points less than 22, what happens if one of this entry is 200. Then range will become 198 (200 - 2), which doesn't always represent correctly.

Variance

Variance looks at how far each number is from the mean, squares it, then averages those squares. It’s exact but uses squared units.

Variance = (Sum of (each number - mean)2) / total numbers

Mean of given data is is 11.4 (from last post). Let’s calculate:

Differences: (8-11.4), (10-11.4), (15-11.4), ..., (7-11.4)

= -3.4, -1.4, 3.6, -5.4, 6.6, -1.4, 10.6, 4.6, -9.4, -4.4

Square them: 11.56, 1.96, 12.96, 29.16, 43.56, 1.96, 112.36, 21.16, 88.36, 19.36

Sum = 11.56 + 1.96 + 12.96 + 29.16 + 43.56 + 1.96 + 112.36 + 21.16 + 88.36 + 19.36 = 342.4

Variance = 342.4 / 10 = 34.24

In AI, high variance means your data’s messy—might need cleaning before training. Less the variance, the close the data is from the mean. 

Standard Deviation

Standard deviation is the square root of variance. It’s like variance but in the same units as your data.

Standard Deviation = Square root of variance

Variance = 34.24

Standard Deviation = √34.24 ≈ 5.85

Most numbers are within 11.4 ± 5.85. In AI, it’s used to scale data or spot odd values.

Interquartile Range

Interquartile range (IQR) is the spread of the middle 50% of ordered data. It ignores extreme values.

IQR = Q3 - Q1

Q1 is the median of the lower half, Q3 is the median of the upper half. Formulas:

Q1 = (n+1)/4 th entry

Q3 = 3(n+1)/4 th entry

Ordered data: 2, 6, 7, 8, 10, 10, 15, 16, 18, 22

n = 10

Q1 position = (10+1)/4 = 2.75 ≈ 3rd entry = 7

Q3 position = 3(10+1)/4 = 8.25 ≈ 8th entry = 16

IQR = 16 - 7 = 9

IQR’s good for AI data prep—it skips outliers. 

Example in Action

Let's consider we are job times (seconds) of an application: 8, 10, 15, 6, 18, 10, 22, 16, 2, 7. Mean’s 11.4, but variability shows:

  • Range = 20: Big gap, check extremes.
  • Variance = 34.24: Numbers vary a lot.
  • Standard Deviation = 5.85: Most times are 5.55–17.25 seconds.
  • IQR = 9: Middle half is tighter, 2 and 22 are outliers.

In AI, this helps clean data for models or flag issues in an application.

Wrapping Up

Range, variance, standard deviation, and IQR helps us understand how data spreads. They’re key for AI and coding. See my “What is Central Tendency?” post for more about central tendency (mean, mode and median). Got a stats tip or want another topic? Comment or use the contact form—let’s keep it going!

Comments

Popular posts from this blog

What is Deep Learning? Beyond the Basics of AI

Retrieve list of Spooled files on System from SQL - IBM i

What Are Neural Networks? AI’s Brain Explained