Random Variables and Probability Functions — 1

Pushpak Ruhil
7 min readJan 7, 2023

--

There will be 2 parts to this title. This is the 1st part.

In this article, we will look at what random variables are and the different probability functions that exist.
I will assume that you already have a basic understanding of probability.

Random Variables (RVs)

A Random Variable is a function from a set of possible outcomes to a set of real numbers.
It maps a real number to each possible outcome.

For example, we roll a die with our RV being X. Then,
X = {1,2,3,4,5,6}

If we flip a coin,
X= {1, 0} (we mapped heads to 1 and tails to 0).

More mathematically →

RV, mathematically.

Here, X is a Random Variable, Ω the Domain(the possible outcomes)

X is mapped to R’’. This could also be a subset of R’’.

Types of Random Variables

Types of RVs (Drawn on ExcaliDraw)

1) Discrete Random Variables

These random variables take finite or countably infinite values.

Countably infinite values are countable but infinite, hehe.
Here’s a more detailed definition for a countably infinite set —
A countably infinite set is a set that contains an infinite number of elements, but the numbers can be counted one by one in a way that never ends. for examples: {1,2,3,…}

Examples of Discrete RVs:

  • The sum of numbers on two dice throws
  • The outcome of a single die
  • The number of tosses after which a head appears (COUNTABLY INFINITE)

2) Continuous Random Variables

These random variables take uncountable values.

This type of random variable can take an infinite number of values that cannot be counted one by one.

Examples of Continuous RVs:

  • The amount of rainfall in a state
  • The temperature of a surface
  • The density of a liquid

You might be wondering how these examples are classified as continuous random variables.
We can measure the rainfall or the temperature of a surface, then how do they fall under the category of “continuous RVs”?

The answer is that we can measure these quantities, but we approximate them to our benefit.
Rainfalls are approximate values. We cannot calculate them to infinite precision.
We are talking about the precision of the level beyond the size of an electron. This isn’t possible. We always approximate!
The same logical approach can be used to argue about the temperature, density, or any other example you can think of.

Probability Functions

There are many types of probability functions, but let’s focus on the 3 most basic types. Even here, we’ll only look at PMF and the other 2 would be discussed in the next article.
We’ll see the name of other types of probability functions at the end.

Types of Probability Functions (Drawn on ExcaliDraw)

1) Probability Mass Function

An assignment of probabilities to all the possible values that a discrete RV can take is called the Probability Mass Function of the random variable

It is also known as the probability distribution of the random variable.

It assigns a probability to each value that the random variable can take.

let’s consider an example to better understand it.

Example: We want to find the probability of an event defined by the sum of numbers that appear on 2 dice throws.

Let X be the random variable denoting the event “Sum of numbers of the 2 dice throws”

let x be an individual value taken by the random variable X.

PMF can be used to answer questions such as →

The probability that X takes a value x.

Let’s look at the possible outcomes for each X=x

All possible events for 2 dice throw

Now, we can calculate the probability for each event using the -

Calculating Probability

Resulting in the following probabilities →

PMF of each X=x

You may also notice this representation, at times —

Properties of PMF

Here are a few properties you should be aware of.
Refer to the picture below.

Discrete Distributions

Now, let’s examine a few discrete distributions commonly used in statistics.

1) Bernoulli Distribution

This type of distribution is used for experiments with two possible outcomes: failure and success.

For True/False, On/Off, 1/0, etc conditions.

Such experiments are called Bernoulli Trials.

P(success) = p
P(failure) = 1-p

let’s see this in a compact form. Let X be an RV denoting the Bernoulli distribution and x be any possible value of X. Then, we can denote the PMF using the following notation -

PMF for Bernoulli DIstribution.

Here,
P(0) = 1-p
P(1) = p

How to check if a distribution is valid or not?
I would not be testing the validity of any of these distributions, but to check for the validity, you need to check 2 things, i.e., the properties of PMF.
1) Each probability should be ≥ 0
2) Sum of probabilities over the support of the RV should be = 1.

2) Binomial Distribution

This is one of the important distributions.
In simple terms, a Binomial distribution is a Bernoulli distribution repeated over and over n times.

The Binomial Distribution is a discrete distribution. It is used to model the number of successes in a series of Bernoulli Trials. The Binomial Distribution is denoted by X∼Bin(n,p)

But there’s a catch. These conditions need to be followed for every trial:

  1. INDEPENDENCE: Success/Failure in one trial should not affect the outcome of any other trial.
  2. IDENTICAL: The probability of success in each trial is the same, i.e., p.

What is the probability of k successes in n trials?
This is the type of question that binomial distribution helps us answer.

Let X be the RV. k be the number of successes.

Then, the probability of getting k successes in n trials, each with a probability of success equal to p, can be derived from the following formula.

PMF for Binomial Distribution

If you check, this formula is similar to what we have for a Bernoulli distribution.
Here, we have a combination term as well. For those who don’t remember, here’s a quick refresher on the formula for combinations.

Combination formula.

3) Geometric Distribution

Instead of performing a trial “n” times, we perform the trial “∞” number of times.

X: The number of trials until we get the 1st success.

Let k be the number of trials performed to get our 1st success. What does it mean? We performed our trial k times and got our 1st success after (k-1) failures.
So, can you find the PMF of it?

PMF of Geometric Distribution.

We can derive the PMF using the simple multiplication rule of probabilities.

4) Uniform distribution

Experiments with equally likely outcomes.
For example, the outcomes of a die-throw.

Let X be the RV denoting the experiment where the outcomes are equally likely.
We also need to assume a range for our support of X.

PMF of Uniform distribution.

Here, (b-a+1) can be explained using the counting principles. It simply is equal to the number of numbers between a and b.

Wrapping up

Well, thanks for reading through.
This is my 2nd article while continuing the journey of writing articles related to statistics. This is the 1st part of “Random variables and Probability Functions”, The 2nd part is available as well.

The 2nd part extends the topic to CDFs(Cumulative Distribution Functions) and PDF(not the format :p It stands for Probability Density Functions)

Also, I mentioned at the top that I’d be referring to the other type of probability functions. Here are a few other than PMF, CDF and PDF -

  • Joint probability mass function
  • Joint probability Density function
  • Moment-generating function

For now, I’ll not be focusing on these, but I might come up with an article in the future.

Hopefully, this article was beneficial for you in some way or another.

Do show your support by dropping in a follow and a comment :’)

--

--

Pushpak Ruhil
Pushpak Ruhil

Written by Pushpak Ruhil

Data Science | Machine Learning | Python | Tableau | Tech geek