1 minute read

Basic Probability - Counting and Random Variables



(Ace the Data Science Interview: 201 Real Interview Questions Asked By FAANG, Tech Startups, & Wall Street)

Counting

Counting is typically used as an important method of counting itself or probability. (For example, what is the likelihood that I draw four cards of the same suite?)

Two types of counting elements are generally relevant. If the order of selection of the $n$ items being counted $k$ at a time matters, then the method for counting possible permutations is employed.

$n * (n-1)* \dots * (n-k+1) = \frac{n!}{(n-k)!}$



In contrast, if the order of selection does not matter, then the technique of counting the possible number of combinations is relevant.

$\binom{n}{k} = \frac{n!}{k!(n-k)!}$



Knowing these concepts is essential for evaluating various probabilities related to counting procedures. Therefore, it's important to distinguish when selection matters versus when it does not.

Both permutations and combinations are frequently encountered in questions related to combinatorial and graph theory.

Random Variables

Random variables are a fundamental concept in probability, and it’s essential to grasp their principles and be able to manipulate them.

A random variable is a quantity with an associated probability distribution. It can be either discrete (i.e., have a countable range) or continuous (have an uncountable range).

The probability distribution linked with a discrete random variable is called a probability mass function (PMF), while for a continuous random variable, it’s called a probability density function (PDF). Both can be represented by the following function of $x: f_x(x)$.

In the discrete case, $X$ can take on specific values with particular probabilities. In the continuous case, the probability of a specific value of $x$ is not measurable; instead, a “probability mass” per unit length around $x$ can be measured (think of a small interval of $x$ and $x + \sigma$).

- Discrete: $ \sum_{x \in X}f_X(x) = 1$

- Continous: $\int ^{\infty}_{-\infty} f_{X}(x)dx = 1$


The cumulative distribution function(CDF) is often used in practice rather than a variable’s PMF or PDF and is defined as follows in both cases: $F_{X}(x) = p(X \leq x)$.

The CDF is given as follows:

- Discrete: $F_{X}(x) = \sum_{k \leq x}p(k)$

- Continous: $F_X(x) = \int^{\infty}_{-\infty}p(y)dy$


$P(X=k) = \frac{e^{- \lambda}\lambda^{k}}{k!}$

mean: $\mu = \lambda$
variance: $\sigma^2 = \lambda$



Leave a comment