Kolmogorov Consistency Theorem
How can one define a product probability measure on infinitely many probability spaces? That is, given a family of probability spaces $(\Omega_t,\mathcal F_t)$, $t\in T$, suppose that for every nonempty finite subset $S\subset T$, a probability measure $\mathbb P_S$ has already been defined on $\prod_{t\in S}\mathcal F_t$. Then how can we define a “suitable” probability measure $\mathbb P$ on the product measurable space
$$
\left(\prod_{t\in T}\Omega_t,\prod_{t\in T}\mathcal F_t\right)
$$
such that its marginal distribution measures are exactly $\mathbb P_S$?
Note: By “suitable”, we mainly mean the following aspects:
Is the probability measure $\mathbb P$, as a mapping, well-defined? That is, if $A,B\in \prod_{t\in T}\mathcal F_t$ and $A=B$, do we have $\mathbb P(A)=\mathbb P(B)$?
Does the probability measure $\mathbb P$ exist, and is it unique?
Is the probability measure $\mathbb P$ really a measure? Does it satisfy countable additivity?
Are the marginal distributions of $\mathbb P$ exactly $\mathbb P_S$? That is, if $A_S\in \prod_{t\in S}\mathcal F_t$, do we have
$$
\mathbb P\left(A_S\times \prod_{t\in T\backslash S}\Omega_t\right)=\mathbb P_S(A_S)?
$$
Note: The reason why there is a well-definedness issue is that the product $\sigma$-algebra has different representations. For example, if $S_1\subset S_2\subset T$, then
$$
A_{S_1}\times \prod_{t\in T\backslash S_1}\Omega_t
=
\left(A_{S_1}\times \prod_{t\in S_2\backslash S_1}\Omega_t\right)\times \prod_{t\in T\backslash S_2}\Omega_t.
$$
In fact, the Kolmogorov consistency theorem gives the answer to the above questions. In order to introduce this result as clearly as possible, we first briefly review some important results and the definition of product probability measures on finitely many probability spaces, then consider product probability measures on countably many probability spaces, and finally consider the case of uncountably many spaces.

Figure 1: Kolmogorov (Russian: Андре́й Никола́евич Колмого́ров, April 25, 1903 -- October 20, 1987), Russian mathematician, founder of modern probability theory. At one time, people even suspected whether Kolmogorov was a person's name or the name of a research institute.
Review of Important Results
Definition 1 (Cartesian product)
Let $\Omega_t$, $t\in T$, be a family of sets, where $T$ is an index set (it may be finite, infinite, or even uncountably infinite). Then their Cartesian product is defined by
$$
\prod_{t\in T}\Omega_t
=
\left\{\omega:T\to \bigcup_{t\in T}\Omega_t\ ;\ \omega(t)=\omega_t\in \Omega_t\right\}.
$$
Note: The Cartesian product of finitely many sets used to be defined as
$$
\Omega_1\times \Omega_2
=
\{(\omega_1,\omega_2)\ ;\ \omega_1\in \Omega_1,\omega_2\in \Omega_2\}.
$$
In fact, each ordered pair $(\omega_1,\omega_2)$ can be regarded as such a mapping:
$$
\omega:\{1,2\}\to \Omega_1\cup \Omega_2
$$
satisfying
$$
\omega(1)=\omega_1\in \Omega_1,\quad \omega(2)=\omega_2\in \Omega_1.
$$
Moreover, although each ordered pair $(\omega_1,\omega_2)$ seems to depend on the order, this order is in fact not essential. Therefore, we may regard each ordered pair as a mapping, view the Cartesian product as the set of such mappings, and extend this to infinitely many sets.
Definition 2 (Coordinate projection mapping)
Define the coordinate projection mapping by
$$
\begin{aligned}
\pi_t:\prod_{t\in T}\Omega_t&\to \Omega_t\
\omega&\mapsto \omega(t)=\omega_t
\end{aligned}
$$
Definition 3 (Product $\sigma$-algebra)
Let $( \Omega_t,\mathcal F_t),\ t\in T$ be a family of measurable spaces. Define the product $\sigma$-algebra on $\prod_{t\in T} \Omega_t$ to be the smallest $\sigma$-algebra making all $\pi_t$ measurable, namely
$$
\prod_{t\in T}\mathcal F_t=\sigma(\pi_t,t\in T),
$$
where $\pi_t$ are the corresponding coordinate projection mappings.
Regarding the structure of the product $\sigma$-algebra, we have the following very important result.
Theorem 1 (Structure theorem of the product $\sigma$-algebra)
$$
\prod_{t\in T}\mathcal F_t=\sigma(J_0)=\sigma(Z_0)=\sigma(J)=\sigma(Z)=Z
$$
where
$$
J_0
=
\bigcup_{S\subset T,\text{ nonempty finite}}
\left\{
\bigcap_{t\in S}\pi_t^{-1}(A_t)\ ;\ A_t\in \mathcal F_t
\right\}
=
\bigcup_{S\subset T,\text{ nonempty finite}}
\left\{
\prod_{t\in S}A_t\times \prod_{t\in T\backslash S}\Omega_t\ ;\ A_t\in \mathcal F_t
\right\}
$$
is a semialgebra, called the finite-dimensional base measurable rectangles;
$$
Z_0
=
\bigcup_{S\subset T,\text{ nonempty finite}}\sigma(\pi_t,t\in S)
=
\bigcup_{S\subset T,\text{ nonempty finite}}
\left\{
A_S\times \prod_{t\in T\backslash S}\Omega_t\ ;\ A_S\in \prod_{t\in S}\mathcal F_t
\right\}
$$
is an algebra, called the finite-dimensional base measurable cylinders;
$$
J
=
\bigcup_{S\subset T,\text{ nonempty countable}}
\left\{
\bigcap_{t\in S}\pi_t^{-1}(A_t)\ ;\ A_t\in \mathcal F_t
\right\}
=
\bigcup_{S\subset T,\text{ nonempty countable}}
\left\{
\prod_{t\in S}A_t\times \prod_{t\in T\backslash S}\Omega_t\ ;\ A_t\in \mathcal F_t
\right\}
$$
is a $\pi$-class, called the countable-dimensional base measurable rectangles;
$$
Z_0
=
\bigcup_{S\subset T,\text{ nonempty countable}}\sigma(\pi_t,t\in S)
=
\bigcup_{S\subset T,\text{ nonempty countable}}
\left\{
A_S\times \prod_{t\in T\backslash S}\Omega_t\ ;\ A_S\in \prod_{t\in S}\mathcal F_t
\right\}
$$
is a $\sigma$-algebra, called the countable-dimensional base measurable cylinders.
Note: The structure theorem of the product $\sigma$-algebra shows that the product of seemingly uncountably many $\sigma$-algebras is actually countably based. Therefore, once we understand the countable case, the uncountable case can be naturally resolved. Moreover, $\prod_{t\in T}\mathcal F_t=\sigma(J_0)=\sigma(Z_0)$ tells us that if we can define a countably additive set function on the relatively simple $J_0$ or $Z_0$, and then extend it to the generated $\sigma$-algebra, then we obtain the probability measure we want. The Carathéodory extension theorem tells us when such an extension can be achieved.
Theorem 2 (Carathéodory extension theorem)
Let $\mathcal{C}$ be a semiring on $\Omega$, and let $\mu$ be a countably additive nonnegative set function on $\mathcal{C}$. Then $\mu$ can be extended to a measure on $\sigma(\mathcal{C})$. If, in addition, the following uniqueness condition holds: $\mu$ is $\sigma$-finite on $\mathcal{C}$ and $\Omega\in \mathcal{C}_\sigma$, then the extension is unique.
Sometimes it is not easy to prove countable additivity of a set function, but it is often easy to know that it is finitely additive and continuous from above at the empty set. We have the following theorem.
Theorem 3 If $\mu$ is a finite set function on an algebra $\mathcal{E}$, then $\mu$ is countably additive if and only if $\mu$ is finitely additive and continuous from above at the empty set.
Product Probability Measures on Finitely Many Probability Spaces
Suppose we have probability spaces $(\Omega_1,\mathcal F_1,\mathbb P_1)$ and $(\Omega_2,\mathcal F_2,\mathbb P_2)$. According to the structure theorem of the product $\sigma$-algebra,
$$
\mathcal F_1\times \mathcal F_2=\sigma(J_0)
$$
where
$$
J_0=\{A_1\times A_2\ ;\ A_1\in \mathcal F_1,A_2\in \mathcal F_2\}
$$
is an algebra. Define a nonnegative set function on it by
$$
\mathbb P(A_1\times A_2)=\mathbb P_1(A_1)\mathbb P_2(A_2).
$$
One can prove by means of sections that this is a countably additive nonnegative set function, and clearly the uniqueness condition is satisfied. Therefore, by the Carathéodory extension theorem, we can uniquely extend it to a measure on $\mathcal F_1\times \mathcal F_2=\sigma(J_0)$, and clearly it is a probability measure.
Therefore, by induction, we can define suitable product probability measures on finitely many probability spaces. Is the countable case equally easy? The answer is negative.
Product Probability Measures on Countably Many Probability Spaces
The Kolmogorov consistency theorem gives a very good answer to this question. The Kolmogorov consistency theorem has many different forms. In this article, we will introduce a form that the author considers relatively practical, and appreciate its very elegant proof. Before that, we introduce some more basic definitions and conclusions.
Definition 4 (Polish space) Let $\Omega$ be a Hausdorff space. If there exists a metric $d$ on $\Omega$ compatible with its topology such that $(\Omega,d)$ is a complete separable metric space, then $\Omega$ is called a Polish space.
We have the following important result.
Theorem 4 Let $\Omega$ be a Polish space, let $\mathcal{B}(\Omega)$ be its Borel $\sigma$-algebra, and let $\mu$ be a finite measure on $\mathcal{B}(\Omega)$. Then $\mu$ is strongly inner regular, that is, for every $A\in \mathcal{B}(\Omega)$,
$$
\mu(A)=\sup\{\mu(K)\ ;\ K\subset A,K\text{ compact}\}.
$$
With the above preparations, we can now state the Kolmogorov consistency theorem.
Theorem 5 (Kolmogorov consistency theorem) Let $(\Omega_n,\mathcal F_n)$ be measurable spaces, where each $\Omega_n$ is a Polish space, $\mathcal F_n=\mathcal{B}(\Omega_n)$, and the index set is $n\in T=\mathbb{N}$. If for every nonempty finite subset $S\subset T$, there exists a probability measure $\mathbb P_S$ on $\prod_{n\in S}\mathcal F_n$, and they satisfy the following consistency condition: for every nonempty finite $S_1\subset S_2\subset T$, and every
$$
A_{S_1}\in \prod_{n\in S_1}\mathcal F_n
\Longrightarrow
A_{S_1}\times \prod_{n\in S_2\backslash S_1}\Omega_n\in \prod_{n\in S_2}\mathcal F_n,
$$
we have
$$
\mathbb P_{S_1}(A_{S_1})
=
\mathbb P_{S_2}\left(A_{S_1}\times \prod_{n\in S_2\backslash S_1}\Omega_n\right).
$$
Then there exists a unique probability measure $\mathbb P$ on $\left(\prod_{n\in T}\Omega_n,\prod_{n\in T}\mathcal F_n\right)$ such that for every $A_S\in \prod_{n\in S}\mathcal F_n$,
$$
\mathbb P\left(A_S\times \prod_{n\in T\backslash S}\Omega_n\right)=\mathbb P_S(A_S).
$$
Proof: By the structure theorem of the product $\sigma$-algebra,
$$
\prod_{n\in T}\mathcal F_n=\sigma(Z_0).
$$
Step 1: According to the consistency condition, the set function $\mathbb P$ defined on $Z_0$ by
$$
\mathbb P\left(A_S\times \prod_{n\in T\backslash S}\Omega_n\right)=\mathbb P_S(A_S)
$$
is well-defined.
Step 2: The set function $\mathbb P$ defined on $Z_0$ satisfies finite additivity.
If $B_1,B_2\in Z_0$ and $B_1\cap B_2=\varnothing$, then there exist nonempty finite $S_1,S_2\subset T$ such that
$$
B_1=A_{S_1}\times \prod_{n\in T\backslash S_1}\Omega_n,\quad
B_2=A_{S_2}\times \prod_{n\in T\backslash S_2}\Omega_n
$$
where
$$
A_{S_1}\in \prod_{n\in S_1}\mathcal F_n,\quad
A_{S_2}\in \prod_{n\in S_2}\mathcal F_n.
$$
Let $S=S_1\cup S_2$, which is clearly nonempty and finite. Then
$$
\begin{aligned}
\mathbb P(B_1\uplus B_2)
&=\mathbb P\left(\left(\left(A_{S_1}\times \prod_{n\in S\backslash S_1}\Omega_n\right)\biguplus \left(A_{S_2}\times \prod_{n\in S\backslash S_2}\Omega_n\right)\right)\times \prod_{n\in T\backslash S}\Omega_n\right)\\\
&=\mathbb P_S\left(\left(A_{S_1}\times \prod_{n\in S\backslash S_1}\Omega_n\right)\biguplus \left(A_{S_2}\times \prod_{n\in S\backslash S_2}\Omega_n\right)\right)\\\
&=\mathbb P_S\left(A_{S_1}\times \prod_{n\in S\backslash S_1}\Omega_n\right)+\mathbb P_S\left(A_{S_2}\times \prod_{n\in S\backslash S_2}\Omega_n\right)\\\
&=\mathbb P_{S_1}(A_{S_1})+\mathbb P_{S_2}(A_{S_2})\\\
&=\mathbb P(B_1)+\mathbb P(B_2).
\end{aligned}
$$
Therefore finite additivity holds.
Step 3: The set function $\mathbb P$ defined on $Z_0$ is continuous from above at the empty set. That is, we need to prove that if $B_n\in Z_0$ and $B_n\downarrow \varnothing$, then
$$
\lim_{n\to\infty}\mathbb P(B_n)=0.
$$
We use proof by contradiction. Suppose that
$$
\lim_{n\to\infty}\mathbb P(B_n)\ne 0.
$$
Since $B_n\downarrow$, we have $\mathbb P(B_n)\downarrow$, and hence there exists $\varepsilon_0>0$ such that
$$
\mathbb P(B_n)\ge \varepsilon_0,\quad \forall n.
$$
Without loss of generality, assume that
$$
B_n=A_n\times \prod_{i=n+1}^{\infty}\Omega_i,\quad
A_n\in \prod_{i=1}^n\mathcal F_i.
$$
Since $\prod_{i=1}^n\Omega_i$ is a Polish space, by Theorem 4 there exists $K_n\subset A_n$, with $K_n$ compact, such that
$$
\mathbb P_{\{1,\cdots,n\}}(A_n\backslash K_n)\le \frac{\varepsilon_0}{2^{n+1}}.
$$
Let
$$
\widetilde{B_n}=K_n\times \prod_{i=n+1}^{\infty}\Omega_i,\quad
C_n=\bigcap_{i=1}^n \widetilde{B_i}\downarrow
$$
We claim that
$$
C_n\ne \varnothing,\quad \forall n.
$$
Indeed,
$$
\begin{aligned}
\mathbb P(B_n\backslash \widetilde{B_n})
&=\mathbb P\left((A_n\backslash K_n)\times \prod_{i=n+1}^{\infty}\Omega_i\right)\\\
&=\mathbb P_{\{1,\cdots,n\}}(A_n\backslash K_n)\le \frac{\varepsilon_0}{2^{n+1}}.
\end{aligned}
$$
Hence
$$
\begin{aligned}
\mathbb P(B_n)-\mathbb P(C_n)
&=\mathbb P(B_n\backslash C_n)\\\
&=\mathbb P\left(\bigcup_{i=1}^n (B_n\backslash \widetilde{B_i})\right)\le \mathbb P\left(\bigcup_{i=1}^n (B_i\backslash \widetilde{B_i})\right)\\\
&\le \sum_{i=1}^n \mathbb P(B_n\backslash \widetilde{B_n})\le \frac{\varepsilon_0}{2}.
\end{aligned}
$$
Combining this with $\mathbb P(B_n)\ge \varepsilon_0,\ \forall n$, we obtain
$$
\mathbb P(C_n)\ge \frac{\varepsilon}{2}.
$$
Therefore
$$
C_n\ne \varnothing,\quad \forall n.
$$
Thus for every $n\in T$, there exists
$$
\omega^n=(\omega_1^n,\omega_2^n,\cdots)\in C_n.
$$
Clearly, for every $m\in T$,
$$
\omega^{n+m}\in C_{n+m}\subset C_n\subset \widetilde{B_n}.
$$
By the definition of $B_n$, we have
$$
(\omega_1^{n+m},\cdots,\omega_n^{n+m})\in K_n.
$$
Since $\omega_1^{1+m}\in K_1,\ \forall m$, and $K_1$ is compact and hence sequentially compact, there exists a subsequence $n_{1k}\ge 2$ such that
$$
\omega_1^{n_{1k}}\to \omega_1^*\in K_1,\quad k\to\infty.
$$
For this subsequence, since
$$
(\omega_1^{n_{1k}},\omega_2^{n_{1k}})\in K_2,\quad \forall k\in T,
$$
there exists a subsequence $n_{2k}\ge 3$ such that
$$
(\omega_1^{n_{2k}},\omega_2^{n_{2k}})\to (\omega_1^\star,\omega_2^\star)\in K_2,\quad k\to\infty.
$$
Proceeding inductively, for every $m\in T$, there exists a subsequence $n_{mk}\ge m+1$ such that
$$
(\omega_1^{n_{mk}},\omega_2^{n_{mk}},\cdots,\omega_m^{n_{mk}})
\to
(\omega_1^\star,\omega_2^\star,\cdots,\omega_m^\star)\in K_m,\quad k\to\infty.
$$
Therefore,
$$
\omega=(\omega_1^\star,\omega_2^\star,\cdots)\in \bigcap_{n=1}^{\infty}\widetilde{B_n}\subset \bigcap_{n=1}^{\infty}B_n=\varnothing.
$$
This is a contradiction. Hence the assumption is false, and
$$
\lim_{n\to\infty}\mathbb P(B_n)=0.
$$
Step 4 Extend $\mathbb P$ to $\prod_{n\in T}\mathcal F_n=\sigma(Z_0)$.
By Steps 1–3 and Theorem 3, we know that $\mathbb P$ is a countably additive nonnegative set function on the algebra $Z_0$, and clearly
$$
\mathbb P\left(\prod_{n\in T}\Omega_n\right)=1.
$$
Furthermore, the uniqueness condition is also clearly satisfied. Therefore, by the Carathéodory measure extension theorem, $\mathbb P$ can be uniquely extended to a measure on $\prod_{n\in T}\mathcal F_n$.
Product Probability Measures on Uncountably Many Probability Spaces
With the result for the countable case, we can rather surprisingly obtain the uncountable result very easily.
Theorem 6 (Kolmogorov consistency theorem) Let $(\Omega_t,\mathcal F_t)$ be measurable spaces, where each $\Omega_t$ is a Polish space, $\mathcal F_t=\mathcal{B}(\Omega_t)$, and the index set $t\in T$ is uncountable. If for every nonempty finite subset $S\subset T$, there exists a probability measure $\mathbb P_S$ on $\prod_{t\in S}\mathcal F_t$, and they satisfy the following consistency condition: for every nonempty finite $S_1\subset S_2\subset T$, and every
$$
A_{S_1}\in \prod_{t\in S_1}\mathcal F_t
\Longrightarrow
A_{S_1}\times \prod_{t\in S_2\backslash S_1}\Omega_t\in \prod_{t\in S_2}\mathcal F_t,
$$
we have
$$
\mathbb P_{S_1}(A_{S_1})
=
\mathbb P_{S_2}\left(A_{S_1}\times \prod_{t\in S_2\backslash S_1}\Omega_t\right).
$$
Then there exists a unique probability measure $\mathbb P$ on $\left(\prod_{t\in T}\Omega_t,\prod_{t\in T}\mathcal F_t\right)$ such that for every $A_S\in \prod_{t\in S}\mathcal F_t$,
$$
\mathbb P\left(A_S\times \prod_{t\in T\backslash S}\Omega_t\right)=\mathbb P_S(A_S).
$$
Proof: By the structure theorem of the product $\sigma$-algebra,
$$
\prod_{t\in T}\mathcal F_t=Z.
$$
By Theorem 5, for every nonempty countable subset $S’\subset T$, we have a probability measure $\mathbb P_{S’}$ on $\prod_{t\in S’}\mathcal F_t$. Thus we define a set function $\mathbb P$ on $Z$ by
$$
\mathbb P\left(A_{S’}\times \prod_{t\in T\backslash S’}\Omega_t\right)=\mathbb P_{S’}(A_{S’}),
$$
where $S’\subset T$ is nonempty and countable. Clearly $\mathbb P$ is a nonnegative set function and
$$
\mathbb P\left(\prod_{t\in T}\Omega_t\right)
=
\mathbb P\left(\Omega_s\times \prod_{t\in T\backslash \{s\}}\Omega_t\right)
=
\mathbb P_{\{s\}}(\Omega_s)
=
1.
$$
Therefore we only need to prove countable additivity. If $B_n\in Z$ are pairwise disjoint and
$$
B=\biguplus_{n=1}^{\infty} B_n,
$$
let
$$
B_n=A_{S_n}\times \prod_{t\in T\backslash S_n}\Omega_t
$$
where $S_n$ is nonempty and countable. Let
$$
S=\bigcup_{n=1}^{\infty}S_n,
$$
which is clearly also nonempty and countable. Hence
$$
\begin{aligned}
\mathbb P\left(\biguplus_{n=1}^{\infty} B_n\right)
&=\mathbb P\left(\biguplus_{n=1}^{\infty}\left(A_{S_n}\times \prod_{t\in S\backslash S_n}\Omega_t\right)\times \prod_{t\in T\backslash S}\Omega_t\right)\\\
&=\mathbb P_S\left(\biguplus_{n=1}^{\infty}\left(A_{S_n}\times \prod_{t\in S\backslash S_n}\Omega_t\right)\right)\\\
&=\sum_{n=1}^{\infty}\mathbb P_S\left(A_{S_n}\times \prod_{t\in S\backslash S_n}\Omega_t\right)\\\
&=\sum_{n=1}^{\infty}\mathbb P\left(A_{S_n}\times \prod_{t\in T\backslash S_n}\Omega_t\right)\\\
&=\sum_{n=1}^{\infty}\mathbb P(B_n).
\end{aligned}
$$
To prove uniqueness, if two probability measures $\mathbb P$ and $\mathbb{Q}$ both satisfy the requirement, then by the condition their restrictions to the algebra $Z_0$ are the same, and hence necessarily $P\equiv Q$.
Since $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ is a Polish space, we may also consider finite-dimensional marginal distribution functions. Thus we have the following corollary.
Corollary (Kolmogorov consistency theorem) Suppose that for every $n\in \mathbb{N}$ and every $t_1,\cdots,t_n\in T$, there exists a finite-dimensional distribution function $F_{t_1,\cdots,t_n}(x_1,\cdots,x_n)$ satisfying
(1) Symmetry: for every permutation $\sigma=(\sigma(1),\cdots,\sigma(n))$ of $\{1,2,\cdots,n\}$,
$$
F_{t_{\sigma(1)},\cdots,t_{\sigma(n)}}(x_{\sigma(1)},\cdots,x_{\sigma(n)})
=
F_{t_1,\cdots,t_n}(x_1,\cdots,x_n).
$$
(2) Consistency: for every $n>m$,
$$
F_{t_1,\cdots,t_m,t_{m+1},\cdots,t_n}(x_1,\cdots,x_m,+\infty,\cdots,+\infty)
=
F_{t_1,\cdots,t_m}(x_1,\cdots,x_m)
$$
Then there exists a unique probability measure $\mathbb P$ on $\prod_{t\in T}\mathcal{B}(\mathbb{R})$ such that under $\mathbb P$, for every $n\in \mathbb{N}$ and every $t_1,\cdots,t_n\in T$, $F_{t_1,\cdots,t_n}(x_1,\cdots,x_n)$ is exactly the distribution function of the projection $(\pi_{t_1},\cdots,\pi_{t_n})$.
Note: The symmetry in the above consistency theorem reflects the unordered nature of the Cartesian product of infinitely many sets. Together with the consistency condition, it is equivalent to satisfying the consistency condition of the previous theorem on a $\pi$-class. Therefore, by the monotone class theorem, the reader can easily prove that the conditions in the corollary are equivalent to the consistency condition, and hence the conclusion follows.
The cover image of this article was taken in Zurich, Switzerland.
Kolmogorov Consistency Theorem
https://handsteinwang.github.io/2024/12/24/Kolmogorov-Consistency-Theorem/
