Bernstein Polynomial and Weierstrass Approximation Theorem

Bernstein Polynomial and Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem tells us that any continuous function $f$ on a closed interval can be uniformly approximated by polynomial functions. In 1912, Bernstein, based on probability theory, constructed Bernstein polynomials and proved the Weierstrass Approximation Theorem. This paper will introduce the essence of how Bernstein polynomials can approximate continuous functions from both the probabilistic and pure analytic perspectives. First, we present the Weierstrass Approximation Theorem:

Weierstrass Approximation Theorem

Theorem 1 (Weierstrass Approximation Theorem): Let $f$ be a continuous function on the closed interval $[a,b]$. For any $\varepsilon > 0$, there exists a polynomial $p$ such that

$$
\max_{x \in [a,b]} |p(x) - f(x)| < \varepsilon
$$

Without loss of generality, we will assume that $[a,b] = [0,1]$ in the subsequent discussion.

Bernstein Polynomials

Definition 1 (Bernstein Basis and Bernstein Polynomial): For any positive integer $n$, define

$$
B_{n,k}(x) = \binom{n}{k} x^k (1-x)^{n-k}, \quad k = 0, 1, \dots, n
$$

We call $\{B_{n,k}; k = 0, 1, \dots, n\}$ the Bernstein basis of degree $n$. For any $f \in C[0,1]$, we define the $n$-th degree Bernstein polynomial of $f$ by

$$
B_n^f(x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) B_{n,k}(x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k}
$$

Probabilistic Perspective

Consider the success probability $x$ for $n$-th Bernoulli trials, where the number of successes $S_n$ follows a binomial distribution. By the law of large numbers, we know that as $n \to \infty$,

$$
\frac{S_n}{n} \stackrel{\mathbb{P}}{\to} x
$$

Hence, for any continuous function $f$, as $n \to \infty$, we have

$$
f\left( \frac{S_n}{n} \right) \stackrel{\mathbb{P}}{\to} f(x)
$$

However,

$$
\begin{aligned}
\mathbb{E}\left[ f\left( \frac{S_n}{n} \right) \right] &= \sum_{k=0}^n f\left( \frac{k}{n} \right) \mathbb{P}(S_n = k) \\\
&= \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k} \\\
&= B_n^f(x)
\end{aligned}
$$

which is precisely the Bernstein polynomial. Based on the law of large numbers and the uniqueness of the limiting convergence in probability, we know that the Bernstein polynomial $B_n^f$ can approximate the continuous function $f$. But is this approximation uniform? The answer is yes, and we will prove this below based on the above analysis and provide a proof of the Weierstrass Approximation Theorem.

Proof of Weierstrass Approximation Theorem (Probabilistic Perspective)

Proof: Since $f$ is continuous on the closed interval $[0,1]$, it is bounded and uniformly continuous. Let $M = \max_{x \in [a,b]} |f(x)|$. For any $\varepsilon > 0$, there exists $\delta > 0$ such that whenever $|y - x| < \delta$, we have

$$
|f(y) - f(x)| < \frac{\varepsilon}{2}.
$$

Thus,

$$
\begin{aligned}
|B_n^f(x) - f(x)| &= \left| \mathbb{E}[f\left( \frac{S_n}{n} \right) - f(x)] \right| \\\
&\le \mathbb{E}\left[ \left| f\left( \frac{S_n}{n} \right) - f(x) \right| (1_{\left| \frac{S_n}{n} - x \right| < \delta} + 1_{\left| \frac{S_n}{n} - x \right| \ge \delta}) \right] \\\
&< \frac{\varepsilon}{2} + 2M \mathbb{P}\left( \left| \frac{S_n}{n} - x \right| \ge \delta \right).
\end{aligned}
$$

By Chebyshev’s inequality,
$$
\mathbb{P}\left( \left| \frac{S_n}{n} - x \right| \ge \delta \right) \le \frac{1}{\delta^2} \operatorname{Var}\left( \frac{S_n}{n} \right) = \frac{1}{\delta^2} n^{-1} x (1 - x) \le \frac{1}{4n \delta^2}.
$$

Thus, take $N = \left[ \frac{1}{\varepsilon \delta^2} \right] + 1$. When $n \ge N$,

$$
|B_n^f(x) - f(x)| < \frac{\varepsilon}{2} + \frac{\varepsilon}{2} = \varepsilon.
$$

Therefore, as $n \to \infty$, $B_n^f$ uniformly converges to $f$ on $[0,1]$.

Pure Analytic Perspective

First, we calculate the corresponding Bernstein polynomials for some specific functions $f$.

Example 1: For $f(x) \equiv 1$, we have $B_n^f(x) \equiv 1$.

Example 2: For $f(x) = x$, we have $B_n^f(x) = x$.

Example 3: For $f(x) = x^2$, we have

$$
B_n^f(x) = x^2 - \frac{1}{n}x^2 + \frac{1}{n}x.
$$

The proofs for these three examples are left as exercises for the reader, and they can be easily verified. From Example 3, we can also see that even polynomial functions might not have their Bernstein polynomials exactly equal to themselves.

Now, we provide three important properties of Bernstein bases:

Important Properties of Bernstein Bases

Property 1 (Non-negativity): For any positive integer $n$, $B_{n,k} \ge 0$, for $k = 0, 1, \dots, n$.

Property 2 (Unity): For any positive integer $n$,

$$
\sum_{k=0}^n B_n^f(x) \equiv 1.
$$

Proof:

$$
\sum_{k=0}^n B_n^f(x) = \sum_{k=0}^n \binom{n}{k} x^k (1-x)^{n-k} = (1-x + x)^n = 1.
$$

Property 3 (Energy Concentration): For any positive integer $n$, the maximum value of $B_{n,k}(x)$ occurs at $x = \frac{k}{n}$, and

$$
\sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} B_{n,k}(x) \le \frac{1}{4\sqrt{n}} \to 0, \quad n \to \infty.
$$

Proof: By differentiation, the maximum value of $B_{n,k}(x)$ occurs at $x = \frac{k}{n}$. When $\left| \frac{k}{n} - x \right| \ge n^{-1/4}$, $|k - nx| \ge n^{3/4}$, so $(k - nx)^2 \ge n^{3/2}$. Thus,

$$
\begin{aligned}
n^{3/2} \sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} B_{n,k}(x) &\le \sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} (k - nx)^2 B_{n,k}(x) \\\
&\le \sum_{k=0}^n (k - nx)^2 B_{n,k}(x) \\\
& = \sum_{k=0}^n k^2 B_{n,k}(x) - 2 \sum_{k=0}^n nkx B_{n,k}(x) + \sum_{k=0}^n n^2 x^2 B_{n,k}(x) \\\
& = n^2 B_n^{x^2} - 2n^2 x B_n^x + n^2 x^2 B_n^1 \\\
& = n^2 (x^2 - \frac{1}{n} x^2 + \frac{1}{n} x) - 2n^2 x^2 + n^2 x^2 \\\
& = n x(1 - x) \le \frac{1}{4} n.
\end{aligned}
$$

Thus,
$$
\sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} B_{n,k}(x) \le \frac{1}{4\sqrt{n}} \to 0, \quad n \to \infty.
$$

With these properties, we can now give another proof of the Weierstrass Approximation Theorem.

Proof of Weierstrass Approximation Theorem (Pure Analytic Perspective)

Proof: Since $f$ is continuous on the closed interval $[0,1]$, it is bounded and uniformly continuous. Let $M = \max_{x \in [a,b]} |f(x)|$. For any $\varepsilon > 0$, there exists $\delta > 0$ such that whenever $|y - x| < \delta$, we have

$$
|f(y) - f(x)| < \frac{\varepsilon}{2}.
$$

Take $N = \max\left\{ \left[ \frac{1}{\delta^4} \right] + 1, \left[ \frac{M^2}{\varepsilon^2} \right] + 1 \right\}$, and when $n \ge N$, $n^{-1/4} < \delta$ and $2M \cdot \frac{1}{4\sqrt{n}} < \frac{\varepsilon}{2}$, so

$$
\begin{aligned}
|B_n^f(x) - f(x)| &= \left| \sum_{k=0}^n f\left( \frac{k}{n} \right) B_{n,k}(x) - \sum_{k=0}^n f(x) B_{n,k}(x) \right| \quad \text{(using unity)} \\\
&\le \sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} \left| f(x) - f\left( \frac{k}{n} \right) \right| B_{n,k}(x) + \sum_{k; \left| \frac{k}{n} - x \right| < n^{-1/4}} \left| f(x) - f\left( \frac{k}{n} \right) \right| B_{n,k}(x) \\\
&< 2M \sum_{k; \left| \frac{k}{n} - x \right| \ge n^{-1/4}} B_{n,k}(x) + \frac{\varepsilon}{2} \sum_{k=0}^n B_{n,k}(x) \quad \text{(using uniform continuity and non-negativity)} \\\
&\le 2M \cdot \frac{1}{4\sqrt{n}} + \frac{\varepsilon}{2} \quad \text{(using energy concentration)} \\\
&< \varepsilon.
\end{aligned}
$$

Thus, as $n \to \infty$, $B_n^f$ uniformly converges to $f$ on $[0,1]$.

Three Important Properties

The three properties of the Bernstein basis mentioned above are actually quite essential. In fact, these three properties are also important guarantees for convolution approximation, and when summarized, they form the definition of a Dirac sequence.

Definition 2 (Dirac Sequence): A sequence of functions $\{k_n\} \subseteq L^1(\mathbb{R})$ is called a Dirac sequence if it satisfies the following three properties:

  • (Non-negativity): $k_n(x) \ge 0, \ \forall n \in \mathbb{N}$.

  • (Unity): $\int_{\mathbb{R}} k_n(x) , dx = 1, \ \forall n \in \mathbb{N}$.

  • (Energy Concentration): For all $\varepsilon > 0$, and for all $r > 0$, there exists an $N \in \mathbb{N}$ such that when $n \ge N$,

    $$
    \int_{|x| > r} k_n(x) , dx < \varepsilon.
    $$

Similar to the proof of the Weierstrass Approximation Theorem above, we can prove the following convolution approximation theorem. When taking $k_n$ as a Landau sequence, $k_n \star f$ is a sequence of polynomials, and therefore it can also serve as another proof of the Weierstrass Approximation Theorem.

Theorem 2 (Convolution Approximation Theorem): Let $f$ be a continuous function on the closed interval $[a,b]$, and let the sequence of functions $\{k_n\} \subseteq L^1(\mathbb{R})$ be a Dirac sequence. Then, $k_n \star f$ uniformly converges to $f$ on $[a,b]$, where $k_n \star f (x) = \int_{\mathbb{R}} k_n(y) f(x-y) , dy$ denotes the convolution.

Proof: The proof of this theorem is left as an exercise. Based on the above analysis, the reader can easily prove it.

The cover image of this article was taken at the Rhine Falls in Switzerland.

Author

Handstein Wang

Posted on

2024-11-09

Updated on

2024-11-09

Licensed under