The Finite Element Method (FEM) is arguably the most powerful method known for the numerical solution of boundary- and initial-value problems characterized by partial differential equations. Consequently, it has had a monumental impact on virtually all areas of engineering and applied science.
There are two fundamental attributes of the method that are at the heart of its great utility and success. Firstly, it is based on the idea of partitioning bounded domains \(\Omega\) in \(\mathbb{R}^n\) into a number \(N\) of small, non-overlapping subdomains, the finite elements, over which functions are approximated by local functions, generally polynomials. Secondly, the boundary- and initial-value problems, to which the method is applied, are formulated in a so-called weak, or integral form, so that the contributions of each subdomain to the global integrals sum up to produce an integral characterizing the problem over the whole domain.
Some point to the first attribute of FEMs as its most important: the systemic representation of approximations of function spaces. From the viewpoint of approximation theory, the method provides a systematic approach to piecewise approximation over subdomains that produces sequences of functions that can approximate arbitrary members of, say, Sobolev spaces, arbitrarily closely in appropriate norms (e.g. Sobolev norms). This attribute is referred to as a property of "interpolation" for FEMs. It basically lifts the idea of approximate function representation from the classical notion of interpolation across values of functions or values of derivatives to a general approach for approximating functions with generalized derivatives in spaces such as \(L^p\left(\Omega\right)\) or \(W^{m,p}\left(\Omega\right)\ ,\) \((1\leq p \leq \infty,\ m \geq 0)\ .\)
But, were it only its ability to represent functions on quite general domains in \(\mathbb{R}^n\ ,\) the FEM would not have attained its popularity in science and engineering. It is the powerful notion of weak formulations of boundary- and initial-value problems that enables the summation of element contributions to integrals over the whole domain, a property that allows one to conceptualize a formulation of the problem at hand over a simple domain, the finite element, so as to approximate it locally. Then, these local subdomain approximations, as noted earlier, are summed up to determine the contributions from the full domain. This is the essence of the FEM.
The basic idea can be demonstrated by a simple model problem. Consider the Poisson problem on a two-dimensional domain which involves seeking a real-valued function \(u=u(x,y)\) such that
\[\tag{1} \left\{ \begin{array}{rll} -\Delta u = f & \;\; \text{in}\ \Omega \\ u = g & \;\; \text{on}\ \partial \Omega \end{array} \right. \]
where \(\Delta =\text{div}\left(\text{grad}\right) = \nabla^2\) is the Laplacian, \(f=f(x,y)\) is given source data, and \(\Omega\) is an open bounded domain in \(\mathbb R^2\) with boundary \(\partial \Omega\ .\) Formally, the differential equation in (1) is multiplied by a test function \(v=v(x,y)\ ,\) integrated by parts via Green's theorem, to obtain
\[ \int_\Omega (- \Delta u - f ) v\ dxdy = \int_\Omega \nabla u \cdot \nabla v\ dxdy - \int_\Omega fv\ dxdy - \oint_{\partial \Omega} \mathbf{n} \cdot \nabla u\ v\ ds = 0 \]
where \(\mathbf{n}\) is the outward unit vector normal to the boundary \(\partial\Omega\ .\) If now one demands that \(v=0\) on \(\partial\Omega\ ,\) (1) is formally equivalent to the following weak or integral form of the boundary-value problem:
Find \(u\in U\) such that
\[\tag{2} \int_\Omega \nabla u \cdot \nabla v \ dxdy = \int_{\Omega} f v\ dxdy, \qquad \forall v\in V. \]
Here \(U\) is called the space of trial functions and \(V\) the space of test functions. Conceptually, the formulation (2) is "weaker" than (1) as it imposes weaker conditions on the smoothness of the solutions \(u\) and test functions \(v\ .\) In fact
\[ U=\left\{w = w(x,y):\ w,\ \frac{\partial w}{\partial x},\ \frac{\partial w}{\partial y} \in L^2(\Omega),\ w = 0\ \text{on}\ \partial \Omega\right\} \] and, in this case, \(V=U\ .\) The space \(U\) is usually denoted
\[ U= H^1_0(\Omega) = \left\{w \in H^1(\Omega),\ w = 0\ \text{on}\ \partial \Omega\right\} \]
where \(H^1(\Omega)\) is the Sobolev space of functions on \(\Omega\) with generalized derivatives \(L^2(\Omega)\ ,\) a special case of the \(W^{m,p}(\Omega)\) of functions with derivatives of orders \(m\geq 0\) in \(L^p(\Omega)\ ,\) \(1 \leq p \leq \infty\ .\)
Problem (2) is a simple example of a broad class of linear boundary- and initial-value problems of the form
\[\tag{3} \text{Find}\ u \in U\ \text{such that} \quad B(u,v) = F(v), \qquad \forall v \in V, \]
where \(B(\cdot,\cdot)\) is a bilinear form mapping (reflexive Banach spaces) \(U\times V\) into \(\mathbb R\) and \(F\) is a continuous linear functional defined on \(V\ .\)
The standard finite element approximation of (2) amounts to the construction of Galerkin or weighted residual approximations of (1) over finite dimensional subspaces \(U^h\) and \(V^h\) of \(U\) and \(V\ ,\) respectively. That is, one seeks
\[\tag{4} u^h \in U^h\subset U\ \text{such that} \quad B(u^h,v^h) = F(v^h), \qquad \forall v^h \in V^h, \]
with \(V^h \subset V\ ,\) \(h\) being a discretization parameter generally chosen as a measure of the mesh size. The great utility of FEM's arises from its approach to constructing the approximation spaces \(U^h\) and \(V^h\) for functions defined on quite general domains \(\Omega\ .\)
Returning to the example (1) (or (2)), consider the case in which \(\Omega\) is the domain shown in Figure 1, partitioned with a mesh of triangular elements, as shown. This mesh is viewed as one generated by a sequence of maps \(F_k\) of a master triangle into the triangles so that, when connected along their edges and at their vertices, cover \(\Omega\) or produce a close approximation \(\Omega^h\) of \(\Omega\ .\) Suppose there are \(N\) elements in the mesh. Over a typical isolated triangle \(\Omega_k\ ,\) assume, for example, that the restriction \(v_k^h\) of the test function \(v^h \in V^h\) varies linearly. Then,
\[ v_k^h (x,y) = v^h|_{\Omega_k} (x,y) = a_1 + a_2 x + a_3 y, \quad (x,y) \in \overline{\Omega}_k. \]
The constants \(a_1\ ,\) \(a_2\ ,\) and \(a_3\) are determined so that \(v^h\) is completely determined by its values at the vertices of \(\Omega_k\ ,\) called the element nodes. If \(v_1\ ,\) \(v_2\ ,\) and \(v_3\) denote the values of \(V^h\) at nodes numbered 1, 2, and 3, then one easily determines that \(v^h\) can be written,
\[ v_k^h (x,y) = v_1 \psi_1^k(x,y) + v_2 \psi_2^k(x,y) + v_3 \psi_3^k(x,y), \quad k=1,2,\ldots,N, \]
where the \(\psi_i^k\) are the element shape functions,
\[ \begin{cases} \displaystyle \psi_1^k (x,y) = \frac{1}{2A_k} \left[ (x_2 y_3 - x_3 y_2) + (y_2-y_3) x + (x_3-x_2)y \right] \\ \displaystyle \psi_2^k (x,y) = \frac{1}{2A_k} \left[ (x_3 y_1 - x_1 y_3) + (y_3-y_1) x + (x_1-x_3)y \right] \\ \displaystyle \psi_3^k (x,y) = \frac{1}{2A_k} \left[ (x_1 y_2 - x_2 y_1) + (y_1-y_2) x + (x_2-x_1)y \right] \end{cases} \]
\((x_i,y_i)\) are the coordinates of node \(i\ ,\) and \(A_k\) is the area of \(\Omega_k\ .\) These linear functions have the property that
\[ \psi_i^k (x_j,y_j) = \begin{cases} 1 \quad \text{if}\ i=j \\ 0 \quad \text{if}\ i\neq j \\ \end{cases} \qquad 1 \leq i,j \leq 3. \]
All the elements are then connected together to form the full computational domain \(\Omega\) (or an approximation \(\Omega^h\) of it), and in so doing create global piecewise linear basis functions \(\phi_i\ ,\) corresponding to each node \(i\) in the connected mesh. These "pyramid" or "hat" functions result from patching the element shape functions together, as indicated in Figure 2. Now the (global) test functions are linear combinations of these basis functions: For a mesh with \(M\) interior nodes (boundary nodes are excluded since \(u=0\) on the boundary),
\[ v^h(x,y) = \sum_{i=1}^{M} v^h(x_i,y_i) \phi_i(x,y) \] \[ V^h = \text{span} \left\{ \phi_i(x,y) \right\}_{i=1}^M \]
In the finite element approximation of (2), one seeks
\[ u^h \in U^h = V^h: \quad u^h(x,y) = \sum_{i=1}^{M} u_i \phi_i(x,y) \]
that satisfies the weak problem (4) where
\[ \begin{cases} & \displaystyle B(u^h,v^h) = \int_\Omega \nabla u^h \cdot \nabla v^h\ dxdy = \sum_{i,j=1}^M v_i K_{ij} u_j\\ & \displaystyle F(v^h) = \sum_{i=1}^M f_i v_i \end{cases} \]
for arbitrary coefficients \(v_i\ .\) Here \(K_{ij}\) are the elements of the global stiffness matrix:
\[ K_{ij} = \int_\Omega \nabla \phi_j \cdot \nabla \phi_i\ dxdy \]
and \(f_{i}\) are the components of the load vector,
\[ f_{i} = \int_\Omega f \phi_i\ dxdy. \]
This leads to a system of linear equations for the coefficients \(u_j\ ,\) called the degrees of freedom:
\[\tag{5} \sum_{j=1}^M K_{ij} u_j = f_i, \qquad 1 \leq i \leq M. \]
The final solution is then,
\[ u^h(x,y) = \sum_{i=1}^M \left( \sum_{j=1}^M \left\{K^{-1}\right\}_{ij} f_j \right) \phi_i(x,y) \]
A property of fundamental importance is that this entire process can essentially be done at the local element level, and then summed up to apply to the global version of the problem. Thus, on element \(\Omega_k\ ,\) the local equalities for element \(k\) are
\[ \sum_{j=1}^3 K_{ij}^k u_j^k = f_i^k + g_i^k(u), \qquad 1 \leq i \leq 3. \]
where \(K_{ij}^k\) and \(f_i^k\) are the elements of the local stiffness matrix and of the local load vector for element \(k\ ,\) respectively, and \(g_i^k(u)\) are element interface flux terms:
\[\tag{6} \begin{cases} \displaystyle K_{ij}^k = \int_{\Omega_k} \nabla \psi_j^k \cdot \nabla \psi_i^k\ dxdy \\ \displaystyle f_{i}^k = \int_{\Omega_k} f \psi_i^k\ dxdy \\ \displaystyle g_i^k(u) = \oint_{\partial \Omega_k} \mathbf{n} \cdot \nabla u\ \psi_i^k\ ds \end{cases} \]
In the expression for \(g_i^k(u)\ ,\) \(u\) is the exact solution of the boundary-value problem (2); while it is unknown, it does not enter the discrete global problem because
\[ \sum_{k=1}^M g_i^k(u) = 0. \]
In practice, the local element matrices (6) are integrated using numerical quadrature methods when higher-order shape functions are used. In general, FEM computer programs are written so that element matrices such as those in (6) are calculated from a family of element types, and these are used to generate the global matrices of (5) when the elements are connected together to form the global element mesh. Of course, there are many other possible choices for finite elements, including for instance, quadratic approximations on triangles with 6 nodes, cubic with 10 nodes, quadrilateral elements, etc., and generalizations to tetrahedral and hexahedral elements for three-dimensional elements are commonly used.
The mathematical properties of FEM's and the fact that convergent and accurate approximations of solutions to problems defined on domains of quite general geometry are worth noting. Returning to the general weak boundary-value problem (3), one observes that \(B\left(\cdot,\cdot\right)\) is typically assumed to be continuous on \(U\times V\) into \(\mathbb{R}\) with continuity constant \(M\ ,\) and to satisfy an "inf-sup" condition
\[\tag{7} \inf_{u}\sup_{v}\frac{\left|B\left(u,v\right)\right|}{\left\|u\right\|_U\left\|v\right\|_V} \ge \gamma > 0, \]
where \(\gamma\) is a constant. Under the conditions that \(B\left(\cdot,\cdot\right)\) and \(F\left(\cdot\right)\) are continuous (on \(U\times V\) and \(V\ ,\) respectively), that \(\sup_u \left|B\left(u,v\right)\right| > 0\) for all \(v\in V\ ,\) and that (7) holds, it can be shown that (3) possesses a unique solution.
To determine the FE approximation, the domain \(\Omega\) is partitioned into \(N\) subdomains \(\Omega_k\) such that
\[ \overline{\Omega} = \text{closure of}\ \Omega = \cup_{k} \overline{\Omega}_k, \;\; \Omega_k\cap\Omega_j = \empty, \;\; k \ne j. \]
As noted earlier, each element \(\Omega_k\) is regarded as the image of a master subdomain (a master element) \(\hat{\Omega}\) under an invertible element map \(F_k:\hat{\Omega}\rightarrow\Omega_k\ ,\) \(1\le k\le N\ .\) In this way, a sequence of subspaces \(\left\{V^h\right\}\) is constructed such that
\[ V^h = \left\{v^h \in C^0(\overline{\Omega}):\left.v^h\right|_{\Omega_k} \circ F^{-1}_k = \hat{v} \in \mathbb{P}_k\right\}, \]
where \(\mathbb{P}_k\) is the space of local FEM approximations, e.g.
\[\tag{8} \mathbb{P}_k = \mathcal{P}_p (\hat{\Omega}) = \text{space of polynomials of degree}\ \le p\ \text{on}\ \hat{\Omega}, \]
or \[ \mathbb{P}_k = \mathcal{P}_p(\hat{I})\otimes \mathcal{P}_p (\hat{I}) = \text{space of tensor products of polynomials on}\ \hat{\Omega} = \hat{I} \otimes \hat{I}, \]
Then the FEM approximation of (3) reduces to the sequence of discrete problems (4). From (3) and (4), one can derive the standard "orthogonality condition"
\[ B\left(u-u^h,v^h\right) = 0, \qquad \forall \;v^h \in V^h. \]
So, in this sense, the FEM (Galerkin) approximation of (4) is optimal: the error \(e^h = u-u^h\) is "orthogonal" to the space \(V^h\) of approximations with respect to the bilinear form \(B\left(u,v\right)\ .\)
Under reasonable assumptions on the sequence of meshes producing the subspaces \(\{V^h\}\ ,\) one can approximate functions in \(W^{r,s}\left(\Omega\right)\) with interpolation error satisfying
\[ \|u-\Pi_h u \|_{r,s,\Omega_k} \le C h_k^{\min(p+1-r,t-r)} \|u\|_{W^{t,s}(\Omega)} \]
where \(p\) is the highest order of complete polynomials contained in the space of local FEM approximations (as in (8)), \(\Pi_h\) is a projection operator of \(V^h\) into \(W^{r,s}(\Omega)\) that preserves polynomials of degree \(\leq p\) for triangular elements, \(h_k = \text{diam}(\Omega_k)\ ,\) \(1 \le s \le \infty\ ,\) \(t \ge 0\ ,\) and \(C\) is a constant.
When interpolation properties such as this hold, one can typically obtain a priori error estimates for elliptic boundary-value problems of the form \[ \|u-u_h \|_{H^m (\Omega)} \le C h^{\min(p+1-m,q-m)} \|u\|_{H^q(\Omega)} \]
where now \(s=2\ ,\) \(r=m\ ,\) \(t=q\ ,\) and \(h=\max_{1\le h\le N}\left(h_k\right)\ .\) Thus, the rate of convergence of such methods in the \(H^m(\Omega)\)-norm is \(\min(p+1-m,q-m) > 0\ .\) The constant \(C\) is independent of \(u\) and \(h\) but depends upon the continuity constant \(M\) of \(B(\cdot,\cdot)\) and the constant \(\gamma\) of (7).
Many variations of such results can be obtained. Among them are cases in which the local spectral order \(p=p_k\) is an approximation parameter. These characterize the "\(p\)-versions" of the FEM. Likewise, both \(h_k\) and \(p_k\) can be varied, resulting in \(h\)-\(p\) FEMs. The "\(h\)-versions" of the FEM commonly refers to the case where \(p\) is uniform over all elements.
The field of FEM's, both as a tool for solving problems of science and engineering and as an area of research in numerical mathematics, is large and expanding and continues to be a vital area of study and practical application.