In almost all object-oriented languages (which includes C + +), there exists the concepts of class attributes and object attributes. Class attributes are those attributes shared by all object instances (both immediate and derived objects) of a particular class definition, and object attributes (sometimes called data members) are those attributes whose values vary from object to object and hence help to characterize (or make unique) a particular object. In C + +, object attributes are specified a header file containing class declarations; within a class declaration, attributes are grouped by their accessibility: public attributes, protected attributes and private attributes. A detailed discussion of the nuances of these categories are beyond the scope of this guide; we refer the interested reader to the following books for further details: [66, 57]. For our purposes, the main thing to appreciate is that categories dictate access patters within the inheritance hierarchy and to the “outside” world (i.e. access from outside the object). We have summarized the relationships between the categories and their accessibility in Tables 5.1, 5.2 and 5.3 1 .
Within the StdRegions directory of the library, there exists a class inheritance hierarchy designed to try to encourage re-use of core algorithms (while simultaneously trying to minimize duplication of code). We present this class hierarchy in Figure 5.5.
As is seen in Figure 5.5, the StdRegions hierarchy consists of three levels: the base level from which all StdRegion objects are derived is StdExpansion. This object is then specialized by dimension, yielding StdExpansion0D, StdExpansion1D, StdExpansion2D and StdExpansion3D. The dimension-specific objects are then specialized based upon shape.
The object attributes (variables) at various levels of the hierarchy can be understood in light of Figure 5.6. At its core, an expansion is a means of representing a function over a canonically-defined region of space evaluated at a collection of point positions. The various data members hold information to allow all these basic building blocks to be specified.
The various private, protected and public data members contained within StdRegions are provided in the subsequent sections.
Private: There are private methods but no private data members within StdExpansion.
Array of Basis Shared Pointers: m_base
Integer element id: m_elmt_id
Integer total number of coefficients in the expansion: m_ncoeffs
Matrix Manager: m_stdMatrixManager
Matrix Manager: m_stdStaticCondMatrixManager
IndexKeyMap Matrix Manager: m_IndexMapManager
Public: There are public methods but no public data members within StdExpansion.
Private: There are private methods but no private data members within StdExpansion$D.
0D and 1D: std::map<int, NormalVector> m_vertexNormals
2D: Currently does not have any data structure. It should probably have m_edgeNormals
3D: std::map<int, NormalVector> m_faceNormals
3D: std::map<int, bool> m_negatedNormals
Public: There are public methods but no public data members within StdExpansion$D.
Basis functions are stored in a 1D array indexed by both mode and quadrature point. The fast index runs over quadrature points while the slow index runs over modes. This was done to match the memory access pattern of the inner product, which is the most frequently computed kernel for most solvers.
Bases are built from the tensor product of three different equation types (subsequently called Type 1, Type II and Type III respectively):
(5.1) |
Here, P is the polynomial order of the basis and Ppα,β are the pth order jacobi polynomial.
Before going further it is worth reviewing the spatial shape of each node. The term is an increasing function which is equal to zero at z = -1 and equal to one at z = 1. Similarly, is a decreasing function which is equal to one at z = -1 and equal to zero at z = +1. These two functions are thus non-zero at one of each of the boundaries. If we need to maintain C0 continuity with surrounding elements (as we do in the continuous Galerkin method), then these local modes must be assembled together with the correct local modes in adjacent elements to create a continuous, global mode. For instance in the left element would be continuous with in the right element. The union of these two modes under assembly form a single “hat” function. By contrast, functions of the form
are zero at both end points z = ±1. As a result, they are trivially continuous with any other function which is also equal to zero on the boundary. These “bubble” functions may be treated entirely locally and thus are used to construct the interior modes of a basis. Only bases with p > 1 have interior modes.
All of this holds separately in one dimension. Higher dimensional bases are constructed via the tensor product of 1D basis functions. As a result, we end up with a greater number of possibilities in terms of continuity. When the tensor product is taken between two bubble functions from different bases, the result is still a bubble function. When the tensor product is taken between a hat function and a bubble function we get “edge” modes (in both 2D and 3D). These are non-zero along one edge of the standard domain and zero along the remaining edges. When the tensor product is taken between two hat functions, they form a “vertex” mode which reaches its maximum at one of the vertices and is non-zero on two edges. The 3D bases are constructed similarly.
Based upon this convention, the 1D basis set consists of vertex modes and bubble modes. The 2D basis function set consists of vertex modes, edge modes and bubble modes. The 3D basis set contains vertex modes, edge modes, face modes and bubble modes.
Quads have the simplest memory organization. The quadrilateral basis is composed of the tensor product of two Type I functions ϕp(ξ0,i)ϕq(ξ1,j). This function would then be indexed as
1basis0[p*nq0 + i] * basis1[q*nq1 + j]
where nq<b> is the number of quadrature points for the bth basis. Unlike certain mode orderings (e.g. Karniadakis and Sherwin [48]), the two hat functions are accessed as the first and second modes in memory with interior modes placed afterward. Thus,
1basis[i], basis[nq + i]
correspond to and respectively.
Due to the use of collapsed coordinates, triangular element bases are formed via the tensor product of one basis function of Type I, and one of Type II, i.e. ϕp(η0,iϕpq(η1,j)). Since ϕp is also a Type I function, its memory ordering is identical to that used for quads. The second function is complicated by the mixing of ξ0 and ξ1 in the construction of η1.
In particular, this means that the basis function has two modal indices, p and q. While p can run all the way to P, The number of q modes depends on the value of the p index q index such that 0 ≤ q < P - p. Thus, for p = 0, the q index can run all the way up to P. When p = 1, it runs up to P - 1 and so on. Memory is laid out in the same way starting with p = 0. To access all values in order, we write
1mode = 0 2for p in P: 3 for q in P - p: 4 out[mode*nq + q] = basis0[p*nq]*basis1[mode*nq + q] 5 mode += P-p
Notice the use of the extra “mode” variable. Since the maximum value of q changes with p, basis1 is not simply a linearized matrix and instead has a triangular structure which necessitates keeping track of our current memory location.
The collapsed coordinate system introduces one extra subtlety. The mode
represents the top right vertex in the standard basis. However, when we move to the standard element basis, we are dealing with a triangle which only has three vertices. During the transformation, the top right vertex collapses into the top left vertex. If we naively construct an operators by iterating through all of our modes, the contribution from this vertex to mode Φ01 will not be included. To deal with this, we add its contribution as a correction when computing a kernel. The correction is Φ01 = ϕ0ϕ01 + ϕ1ϕ10 for a triangle.
The hexahedral element does not differ much from the quadrilateral as it is the simply the product of three Type I functions.
Cross sections of a triangular prism yield either a quad or a triangle based chosen direction. The basis, therefore, looks like a combination of the two different 2D geometries.
Taking ϕpϕpr on its own produces a triangular face while taking ϕpϕq on its own produces a quadrilateral face. When the three basis functions are combined into a single array (as in the inner product kernel), modes are accessed in the order p,q,r with r being the fastest index and p the slowest. The access pattern for the prism thus looks like
1mode_pqr = 0 2mode_pr = 0 3for p in P: 4 for q in Q: 5 for r in P - p: 6 out[mode_pqr*nq + r] = basis0[p*nq]*basis1[q*nq]*basis2[mode_pr + r] 7 mode_pqr += P - p 8 mode_pr += P - p
As with the triangle, we have to deal with complications due to collapsed coordinates. This time, the singular vertex from the triangle occurs along an entire edge of the prism. Our correction must be added to a collection of modes indexed by q
The tetrahedral element is the most complicated of the element constructions. It cannot simply be formed as the composition of multiple triangles since η2 is constructed by mixing three coordinate directions. We thus need to introduce our first Type III function.
The r index is constrained by both p and q indices. It runs from P - p - q to 1 in a similar manner to the Type II function. Our typical access pattern is thus
1mode_pqr = 0 2mode_pq = 0 3for p in P: 4 for q in P - p: 5 for r in P - p - r: 6 out[mode_pqr*nq + r] = basis0[p*nq]*basis1[mode_pq + q]*basis2[mode_pqr + r] 7 mode_pqr += (P - p - r) 8 mode_pq += (P - p)
The tetrahedral element also has to add a correction due to collapsed coordinates. Similar to the prism, the correction must be applied to an entire edge indexed by r
Like the tetrahedral element, a pyramid contains a collapsed coordinate direction which mixes three standard coordinates from the standard region. Unlike the tetrahedra, the collapse only occurs along one axis. Thus it is constructed from two Type I functions and one Type III function
The product ϕpϕq looks like the a quad construction which reflects the quad which serves as the base of the pyramid. A typical memory access looks like
1mode_pqr = 0 2for p in P: 3 for q in P - p: 4 for r in P - p - r: 5 out[mode_pqr*nq + r] = basis0[p*nq]*basis1[q*nq]*basis2[mode_pqr*nq + r] 6 mode_pqr += (P - p - r)