Transformations for Piola-mapped elements

The Arnold-Winther element successfully discretizes the Hellinger-Reissner variational formulation of linear elasticity; its development was one of the key early breakthroughs of the finite element exterior calculus. Despite its great utility, it is not available in standard finite element software, because its degrees of freedom are not preserved under the standard Piola push-forward. In this work we apply the novel transformation theory recently developed by Kirby [SMAI-JCM, 4:197-224, 2018] to devise the correct map for transforming the basis on a reference cell to a generic physical triangle. This enables the use of the Arnold-Winther elements, both conforming and nonconforming, in the widely-used Firedrake finite element software, composing with its advanced symbolic code generation and geometric multigrid functionality. Similar results also enable the correct transformation of the Mardal-Tai-Winther element for incompressible fluid flow. We present numerical results for both elements, verifying the correctness of our theory.


Introduction
The use of a reference element is critical in the evaluation of finite element basis functions and their derivatives.One constructs, by some means, the basis functions on a particular fixed element and obtains the basis functions on an arbitrary cell K through a mapping.For scalar-valued function spaces, this is frequently a simple pullback (change of coordinates).However, vector-valued elements discretizing H(div) or H(curl), as well as their tensor-valued counterparts, typically use a different mapping in order to facilitate enforcing appropriate continuity of only normal or tangential components.
The reference element paradigm is typically employed for elements that satisfy a kind of equivalence, where the reference element basis functions map directly to the physical element basis functions under the coordinate change.However, for many elements (both classical and modern), this is not the case, as the degrees of freedom (DOFs) are 'mixed up' by the pullback in a cell-dependent manner.In [51], Kirby developed a general theory for obtaining the proper basis in the case of affinely mapped cells.This approach gives the correct nodal basis as a linear combination of the mapped reference element basis functions.This linear combination turns out to be sparse, meaning that applying the theory incurs only a small additional cost during the finite element computation, ensuring that the use of these more exotic elements composes neatly with existing high-level automatic code generation software.Kirby and Mitchell [53] generalized the Firedrake [72] code stack to generate and employ F. R. A. Aznaran, P. E. Farrell, & R. C. Kirby this transformation, giving results for Morley, Hermite, Argyris, and Bell elements.The basis on the reference cell for each of these elements is constructed using FIAT [50].
In this paper, we extend the transformation theory presented in [51] to Piola-mapped H(div) elements, both vector-and tensor-valued.Specifically, we focus on the Mardal-Tai-Winther [61] and Arnold-Winther [11,12] elements.The vector-valued Mardal-Tai-Winther element provides a discretization of H(div) which is nonconforming in H 1 , and works robustly for the scale of operators that interpolate between Stokes and Darcy flow.Thanks to its discrete Korn inequality, it serves as a locking-free elasticity element [62].It also provides a low-order divergence-free discretization of the incompressible Stokes and Navier-Stokes equations [47].It has been implemented for a small number of standalone numerical experiments [27], and implemented partially in Diffpack [57,Ch. 4] and SyFi [60,Ch. 15] (a symbolic alternative to FIAT) and envisioned as part of FIAT [60,Ch. 3].These latter implementations were not fully usable on arbitrary unstructured meshes, due to the lack of transformation theory which we now provide in this work.
The tensor-valued Arnold-Winther elements are used to discretize the Cauchy stress tensor in the stress-displacement formulation of planar linear elasticity.After decades of effort by many researchers, these elements solved a fundamental challenge of numerical elasticity in providing a stable and convergent discretization of the Hellinger-Reissner principle, which moreover enforces symmetry of the stress tensor exactly.Despite being conceived almost two decades ago, the prior implementations of these Arnold-Winther elements have either required the explicit element-by-element construction of the basis [25], or again, have been for one-off numerical experiments, such as for stress reconstruction or the design of error indicators [36,63,66,74].Again, FIAT has had partial support for this element, but the lack of reference mappings has prevented its use in practice [60,Ch. 3].
We begin in Section 2 with notation used to define geometry and Piola mappings, and summarize some results on how normal and tangential components are transformed.In Section 3, we survey the applications which motivate the use of the Mardal-Tai-Winther element, before describing several other vector-valued elements of interest; analogously in Section 4, we briefly review the role of the Cauchy stress tensor in linear elasticity and its approximation with finite elements, before defining the Arnold-Winther stress elements.We survey the transformation theory of Kirby [51] and show how it applies in the context of Piola elements in Section 5, which is the main contribution of the paper.A discussion of the application of the Arnold-Winther elements to linear elasticity follows in Section 6; in particular, we prove convergence of Nitsche's method for the imposition of traction boundary conditions.Section 7 offers an interpretation of Piola transformations in terms of the finite element exterior calculus (FEEC), and reviews the role of FEEC in the development of multigrid preconditioners, as discussed in Section 8, in which we apply patch-based multigrid algorithms to precondition the canonical Stokes-Darcy and Hellinger-Reissner systems for the Mardal-Tai-Winther and Arnold-Winther elements respectively.We briefly discuss the implementation within the Firedrake code stack and present some numerical examples for our newly-enabled elements in Section 9.

Notation and preliminaries
Let M = R d×d denote the space of d × d matrices, and S = R d×d sym its symmetric subspace.We shall employ the standard notation for the Sobolev space H k (Ω; X) (or L 2 (Ω; X) when k = 0), with domain Ω ⊂ R d and codomain X, and associated norm • k,Ω or simply • k , as well as the standard spaces H(div), H(div; M), H(div; S), H(rot), H(rot; S), H(curl), and H(curl; S), where the divergence, rotation, or curl of a tensor field is defined row-wise.Differential operators with subscript h, such as ∇ h , are meant element-wise.The symbol denotes inequality up to a constant which may depend on mesh regularity but not mesh spacing h.

PIOLA-MAPPED ELEMENTS
Let K ⊂ R 2 be a reference triangle with vertices {x i } 3 i=1 and let K be a typical element with vertices {x i } 3 i=1 .We assume the diffeomorphism F : K → K to be affine, as shown in Fig. 1.We number the edges of a triangle by the convention in [75] as follows.Edge i of any triangle excludes the vertex i and is oriented from the lower-numbered vertex to the higher-numbered one.Each edge e is a column vector running between two vertices of a triangle, and is also used to denote the edge as a set.Its magnitude is the edge length, and we define the unit tangent to edge e as t = e e .To each edge, we associate its normal n = Rt, where is clockwise rotation.Quantities and differential operators with are defined analogously for the reference element.Note that the normal to a triangle may point inward or outward according to this convention, and we allow F to reverse orientation, but triangles sharing an edge will automatically agree on the direction (so that local and global direction coincide).This greatly simplifies the required logic for assembly of finite element DOFs involving normal components.This convention can be extended to simplices of any dimension [10,75].
Given any function φ defined on K, we define its pullback by 2) The pullback can be used to evaluate φ at a point x = F (x) by For scalar-valued spaces, the pullback provides an isomorphism between H m ( K) and H m (K) for m ∈ Z ≥0 ; its use is very natural in the context of affine equivalent finite elements such as Lagrange or Crouzeix-Raviart.One can define its basis functions on a fixed reference element K and obtain the basis functions on any K by pullback.This can be adapted, with some technicalities, when affine equivalence fails [51].
Affine mapping from a reference cell K to a typical cell K.When working with finite elements discretizing Sobolev spaces such as H(div), it is frequently helpful to work with the contravariant Piola mapping.Piecewise polynomial vector fields will lie in H(div) if and only if they have continuous normal components across interfaces [17,Prop. 2.1.2];applying this row-wise gives an analogous statement for H(div; M) and H(div; S).Hence, finite elements for H(div)-based spaces tend to include normal components as degrees of freedom, and these will be preserved by using the appropriate mapping.Definition 2.1.Let K, K ⊂ R d , let F : K → K be a diffeomorphism, and let J(x) = ∇F (x) be its Jacobian.The contravariant Piola map takes Φ : K → R d to Φ : K → R d defined by The (double) contravariant Piola map takes τ : K → S to τ : K → S by The Piola transforms F div : H(div, K) → H(div, K), F div,div : H(div, K; S) → H(div, K; S) are welldefined and isomorphisms.Analogous considerations hold for H(rot), H(rot; S), H(curl), H(curl; S), tangential traces, and the covariant Piola mappings: Definition 2.2.The covariant Piola map takes Φ : (2.6) The (double) covariant Piola map takes τ : K → S to τ : K → S by Unlike in the usual definition of the Piola map, the absolute value of the determinant is not taken in (2.4); this is explained in [75], which gives a thorough exposition of transforming and assembling vector finite elements in the case that the spaces and degrees of freedom are exactly preserved under the Piola mappings.The central contribution of this paper is to adapt the techniques of [51] to handle H(div) elements that are not exactly preserved under contravariant Piola mapping.This occurs, for example, when elements use degrees of freedom involving edge tangents or vertex values in addition to edge/face normals and internal ones.This enables us to develop general and robust software implementations of these elements.

Transformation of components and their moments
Now, we review the relationship between the reference and physical element normals and tangents when F is affine.Suppose edge e = x i − x j connects vertices i and j of the physical element.Since ) and equivalently, e t = ê J t.
(2.9) In a similar way, we use the identity 1  det J RJR T = J −T for all 2 × 2 matrices to obtain e n = Re = RJ ê = (det J) J −T Rê = (det J) J −T ê n.
(2.10) In other words, (2.9) shows that the Jacobian maps reference tangents onto physical ones (up to scale factor), but a different mapping given in (2.10) is required to obtain the physical normal from the reference one.
These calculations directly inform how the Piola mapping works.Let us consider some Φ defined over K and its image under the Piola map Φ = F div ( Φ).At each point x on edge e ⊂ ∂K the Piola mapping preserves the normal component scaled by edge length, for we have .11)This calculation also specifies how integral moments of normal components against some function µ(x) = μ(F −1 (x)) along edges behave under Piola mapping.Changing variables from e to ê introduces a Jacobian factor of e / ê , so that On the other hand, the Piola map fails to preserve the tangential direction: It is useful to further resolve J T J t and hence e Φ(x)•t in terms of the reference normal and tangential components.We define Ĝ = n t , (2.14) which is an orthogonal matrix.Using I = Ĝ ĜT gives where Now, with α = α det J and β = β det J , we have e Φ(x) (2.17) So, tangential moments are not preserved under the Piola map, but we have (2.18) These properties show that vector-valued elements with boundary degrees of freedom involving only normal components can be directly mapped by the Piola transform.Such elements include the Raviart-Thomas [73], Brezzi-Douglas-Marini [20], and Brezzi-Douglas-Fortin-Marini [19] elements.On the other hand, the Mardal-Tai-Winther element includes tangential as well as normal boundary degrees of freedom, and we can expect complications in mapping them.
We can also use the calculations (2.9) and (2.10) to determine what happens to the various components of a tensor under the Piola map.For some symmetric tensor-valued function τ : K → S and τ = F div,div (τ ), the double-normal component is preserved in an analogous sense to the vector case: On the other hand, the normal/tangential and double tangential components are not preserved, as direct calculations verify: and The tangential-normal component e 2 t T τ (x)n follows from (2.20) by symmetry.Moreover, we can keep Ĝ as in (2.14) and further resolve these in terms of the reference normal/tangential directions by and e 2 t T τ (x)t = ê 2 α 2 nT τ (x)n + 2αβ nT τ (x) t + β 2 tT τ (x) t . (2.23) Remark 2.3.The tangential and tangential-tangential traces are well-defined at least on polynomial subspaces, and are included here for completeness.
Like in the vector-valued case, we can also show how moments transform.Suppose τ ∈ L 2 ( K; S) with τ = F div,div (τ ) and μ ∈ L 2 (ê).Integrating both sides of (2.19) and accounting for the Jacobian ds = e ê dŝ gives e It follows that tensor-valued elements with boundary DOFs involving only normal-normal components can be directly mapped by the Piola transform; this is for example true of the Tangential-Displacement Normal-Normal Stress element [77] and the Hellan-Herrmann-Johnson element [38,39,48].Since the Arnold-Winther elements also have boundary DOFs of other kinds, application of the Piola map alone will not result in the correct dual basis on each physical cell.
Remark 2.4.We briefly comment on the covariant case for H(curl) elements.As is well-known, in 2D, H(rot) and H(div) are isomorphic by simple rotation, although the spaces are quite different in 3D; indeed, in dimensions strictly greater than 2, the analogous identification is always impossible for the vector-valued spaces [59,Prop. 4.1].It follows that the transformation theory for H(rot) and H(curl) elements, at least in 2D, may be similar mutatis mutandis to what we present in this paper.

Motivations
The Mardal-Tai-Winther (MTW) element [61] provides a uniformly robust element for both H(div) and (nonconforming) H 1 .It is a single discretization that is stable and accurate for both Stokes and Darcy flow as well as the scale of operators interpolating between them, including systems arising from transient Stokes-like flow with small timesteps.Due to its lack of vertex degrees of freedom, it is also appropriate for the interface conditions arising in coupled Stokes-Darcy flow [49].In the context of elasticity, it admits a discrete Korn inequality [62] and serves as a locking-free element for primal linear elasticity [55], and for Biot's consolidation model in poroelasticity [58].
It is especially significant for the purposes of incompressible flow that the MTW space V h may be paired with the space Q h of piecewise constants, so that the resulting velocity-pressure pair V h × Q h satisfies div V h ⊂ Q h , a highly desirable property since it implies mass conservation (i.e.exactly divergence-free velocities) and pressure robustness [47].
The MTW element offers a nonconforming discretization of the two-dimensional Stokes complex, a smoother variant of the classical L 2 de Rham complex [79]: While it does not form a true subcomplex due to its nonconforming nature, this relates the MTW element to elements for other spaces and gives stability and convergence theory for mixed problems; indeed, the original motivation for the construction of the MTW space was that its divergence-free subspace is exactly the curl of a related H 2 -nonconforming element, studied in [67] and which we denote by W h : Both sequences (3.1), (3.2) are exact whenever Ω is simply connected.Additionally, this perspective guides the construction of robust smoothers for optimal-order multilevel algorithms [31], as we shall discuss in Sections 7 and 8.

The Mardal-Tai-Winther element
In this Section, we define several other vector-valued H(div) elements, to discuss their transformations: the Raviart-Thomas [73], Brezzi-Douglas-Marini [20], as well as the Mardal-Tai-Winther [61] elements.Schematics of these elements are shown in Figure 2. Raviart-Thomas elements (the original H(div) element) are based on the function space 3) This is somehow the smallest space of polynomials over K such that the divergence maps exactly onto P m (K).The edge degrees of freedom are taken as moments of the normal components up to degree m.The Brezzi-Douglas-Marini elements [20] and Brezzi-Douglas-Fortin-Marini elements [19] have larger spaces than RT, have similar edge degrees of freedom, and so transform in a very similar way.All these elements form equivalent families under the contravariant Piola map.
We let {µ i } m i=0 be some hierarchically ordered basis for polynomials of degree m on the unit interval (e.g.monomials or Legendre polynomials).(The hierarchical property will not be important for the Raviart-Thomas element, but we shall use it for the Mardal-Tai-Winther space.)We will assume that µ 0 = 1.Let µ k i be the mapping of µ i from the unit interval to edge e k of triangle K by standard pullback.
The Raviart-Thomas degrees of freedom include normal moments on each edge: for each edge e k .Now let { φi } m(m+1)/2 i=1 be some hierarchically ordered basis for P m−1 ( K) (e.g. the bivariate monomials or Dubiner basis) and φ i = φi • F −1 its pullback.We then let Φ k,i ∈ L 2 (K; R 2 ) be the vector-valued function with φ i in component k and 0 in the rest.Defining this, the rest of the degrees of freedom are Then, taking the union of all of these gives the nodes for the Raviart-Thomas element.If we construct the nodal basis { Ψi } for RT m ( K) or BDM m ( K) on a reference element, then a suitable basis for RT m or BDM m on any K is obtained by applying F div to each member of the space.This follows directly from the preservation of normal sense (2.10).
In this context, the Mardal-Tai-Winther element [61] is an example of an H(div) element that includes both normal and tangential degrees of freedom and so requires a more complex reference mapping.On each cell K, the MTW space is ) which includes all linear polynomials (and some higher-order terms) and has dimension exactly nine.Six degrees of freedom are the constant and linear moments of the normal components of each edge, as in (3.4).Instead of internal moments like (3.5) for Raviart-Thomas, the remaining three are the constant tangential moment on each edge e k : This element is somewhat more expensive than H(div) elements with comparable accuracy.Like BDM 1 , it pairs with piecewise constant pressures but has 50% more degrees of freedom (three per edge rather than two).It gives a second order approximation with only one more degree of freedom than RT 2 , and unlike RT 2 serves also as a H 1 -nonconforming element, but RT 2 pairs with piecewise linear pressures.

Tensor-valued elements
4.1.Elastic stress and the numerical enforcement of symmetry For practical applications of linear elasticity, one is often interested in computing the Cauchy stress tensor with at least as much accuracy as the displacement, and hence, in mixed stress-displacement formulations in which the stress is computed directly rather than, for example, via numerical differentiation after the fact.The mixed dual stress-displacement formulation achieves this, and offers a way to alleviate the well-known numerical phenomenon of volumetric locking in the incompressible limit.
Let Ω ⊂ R d be a polygonal elastic body; we focus on the planar case d = 2. Given Lamé parameters λ, µ > 0, the compliance tensor A : S → S is, in the homogeneous isotropic case, defined as Let Γ D , Γ N partition ∂Ω, and let and traction data g ∈ H −1/2 (Γ N ; R d ), we consider the Hellinger-Reissner system, which seeks a stress-displacement pair (σ, u) ∈ H(div; S) × L 2 (Ω; R d ) satisfying σn = g on Γ N in the trace sense, (4.2) and which are critical points of the Hellinger-Reissner functional A stationary point (σ, u) satisfies i.e. the saddle point system ) where ε denotes the symmetric gradient (or linearized strain).Here, the constitutive relation (4.5a) is associated with the minimization among stress fields of an appropriate complementary energy, and is formally equivalent to the standard "Hooke's law" σ = Cε(u) = 2µε(u) + λ(trε(u))I (where C = A −1 denotes the elasticity tensor), which in primal formulations is typically used to define (and thus eliminate) the stress.However, only the former stress-strain relation (4.5a) remains valid in the incompressible limit λ µ.That the Cauchy stress field σ is symmetric is equivalent to the conservation of angular momentum, and is highly desirable but notoriously difficult to preserve in finite element discretizations; a further challenge arises from the requirement of H(div; M)-conformity, which ensures that the traction forces on a mesh face shared between two elements are in equilibrium.Before defining the H(div; S)discretizing, exactly symmetric Arnold-Winther (AW) elements, we briefly review other approaches to the numerical enforcement of symmetry; for further references, see [17,.
Conventionally, symmetry is enforced weakly via L 2 (Ω; M)-orthogonality to a skew-symmetric subspace; this approach encapsulates the PEERS element [6], and the weakly symmetric successors to the AW elements [7].Weak imposition of symmetry also fits naturally into the FOSLS (first order system least-squares) method by penalization of asymmetry [22,65].
A very significant contribution are the TDNNS elements due to Schöberl and Pechstein (née Sinwel) [77]; the stress space consists of the symmetric tensor-valued polynomials of degree k ≥ 1 whose normal-normal component is continuous across each edge, paired with a displacement space of the same degree whose tangential component is continuous; schematics are provided in Figure 3. Another exactly symmetric element is the Hellan-Herrmann-Johnson (HHJ) element, consisting of symmetric matrix-valued polynomials of degree k ≥ 0 with continuity of the normal-normal components (that is, n T τ n is continuous across edges).A schematic is provided in Figure 4.  [38,39,48].The TDNNS elements are implemented in Netgen/NGSolve, while the HHJ element was introduced into FEniCS in [37] and later into Firedrake; both implementations were possible without the theory we develop in this paper since both have boundary degrees of freedom involving only the normal-normal components and hence, in light of (2.19), can be Piola-mapped in a standard way.

The Arnold-Winther elements
Arnold and Winther proposed two exactly symmetric elements discretizing H(div; S), one conforming [11] and the other nonconforming [12].We first consider the conforming element.In the lowest-order case, the space is the symmetric matrix-valued quadratic polynomials, augmented by solenoidal cubics: This space has dimension 24, and the degrees of freedom are: • the values of each component of τ at each vertex of K, • the moments of degree 0 and 1 of the normal-normal and normal-tangential components of τ on each edge, • the constant moment of each component of τ over K.
The associated local displacement space is V h (K) = P 1 (K; R 2 ) which is 6-dimensional, with DOFs given by, for example, the values of the 2 components at 3 non-collinear points interior to K. Element diagrams are provided in Figure 5.
Figure 5.The conforming Arnold-Winther element [11], and the discontinuous Lagrange element for displacement with which it is paired.Thick arrows refer to both components of a tensor in a given direction (e.g.τ n or τ t).A clutch of circles either indicates internal moments of each unique component or evaluation of each component at a point.
To introduce notation for the stress DOFs, we let denote evaluation of a particular component of the input τ at one of the vertices of triangle K. Similar to the nodes for vector-valued elements, we define the boundary moments by where s ∈ {n, t}; for the HHJ element, we thus have nodes { n,n,i,k } 1 i=0 for each edge e k of the triangle.For the interior nodes, denoting by {φ i } i a hierarchically ordered basis over K with φ 0 = 1, we let τ n,i,j denote the tensor with φ n in the i, j entry and 0 elsewhere -a natural injection into the space of tensor-valued orthogonal polynomials.We let Thus, DOFs for the AW c space are given in full by The nonconforming stress element introduced in [12] avoids the somewhat unusual feature of vertex degrees of freedom and gives a cheaper (though slightly less accurate) method based on the space This space, consisting of symmetric tensors of quadratic polynomials subject to a degree reduction in the normal-normal component on each edge, is 15-dimensional, and is determined by • the constant and linear moments of both the normal-normal and normal-tangential components on each edge, • the constant moments of the three unique components on the interior.
It is again paired with the displacement space P 1 (K; R 2 ); element diagrams are provided in Figure 6.

PIOLA-MAPPED ELEMENTS
Figure 6.The nonconforming Arnold-Winther mixed element [12].The Arnold-Winther elements were one of the first products of the then embryonic finite element exterior calculus [9].Their novelty is that they form an exact subcomplex of the stress elasticity complex in 2D, with commuting cochain projections: Here, if (Σ h , V h ) are either the conforming or nonconforming AW elements, Q h is either the Argyris space (another element requiring nonstandard transformations) or a certain nonconforming approximation of H 2 (Ω) respectively, airy denotes the Airy stress operator which we will define in (7.12),I h , Π h are appropriate densely defined interpolants, and P h is the L 2 -projection, then it is proved in [11,12] (see also [80]) that the above diagram is commuting, with exact rows whenever Ω is simply connected.
It is these structure-preserving properties which are used to prove unisolvency of dual bases, prove error estimates, and clarify links between their elements and H 2 -(non)conforming elements.Indeed, under the unifying framework, the elasticity system (4.5) is the mixed Hodge Laplacian boundary value problem for a specific choice of Hilbert complex -more precisely, a segment of the stress complex such that the base space L 2 (Ω; S) is endowed with the energy norm induced by the compliance tensor A [3, p. 107].Remark 4.1.Many methods proposed after the seminal papers of Arnold and Winther [11,12] were, in essence, attempts to construct improved versions of the Arnold-Winther elements.Among these, perhaps the most efficient are the Hu-Zhang elements [43,45,46], which strongly and stably enforce symmetry in any dimension; in the 2D conforming case, they have 18 local DOFs in comparison to AW c 's 24.The same authors recently provided a unified analysis of methods which strongly impose symmetry [42], using a generalized 4-field formulation of which both the Hu-Zhang and AW elements, among many others, are special cases.These elements also admit an interpretation in terms of the FEEC [26].The Hu-Zhang elements employ tangential DOFs, and the Piola transformation theory which they therefore require would be very similar to that for the AW elements which we present here.

Transforming Piola-inequivalent elements
5.1.Reviewing the transformation theory Equations (2.13), (2.18), (2.25), and (2.26) indicate that finite elements involving tangential degrees of freedom will not map nicely under the Piola transform -the reference element nodal basis will not map to the physical element one.In [51], Kirby developed a theory to address the simpler but analogous situation with standard pullback.The broad strokes of the theory go through essentially unchanged if one replaces the regular pullback F − * ( φ) = φ • F −1 with a Piola pullback; for self-containment, we summarize the key ideas here.We let Φ be a (column) vector of elements of a function space V over K, taking values in some space such as R d or S, with dim V = n.Let F : K → K be the mapping shown in Figure 1 and F. R. A. Aznaran, P. E. Farrell, & R. C. Kirby F * : V → V some pullback operator (e.g. one of the Piola maps) taking V into a function space V defined over K.We define F * ( Φ) to be the componentwise application of the pullback: (5.1) Then, if we take Ψ to contain the nodal basis functions of V on the reference element and Ψ to contain the nodal basis of V defined on K, an immediate consequence of equations such as (2.18) or (2.25) is that 2) and note that equality still fails to hold up to permutation or scaling.If the pullback operator preserves the function space, providing an isomorphism between the instantiations V and V , then F * ( Ψ) will be a basis -just not the nodal one.Consequently, there must exist a (cell-dependent) matrix M such that Ψ = M F * ( Ψ).
(5.3)In this case, one can simply compute the pullback of the reference basis and multiply by M .If M is quite sparse, this is a small additional cost in the finite element computation.With some effort, the theory can be extended to handle the situation when the pullback does not send V onto V , but none of our examples in the present work require this.
However, as we see from subsection 2.2, it is natural to consider the effect of the pullback on the DOFs, rather than directly on the nodal basis.To consider instead how the dual spaces are transformed, denote by N = { i } i and N = { ˆ i } i (row) vectors of functionals on V and V respectively, and define the dual or push-forward (5.4) Now identifying an n-vector Φ ∈ V n with its componentwise image Φ * * ∈ (V * * ) n under the canonical embedding into the bidual, it induces an evaluation operator on vectors of dual functionals via the 'outer product' N Φ := Φ * * ⊗ N , where (Φ * * ⊗ N ) i,j := Φ * * j ( i ) = i (Φ j ).
(5.5)This allows us to extend the push-forward (5.4) to vectors of functionals componentwise either in a manner analogous to (5.1), or equivalently as (5.6) Choosing N, N to be the DOFs, then the Kronecker property by which they are characterized is then simply expressed as Ψ * * ⊗ N = I = Ψ * * ⊗ N .
(5.7)Moreover, this clarifies the transformation of the primal basis: expanding the desired physical basis in terms of the mapped reference basis, we have so it follows from the Kronecker property that QΘ = I, (5.9) where (5.10) can be interpreted as a generalized Vandermonde matrix and has the vector of coefficients for Ψ j with respect to {F * ( Ψk )} k in column j.Now by (5.2) and unisolvency, we have .13) so in particular, F * (N ) = N , (5.14) but as before, there must be an invertible P with N = P F * (N ).
(5.15)An important result of [51,Theorem 3.1] is that M = P T (5.16) (as matrices of numbers).That is, it is sufficient to relate the physical nodes and their push-forwards.As we will see in our examples, it can be more natural to find P −1 and then invert it by hand.

Piola-mapped vector elements
The general approach of the transformation theory developed in [51] is to build the nodal basis on K out of a linear combination of the mapped reference element basis functions.As explained in the previous subsection, for us the transformation of basis functions works by duality, considering instead the push-forward of nodes, which follow readily from the calculations (2.12) and (2.18).
We will use block notation in defining our vectors of degrees of freedom.Let (5.17) contain the three MTW degrees of freedom associated with edge e k , and then N is block vector with three parts containing the nodes for each edge: (5.18) Similarly, we let N k and N contain the corresponding nodes on the reference element.Now, we need to compute the Piola push-forward F * of these nodes in terms of the reference element nodes, which effectively builds P −1 for the transformation theory.Let Φ be defined over K, 1 ≤ k ≤ 3, and i = 0 or 1.Then (2.12) gives So, the normal moment nodes are in fact pushed forward to the corresponding reference element normal moment nodes.For tangential moments, (2.18) can be rewritten as where we have labeled the α, β with a superscript k indicating they correspond to edge k.We define and this gives (5.22) Using this calculation for each edge, we have and the block matrix in this last equation is and hence obtain P by (5.25)

Piola-mapped tensor elements
Applying the abstract transformation theory to the Arnold-Winther stress elements introduced above follows a similar pattern, using the results in Section 2. We begin with the nonconforming element, and subsequently consider the inclusion of the vertex DOFs for the conforming element.The existing implementations of these elements [23,25] require a separate construction of the basis functions for each element, by inverting the Vandermonde matrix arising from the Piola pullback of the primal basis, as in equation (5.9).
For the nonconforming element, we define the vector of nodes associated with edge e k by N k = n,n,0,k n,t,0,k n,n,1,k n,t,1,k T . (5.26) That is, we store the normal and tangential moments of order 0, followed by those of order 1.We also collect the three internal degrees of freedom in the vector so that the nonconforming element has degrees of freedom with a similar definition and ordering of reference element degrees of freedom N k and N .Much as with the MTW element, we can use (2.24) and (2.25) to write where we now have and we readily invert W k by and have that in a manner very analogous to the MTW element.
Let us apply the push-forward to an internal degree of freedom, recalling that φ 0 = 1: where the asterisk indicates equality due to symmetry.Using this in (5.33) gives that PIOLA-MAPPED ELEMENTS and W = 1 det J W , we write (5.37)This gives the result which, again, is readily inverted blockwise to find P .As a remark, the W block is dense, but denoting its dependence on J by W = WJ , it is then easily checked that W −1 J = WJ −1 , i.e. the inverse can be found by reversing the roles of K and K.
We now turn to the full conforming Arnold-Winther element.It has the same edge and internal nodes to the nonconforming element just considered, but also includes vertex values.However, these transform in a very similar fashion to the internal moments just discussed.We collect the pointwise degrees of freedom for each vertex k into a small vector: (5.39) We also collect the edge degrees of freedom together in the same way as for the nonconforming element: N 1,k = n,n,0,k n,t,0,k n,n,1,k n,t,1,k T .
(5.40)Note that we have added extra superscripts of 0 and 1 to indicate the topological dimension associated with the degrees of freedom.We also include the internal degrees of freedom in a separate vector and hence order the degrees of freedom by (5.42) The edge and internal degrees of freedom are handled in exactly the same way as for the previous elements.The Piola push-forward of the vertex functional gives (5. 43) This suggests that (except for very special geometry), the Piola transform will not send vertex-oriented basis functions on the reference element to their physical counterpart.However, using the expansion of J τ J T given in (5.34), a very similar calculation as for the internal moments shows that (5.43) gives for each vertex where now W = 1 (det J) 2 W , for which again W −1 Combining all of the push-forwards gives that (5.45) Again, the block-diagonal structure of the matrix makes it straightforward to invert and find P .

Scale-invariance and conditioning
If global basis functions are of disparate size as the mesh is refined, this introduces an additional source of ill-conditioning into finite element systems.This phenomenon occurs for Hermite and certain other affinely-mapped elements [51], and similar conditions apply to our Piola-mapped elements as well.The solution is to globally re-scale degrees of freedom so that basis functions are of comparable size.
We illustrate these issues for the AW c element.Let h denote the typical diameter of a cell.Consider a basis function dual to a vertex DOF; it is unity at one vertex in one component and zero at other vertices, hence O(1) over the cell.A basis function dual to an internal DOF integrates to unity over the cell, hence must be of size O(h −2 ).Between these cases, a normal moment basis function over an edge must have size O(h −1 ) over that edge.Integrating the inner products which populate the global mass matrix, we obtain some entries (corresponding to pairs of vertex basis functions) of size O(h −2 ), others (corresponding to pairs of internal basis functions) of size O(h 2 ), and others in between.This leads to an O(h −4 ) condition number of the mass matrix rather than an O(1) condition number resulting from, say, standard Lagrange polynomials.
This issue can be readily addressed in a post hoc fashion by rescaling the global DOFs according to the local mesh size.For example, one can multiply the edge degrees of freedom by h and the internal degrees of freedom by h 2 , so long as cells sharing a DOF agree on the precise value used in the scaling.This choice was made in our implementation, but alternatively, one could include some power of facet measure in the definitions of degrees of freedom for the finite element.Such scale-invariance is implicit in the treatment of AW c by Carstensen et al. [25], the Morley-Wang-Xu elements, and related H 3nonconforming elements due to Wu and Xu [81] and would give a slightly different transformation matrix.

Preservation of the constraints
Our three spaces of primary interest can be characterized as the intersection of the kernels of several linear functionals.Note that the constraint on the normal component used to define the MTW space is preserved by the Piola map, due to (2.11).Similarly, the constraint on the normal-normal component of the nonconforming AW nc space is preserved in light of (2.19).The MTW element is moreover constrained by a loss of degree in its divergence; denoting by Φ = F div ( Φ) the vector Piola map applied to some Φ defined on the reference element, by a direct computation we have div Φ = 1 det J div Φ.Similarly, the conforming AW c element is also subject to a loss-of-degree constraint on its (now vectorvalued) divergence.Denoting τ = F div,div (τ ) for some symmetric τ defined on the reference element, then div τ = 1 (det J) 2 J div τ .It follows that the Piola maps preserve the degree of the divergence in both cases.
These constraints on the divergence, as well as constraints on the degree of the normal components, may be expressed by requiring integral moments against certain orthogonal polynomials to vanish.This is described on the reference element in some detail for Arnold-Winther elements in [50].All these functionals are preserved under the Piola push-forward.Hence, the physical element function space is actually constructed by pullback, even if the basis functions are not preserved.This need PIOLA-MAPPED ELEMENTS not be the case -for example, the constraints on the C 1 Bell element are not preserved under affine pullback, requiring a fuller version of the transformation theory [51].

Traction conditions in the Hellinger-Reissner principle with Nitsche's method
The Arnold-Winther discretization of the Hellinger-Reissner problem (4.5), in the pure displacement case |Γ N | = 0, for displacement data u 0 , has discrete weak form given by seeking (σ h , u h ) in one of the AW pairs Σ h × V h such that We now turn to the enforcement of traction conditions (4.2) for the Hellinger-Reissner problem, which for the dual formulation (4.4) are essential (i.e.strongly enforced); we consider a mixed boundary condition (0 < |Γ D |, |Γ N | < |∂Ω|), although the method may be extended to the pure traction case |Γ D | = 0 subject to a quotient of the displacement space by the rigid motions and a suitable compatibility condition.
The original AW papers [11,12] only treated the case of the elastic body clamped everywhere on the boundary, i.e. |Γ N | = 0 with homogeneous displacement data u 0 ≡ 0. Carstensen et al. [25] identified (in particular, inhomogeneous) traction conditions as a substantial practical difficulty associated with the AW elements, due to delicate interdependence between DOFs at the boundary, which moreover depend on the shape of the boundary at a given boundary vertex; they were able to enforce them using nodal interpolation [24], Lagrange multipliers [25], or elimination of the boundary DOFs by condensation [23].When the boundary condition is mixed but the traction data g is zero, the relevant trial and test space for the stress is [17, Remark 2.1.3].Wang [80, Lemma III.1] constructed an interpolant of Scott-Zhang type which preserves the traction-free condition on Γ N under an elliptic regularity assumption, and which was used to prove a discrete inf-sup condition for this case, but no details of the discrete enforcement of this condition were offered in Pasciak and Wang's application of AW c to the homogeneous pure traction problem in [70].
There is no clear way to enforce the traction condition in either the Arnold-Winther spaces or the weak formulation of Hellinger-Reissner.We therefore advocate a simpler approach, employing the classical Nitsche's method [68] to weakly enforce the condition.There is little literature on the application of Nitsche's method to the enforcement of essential boundary conditions in dual mixed problems; our approach is similar to [21,54] for the mixed Poisson problem.
Given traction data g, we augment the Hellinger-Reissner functional (4.3) over the discrete spaces Σ h × V h with a term incorporating the traction condition (to ensure consistency, since we do not impose the condition on the spaces), and a consistent, quadratic penalty term, seeking the critical point where h denotes the characteristic mesh size, γ > 0 is an h-independent penalty parameter, r h = σ h n − g denotes the traction residual, and may be interpreted as a least-squares term penalizing deviation from the traction condition.

F. R. A. Aznaran, P. E. Farrell, & R. C. Kirby
The exact traction satisfies σn = g in H −1/2 (Γ N ; R 2 ), but a penalty term δ h in terms of the dual norm in H −1/2 (Γ N ; R 2 ) would not aid the analysis, nor is it practical to equivalently penalize the Riesz representative of r h in H 1/2 00 (Γ N ; R 2 ), or to work with the linearization of such penalizations.To ensure consistency of the L 2 (Γ N ; R 2 )-penalization (6.4), we assume full elliptic regularity of the stress field by assuming that the solution (σ, u) to (4.4) satisfies (σ, u) ∈ H 1 (Ω; S) × H 2 (Ω; R 2 ), and that g ∈ L 2 (Γ N ; R 2 ).The latter assumption typically holds in practice for the traction data (or some discrete approximation thereof), while the former holds in the homogeneous isotropic case if Ω is a convex polygon and A is smooth.Remark 6.1.The exact displacement u formally satisfies an unmixed primal formulation of linear elasticity for which u ∈ H 1 (Ω; R 2 ), but there is no gain in regularity for the stress field by passing to an alternative formulation.Remark 6.2.The natural choice of exponent for h in (6.4) is +1 and not −1, by dimensionalization and since we expect r h 2 0,Γ N to converge one order slower than r h 2 −1/2,Γ N .However, the term with negative exponent is more naturally interpretable as a penalization, and was found to be more effective in preliminary numerical experiments.This also informs the choice of discrete norms (6.7) below.
Convergence will be proved only for the AW c element, but a computational example will be included also for AW nc .
Let T h denote a quasi-uniform triangulation of Ω, E h the set of its internal edges, and Λ h the set of its vertices. [1]Linearizing the augmented functional (6.3) over the AW c pair, we seek where where

and on an
exterior edge e ⊂ ∂Ω denotes the identity.We now prove well-posedness of the augmented discrete formulation (6.5) using the standard Brezzi conditions [17, Section 4.2.3].We take γ ≥ 1.
Proof.There exists By the scaling τ h n 0,e h − 1 2 τ h 0,K ∀ e ⊂ ∂K, we obtain 0,e ∀ e ⊂ Γ D , (6.13) It remains to show that u h h |||τ h ||| h .For every e ⊂ Γ N , since the DOFs associated to an edge and its endpoints determine τ h n on that edge, we have τ h n = 0 by (6.12a), (6.12c), so it suffices to show u h h τ h 0,Ω .By (6.12b), (6.12d), we have where π e : L 2 (e; R 2 ) → P 1 (e; R 2 ), π K : L 2 (K; S) → S are orthogonal projections.By equivalence of norms on Σ h on the reference cell K, we have τh K , so by a scaling argument we obtain τ h  [11], which enjoy the following approximation properties for all (τ, v) ∈ H 1 (Ω; S)×L 2 (Ω; R 2 ): (i) div Π h τ = P h div τ (as in a smoothed variant of the commuting diagram (4.11)); (ii) (Π h τ )n = π e (τ n) on all edges e; (iii Lemma 6.5.For each K ∈ T h and τ ∈ H 2 (Ω; S), we have where S K is a patch of cells neighbouring K such that {S K } K∈T h has the finite overlapping property.
Proof.We adapt the proof of (iii) in [11], from which we recall that the error in Π h may be written as .18) and S K is a patch of the required form, and Π 0 h : H 1 (Ω; S) → Σ h is the canonical interpolation operator except at the vertices, at which (Π 0 h τ )(x) := 0 ∀ x ∈ Λ h .It is easily checked that Π 0 K , the restriction of Π 0 h to a single cell K, is bounded from H 1 ( K; S) to H r ( K; S) for all r ≥ 0; choosing r = 1, by a scaling argument we obtain Π ) which gives the result when combined with (6.18).
We make an additional regularity assumption on the stress field for error analysis.
In particular, we expect the traction residual to converge as 2 ).Proof.By the Babuška condition associated with the well-posed system (6.5),applied to ( 21) which by consistency of the system (6.5) is equal to By (ii), the term ( * ) would vanish were it not for Remark 6.4.Employing multiplicative trace inequalities, approximation properties of P h , and Lemma 6.5, we have 1,K (6.27) (6.30)

An exterior calculus perspective
The finite element exterior calculus [9] is a framework for structure-preservation in finite element discretizations, and for constructing finite element spaces as subcomplexes of complexes of function spaces of differential forms.The inventions of the MTW and AW elements we have considered were motivated by the Stokes (3.1) and elasticity (4.11) complexes respectively.In this Section, we consider the application of FEEC to Piola transformation theory, and to the construction of multigrid smoothers.

Uniform construction of the pullbacks
The Piola transforms (2.4)-(2.7)may be regarded as the analogy of the standard pullback (2.2) for H(div)-and H(curl)-based spaces, but in fact the pullbacks may be defined uniformly in a manner guided by the FEEC [59, p. 35], [3, Section 6.2.5]; we here employ terminology for which we refer the reader to [3,Ch. 6]. [2] Let us regard the physical and reference cells K, K as submanifolds of dimension d in R d , with F : K → K a diffeomorphism.Denote by Alt k R d the space of alternating k-linear forms on R d , and for M ∈ {K, K} let Λ k (M ) denote the space of differential k-forms on M , of which functions in the Sobolev spaces we consider will be scalar, vector, and tensor proxies, and the operators of vector calculus will be proxies for the exterior derivative d : Λ k → Λ k+1 .Scalar fields may be identified with 0-forms or d-forms, and vector fields with 1-forms or (d − 1)-forms.We may specify L 2 -integrability of differential form coefficients with L 2 Λ k (Ω) := L 2 (Ω; Alt k R d ); Sobolev spaces of differential forms may be defined as HΛ k := {ω ∈ L 2 Λ k | dω ∈ L 2 Λ k+1 }, and correspond to the conventional spaces of vector calculus via HΛ 0 H 1 , HΛ 1 H(curl), HΛ d−1 H(div), and HΛ d L 2 . [3] An elastic stress field T on M may naturally be identified with an (Alt since when integrated over a codimension-1 submanifold (such as the boundary of a subdomain), it should give a vector representing force [9][35, p. 618].Applying the Hodge star gives an element τ ∈ Λ 1 (M ; R d ), i.e. a linear map R d → R d (hence, a matrix) at every point of M , which is the classical characterization of stress.Alternatively, T may be identified with a (symmetric) covariant 2-tensor field in Λ 1 (M ; Alt 1 R d ), which to each point assigns a (symmetric) bilinear form on R d [59, p. 10].
Given ω ∈ Λ k ( K), the derivative of the inverse diffeomorphism , the pullback of ω under F −1 , where * denotes the algebraic pullback of a linear map L : R d → R d given by For scalar fields representing 0-forms ω ∈ HΛ 0 ( K) (i.e. a constant map at each point), the scalar proxy for the resulting 0-form ω is easily seen to be given simply by precomposition with F −1 , because (J − * ω) x = ωF −1 (x) , which for the proxy gives exactly the standard pullback (2.2).For a vector field ŵ representing a 1-form ω ∈ HΛ 1 ( K), the canonical identification is ω = d i=1 ŵi dx i , so that ωx (v) = ŵ(x) • v (i.e.ŵ is the pointwise Riesz representative of ω), so hence the transformed proxy is given by F curl ( ŵ).A vector field ŵ can alternatively represent a (d − 1)-form via for which 3) The derivation of the tensor transforms are analogous; a (symmetric) covariant 2-tensor field τ ∈ Λ 1 ( K; Alt 1 R d ) has as proxy a (symmetric) matrix field T via the identification τx (v A (symmetric) matrix field T may instead serve as a proxy for a (symmetric where φ is defined by (7.2).By definition of φ it is easily checked that φ(J ), and hence Note that neither tensor transform is given by the row-wise application of the corresponding vector transform.
We hereafter assume that J is spatially constant, so that the pullback commutes with the exterior derivative.In this case, a crucial property is that moments of the exterior derivatives against appropriate fields are preserved, a primary motivation for the use of the Piola pullbacks in mixed finite elements [17, Lemmas 2.1.6,2.1.9].Let ω ∈ HΛ k (K), µ ∈ HΛ d−k−1 (K) with pullbacks ω, μ, then since the pullback respects the exterior product and commutes with the exterior derivative, Note that since we have chosen to allow F to reverse orientation, in general the moment preservation (7.6) will only be up to sign.By Stokes' theorem, it also follows that where the boundary integrands are meant in the trace sense.Combining these identities with the characterization of constrained finite element spaces by kernels of constraint functionals, as in subsection (5.5), also allows for a more elegant verification of their preservation by the Piola maps; the constraints for the MTW and AW c elements are preserved in light of (7.6), while the preservation of boundary constraints follows from the identity (7.7) applied to appropriate L 2 (∂K)-orthogonal polynomials, as constructed e.g. in [69, Section 3].
That the pullback commutes with the exterior derivative also implies the preservation of the kernel of the operators defining the spaces, a property implicit in the classical presentation of the Piola operators (e.g.[64, Section 3.9]).For example, if K, K ⊂ R 2 (respectively, R 3 ), note that if φ ∈ H 1 ( K) and φ ∈ H 1 (K) is its standard pullback given by (2.2), then by the chain rule we have ∇φ = J −T ∇ φ.Since rot ∇ ≡ 0 (resp.curl ∇ ≡ 0), certainly both ∇ φ ∈ H(rot, K) (resp.H(curl, K)) and ∇φ ∈ H(rot, K) (resp.H(curl, K)), and the covariant map (2.6) "transform[s] vectors of H(curl; Ω) like gradients" [17, p. 61], which, on simply connected domains, form the kernel of rot or curl by exactness of an appropriate de Rham sequence; hence PIOLA-MAPPED ELEMENTS ker(rot) = F curl ker rot in 2D and ker(curl) = F curl ker curl in 3D.(7.8) Analogously, by a direct computation, which is reflected in (2.4).We now extend the discussion in [64, Section 3.9] to the tensor-valued case in 2D.Let us define the matrix-valued curl of a vector field in 2D, for which and the Airy stress function associated with a scalar potential φ which is identically symmetric and divergence-free.Then for φ ∈ H 2 ( K) with pullback φ ∈ H 2 (K), so by exactness of the 2D stress complex (4.11), we obtain that ker(div) = F div,div (ker( div)).Similarly, for the rot of a symmetric matrix field applied row-wise, we calculate that ∇ 2 φ = F curl,curl ( ∇2 φ), where ∇ 2 H 2 forms the kernel of rot on H(rot; S) by exactness of the 2D Hessian complex (e.g.[71,Remark 3.16]).This observation of kernel preservation may also be connected to the (albeit trivial) topologies of K and K: the pullbacks are isomorphisms between appropriate complexes on K and K, which moreover commute with d, hence are cochain maps which preserve the cohomology of each domain.

Kernel-capturing and robust multigrid
It is now well-established that the characterization of the kernels of discretized differential operators is crucial for the design of robust multigrid schemes [8,31,33,76]; for both the MTW and AW-type elements, this is given precisely by their positions in the discrete exact complexes (3.2) and (4.11), as we now explain.We describe multigrid relaxation in the framework of subspace correction methods [83].Given a finite-dimensional Hilbert space V of functions on a mesh, consider a decomposition where the sum need not be direct.The variational problem to be solved over V often takes the form of a symmetric, coercive operator, perturbed by a positive semidefinite singular operator (such as a discretized divergence) which is scaled by some parameter α > 0, a physical or penalty parameter which, as it increases, renders the problem difficult to solve.The seminal work of Schöberl [76, Theorem 4.1] revealed sufficient conditions for α-robustness of the parallel subspace correction preconditioner induced by the decomposition (7.14); a key insight is that, if N denotes the kernel of the singular operator, the subspaces should satisfy the kernel-capturing property For an H(div)-based space V such as the MTW velocity space in the Stokes complex (3.1) or the AW stress space in the elasticity complex (4.11), one choice of relaxation method is given by the vertex star iteration [34], which is induced by the subspaces ) where K i denotes the patch of cells in the mesh sharing vertex i.Given v ∈ V with div v = 0, where now V ∈ {V h , Σ h } denotes one of the canonical MTW or AW spaces respectively, by exactness of the associated discrete complex when Ω is simply connected, we may write Cφ = v for some potential φ ∈ W , where (W, C) denote (W h /R, curl) or (Q h , airy h ) respectively.Let now {φ i } i denote a basis for W , and write φ = i c i φ i .Then a divergence-free decomposition of v is given by v i = c i Cφ i , since v = i v i and v i ∈ N for each i, so it suffices to find subspaces V i with Cφ i ∈ V i for each i.That the vertex star (7.16) fulfils this property follows from inspection of the basis functions of W .
In general, Schöberl's hypotheses also require that the splitting (7.14) be stable in the V -norm and that the kernel splitting (7.15) be stable in the energy norm induced by the Galerkin projection of the coercive form, which does not follow automatically from the exactness of the discrete complex; typically, such bounds hold for the infinite-dimensional spaces, and hold also for the discrete complex if bounded commuting projections exist.

Smoothers
To demonstrate the effectiveness of our mapping techniques in practical numerical simulations, it is necessary to consider preconditioners for the partial differential equations discretized by the MTW and AW elements, and to exhibit composability of our implementation with the associated software stack [52].As indicated in the previous subsection, the application of patch-based multigrid smoothers is natural for the H(div)-discretizing elements we consider in this paper.
Building on the work of Benzi and Olshanskii [15] and Schöberl [76] among others, Farrell and coauthors have successfully developed parameter-and mesh-robust preconditioners of augmented Lagrangian (AL) type, with highly specialized multigrid algorithms, for a host of nonlinear PDEs with saddle point structure [28-30, 32-34, 56, 82].We illustrate the AL method for AW elements, the application to MTW being analogous.It consists of augmenting the Hellinger-Reissner Lagrangian (4.3) with a penalty term for α ≥ 0. The AL term penalizes deviation from the set constrained by the momentum balance (4.5b), but does not change the exact solution.Its more significant benefit is that it allows the control of the Schur complement of the discretized system, as we now explain.The stationarity condition of the augmented energy (8.1) gives rise to a saddle point system for the stress-displacement pair, which, by a block factorization, admits the well-known solution formula [14, p. 14] where A, B T , B are discrete compliance, strain, and divergence operators respectively, S = −BA −1 B T is the (in general, dense) Schur complement, and the subscript α denotes the same quantities but of the augmented system (so that A 0 = A etc.).We wish to precondition this system for FGMRES Krylov iterations.In analogy to the velocity-pressure Stokes problem, for which the Schur complement is spectrally equivalent to the viscosity-weighted pressure mass matrix [78], it can be shown (denoting by M u the displacement mass matrix) that S−1 := −αM −1 u ∼ S −1 α (8.3) serves as an approximate inverse to the augmented Schur complement, at least for fixed mesh size, with the approximation improving as α → ∞.Preconditioning the Schur complement with (8.3) is however in tradeoff with the augmented stress solve A −1 α ; the AL term in A α has a large kernel

PIOLA-MAPPED ELEMENTS
consisting of the infinite-dimensional affine space of tensor fields with divergence f , rendering standard multigrid relaxation schemes ineffective.We propose the vertex star relaxation as an α-robust multigrid algorithm for this block.
Remark 8.1.All three of the finite element spaces we implement in this paper are non-nested under uniform refinement; we employ the default prolongation operator of Firedrake, which involves (i) lossless projection of the coarse function onto a DG space of the same degree, (ii) lossless natural injection of the coarse projection, from the coarse grid to the fine DG space, (iii) projection from the fine DG space to the fine finite element space.We conjecture that multigrid convergence behaviour observed in Section 9 may be improved by the construction of specialized prolongation operators for each element which preserve, at least approximately, the kernel of the discrete divergence, in the style of [32,34] (this being the other component of the parameter-robust multigrid framework of Schöberl), but we do not pursue this here.
Remark 8.2.Let τ h ∈ AW c (K) denote a local AW c basis function dual to a vertex DOF for a given cell K.Because all other nodes vanish at τ h , its full normal component vanishes along each edge, and its components have vanishing mean.Since ε(div τ h ) is a constant matrix, thus, compared to AW nc , the nodal AW c basis functions arising from the 'extra' vertex DOFs contribute only to the divergence-free subspace.It follows that similar multigrid transfer operators may work well for both these elements.

Numerical examples
For the case of affine rather than Piola transformations, we have described the inclusion of this theory in the Firedrake code stack in [53], and since this stack already understands Piola transformation the process is quite analogous.We must implement each new reference element in FIAT [50] and wrap it into FInAT [40].The FInAT wrapper also requires a function to construct abstract syntax for the basis transformation in terms of callbacks provided by the form compiler to obtain symbols for geometric quantities such as Jacobians, and physical and reference normal and tangent vectors.
The new elements must also be registered with UFL [1] and tsfc [41].We now consider several test problems to validate our implementation and demonstrate its capabilities.Linear systems in the manufactured solution examples, and on the coarsest grid in the multigrid examples, were solved by sparse LU factorization with MUMPS [2] and PETSc [13].

Mardal-Tai-Winther
We consider the following parameterized saddle point system: given H 1 methods degrade.The Mardal-Tai-Winther element, however, is stable for both H(div) and H 1 (though nonconforming) and for a scale of spaces in between; this is manifested in exhibiting stability and accuracy uniformly in for the system (9.1).
Figure 7.A warped mesh employed to check convergence under refinement.
On the unit square Ω = (0, 1) 2 , with Γ N = {y = 1}, Γ D = ∂Ω\Γ N , we approximate the manufactured solution u(x, y) = (2 1−y , 0) T , p(x, y) = cos(πx) cos(2πy).The essential boundary data (9.1c) is enforced weakly using the boundary DOFs and an interpolant of the given boundary data j.Denoting the resulting global constrained MTW space by V h,j , and the global DG(0) pressure space by Q h , we seek where now f, g, j, K are defined by appropriate residuals of the manufactured solution.In order to show that the mapping techniques apply on general unstructured meshes, we perturb the interior vertices of the coarsest mesh as pictured in Figure 7; further refinements are obtained uniformly.In Tables 1 and 2, we reproduce the approximately -independent expected orders of convergence (EOCs) in both velocity and pressure proved in [61] for the MTW-DG(0) pair, for a representative range of parameters .Notice that the errors themselves are essentially -independent, and a fortiori so are the convergence rates, demonstrating the robustness of the element between these contrasting regimes.

PIOLA-MAPPED ELEMENTS
Table 2. Errors and convergence rates in L 2 of pressure for (9.2).
We now demonstrate instead the robustness of our proposed multigrid solver for the MTW element applied to primal planar linear elasticity near the incompressible limit, a regime in which the MTW discretization will itself exhibit robust convergence; this example adapts the demonstration in [16].We consider a simple cantilever beam Ω = [0, 25] × [0, 1], fixed to a wall at the left-hand end Γ D , stress-free on the rest of its boundary, in plane stress conditions, and subject only to the force of its own weight f = (0, −10 −3 ) T .Employing a Union-Jack-crossed mesh, and the associated MTW space V h,0 , we seek u h ∈ V h,0 such that where the shear modulus of the material is fixed at µ = 3.8 × 10 4 and the Poisson ratio is initialized at ν = 0.3.In Table 3, for a discretization of 7.23×10 5 DOFs, we demonstrate robustness of the associated star relaxation, implemented via PCPATCH [31], as the Poisson ratio approaches the critical value 1 2 ; the solution at highest Poisson ratio, enlarged for the sake of illustration, is pictured in Figure 8.This was carried out with relative 2 -norm tolerance 10 −5 for the outermost Krylov iterations.Table 3. Moderate and approximately constant Krylov iteration counts for the solution of (9.3) near the incompressible limit, after preconditioning via the vertex star relaxation described in subsection 7.2.4. The approximation order m varies because in some cases, higher-order convergence may be obtained if the exact solution pair admits improved Sobolev regularity.
Table 4. Ranges of approximation order m by the AW elements of the elasticity variables in the L 2 -norm.
Having validated both AW elements, we consider a more complex example from the PhD thesis of Li [59] which moreover includes traction conditions.The domain, pictured in Figure 9, consists of the rectangle Ω = [0, 3] × [0, 1] with three disks removed, occupied by a material assumed to be isotropic and homogeneous with ν = 0.2 and Young's modulus E = 10: Figure 9. Domain with a coarser initial mesh than that employed in [59] so that multigrid may be performed.The prescribed displacement is fixed (0, 0) T at the left-hand end and compressed (−1, 0) T at the right end, which together form Γ D , with a stress-free condition σn = 0 on Γ N = ∂Ω \ Γ D given by the top, bottom, and the boundaries of the holes; there is no external force f .We combine the Nitsche and augmented Lagrangian penalties, seeking critical points (σ h , u h ) of H h,γ,α (σ h , u h ) := H h,γ (σ h , u h ) + α 2 Ω div σ h 2 dx.(9.5) Due to the Nitsche boundary terms, which at present cannot be treated with PCPATCH, the application of the vertex star iteration necessitated the use of PCASM.This was carried out with residual 2 -norm tolerance 10 −9 for the outermost Krylov iterations.Table 9 exhibits the behaviour of FGMRES for AW c , with multigrid for the augmented stress block employing vertex star relaxation with Chebyshev smoother on each multigrid level, with fixed Nitsche and AL parameters γ = 100, α = 1.We verify in Figure 10, which is colored by the size of the shear stress dev σ h , that the AW c solution on the finest mesh, with almost five million degrees of freedom, is free from numerical artifacts.Figure 11 shows convergence of the traction residual to zero in L 2 (Γ N ; R 2 ) as h → 0 using AW c with α = 1, and AW nc (using LU factorization) with α = 0 respectively, for various values of the Nitsche parameter γ.
Although we have proved convergence of the Nitsche penalty for any γ ≥ 1, in practice we find the solver most effective at γ = O(100), and that increasing γ further will enforce the traction condition more strongly (if desired), at the cost of a more ill-conditioned system.Remark 9.1.The observed order of convergence for the traction residual in Figure 11 is higher than that predicted by Proposition 6.6, which itself makes an artificial regularity assumption on the stress field, and which we therefore conjecture could be improved, for example via duality arguments.7.90×10 4 12 2 3  3.13×10 5 12 2 4  1.25×10 6 12 2 5  4.98×10 6  11   Table 9. Outer Krylov iteration counts using the vertex star relaxation for the AW c element.

Code availability
The Mardal-Tai-Winther and conforming and nonconforming Arnold-Winther elements, with the modifications suggested by subsection 5.4, were incorporated into the main branch of Firedrake as part of this work.For reproducibility, the exact software versions used to generate the numerical results in this paper are archived in https://zenodo.org/record/5596313[85]; the code, and scripts for the associated plots, are available at https://bitbucket.org/FAznaran/piola-mapped.

Conclusion
We have generalized Piola transformation theory to incorporate non-standard H(div) elements, developing techniques for representative finite elements discretizing H(div) and H(div; S) in two dimensions: the Mardal-Tai-Winther element for Stokes-Darcy flow, and two exotic, symmetry-enforcing elements for elastic stress due to Arnold and Winther.Numerical experiments verify the accuracy of implementations which are newly enabled by our approach.All elements have been implemented in the publicly available Firedrake library, within which we have demonstrated the composability of our implementations with existing patch-based smoothers.Of independent interest is the application of Nitsche's method to dual mixed problems; further investigation is merited by the interaction between the Nitsche and augmented Lagrangian penalties, whose efficacy when combined together we have observed numerically when applied to Arnold-Winther elements for linear elasticity.
We emphasize that our theory aims to demonstrate that unusual or non-standard elements, with desirable features but perhaps complicated transformation properties, can be used in an inexpensive, composable, and automated way, rather than to advocate for the use of MTW, AW c , or AW nc specifically.In particular, the 3D analogue of the AW spaces, namely the conforming and nonconforming Arnold-Awanou-Winther elements [4,5], are of local dimension 42 and 162 respectively, and are presented in the understanding that "[t]he complexity of the elements may very well limit their practical significance" [4, p. 1231].A cheaper alternative is provided by the conforming Hu-Zhang element, which in 3D is of dimension only 48, while many nonconforming efforts in 3D are rectangular [44,84]; in keeping with Remark 4.1, we conjecture that the mapping properties of these 3D elements would be analogous to our analysis in Section 5.

Figure 2 .
Figure 2. Some triangular vector H(div) elements.Arrows represent moments of a particular vector component (normal or tangential) along a facet.

Figure 3 .
Figure 3.The Tangential-Displacement and Normal-Normal-Stress mixed element [77].Thin arrows on the tensor diagram refer to a single tensor component in the given direction (e.g.n T τ n).Another exactly symmetric element is the Hellan-Herrmann-Johnson (HHJ) element, consisting of symmetric matrix-valued polynomials of degree k ≥ 0 with continuity of the normal-normal components (that is, n T τ n is continuous across edges).A schematic is provided in Figure4.

Figure 4 .
Figure 4.The Hellan-Herrmann-Johnson element[38,39,48].The TDNNS elements are implemented in Netgen/NGSolve, while the HHJ element was introduced into FEniCS in[37] and later into Firedrake; both implementations were possible without the theory we develop in this paper since both have boundary degrees of freedom involving only the normal-normal components and hence, in light of (2.19), can be Piola-mapped in a standard way.

Figure 10 .
Figure 10.A traction-free condition except at both ends.

Figure 11 .
Figure 11.Convergence to the traction-free condition in L 2 (Γ N ; R 2 ) for the conforming and nonconforming AW elements, with and without AL penalization respectively.

J
11 J 21 J 11 J 22 + J 12 J 21 J 12 J 22 Σ h denotes the Clément-like interpolation F. R. A. Aznaran, P. E. Farrell, & R. C. Kirby operator constructed by Arnold and Winther [21,54]mark 6.4.In analogy with the RT or BDM spaces considered in[21,54], we have the useful equilibrium property div Σ h ⊂ V h , but in disanalogy we in general have (Σ h | e )n ⊂ V h | e on edges e.For error estimation, we employ intermediate approximants Π h σ, P h u, where P h : L 2 (Ω; R 2 ) → V h denotes the orthogonal projection and Π h : H 1 (Ω; S) →

Table 1 .
Errors and convergence rates in L 2 of the MTW velocity for the model problem (9.2).Here and below, N denotes the uniform refinement factor with respect to the original mesh.