Hybrid high-order methods for flow simulations in extremely large discrete fracture networks

We investigate the computational performance of hybrid high-order methods applied to ﬂow simulations in extremely large discrete fracture networks (over one million of fractures). We study the choice of basis functions, the trade-oﬀ between increasing the polynomial order and reﬁning the mesh, and how to take advantage of polygonal cells to reduce the number of degrees of freedom


Introduction
It is well known that fractures play a major role in a wide range of applications (water resources, deep geological waste storage, geothermy, among others) and cannot be neglected in subsurface modelling as they are preferential flow paths [17].Fractures are ubiquitous and have a large range of sizes (from centimeters to kilometers) and apertures.In this paper, we consider flow simulations in crystalline rocks where the rock matrix is assumed to be impervious and the fractures to have large transmissivities.We do not consider the case of fractures acting like geological barriers and situations where flow occurs in the rock matrix.A more general model can be found in [37].A possible modelling strategy of the network of fractures is the Discrete Fracture Network (DFN) approach.Fractures are modelled as ellipses or disks whose properties (orientation, size, position, transmissivity) are governed by statistical laws derived from field observations [12,8,24].This model has been enriched in [18,17] to include the evolution of fracture network formation (nucleation, growth and arrest).The networks generated with an arrest rule contain large numbers of T-shape intersections (where a fracture stops on another one) compared to X-shape intersections (the two fractures cross each other).With no arrest rule, the networks rather contain X-shape intersections.In this paper, the DFNs we consider are generated with the software DFN.lab with no arrest rule.Examples of such networks are displayed in Figure 1.The largest test case contains 1,176,566 fractures embedded in a cubic domain of size  = 200m (labelled "L200": right panel).By decreasing the cube size, one obtains smaller (and embedded) DFNs:  = 100m (left panel, labelled "L100": 152,405 fractures) and  = 150m (central panel, labelled "L150": 508,339 fractures).To better emphasize the complexity of these DFNs, Table 1 gives the total number of intersections, the maximum number intersections per fracture, and the range of transmissivity values, given as input (one value per fracture).Table 1: Description of the three DFNs.
Figure 2 displays the histogram of the number of intersections per fracture for the test case L200.
As shown in Figure 2 and in agreement with natural field observations, there are mostly fractures with a few number of intersections (below 1000) and a few ones with a very large number of intersections. Figure 3 displays, for the test case L200, the fracture with the maximum number of intersections (11,710 intersections) and indicated on Figure 2 (right).Figure 3 (center and right) emphasizes the possible complexity of the network of intersections within a fracture.
Flow simulations are of primal importance to consider more involved physical and chemical couplings in these complex DFNs.Flows in fractured rocks have been extensively studied in the litterature and dedicated numerical methods have been developed to handle the challenging geometry of DFNs; see [28] for a review and the references therein.Considering   extremely large DFNs is challenging in terms of geometrical complexity, number of degrees of freedom and, therefore, computational ressources.The major difference between the proposed methods lies in the way the DFN is meshed.
• The first possibility is to consider a mesh that is conforming to the intersections between fractures.This can be realized in a matching way with nodes matching at the intersections [19,31,32,40,26] or in a nonmatching way with nodes from each fracture that may differ.This second situation arises, for example, when choosing a different mesh step from one fracture to another, and leads to the appearance of hanging nodes that can be managed either with mortar methods [41] or with polytopal (polygonal/polyhedral) discretization methods such as the virtual element method (VEM) [2] or the hybrid high-order (HHO) method [21,20].Flow simulations in DFNs using intersectionconforming meshes and polytopal discretization methods have been reported in [27] for VEM and [13,30] for HHO.
• The alternative is to use a mesh that does not take the intersections into account while the numerical methods do.In the litterature, different methods have been proposed to handle intersections that may cut the mesh cells: with an optimization approach [7,5], by means of the extended finite element method (XFEM) [25] or with VEM [3].In [4], a node insertion procedure ensures a global conformity of the mesh used with VEM.
In this work, we focus on intersection-conforming meshes that are constrained by the intersections (first item above).The advantage of keeping the intersections explicitely in the mesh generation is that it allows one to attach unknowns to the intersection edges so that continuity conditions at these intersections are easier to impose.Moreover, we focus on HHO methods since they possess, in our view, various interesting features for flow simulation in DFNs.Indeed, HHO methods are locally conservative, they support polytopal meshes, their order of accuracy can be easily increased, and they are computationally effective owing to the use of static condensation.Moreover, HHO methods have been bridged to hybridizable discontinuous Galerkin methods in [16].HHO methods have been applied to a broad range of applications in computational mechanics and beyond; we refer the reader to the two recent textbooks [22,15] and the references therein.The main goal of the present work is to study the computational performance of HHO methods in the context of extremely large DFNs (over one million fractures).More particularly, we focus on the choice of basis functions, the trade-off between increasing the polynomial order and refining the mesh, and how to take advantage of polygonal cells to reduce the number of degrees of freedom.
The rest of this paper is organized as follows.Section 2 briefly recalls the mathematical model of single-phase flow in DFNs.Section 3 presents the extremely large DFNs considered herein, as well as a brief assessment of the quality of some generated meshes.Section 4 shortly presents the HHO method applied to flow simulation in a DFN.Section 5 discusses our results concerning the computational performances of the HHO method on extremely large DFNs.Finally, conclusions and perspectives are given in Section 6.

The flow problem
In this work, fracture networks are modeled using the DFN approach proposed in [18,17].We consider a single-phase flow problem posed in a cubic domain of size , where there are   intersecting fractures that form the computational domain.The rock matrix is assumed to be impervious.Let   be the  th fracture,  = 1, ...,   .Every fracture is assumed to be planar.Let Γ  be the boundary of the fracture   (which may be truncated by the cube).Let x be the local 2D coordinates in the fracture   .Let   be the total number of intersections between fractures,   be the  th intersection,  = 1, ...,   , and   be the set of the indices of the fractures containing   .In our applications, it is unlikely to encounter intersection segments common to more than two fractures nor overlapping fractures.Therefore, the set   only consists of two indices, the indices of the two fractures that intersect.In each fracture   , we assume that the governing equations for the hydraulic head   (scalar-valued) and for the flux per unit length u  (vector-valued) are the mass conservation equation and Darcy's law [37]: The transmissivity field   (unit [m 2 .s−1 ]) is taken as a positive-definite tensor that may be different from one fracture to another (in where n denotes the outward normal unit vector on Γ  ,    the Dirichlet load, and    is the Neumann load.As the rock matrix is supposed impervious, for every fracture , one also imposes a homogeneous Neumann boundary condition on Additionally, coupling conditions hold at the intersections between the fractures [24,38] where   , is the trace on   from the side  of the hydraulic head in the fracture   .Condition (6) enforces the continuity of the hydraulic head at the fracture intersections and condition (7) imposes the conservation of mass.

Algorithm
The mesh of the DFN is generated according to the algorithm described in [10,35].It is implemented in the C software MODFRAC .Parallelism is achieved using posix-threads.The mesh step can be the same in the entire network, in which case the mesh is said to be matching at the intersections between fractures.One can also vary the mesh step from one fracture to another, in which case the mesh is generally nonmatching at the intersections between fractures.Figure 4 shows an example of a matching and an example of a nonmatching mesh.

Notation
Let T  be the mesh of the fracture   and let T =  T  be the mesh of the DFN.Let  denote a (boundary or interior) face of the mesh.Since the fractures are two-dimensional objects, a face  is actually an edge.For every fracture   , the interior faces are collected in the set F •  , the intersection faces in the set F   and the boundary faces in the set F   , so that the set MODFRAC is a proprietary software owned by Inria and the University of Technology of Troyes (UTT) collects all the mesh faces contained in the fracture   .Let F =  F  be the set of mesh faces.
Notice that every face that lies at the intersection between two fractures appears only once in the set F .Moreover, we assume that the mesh faces are compatible with the partitioning of the domain boundary related to the Dirichlet and Neumann boundary conditions, so that we can consider the partition F = F  ∪ F \ , where F  (resp., F \ ) is the set of faces located in (resp., outside) Γ  .For a generic mesh cell  ∈ T , its boundary is denoted by  and its unit outward normal by   , F  denotes the collection of the mesh faces composing its boundary  and for all  ∈ F  ,   | denotes the unit normal to  pointing outward .

Computational resources and mesh quality
We perform the mesh generation using the software MODFRAC for the three test cases of Figure 1 on a Laptop Intel Core i7 4 cores CPU 32GiB RAM with 4 posix-threads.The quality () of a triangle  with edge lengths   ,  = 1, 2, 3 and area | |, is defined as in [9]: The quality () is equal to 1 for an equilateral triangle.The lower (), the worse the quality of .Table 2 gives examples of a mesh and its properties for the three test cases L100, L150, L200 and also the mesh generation time.

The HHO method
In this section, we describe how to apply the HHO method to DFN flow simulations.We recall that the DFN is composed of fractures   ,  = 1, . . .,   , each with its mesh T  and 6 (tensor-valued) transmissivity   .The flow problem (1)-( 2) in the fracture   is a diffusion problem, and the fracture problems are coupled together according to the conditions ( 6)-( 7).

Local HHO spaces and operators
The HHO method attaches unknowns to the mesh faces and to the mesh cells.HHO uses one polynomial of order  ≥ 0 on each mesh face and one polynomial of order  on each mesh cell.At the algebraic level, the unknowns are therefore polynomial coefficients.We make the choice  :=  + 1 for the cell polynomial degree, which offers the advantage of leading to a simpler stabilization operator than the equal-order choice  :=  (see [16]).Let P  2 be the space composed of divariate (real-valued) polynomials of total degree at most .For every mesh cell  ∈ T , P +1 2 () denotes the space composed of the restriction to  of the polynomials in P +1 2 and P  1 () the univariate polynomial space attached to a mesh face  ∈ F  (defined using an affine geometric mapping from the reference interval in R to ).The local HHO space in the mesh cell  ∈ T is A generic element in V   is denoted by p := (   ,   ) with   := (   )  ∈ F  .The HHO method requires two main ingredients in every mesh cell  ∈ T : • a local reconstruction operator to reconstruct locally a gradient operator from the cell and the face unknowns; • a stabilization term to ensure that the trace of the cell unknows and the face unknowns weakly match.
Let  = 1, . . .,   be the index of the fracture to which  belongs, i.e.,  ∈ T  (notice that  is uniquely defined from  since every mesh cell belongs to one and only one fracture).Then, the local reconstruction operator   : V   → P +1 2 (), is such that for all p := (   ,   ) ∈ V   , the function   ( p ) ∈ P +1 2 () is uniquely defined by the following equations: where ( 11) holds for all where Finally, the local bilinear form where   := (  ) is the spectral radius of   and ℓ  := ℎ  is a local length scale associated with  and here taken equal to the diameter of .(It is also possible to consider a specific length scale for each face  ∈ F  , but our numerical experiments do not indicate any advantage for this alternative choice).

Assembly of the discrete problem in the DFN
The HHO space in every fracture   ,  = 1, . . .,   , is defined as follows: A generic element in V  ℎ, is denoted by pℎ, := (  T, ,  F, ) with  T, := (   )  ∈ T  and  F, := (   )  ∈ F  .As illustrated in Figure 5, the face unknowns are uniquely defined for all  ∈ F  .Similarly, the HHO space in the DFN is defined as follows: A generic element in V  ℎ is denoted by pℎ := (  T ,  F ) with  T := (  T, ) =1,...,   and  F := (  F, ) =1,...,   .Recall that every face  that lies at the intersection between two fractures appears only once in the set F , so that the discrete unknowns attached to the face  appear only once in   F .This automatically enforces at the discrete level the condition (6) at any intersection.Moreover, to account for the Dirichlet boundary conditions on Γ  , we consider the subspaces The discrete bilinear form  ℎ : V  ℎ × V  ℎ → R at the DFN level reads as follows: where for all  = 1, . . .,   and all  ∈ T  , p (resp., ŵ ) denotes the components of pℎ (resp., ŵℎ ) attached to the mesh cell  and to the faces in F  .The discrete problem is as follows: with ℓ( ŵℎ As mentioned above, the condition (6) at any intersection between fractures is automatically enforced owing to the choice of the face unknowns, whereas the condition ( 7) is a consequence of the recovery of equilibrated fluxes as detailed in Subsection 4.3 (see in particular ( 29)).
At the algebraic level, the components of pℎ attached to the Dirichlet boundary faces are eliminated, that is, we only seek for the components of pℎ in Let F \ be the corresponding component vectors of the discrete solution pℎ := (  T ,  F ) ∈  +1 T ×  F \ (with obvious notation).Ordering the cell unknowns and then the face unknowns that are not Dirichlet, the algebraic realization of ( 21) is and the right-hand side results from the sink/source terms and the Dirichlet/Neumann boundary conditions.The matrix A is symmetric positive-definite.As the submatrix A T T is block-diagonal, a computationally effective way to solve the linear system ( 23) is to eliminate locally the cell unknowns and solve first for the face unknowns only.Defining the Schur complement matrix the global transmission problem coupling all the face unknowns is This linear system is only of size   F \ .Once it is solved, one recovers locally the cell unknowns by using that The global transmission problem (25) being symmetric positive-definite, it can be solved with a direct solver based on Cholesky's factorization or an iterative solver based on the preconditioned Conjugate Gradient (CG) algorithm.

Remark 4.1
In practice, for extremely large DFNs, it may happen that the Cholesky algorithm fails with a "non positive-definite matrix" error.We suspect an effect of the use of floating-point numbers also linked to the possibly high condition number of the linear system matrix.An alternative which proved successful in all our test cases run so far is to choose solvers based on LU factorization, like the Intel Pardiso LU solver from Intel MKL library [33].

Recovery of equilibrated fluxes
HHO is a locally conservative method for which it is possible to define equilibrated fluxes at the boundary of every mesh cell  ∈ T .Let  = 1, . . .,   be the index of the unique fracture to which  belongs.Let pℎ ∈ V  ℎ, solve the discrete problem (21).We define the following fluxes associated with pℎ : As shown in [16], these fluxes satisfy the following properties: • Balance with the source term in every mesh cell  ∈ T  : • Equilibrium at every mesh interface  ∈ F • : (a) If  does not lie on the intersection between two fractures, then  =  − ∩  + ∈ F •  , and we have • Equilibrium with the Neumann boundary conditions: For all  = 1, . . .,   and all

Study of computational performance
In this section, we study the computational performance of HHO methods in the context of extremely large DFNs.We address the choice of the basis functions in Section 5.1, and the trade-off between increasing the polynomial order and refining the mesh in Section 5.2.In these two sections, triangular meshes are considered.We refer the reader to [30] for a comparison between HHO and lowest-order Raviart-Thomas methods on triangular meshes; here, we focus on the performance of HHO methods regarding basis functions and polynomial order.Section 5.3 contains numerical experiments in order to evaluate whether computational resources can be saved by means of polygonal cells.In this last section, the mesh in each fracture still matches the various intersections of the fracture, but its size is independent of the mesh of the other fractures, thus creating hanging nodes, and thereby polygonal cells, next to the fracture intersections.
The main numerical experiment to assess the accuracy of a simulation is the computation of the equivalent permeability of a DFN by running a permeameter test case.For example, to estimate the equivalent permeability   in the -direction, we proceed as shown in Figure 6, i.e., we apply a difference of hydraulic head,   1 and   2 (with Δ  :=   1 −   2 > 0), on the two opposite cube faces along the -direction and null flux boundary conditions on the other sides of the cube.The computed input flux is denoted  in,  (units m 3 .−1 ) and the computed output flux  out,  (units m 3 .−1 ).Recall that the fluxes are computed as discussed in Subsection 4.3 and that, in the absence of round-off errors, we have  in,  =  out,  since the HHO method is conservative.Then the equivalent permeability   is estimated as Our implementation is made in the Inria C++17 software called NEF++ [30] that relies on the Eigen library, which is a C++ template library for linear algebra [29] and the open-source HHO library Disk++ [14].Parametric studies have been carried out using the Python tool Prune (Inria).In all the test cases, the linear system is solved with the Intel Pardiso LU solver from Intel MKL library [33].Except where indicated otherwise, all the simulations are carried out using the Inria cluster CLEPS on a partition with 4 Intel Xeon E7-4860 v2 with 12 cores, 2.6-3.2GHz and 3TB RAM.In all the tables reported in this section, the number of dofs is the one after static condensation.

Choice of the HHO basis functions
In the HHO method, the unknowns are polynomials coefficients, and there are various (classical) possibilities for choosing the basis functions.On every mesh face , we choose the scaled monomials    with  = 0, . . .,  and where x denotes the midpoint of  and | | its length.On every mesh cell  ∈ T , we compare two choices of scaled monomials: where ( x , ȳ ) denote the coordinates of the barycenter of  and ℎ   , ℎ   denote the lengths of the bounding box of  according to the Cartesian axes; • the basis composed of rotated monomials [1], hereafter called [Rotated], and defined as       , with ,  = 0, . . .,  =  + 1,  +  ≤  and where    ,    denote the unit vectors aligned with the inertia axes of  and ℎ   , ℎ   the lengths of the bounding box of  according to the inertia axes.
Both the Cartesian and rotated monomials are independent of translation (since they use the barycenter of ) and of homothety (since they use the length of a suitable bounding box).Moreover, the rotated monomials are also independent of rotation.
To compare the two choices for the cell basis functions (Cartesian vs. rotated), we report in Table 3 the relative error on the global mass conservation indicator | in,  −  out,  |/ in,  for these two choices on the permeameter test case described above.We observe that both choices of basis functions behave well for both DFNs for  = 0 and for the smaller DFN, L100, up to  = 2.For L100 with  > 2 and for L150 with  > 0, global mass conservation (and also local mass conservation, omitted from the table for brevity) is ensured only for the rotated basis functions.The reason of the poor behavior of the Cartesian monomials is the presence of some highly deformed mesh cells which leads to highly ill-conditioned local matrices when computing the reconstruction and stabilization operators, as discussed, e.g., in [22].In all the following simulations, we employ the rotated monomials.

Remark 5.1 Other choices for the basis functions are possible, such as an 𝐿 2 -orthonormalised basis, which can help dealing with distorted elements (see [22, §B.1.1]). We have not tested this choice because it is computationally more demanding than the choice of rotated monomials, especially with a HHO library based on on-the-fly computations of the local matrices
(as the one we are using).

Trade-off between polynomial order and mesh refinement
We investigate the trade-off between polynomial order and mesh refinement in the context of the permeameter test case described above.As a first step, we perform a flow computation on a fine mesh to obtain a reference value for the equivalent permeability   of the DFN.Then, we use this reference value to compute errors on the equivalent permeability obtained using coarser meshes.In the first step of our study, two fine meshes are built with the software MODFRAC: one with 12,890,943 triangles for the DFN L150 and one with 20,522,575 triangles for the DFN L200.  4 and Table 5 report the number of dofs, the number of non-zeros of the matrix of the linear system (nnz), the relative error (in %) on the equivalent permeability defined as RelErr(  ):

𝑥
and the relative error (in %) on the global mass conservation | in,  −  out,  |/ in,  for the test cases L150 and L200 respectively.The computational times (total time  tot ) for the fine mesh flow simulations and the peak of RAM for the two test cases L150 and L200 are given in Table 6 and 7 respectively.The postprocessing encompasses two steps: (1) the mean hydraulic head computations; (2) the flux recovery detailed in Subsection 4.3.These two steps are known to be embarrassingly parallel, i.e. they can be performed independently in each mesh cell, which means a potentially large gain in computational ressources with respect to the present serial implementation.The only difference is that our application is not really 2D since the 2D coordinates change from one fracture to another.Our current version of the code is based on a loop over the fractures.We could save computational time by an efficient parallelization of this loop.Notice also that to save RAM, neither A −1 T T to compute P T (see Subsection 4.2), nor the local gradient or the stabilization matrix are stored.This is the reason why the postprocessing step has a sizable contribution to the costs.
Let us now investigate the trade-off between polynomial order and mesh refinement.To do so, simulations are performed on different coarser meshes of the DFN L150 (Table 8) and for each mesh, we compute the resulting equivalent permeability for different polynomial degrees  ∈ {0, 1, 2, 3, 4}.
Figure 8 displays the relative error using the reference value obtained above in the first step.Considering first the left panel based on the number of dofs (after static condensation), we observe that  = 1 is the most effective choice on the three coarsest meshes if an accuracy  between 2% and 4% is desired, whereas for smaller accuracies, going for  ≥ 2 is preferable.
Considering the right panel based on the computational time leads essentially to similar conclusions.Similar conclusions are reached as well if one considers the peak of memory (results not shown for brevity).Actually, the optimal choice depends on the initial mesh as shown on Figure 9.It is computationnally less demanding to go for  = 1 on a fine mesh and to go for higher order  ≥ 2 on a coarse mesh.In summary, as expected from the linear nature of the problem and the regularity of the solution, we recommend:  • to use a face polynomial order  at least equal to 1, instead of refining a mesh while keeping  = 0, • to choose  = 1 on a fine mesh and  > 1 on a coarser mesh.To provide some further insight on the relation between number of dofs, computational time and peak memory, we report in Table 9 the computational time and RAM required for approximately the same number of dofs with polynomial orders  = 0, 1, 2, 3.As expected, the time and RAM memory requirements increase with .This is shown in Figure 10 where we plot the computational time and the peak of RAM versus the number of dofs.From Table 9 and Figure 10, we observe that the extra time and RAM memory for  ≤ 2 seem negligible in comparison with the time and RAM at order  = 0.For  > 2, the slopes become somewhat higher.We believe that these increases in time and RAM for  > 2 are due to the solver since the matrices have more nonzero elements when  increases.

How to take advantage of polygonal cells?
So far, to get an accurate solution, the numerical simulations were performed on a fine mesh according to the standard procedure described in Algorithm 1 below (where the steps are named "FT" for fully triangular).In this section, we study how to change this procedure to take advantage of polygonal cells in order to (possibly) save computational resources.
Algorithm 1 Fully triangular mesh generation and flow computation 1: (FT1) Generate a fine mesh of the DFN with matching cells (in our case triangles) at the intersections 2: (FT2) Perform the corresponding flow simulation

A mesh refinement/coarsening procedure
In DFN, flows are highly channelled, meaning that only a small number of fractures carry most of the flow.A reasonable way to decrease the number of triangles in the mesh of a DFN is, therefore, to use a fine mesh only for those fractures that mostly contribute to the flow and to use a coarse mesh for the other fractures.To do so, one can use the fact that, by choosing a mesh step specific to each fracture, the mesh generator MODFRAC one to generate triangular meshes that are nonmatching at the intersections between fractures.Then, the resulting nonmatching mesh can be transformed into a polygonal matching mesh which is supported by the HHO method.The detailed steps of the new procedure are described in Algorithm 2. The resulting meshes are called "polygonal-triangular" meshes (since they consist of triangular cells away from the intersections and polygonal cells at the intersection between fractures meshes with different mesh steps).We use therefore the notation "PT" to name the steps of Algorithm 2. The polygonal-triangular mesh can be viewed as an intermediate mesh between the two fully triangular meshes where all the fractures are meshed with either a fine or a coarse triangular mesh.Notice that coarsening at step (PT1) is limited by the geometry and intersections of the fractures (above a given mesh step, no more coarsening is possible owing to geometric constraints).
Algorithm 2 Polygonal-triangular mesh generation and flow computation 1: (PT1) Generate a coarse mesh of the DFN 2: (PT2) Perform the coarse flow simulation 3: (PT3) Decide which fractures to refine according to their flow contribution 4: (PT4) Remesh the DFN according to step (PT3): only the selected fractures are refined and the mesh is therefore nonmatching at the intersections between fractures 5: (PT5) Run a node insertion algorithm to build a (matching) polygonal mesh at intersections: the mesh is therefore composed of triangles and polygons 6: (PT6) Perform the flow computation on the new polygonal mesh obtained at step (PT5) In view of Algorithm 2, it appears that we need two additional ingredients with respect to the standard procedure of Algorithm 1: • an algorithm to select the most contributing fractures at step (PT3); • a node insertion algorithm to create polygons from nonmatching triangles at step (PT5).
These two ingredients are now described.

Selection of the most contributing fractures (step (PT3))
As proposed in [36], we characterize each fracture   by a single flow value   defined as the total flow exchanged between   and its intersecting fractures: where    is the number of intersections in the fracture   , and  , is the flow in   exchanged through the intersection ,  = 1, . . .,    .
To know which fractures to refine/coarsen, we propose a basic idea consisting in two steps: 1. for each fracture   , compute   from the coarse flow simulation performed at step (PT2) and compute the maximum value 2. choose a fine mesh step for the fractures with   / max above given threshold.

Creation of polygons from nonmatching meshes (step (PT5))
The second ingredient is an algorithm to create a polygonal mesh from a nonmatching triangular mesh.As illustrated in Figure 11, choosing a mesh step that can be different from one fracture to another generally leads to nonmatching cells at the intersections between fractures (notice that the two fractures do not lie on the same plane).One way to transform these nonmatching cells into polygonal cells is to use a vertex insertion algorithm (added vertices are displayed in red in Figure 11).The vertex insertion algorithm is described in Algorithm 3.

Algorithm 3 Vertex insertion algorithm
1: Load the nonmatching triangular mesh, including the edges and vertices of each intersection on both sides 2: For each intersection, compute the linear coordinates of the vertices on both sides in order to know where to insert the vertices to create a matching discretization 3: Find the unique set of intersection vertices and create a new global numbering of the vertices 4: Add the extra vertices to create the polygons (red dots) and update the connectivity table that defines the polygons in terms of vertices 5: Compute the 2D coordinates of the extra vertices in the fracture to which they now belong 6: Store the polygonal mesh

Numerical tests
Let us test the procedure proposed in Algorithm 2 for the DFN L150 of Figure 1.As the generated mesh is conforming to the intersections between the fractures and as the DFN L150 and L200 contain many intersections (see Table 1), the coarsening with triangles is limited.Therefore the coarse simulation remains quite costly.For this reason, we exploit the fact that the DFN L150 is embedded into the DFN L200 (Figure 1) and we reuse, for the test case DFN L200, the set of fractures selected at step (PT3) for the DFN L150.Then, we save steps (PT1-PT3) for the DFN L200 and we start Algorithm 2 at step (PT4).The potential price to pay is that some preferential flow paths in the fractures that are present in L200 but not in L150 may be missed.Using this procedure is only considered here to save some computational resources, and is not a limitation of the present algorithms.
Step (PT1): Coarse mesh generation and mesh quality The first step consists in running a coarse mesh generation with MODFRAC for the DFN L150.Let us choose a coarse mesh step of size 10 m.Table 10 gives the properties of the coarse mesh and also the mesh generation time.Step (PT3): Selection of the most contributing fractures From the coarse simulation obtained at Step (PT2), we compute for each fracture   the quantity   .Figure 12 reports the number of fractures versus   / max (%) (recall that the maximum flux  max is defined in (36)).Table 11 reports the number of selected fractures according to different choices for the threshold parameter.Step (PT4): Generation of the nonmatching mesh (with triangles) Let us choose the threshold 0.01% in Table 11 (all other choices are possible).Then 97,277 fractures are selected as the most contributing fractures at Step (PT3).We use this set of fractures for the test case L150 (19% of the DFN) and also for the test case L200 (8% of the DFN) owing to the embedded feature of the considered DFNs.For the test L150, we set a mesh step of size 1m for the 97,277 selected fractures and we coarsen the other fractures with a mesh step of 2. For the test case L200, we set a mesh step of 1.5m for the 97,277 selected fractures and we coarsen the other fractures with a mesh step of 2m.Table 12 reports the properties of the nonmatching meshes and also the mesh generation time.The quality of the mesh is computed according to Subsection 3.3.For L150, we obtain a nonmatching mesh containing 10,461,407 triangles which is 19% triangles less than the fine mesh described in Table 2.For L200, the nonmatching mesh contains 18,648,084 triangles, which is 9% less that the fine mesh described in Table 2.

Remark 5.2
The mesh generation time is nearly the same for the nonmatching mesh as for the matching mesh (compare with Table 2) despite a lower number of triangles.This comes from the fact that the intersections are discretized twice in the nonmatching case versus once in the matching case.Moreover, as the contours of the ellipses are discretized by polygonal lines, some automatic corrections happen more frequently for coarse discretization to ensure that all discrete intersections lie within the polygons [35].Some code optimization/parallelization is currently under investigation in the nonmatching case to improve the mesh generation time.

Step (PT5): Vertex insertion algorithm to build the polygonal-triangular mesh
The vertex insertion procedure used in Algorithm 3 has been implemented in a Matlab code, named NEF-flow-polygons (Inria).It takes 00:17:11 to load the nonmatching mesh and to insert the vertices for the L150 test case and 01:11:47 for the L200 test case.In both cases, about half of the time is dedicated to loading the mesh data.Figure 13 shows the polygonal mesh with the red dots representing the inserted vertices.For L200, the red dots are rather inside the cube (since the selected fractures are the ones of the test L150 and therefore rather located inside the embedding cube).Hence, a smaller number of red dots are seen on the cube surface, and these dots belong to the largest fractures, common to the test cases L150 and L200.For L150, 3.5% cells of the polygonal-triangular mesh are polygons, whereas this number is 1% for L200.Hence, the computational time to run Algorithm 3 can be improved, especially as there are several loops.A  version included in the mesh generator would be much more efficient (also as there would be no need to read the mesh data files).
Step (PT6): Flow computation on the new polygonal-triangular mesh The last stage is to run the flow simulations on the polygonal-triangular mesh built at step (PT5) using the NEF++ software.Table 13 and Table 14 report the number of dofs, the number of nonzeros (nnz) in the linear system matrix, and the relative error on the equivalent permeability RelErr(  ) and the relative error (in %) on the global mass conservation | in,  − out,  |/ in,  for the test cases L150 and L200, respectively.We observe a slight loss in mass conservation by comparison with Table 4 and Table 5, respectively.This is due to a lower mesh quality of the nonmatching mesh by comparison with the initial fully triangular matching mesh (see the columns Mean() in Tables 2 and 12).The computational time for the new polygonal-triangular mesh flow simulations and the peak of RAM for the two test cases L150 and L200 are given in Table 15

Discussion
Let us compare the results obtained with Algorithm 1 with the ones obtained with Algorithm 2 on the two test cases L150 and L200.  Figure 14 displays the relative error on the -equivalent permeability with respect to the number of dofs (#dofs) for the test case L150 and for the three types of mesh: fully triangular coarse, polygonal-triangular and fully triangular fine.We observe that the intermediate polygonal-triangular mesh produces an approximation of the -equivalent permeability as accurate as the one obtained on the fully triangular fine mesh, while saving ( + 1) * 3,904,298 dofs.This roughly represents 15% to 25% time saved and 13% to 16% RAM saved for a given polynomial degree  (compare Table 6 with Table 15).
For the test case L200, as shown in Figure 15, we observe again that the intermediate polygonal-triangular mesh produces an approximation of the -equivalent permeability as accurate as the one obtained on the fully triangular fine mesh, while saving ( + 1) * 3,026,232 dofs.This represents roughly 12% to 20% time saved and 7% to 10% RAM saved for a given degree  (see Table 7 versus Table 16).
These time savings deserve some further comments since a fair comparison requires to compare the overall time spent on steps (PT1)-(PT6) to the overall time spent on steps (FT1)-(FT2).In that case, the polygonal strategy is not the most relevant in terms of  computational time, but the gain remains in terms of RAM memory requirements.Thus, the present results still provide a proof of concept that the polygonal feature of HHO helps in decreasing the number of dofs.Moreover, this study emphasizes that staying with triangles limits the coarsening possibilities (and therefore the computational gain) due to the geometry of the DFN and intersections.To further improve the results with HHO, general polygonal cells can be generated also away from the intersections between fractures, using a triangle agglomeration strategy for example.Also, a more advanced procedure based on local error estimators as proposed for example in [6] for VEM can be quite helpful.Furthermore, alternatives to reduce the computational cost can be considered.For example, steps (PT1-PT2-PT3) could be replaced by a single step that does not require any flow computation and based on recent applications of the graph theory for DFNs to know which fractures to coarsen and to refine [34,23].Also, step (PT5) could be included in the mesh generator (avoiding a costly mesh reload as it is the case now).

Conclusion
We have successfully tested the HHO method on large scale DFNs (more than 1 million of fractures).The implementation of the method is locally and globally conservative (whatever the polynomial degree).The use of basis functions based on hierarchical scaled rotated monomials is important to avoid roundoff errors due to ill-conditioning.Moreover, the use of high order polynomials is an advantage to compute a more accurate flow without much extra time compared to a low-order method, but the RAM memory demand increases with the polynomial order for the same number of dofs.We have also shown that the use of nonmatching meshes and the polygonal feature of the HHO method allow one to exploit the channelling effect of DFN flows by reducing the number of dofs, and therefore the RAM requirements.Regarding the computational time, the strategy based on a mesh/coarsening procedure involving an additional coarse flow simulation is still costly.Other promising methods will be tested in a near future, especially the ones based on graph theory for DFN [34,23] or a posteriori error estimates [42,6].Moreover, further reduction of the number of dofs by using the unfitted HHO method [11] and the possibility to merge the triangles in (possibly non convex) polygons can be considered.Finally, we are also interested in studying iterative solvers to reduce the RAM memory requirements [39].

Figure 2 :
Figure 2: L200: histogram of the number of intersections per fracture.

Figure 4 :
Figure 4: (left) A matching mesh at the fracture intersections; (right) A nonmatching mesh at the fracture intersections.The intersections are highlighted by red lines.

Figure 5 :
Figure 5: Discrete unknowns in three mesh cells.Here, the order of the cell polynomials is  =  + 1, and  is the order of the face polynomials.(left)  = 0 and (right)  = 1 where black dots represent face unknowns and green dots cell unknowns (without necessarily meaning point evaluation).

Figure 7
reports the -equivalent permeability versus the number of dofs for polynomials degrees  = 0, 1, 2, 3, 4, and for the two DFNs L150 and L200.The equivalent permeability reaches a plateau as  increases.The reference values for   are taken for  = 4 and are equal to  ref  = 3.51e−2 m 2 .s−1 for L150 and  ref  = 3.53e−2 m 2 .s−1 for L200.Table

Figure 11 :
Figure 11: (left) Nonmatching triangular mesh; (center) Vertex insertion (red dots) to create a polygonal mesh; (right) Example of a triangle (initial vertices are in blue) transformed into a quadrangle with two aligned sides (the extra vertex is in red).This triangle/quadrangle is a zoom of the encircled triangle in the image at the center.

Table 10 :
DFN #fractures #Triangles Mean  (| |) Min  (| |) Mean() Min() Time (hh:mm:ss) Matching coarse mesh for the DFN L150.Step (PT2): Coarse flow simulation The coarse flow simulation is performed on a Laptop Intel Core i7 4 cores CPU 32GiB RAM with 4 posix-threads.It takes 6 minutes to load the coarse mesh plus 16 minutes to solve the coarse flow problem.

Figure 12 :
Figure 12: L150: number of fractures versus their percentage of the maximum output flux.

Figure 14 :
Figure 14: L150: relative error on the -equivalent permeability versus #dofs.Example of gain obtained with a polygonal discretization.

Figure 15 :
Figure 15: L200: relative error on the -equivalent permeability versus #dofs.Example of gain obtained with a polygonal discretization.

Table
. For each intersection   ,  = 1, . . .,   , let us consider the associated intersecting fractures   ,  ∈   .For all  ∈   , the fracture   can have either one side (T-junction) or two sides (X-junction) at   .These sides are enumerated using a generic index  ∈ Σ , , where Σ , consists of a single element for a T-junction and of two elements for an X-junction.Let n  , be the unit vector normal to   , pointing outward the side  of   , and tangent to   .Then, for each intersection   ,  = 1, . . .,   , the following coupling conditions are enforced:

Table 2 :
Examples of a mesh for the three DFNs: L100, L150 and L200.

Table 3 :
Choice of the cell basis functions and its influence on the relative error on the global mass conservation | in,  −  out,  |/ in,  on two DFNs as , with ,  = 0, . . .,  =  + 1,  +  ≤  and

Table 4 :
L150: Flow computation on a fine mesh.

Table 5 :
L200: Flow computation on a fine mesh.

Table 6 :
L150 : Computational resources required for the flow computation on a fine mesh: total time  tot (hh:mm), time of the different steps: Input, Assembly, Solver, Postprocessing (hh:mm and % of  tot in parentheses) and peak of RAM (GiB).

Table 7 :
L200: Computational resources required for the flow computation on a fine mesh: total time  tot (hh:mm), time of the different steps: Input, Assembly, Solver, Postprocessing (hh:mm and % of  tot in parentheses) and peak of RAM (GiB).

Table 9 :
L150 : Computational resources required for approximately a fixed number of dofs and for different polynomial orders  = 0, 1, 2, 3.

Table 11 :
L150: Number of selected fractures according to different thresholds.

Table 12 :
Nonmatching meshes for the test cases L150 and L200.

Table 15 :
L150: Computational resources required for the flow computation on the new polygonal-triangular mesh: total time  tot (hh:mm), time of the different steps: Input, Assembly, Solver, Postprocessing (hh:mm and % of  tot in parentheses) and peak of RAM (GiB).

Table 16 :
L200: Computational resources required for the flow computation on the new polygonal-triangular mesh: total time  tot (hh:mm), time of the different steps: Input, Assembly, Solver, Postprocessing (hh:mm and % of  tot in parentheses) and peak of RAM (GiB).