Linear Operators ################ *µ*\Grid provides a hierarchy of linear operators for discretizing partial differential equations on structured grids. This chapter explains the different operator types and when to use each one. Overview ******** Linear operators in *µ*\Grid fall into three categories: 1. **Generic operators**: Flexible, user-defined stencils for arbitrary convolutions 2. **Gradient/divergence operators**: FEM-based operators for computing derivatives 3. **Fused operators**: Highly optimized operators for specific PDEs (Laplace, elasticity) The general recommendation is to use fused operators when available, as they provide the best performance. Generic operators are useful for prototyping or implementing custom stencils. .. list-table:: Operator Summary :header-rows: 1 :widths: 25 35 40 * - Operator - Use Case - Performance * - ``GenericLinearOperator`` - Custom stencils, prototyping - Baseline * - ``FEMGradientOperator`` - Gradient/divergence with FEM - 2-3× faster than generic * - ``LaplaceOperator`` - Scalar Poisson problems - 2-4× faster than generic * - ``IsotropicStiffnessOperator`` - Linear elasticity (isotropic) - 5-10× faster than unfused Generic Linear Operators ************************ The ``GenericLinearOperator`` class (formerly ``StencilGradientOperator``) implements arbitrary convolution stencils. It is the most flexible operator type but also the slowest due to indirect memory access patterns. Creating a generic operator --------------------------- A generic operator requires: 1. **Stencil coefficients**: A numpy array of any shape containing the weights 2. **Offset**: The position of the stencil origin relative to array indices Example: 5-point Laplacian in 2D: .. code-block:: python import numpy as np import muGrid # 5-point Laplacian stencil h = 0.1 # Grid spacing stencil = np.array([ [0, 1, 0], [1, -4, 1], [0, 1, 0] ]) / h**2 # Offset: stencil[0,0] corresponds to neighbor at (-1, -1) offset = [-1, -1] laplace = muGrid.GenericLinearOperator(offset, stencil) Multi-component stencils ------------------------ For operators with multiple input/output components (like gradients), use higher-dimensional coefficient arrays: .. code-block:: python # 2D gradient stencil: 2 output components, 1 input, 2×2 stencil # Shape: [nb_output, nb_quad, nb_input, stencil_x, stencil_y] gradient_coeffs = np.zeros((2, 1, 1, 2, 2)) # ∂/∂x: forward difference gradient_coeffs[0, 0, 0, 0, 0] = -1/h gradient_coeffs[0, 0, 0, 1, 0] = +1/h # ∂/∂y: forward difference gradient_coeffs[1, 0, 0, 0, 0] = -1/h gradient_coeffs[1, 0, 0, 0, 1] = +1/h gradient_op = muGrid.GenericLinearOperator([0, 0], gradient_coeffs) Using generic operators ----------------------- .. code-block:: python # Create field collection with ghost regions decomposition = muGrid.CartesianDecomposition( comm, (64, 64), nb_ghosts_left=(1, 1), nb_ghosts_right=(1, 1), ) input_field = decomposition.real_field("input") output_field = decomposition.real_field("output") # Fill ghost regions before applying operator decomposition.communicate_ghosts(input_field) # Apply the operator laplace.apply(input_field, output_field) FEM Gradient Operator ********************* The ``FEMGradientOperator`` computes gradients using linear finite element shape functions. Each pixel/voxel is subdivided into simplicial elements (triangles in 2D, tetrahedra in 3D) with multiple quadrature points. Properties ---------- .. list-table:: :header-rows: 1 :widths: 20 20 60 * - Dimension - Elements - Quadrature Points * - 2D - 2 triangles per pixel - 2 (one per triangle) * - 3D - 5 tetrahedra per voxel - 5 (one per tetrahedron) Creating the operator --------------------- .. code-block:: python # Grid spacing (can be non-uniform) grid_spacing = (0.1, 0.1) # 2D # grid_spacing = (0.1, 0.1, 0.1) # 3D gradient_op = muGrid.FEMGradientOperator(dim=2, grid_spacing=grid_spacing) # Query operator properties print(f"Quadrature points: {gradient_op.nb_quad_pts}") print(f"Quadrature weights: {gradient_op.quadrature_weights}") Gradient operation ------------------ The forward operation computes gradients at quadrature points: .. code-block:: python # Input: scalar or vector field at nodes # Output: gradient tensor at quadrature points # For a displacement field u with shape [dim, nx, ny]: # gradient_op.apply(u, grad_u) produces grad_u with shape [dim, dim, quad, nx, ny] decomposition.communicate_ghosts(u_field) gradient_op.apply(u_field, gradient_field) Divergence (transpose) operation -------------------------------- The transpose operation computes the (negative) divergence, weighted by quadrature weights: .. code-block:: python # Input: tensor field at quadrature points # Output: vector field at nodes gradient_op.transpose(stress_field, force_field, quad_weights) The quadrature weights are typically the element volumes divided by the number of quadrature points per element. Laplace Operator **************** The ``LaplaceOperator`` provides an optimized implementation of the discrete Laplacian using the standard 5-point (2D) or 7-point (3D) stencil. .. code-block:: python # Create Laplacian with scaling factor # Negative scale makes the operator positive-definite (for CG solver) scale = -1.0 / h**2 laplace = muGrid.LaplaceOperator(dim=2, scale=scale) # Apply to fields decomposition.communicate_ghosts(u_field) laplace.apply(u_field, result_field) The hard-coded implementation is significantly faster than an equivalent ``GenericLinearOperator`` because: - Memory access patterns are predictable - The compiler can vectorize the inner loops (SIMD) - GPU kernels are highly optimized Isotropic Stiffness Operator **************************** The ``IsotropicStiffnessOperator2D`` and ``IsotropicStiffnessOperator3D`` classes implement fused stiffness operators for isotropic linear elastic materials. They compute: .. math:: \mathbf{f} = \mathbf{K} \mathbf{u} = \mathbf{B}^T \mathbf{C} \mathbf{B} \mathbf{u} where: - :math:`\mathbf{u}` is the displacement field - :math:`\mathbf{B}` is the strain-displacement matrix (gradient operator) - :math:`\mathbf{C}` is the material stiffness tensor - :math:`\mathbf{f}` is the internal force vector Mathematical formulation ------------------------ For isotropic materials, the stiffness tensor :math:`\mathbf{C}` depends only on two Lamé parameters: - :math:`\lambda` (first Lamé parameter, related to bulk modulus) - :math:`\mu` (shear modulus) The element stiffness matrix decomposes as: .. math:: \mathbf{K}_e = 2\mu \mathbf{G} + \lambda \mathbf{V} where :math:`\mathbf{G}` and :math:`\mathbf{V}` are geometry-only matrices computed once at construction time. This decomposition enables: 1. **Memory efficiency**: Store only 2 scalars per element instead of a full stiffness matrix 2. **Computational efficiency**: Avoid explicit matrix assembly and storage Creating the operator --------------------- .. code-block:: python # 2D operator grid_spacing = (0.1, 0.1) stiffness_op_2d = muGrid.IsotropicStiffnessOperator2D(grid_spacing) # 3D operator grid_spacing = (0.1, 0.1, 0.1) stiffness_op_3d = muGrid.IsotropicStiffnessOperator3D(grid_spacing) Material fields --------------- The operator requires two material fields containing the Lamé parameters at each element (pixel/voxel): .. code-block:: python # Create decomposition for element-based fields (one fewer grid point in each direction) # because elements are defined between nodes element_grid_pts = tuple(n - 1 for n in nb_grid_pts) element_decomposition = muGrid.CartesianDecomposition( comm, element_grid_pts, nb_subdivisions=subdivisions, nb_ghosts_left=(1,) * dim, nb_ghosts_right=(1,) * dim, ) # Create material fields with shape [nx-1, ny-1] (2D) or [nx-1, ny-1, nz-1] (3D) lambda_field = element_decomposition.real_field("lambda") mu_field = element_decomposition.real_field("mu") Computing Lamé parameters from engineering constants ---------------------------------------------------- .. code-block:: python def lame_parameters(E, nu): """ Compute Lamé parameters from Young's modulus E and Poisson's ratio nu. Parameters ---------- E : float Young's modulus nu : float Poisson's ratio Returns ------- lam : float First Lamé parameter (λ) mu : float Shear modulus (μ) """ lam = E * nu / ((1 + nu) * (1 - 2 * nu)) mu = E / (2 * (1 + nu)) return lam, mu # Example: heterogeneous material with matrix and inclusions E_matrix, E_inclusion = 1.0, 10.0 nu = 0.3 lam_matrix, mu_matrix = lame_parameters(E_matrix, nu) lam_inclusion, mu_inclusion = lame_parameters(E_inclusion, nu) # Set material properties based on phase field (0 = matrix, 1 = inclusion) lambda_field.p[...] = lam_matrix * (1 - phase) + lam_inclusion * phase mu_field.p[...] = mu_matrix * (1 - phase) + mu_inclusion * phase # Fill ghost regions (only needs to be done once) element_decomposition.communicate_ghosts(lambda_field) element_decomposition.communicate_ghosts(mu_field) Applying the operator --------------------- .. code-block:: python # Displacement and force fields on node-based grid # Shape: [dim, nx, ny] for 2D, [dim, nx, ny, nz] for 3D u_field = decomposition.real_field("displacement", (dim,)) f_field = decomposition.real_field("force", (dim,)) # Apply stiffness operator: f = K @ u decomposition.communicate_ghosts(u_field) stiffness_op.apply(u_field, lambda_field, mu_field, f_field) # Increment form: f += alpha * K @ u stiffness_op.apply_increment(u_field, lambda_field, mu_field, alpha, f_field) Complete linear elasticity example ---------------------------------- .. code-block:: python import numpy as np import muGrid from muGrid.Solvers import conjugate_gradients # Parameters nb_grid_pts = (32, 32) dim = 2 E_matrix, E_inclusion = 1.0, 10.0 nu = 0.3 # Compute Lamé parameters lam_matrix = E_matrix * nu / ((1 + nu) * (1 - 2 * nu)) mu_matrix = E_matrix / (2 * (1 + nu)) lam_inclusion = E_inclusion * nu / ((1 + nu) * (1 - 2 * nu)) mu_inclusion = E_inclusion / (2 * (1 + nu)) # Grid spacing grid_spacing = tuple(1.0 / n for n in nb_grid_pts) # Create operator stiffness_op = muGrid.IsotropicStiffnessOperator2D(grid_spacing) # Domain decomposition for nodal fields comm = muGrid.Communicator() decomposition = muGrid.CartesianDecomposition( comm, nb_grid_pts, nb_ghosts_left=(1, 1), nb_ghosts_right=(1, 1), ) # Domain decomposition for element fields element_decomposition = muGrid.CartesianDecomposition( comm, tuple(n - 1 for n in nb_grid_pts), nb_ghosts_left=(1, 1), nb_ghosts_right=(1, 1), ) # Create fields u_field = decomposition.real_field("displacement", (dim,)) f_field = decomposition.real_field("force", (dim,)) lambda_field = element_decomposition.real_field("lambda") mu_field = element_decomposition.real_field("mu") # Set up material (circular inclusion) coords = element_decomposition.coords X, Y = coords[0], coords[1] phase = ((X - 0.5)**2 + (Y - 0.5)**2 < 0.25**2).astype(float) lambda_field.p[...] = lam_matrix * (1 - phase) + lam_inclusion * phase mu_field.p[...] = mu_matrix * (1 - phase) + mu_inclusion * phase element_decomposition.communicate_ghosts(lambda_field) element_decomposition.communicate_ghosts(mu_field) # Stiffness operator wrapper for CG solver def apply_stiffness(u, f): decomposition.communicate_ghosts(u) stiffness_op.apply(u, lambda_field, mu_field, f) # Solve (with appropriate RHS setup for homogenization...) # conjugate_gradients(comm, decomposition, rhs, u_field, hessp=apply_stiffness, ...) Performance Comparison ********************** The fused ``IsotropicStiffnessOperator`` provides significant performance advantages over manually computing :math:`\mathbf{B}^T \mathbf{C} \mathbf{B}`: .. list-table:: :header-rows: 1 :widths: 40 20 40 * - Approach - Relative Speed - Memory per Element * - Full stiffness matrix - 1× (baseline) - 576 floats (3D) * - Separate B, C, B^T ops - 2-3× - Full C tensor + intermediates * - Fused isotropic operator - 5-10× - 2 floats (λ, μ) The speedup comes from: 1. **Reduced memory traffic**: Only read 2 material values per element instead of full tensors 2. **No intermediate storage**: Strain and stress computed on-the-fly 3. **Optimized kernels**: Hand-tuned CPU loops and GPU kernels 4. **Better cache utilization**: Predictable access patterns GPU Performance --------------- On GPUs, the fused operators show even greater advantages: - Atomic-free implementation using gather patterns - Shared memory optimization for cooperative node loading - High occupancy due to low register pressure Typical GPU speedups are 5-10× over unfused approaches on modern NVIDIA and AMD GPUs. When to Use Each Operator ************************* .. list-table:: :header-rows: 1 :widths: 30 70 * - Operator - Recommended Use * - ``GenericLinearOperator`` - Prototyping, custom stencils, uncommon PDEs * - ``FEMGradientOperator`` - Computing gradients/divergence, anisotropic materials * - ``LaplaceOperator`` - Scalar Poisson problems, diffusion equations * - ``IsotropicStiffnessOperator`` - Linear elasticity with isotropic materials For production code solving standard PDEs, always prefer the fused operators when available. They provide the best performance while maintaining numerical accuracy identical to the unfused approach.