implementing domain decomposition algos as solvers vs preconditioners

66 Views Asked by At

Question

In learning parallel domain decomposition (DD) concepts, and subsequently interested in their implementation, the terms domain decomposition as a solver vs. as a preconditioner come up quite often, see below from "Numerical Solution of Partial Differential Equations on Parallel Computers" (Bruaset 2006):

Since the Schwarz methods introduced in Chapters 1 and 2 represent fixed-point iterations applied to preconditioned global problems, and consequently do not provide the fastest convergence possible, it is natural to apply Krylov methods instead. This provides the justification for using Schwarz methods as preconditioners rather than solvers.

As solvers, at least the parallel Schwarz method as a DD is quite clear, see below image from this talk, since one just solves in parallel on each domain and then exchanges boundary data. But how are parallel DD methods used as preconditioners for say the conjugate gradient method? And wouldn't global synchronizations as part of applying the preconditioner at each step in the preconditioned conjugate gradient algorithm be a major computational bottleneck?

enter image description here

Attempt

I think the idea is that you solve locally on each subdomain, and you then reduce-sum these solutions to get the preconditioner (in this case, the preconditioner M_inverse can be saved) for each step of the conjugate gradient algorithm acting on the global problem. However, this seems to be odd because in more complicated algorithms, e.g., balancing domain decomposition by constraints (BDDC), one would have to globally synchronize at each step of the conjugate gradient algorithm. Lastly, in solving the local problems, you could actually use a preconditioner for these linear solvers, so you end up with a sort of nested preconditioning algorithm (see end for pseudocode that perhaps clarifies this).

Definition of restricted additive Schwarz (RAS) method from "An Introduction to Domain Decomposition Methods Algorithms, Theory, and Parallel Implementation" (Dolean 2016):

enter image description here

Pseudocode for RAS as a preconditioner to a preconditioned conjugate method controlled by the master process:

def RAS(A_global, b_global, subdomain_indices):
  # Listing 3.2 from "An Introduction to Domain Decomposition Methods Algorithms, 
  # Theory, and Parallel Implementation" (Dolean 2016)

  # Get subdomain info
  subdomain_id = get_partition_id() # using MPI
  A_subdomain = A_global[subdomain_indices[subdomain_id]]
  b_subdomain = b_global[subdomain_indices[subdomain_id]]

  # local solve can ALSO have a preconditioner
  jacobi_preconditioner_subdomain = diagonal(A_subdomain)
  x_subdomain = linear_solver(
    A_subdomain, b_subdomain, preconditioner=jacobi_preconditioner_subdomain)

  # add the solutions on each subdomain 
  M_inverse = reduce_sum(x_subdomain) # using MPI, this is a global sync step

  return M_inverse


def preconditioned_conjugate_gradient(
  A_global, b_global, x0, subdomain_indices, n_iters):
  # Convergence/iteration checks of pcg occurs on the master process, 
  # and its preconditioner is the RAS
  # Algorithm 11.2 from "Scientific Computing An Introductory Survey" (Heath 2018)

  # initial residual
  rk = b_global - A_global @ x0

  # get the preconditioner via RAS and save this so that it may 
  # be used at each step of the pcg algo, SAVING M_INVERSE
  # IS NOT ALWAYS POSSIBLE AND APPLICATION OF PRECONDITIONER AT EACH
  # STEP IS MORE GENERAL ALGO!
  M_inverse = RAS(A_global, b_global, subdomain_indices)

  # APPLICATION OF PRECONDITIONER ON RESIDUAL!
  sk = M_inverse @ rk

  # cg iterations
  for k in range(n_iters)
    alpha_k = (rk.T @ sk)/(sk.T @ A @ sk)
    
    # x_{k+1}
    xk += alpha_k*sk

    # r_{k+1}
    rkp1 = rk - alpha_k*A @ sk

    betakp1 = (rkp1.T @ M_inverse @ rkp1)/(rk.T @ M_inverse @ rk)

    # APPLY PRECONDITIONER!!! s_{k+1}
    sk = M_inverse @ rkp1 + betakp1*sk 

    # update rk
    rk = rkp1
1

There are 1 best solutions below

0
Jared Frazier On BEST ANSWER

Based on reading "A Preconditioner for Substructuring Based on Constrained Energy Minimization" (Dohrman 2006) and an implementation of "A method of Finite Element Tearing and Interconnecting (FETI) and its parallel solution algorithm" (Farhat 1991) available through AMfeti, I think the use of domain decomposition methods as preconditioners vs. solvers essentially comes down to the fact that after splitting up a domain into different pieces (known as a substructures), you want to get a preconditioner (i.e., the result s of the matrix-vector multiplication between a preconditioning matrix M and a residual r like s = inv(M) @ r) for each subdomain, use an iterative solver on that subdomain, and then ensure that solutions at the boundaries of the subdomain don't "conflict" somehow. It seems getting the solutions at the subdomain boundaries not to "conflict" is how the domain decomposition methods differ (e.g., FETI uses Lagrange multipliers). So in essence, a domain decomposition method can be a solver itself OR the domain decomposition could create a preconditioner on each subdomain to be used in conjunction with an iterative solver (e.g., preconditioned conjugated gradient). I graphically summarize the use of domain decomposition methods as preconditioners for each subdomain for which separate iterative solvers then generates the solutions u_{i} and the boundary conflicts are subsequently resolved. Resolving the boundary conflict enforces a global solution on the entire domain, and this process continues until some convergence criteria is met.

enter image description here

One last verification of the fact that the preconditioner is different on each subdomain (which in a distributed setting, might each exist on separate compute nodes) is corroborated by "PETSc for Partial Differential Equations Numerical Solutions in C and Python" (Buehler 2021), I include this just to show a concrete verification of my claims in this answer. The excerpt below has bjacobi for block jacobi and asm for additive Schwarz method, respectively:

enter image description here