Users:General FEM Analysis/Solvers Reference/Linear solvers
Contents |
Best practice
In this section all available linear solvers are explained and an example for the input syntax is given. In-house solvers are implemented as a back-up if no external solver library is available. This is why they are used in almost every benchmark example.
It is strongly recommended to use the MKL or TRILINOS package to reduce equation solving time. Best practice for standard problems on a personal computer:
PC-SOLVER 1: MKL_PARDISO DATA_STORAGE = IN_CORE ! IN_CORE OUT_OF_CORE SWAP BANDWITH = CUTHILL_MCKEE ! can even be switched of by commenting this line
For more details see the following sections with detailed explanations for all solver settings
In-house solvers
Carat++ provides two direct linear equation solvers that do not depend on any 3rd part library. Sparse matrix algebra and equation solving are part of the in-house software development.
Crout Skyline
Crout Skyline is one of the most frequently used solvers. It turned out to be the standard solver for Windows users or Linux users that do not use the TRILINOS software package. As the name already tells, it is a Crout matrix decomposition algorithm derived from Gaussian elimination. It transforms the matrix K into the form K = L D L' where L is a triangular and D a diagonal matrix. Sparse matrix storage is done by a skyline method, ^{[1]} which is not very memory efficient, but provides quick matrix data access.
A Crout Skyline solver is defined in the input file as follows:
PC-SOLVER 1: CROUT_SKYLINE BANDWITH = CUTHILL_MCKEE
The last line in the input block activates a bandwidth optimization based on the Cuthill-McKee algorithm. If a skyline storage scheme is used a bandwidth optimization is strictly recommended, because it reduces the memory overhead generated by the skyline method dramatically.
LDL
This was the first linear solver available in Carat++. It works on a Cholesky decomposition based Matrix factorization of the form K = L D L'. The LDL solver is outperformed by the Crout Skyline solver, but it is based on a compressed sparsity column scheme (CSC) ^{[2]} which is a memory efficient and general storage method for sparse matrices.
An input sequence is shown below.
PC-SOLVER 1: SANDIA_LDL BANDWITH = CUTHILL_MCKEE
Although the memory request of the CSC storage scheme does not depend on the matrix's bandwidth an optimization should be performed anyway if a direct solver is used as it reduces the fill-in produced by the equation solving algorithm.
MKL-solvers
CG
General Info:
- Conjugate Gradient Iterative Solver with optional Jacobi preconditioner
- The CG method may fail to compute the solution or compute the wrong solution if the matrix of the system is not symmetric and positive definite.
An input sequence is shown below:
PC-SOLVER 1 : MKL_CG MAX_NUM_ITER = 5000 REL_TOLERANCE = 1E-25 PRECOND=JACOBI ! PRECOND=NONE BANDWITH = CUTHILL_MCKEE
PARDISO
General Info:
- State of the art solver
- The Pardiso solver is able to solve any kind of problems. Symmetric, Unsymmetric, etc. Up to know the solver can be used for this two kind of problems.
- Solver can be run in different data storage modes
- IN_CORE: All data is kept in main memory (default)
- OUT_OF_CORE All data is written to hard disk (ooc_* files, stored in working directory)
- SWAP: Keeps data in-core until main memory size is exceeded.
An input sequence is shown below:
PC-SOLVER 1: MKL_PARDISO BANDWITH = CUTHILL_MCKEE MATRIX_TYPE = FULL !SYMMETRIC or FULL -> Default Vaule 'SYMMETRIC' DATA_STORAGE = IN_CORE ! IN_CORE OUT_OF_CORE SWAP
TRILINOS-solvers
TRILINOS offers two packages to solve large and sparse linear equation systems: Amesos, which is an interface package to direct solution algorithms, and Aztec00, providing different iterative solution methods. All TRILINOS solvers are based on a compressed row storage (CRS) ^{[3]} scheme which is very similar to the CSC method, inheriting all its drawbacks and advantages.
Amesos
Although Amesos already includes an equation solving algorithm it is not narrowed to this algorithm. Amesos provides an interface to several of the most famous and powerful open source equation solver available at this point of time. All this algorithms can be plugged in into TRILINOS and used via the same input interface.
The input syntax is shown below the explanations of the particular solvers.
Amesos KLU
KLU is Amesos' native solver, working on a pivoting LU-factorization. It performs in the range of the Crout Skyline solver.
Amesos UMFPACK
This is probably the most famous algorithm accessible via Amesos. UMFPACK appears as a built-in routine in MATLAB and is a set of routines for solving unsymmetric sparse linear systems. ^{[4]} It is a very powerful solver and yielded computation times that were smaller by a factor of 7 compared to KLU on bigger benchmarks (<100.000 degrees of freedom). The UMFPACK solver package has to be installed separately and linked via the TRILINOS build.
A drawback of the usage of UMFACK via Amesos is that Amesos adresses the 32-bit routines of UMFPACK which are only able to adress 2GB of memory.
Amesos SuperLU distributed
TRILINOS supports data management for sparse matrices on distributed memory architectues. For this reason it is common-sense to provide a distributed memory system supporting equation solver (KLU and UMFPACK can be used for parallel computing, too, but the whole data will be shifted to one machine to perform the solution in serial). The SuperLU algorithm provided Xiaoye Sherry Li ^{[5]} turned out to be a good choice for this purpose, as it is available for serial architectures as well as for shared and distributed memory systems.
PC-SOLVER 1: TRILINOS_AMESOS TYPE=KLU ! choose one of the following: KLU UMFPACK SUPER_LU_DIST BANDWITH = CUTHILL_MCKEE
Aztec00
Aztec00 is the package of TRILINOS which supports iterative equation solving. It provides several Krylow subspace solution methods such as GMRES, CG, CGS, BiCG, BiCGStab or TFQMR. As the current applications within Carat++ only deliver equation systems which are symmetric positive definite, the CG algorithm is selected by hard-code.
As preconditioning has got a heavy influence onto the convergence behaviour of a Krylow method, Aztec provides several preconditioning approaches usch as Jacobi preconditioning, the most simple approach, or preconditioning by Neumann or least squares polynomials. Incomplete Cholesky or LU factorization can be applied, too. In case of parallel computing, these methods will precondition each process independent from the other processes. So the convergence behaviour might depend on the number of processes involved.
The following section will present the input sequence and the paramters of the Aztec00 solver.
Compulsory Parameters | ||
Parameter | Values, Default(*) | Description |
---|---|---|
PC-SOLVER | int | Solver ID |
PC-SOLVER | TRILINOS_AZTEC00 | |
REL_TOLERANCE | real | break criteria |
MAX_NUM_ITER | int | maximum number of iterations to be performed |
PRECOND | NONE, JACOBI, NEUMANN, LS, DOMAIN_DECOMP | select a pre-conditioner |
SUBDOMAIN | ICC, LU | preconditioning on domain decomposition level by incomplete factorization, either Cholesky or LU. Select PRECOND = DOMAIN_DECOMP to use a sub-domain preconditioner. |
REORDER | 0,1 | determines is RCM reordering is used within incomplete factorization (1) or not (0) |
TYPE_OVERLAP | SYMMETRIC, STANDARD | determines how to treat preconditioning when different processes have computes different value for shared variables STANDARD: Only concider the information concidered by the hosting process |
OVERLAP | int | augment each sub-process matrix with rows/columns of the neighboring processes in int steps. |
GRAPH_FILL | 0,1,2 | controlls the fill-in produced by incomplete factorization preconditioning. Higher vales will lead to a better preconditionier, but request more memory |
KEEP_INFO | 0,1 | controlls if matrix information is kept in memory after the solving (1) or not(0) |
REUSE | 0,1,2 | reuse information from previous solution 0: Don't reuse anything |
OUTPUT | 0,-1,int | Give information about iteration progress 0: No information |
DIAGNOSTICS | 0,1 | Print out diagnostic data (1) or not (0) |
BANDWITH | CUTHILL_MCKEE, NONE | Choose the bandwidth optimization method |
PC-SOLVER 1 : TRILINOS_AZTECOO REL_TOLERANCE=1E-7 MAX_NUM_ITER=2000 PRECOND=JACOBI !choose: NONE JACOBI NEUMANN LS DOMAIN_DECOMP SUBDOMAIN_SOLVE=ICC !choose: ICC LU REORDER=1 TYPE_OVERLAP=SYMMETRIC OVERLAP=1 GRAPH_FILL=0 KEEP_INFO=0 REUSE=0 OUTPUT=0 DIAGNOSTICS=0 BANDWITH = CUTHILL_MCKEE
References
- ↑ http://www.netlib.org/linalg/html_templates/node96.html
- ↑ http://www.netlib.org/linalg/html_templates/node92.html#SECTION00931200000000000000
- ↑ http://www.netlib.org/linalg/html_templates/node91.html#SECTION00931100000000000000
- ↑ http://www.cise.ufl.edu/research/sparse/umfpack/
- ↑ http://crd.lbl.gov/~xiaoye/SuperLU/index.html
Whos here now: Members 0 Guests 1 Bots & Crawlers 0 |