I'm new to Haskell. As a part of an academic course, I was requested to implement a function in Haskell that calculates the determinant, rank and inverse matrix of a given matrix.
I use gaussian elimination (performing same row operations on both the original matrix and another matrix which is initialized as the identity matrix). I'm using a one-pass approach to accumulate to rank and determinant "on the fly".
The function behaves as expected for input matrices up to 600x600 (execution time in this case is 63 seconds). Any input above that size causes excessive memory usage and impractical computation time.
The inputs are 3 different matrices, sizes 800x800, 800x800 and 1000x1000.
Worth mentioning I am not allowed to use any external libraries (such as HMatrix or Data.Matrix)
I did my best to optimize the code, but since I'm new I wasn't successful at making my code work for sizes larger than 600x600.
import Data.Time.Clock
import System.IO
type Matrix = [[Double]]
-- Row operations
swapRows :: Int -> Int -> Matrix -> Matrix
swapRows i j mat = take i mat ++ [mat !! j] ++ (drop (i+1) . take j $ mat) ++ [mat !! i] ++ drop (j+1) mat
scaleRow :: Int -> Double -> Matrix -> Matrix
scaleRow i k mat = take i mat ++ [map (* k) (mat !! i)] ++ drop (i+1) mat
addRows :: Int -> Int -> Double -> Matrix -> Matrix
addRows i j k mat = take i mat ++ [zipWith (+) (mat !! i) (map (* k) (mat !! j))] ++ drop (i+1) mat
-- Gaussian elimination
eliminate :: Matrix -> (Matrix, Matrix, Double, Int)
eliminate mat = foldr eliminateRow (mat, identityMatrix (length mat), 1.0, rank) [0..length mat-1]
where
rank = length mat -- Initialize rank as the number of rows
eliminateRow row (mat, invMat, detAccum, rank) = foldl eliminateEntry (mat, invMat, detAccum, rank) [0..length mat-1]
where
eliminateEntry (mat, invMat, detAccum, rank) col
| row == col = (scaleRow row (1 / pivot) mat, scaleRow row (1 / pivot) invMat, detAccum * pivot, rank)
| mat !! col !! row /= 0 = (addRows col row (-ratio) mat, addRows col row (-ratio) invMat, detAccum, rank)
| otherwise = (mat, invMat, detAccum, rank - 1)
where
pivot = mat !! row !! row
ratio = (mat !! col !! row) / pivot
-- Matrix operations
identityMatrix :: Int -> Matrix
identityMatrix n = [[fromIntegral (fromEnum (i == j)) | j <- [1..n]] | i <- [1..n]]
-- create sub matrix n x n from matrix mat
subMatrix :: [[Double]] -> Int -> [[Double]]
subMatrix mat n = take n (map (take n) mat)
-- Testing
main :: IO ()
main = do
--let submat = [[1, 2], [3, 4]] -- 3x3 invertible matrix
contenm_headMulScal <- readFile "det_matrix(800 x 800).txt"
let mat = map (map read . words) (lines contenm_headMulScal) :: [[Double]]
let piece_dim = 600
let submat = subMatrix mat piece_dim
tt_start <- getCurrentTime
let (refMat, inverse, determinant, rank) = eliminate submat -- Calculate the ref matrix, determinant, and rank
t_end <- getCurrentTime
--putStrLn "Original Matrix:"
--printMatrix submat
putStrLn "\nDeterminant:"
print determinant
putStrLn "\nInverse Matrix:"
--printMatrix inverse
putStrLn $ "First element in the inverse matrix: " ++ show (head (head inverse))
putStrLn "\nRank:"
print rank
tt_end <- getCurrentTime
print "Runtime:"
print (diffUTCTime tt_end tt_start)
printMatrix :: Matrix -> IO ()
printMatrix mat = mapM_ (putStrLn . unwords . map show) mat
For example, I take a "piece" of 600x600 out of the 800x800 input as you can see. It works. take a larger piece (700x700, or all the input) and it become impractical - about an hour of calculation in which the computer is totally stuck due to excessive memory usage.
Thanks to Daniel Wagner for pointing out: For folks who want to play along at home, try:
countingMatrix :: Int -> Matrix
countingMatrix n = [[fromIntegral (j*n+i) | j <- [1..n]] | i <-
[1..n]]
in place of the matrix loaded from disk.
I would appreciate any suggestions.
Looks like there's a space leak/laziness issue. I don't blame you: I found it surprisingly finicky to get the evaluation behavior I wanted, even when I was already pretty sure what the problem was!
The thing you want to make sure of is that when making a new matrix, you don't hold onto any references to the old matrix. If you do, then the GC can't collect the old matrix. In particular, references include latent calculations like
take i oldMatrixoroldMatrix !! ior similar. So! Let's discuss the changes I had to make to get this to happen.First: when we make a new matrix that's mostly a copy of rows from the old matrix, but one or two new rows, we'll want a way to say "walk through the entire matrix, and force enough of its computation that you copy a pointer to a specific row, rather than a calculation that looks up that row in the old matrix later". To get this, we'll make a function that forces each element of a list, but only to WHNF. Note that, since we're passing this a matrix, this isn't a full evaluation! We don't force all the way down to matrix elements, only down to the level of matrix rows. On my first go I got this wrong, forcing all the way down to the element level, and inadvertently introduced a very serious time performance regression.
We will call this at the start of each of the row operations.
This forms the "base case" of our manual forcing annotations. Unfortunately, we now need to propagate this all the way up the expression tree -- each caller needs to make sure that any data structures it makes with one of these in a field also ties evaluation of that data structure to evaluation of the field. In
eliminateEntry, that means we need a stricter version of quadruples.Finally, in
eliminateRow, I recommend switching fromfoldltofoldl'. It doesn't appear to make a difference in this case in my tests, but it's a good habit to get into; the extra laziness thatfoldlprovides is almost never needed and frequently detrimental.With these changes, I now see a much lower memory usage: ~50s, 96MiB for the 600x600 matrix and ~128s, 167MiB for the 800x800 matrix.
There are likely opportunities for significant further optimization, if that turns out to be needed; e.g. I expect switching to a mutable array representation for your matrix would be another big boost.