-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amesos2: Reindexing of matrix #13205
Comments
As an aside, I don't think you are required to have continuous GIDs so long as you set the See, for example: https://github.com/trilinos/Trilinos/blob/master/packages/amesos2/test/solvers/Superlu_UnitTests.cpp#L372. The one big caveat with that is if you dump the matrix to matrix-market format, you'll need to remap those GIDs to be 1-based index to be in compliance with the matrix market format. Script:#!/usr/bin/env python3
import os.path
import scipy.io
import numpy as np
import sys
from itertools import (takewhile, repeat)
import glob
import pathlib
import argparse
def warn_on_missing_file(fileName):
errString = f"File {fileName} does not exist!"
print(errString)
return
def warn_on_existing_file(fileName):
errString = f"Cowardly refusing to overwrite file {fileName}.\n"
errString += "Please use --overwrite to allow overwriting to occur, or alter the output matrix filename.\n"
print(errString)
return
def draw_progress_bar(percentage, process="", num_bins=20):
num_draw = int(percentage * num_bins)
sys.stdout.write('\r')
sys.stdout.write(
f"{process} : [{'='*num_draw}{' ' * (num_bins-num_draw)}] {int(100 * percentage)}%")
sys.stdout.flush()
return
def grab_line_count(filename):
f = open(filename, 'rb')
bufgen = takewhile(lambda x: x, (f.raw.read(1024*1024)
for _ in repeat(None)))
return sum(buf.count(b'\n') for buf in bufgen)
def read_matrix_file(matrixFileName):
gids = {}
A_coo = {}
readSizeLine = False
update_line_freq = 1000
num_lines = grab_line_count(matrixFileName)
header_content = ""
with open(matrixFileName, "r") as matrixFile:
for line_num, line in enumerate(matrixFile):
if line_num % update_line_freq == 0:
draw_progress_bar((line_num+1)/num_lines, "reading matrix file")
if "%" in line:
header_content += line
continue
if not readSizeLine:
header_content += line
readSizeLine = True
continue
i, j, A_ij = line.split()
i, j = int(i), int(j)
A_coo[(i, j)] = A_ij
gids[i] = None
draw_progress_bar(1, "reading matrix file")
print()
rows = {}
row = 1
for gid in gids.keys():
rows[gid] = row
row = row + 1
return rows, A_coo, header_content
def write_matrix_file(rows, A_coo, header, outputMatrixFileName):
update_line_freq = 1000
newFileContents = header
num_matrix_entries = len(A_coo)
for mat_entry, ((gid_i, gid_j), A_ij) in enumerate(A_coo.items()):
if mat_entry % update_line_freq == 0:
draw_progress_bar((mat_entry+1)/num_matrix_entries,
"creating new matrix file")
i = rows[gid_i]
j = rows[gid_j]
newFileContents += f"{i} {j} {A_ij}\n"
draw_progress_bar(1, "creating new matrix file")
print()
with open(outputMatrixFileName, "w") as newMatrixFile:
newMatrixFile.write(newFileContents)
return
def reassign_matrix_rows(matrixFileName, newMatrixFileName):
rows, A_coo, header = read_matrix_file(matrixFileName)
write_matrix_file(rows, A_coo, header, newMatrixFileName)
return
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description='Re-assign row ids in matrix market file to force them to be contiguous and 1-base indexed')
parser.add_argument('-i', '--input-matrix',
help='Input matrix-market file', required=True)
parser.add_argument('-o', '--output-matrix',
help='Output matrix-market file', required=True)
parser.add_argument('--overwrite', action='store_true')
args = vars(parser.parse_args())
inputMatrix = args['input_matrix']
outputMatrix = args['output_matrix']
allGood = True
if not os.path.isfile(inputMatrix):
warn_on_missing_file(inputMatrix)
print("Please check your --input-matrix/-i option.")
allGood = False
if os.path.isfile(outputMatrix) and not overwrite:
warn_on_existing_file(outputMatrix)
allGood = False
reassign_matrix_rows(inputMatrix, outputMatrix) |
@MalachiTimothyPhillips Interesting! I will try to set the |
@MalachiTimothyPhillips I checked the flag, but it's not helping in our case ... I still get the same memory errors, if I don't reindex the problem ... |
@maxfirmbach Can you reproduce the issue with a matrix market file of your matrix and some Trilinos executable? |
@cgcgcg I'm pretty sure the Tpetra matrix market readers require contiguous GID numbering at the moment. That said, @maxfirmbach could dump the (non-contiguously numbered) matrix, use the conversion script to use 1-based contiguous GID numbering, read in the matrix, and then alter the GID ordering to something like @maxfirmbach Do you have a stack-trace or something for the error you are hitting? We've never had to re-index for our application code, and we have some very non-contiguous GIDs. |
@cgcgcg @MalachiTimothyPhillips Seems like with the respective flag set to #0 0x55b124a14643 in Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::getComm() const
#1 0x55b1032bccb9 in Teuchos::RCP<Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> Tpetra::Details::computeGatherMap<Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >(Teuchos::RCP<Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const>, Teuchos::RCP<Teuchos::basic_FancyOStream<char, std::char_traits<char> > > const&, bool)
#2 0x55b123dbbd55 in Teuchos::RCP<Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const Amesos2::Util::getDistributionMap<int, int, unsigned long, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >(Amesos2::EDistribution, unsigned long, Teuchos::RCP<Teuchos::Comm<int> const> const&, int, Teuchos::RCP<Tpetra::Map<int, int, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&)
#3 0x55b123f77d16 in Amesos2::Umfpack<Epetra_CrsMatrix, Epetra_MultiVector>::loadA_impl(Amesos2::EPhase)
... Maybe let's clarify what contiguous GID numbering means ...does it mean to have something like [0, 1, 2, 6, 7, 8, 12, 13, 14, ...] ? Btw, I'm using |
It's weird though ... a |
@maxfirmbach Can you pinpoint the exact location of the error inside the Amesos2 source code? Maybe some |
Contiguous GID numbering as in [0, 1, 2, .., nrow-1] as opposed to, e.g., [0, 3, 6, ..., 3*(nrow-1)]. At least, that's my understanding. I should clarify that I've only ever tested using Amesos2 + Tpetra, so maybe the discontinuous numbering does not work as expected with Epetra? |
You are right. I do not see testing for non-contiguous GIDs with Epetra. I'll take a look. |
@iyamazaki @MalachiTimothyPhillips Could be that there's a problem with Epetra ... I know this is a edge case, but a working version with Epetra would make my/our life easier switching to Tpetra without having to port all packages at once. |
@mayrmt Yes, I can track it down till |
I can reproduce the error, so hopefully I can fix it. I think we create Tpetra::Map, but then convert it to Epetra::Map for Epetra backend. |
Basically, the issue is Trilinos/packages/amesos2/src/Amesos2_EpetraRowMatrix_AbstractMatrixAdapter_def.hpp Lines 305 to 317 in 9b42dd2
We could return |
Question
Dear Amesos2-developers,
currently I'm trying to switch our in-house code from
Amesos
toAmesos2
.As, let's call it pre-processing step, we do a reindexing of our linear problem to have a continous GID numbering from 0 ... n, using
EpetraExt
(which in our case is necessary as we have GID jumps). Is there a similar feature implemented inXpetra
orTpetra
?I've seen that
Amesos2
has areindex
input parameter, but looking into the code, it seems to do ... basically nothing? From documentation:Are there any intentions to implement this feature properly into
Amesos2
?Best regards,
Max
@trilinos/amesos2
@mayrmt
The text was updated successfully, but these errors were encountered: