我有一个分布式矩阵,采用块列格式。我知道我可以将矩阵重塑为一个长向量并使用 all_gatherv 操作。我只是想避免不得不在我的代码中重塑矩阵的麻烦。所以,我想知道是否有一个 mpi all gather 操作,以便最终每个处理器都有完整矩阵的精确副本。
是否有矩阵的 MPI All Gather 操作?
计算科学
矩阵
mpi
2021-12-06 01:17:02
2个回答
有许多可能的解决方案,我建议您使用MPI_Type_struct()
. 如果您有多个不同大小的矩阵,则需要为每个矩阵提交新的数据类型(数据类型大小是静态的)或考虑更灵活的方法。
如果这是一个密集矩阵,那就很简单了;您使用 MPI_Type_create_subarray 或类似的东西(您可以使用 MPI_Type_vector 或其他任何东西自己构建它; MPI_Type_struct() 几乎是最通用的选项)将单个列定义为数据类型。(在 Fortran 中,您甚至不必这样做,您只需一次发送 (nrows) 值,但如果您想发送行,则会遇到同样的问题。)然后您需要调整数据类型,以便它们在正确的位置“开始”和“结束”,并且您已准备好以列为单位开始 allgatherv'ing:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
void printMatrix (float **m, int rows, int cols)
{
for (int i = 0; i < rows; ++i) {
printf ("%3d: ", i);
for (int j = 0; j < cols; ++j)
printf ("%2.0f ", m[i][j]);
printf ("\n");
}
}
float **allocMat (int rows, int cols)
{
float *data = (float *) malloc (rows * cols * sizeof(float));
float **matrix = (float **) malloc (rows * sizeof(float *));
for (int i = 0; i < rows; i++)
matrix[i] = & (data[i * cols]);
return matrix;
}
int main (int argc, char *argv[])
{
int size, rank;
int i, j;
const int root = 0;
const int globalncols = 10, globalnrows = 10;
int ncols, start;
int *allncols, *allstarts;
float **matrix;
MPI_Datatype columnunsized, column;
MPI_Init (&argc, &argv);
MPI_Comm_size (MPI_COMM_WORLD, &size);
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
/* everyone's number of columns and offsets */
allncols = malloc(rank * sizeof(int));
allstarts= malloc(rank * sizeof(int));
/* everyone gets a global matrix */
matrix = allocMat(globalnrows, globalncols);
for (i = 0; i < globalnrows; i++)
for (j = 0; j < globalncols; j++)
matrix[i][j] = ( i == j? 1. : 0.);
/* rank 0 print the results */
if (rank == 0) {
printf("Before:\n");
printMatrix(matrix, globalnrows, globalncols);
}
/* how many columns are we responsble for? */
ncols = (globalncols + rank)/size;
MPI_Allgather(&ncols, 1, MPI_INT, allncols, 1, MPI_INT, MPI_COMM_WORLD);
start = 0;
for (int i=0; i<rank; i++)
start += allncols[i];
MPI_Allgather(&start, 1, MPI_INT, allstarts, 1, MPI_INT, MPI_COMM_WORLD);
/* create the data type for a column of data */
int sizes[2] = {globalnrows, globalncols};
int subsizes[2] = {globalnrows, 1};
int starts[2] = {0,0};
MPI_Type_create_subarray (2, sizes, subsizes, starts, MPI_ORDER_C,
MPI_FLOAT, &columnunsized);
MPI_Type_create_resized (columnunsized, 0, sizeof(float), &column);
MPI_Type_commit(&column);
/* everyone update their columns by adding their rank to all values */
for (int row=0; row<globalnrows; row++)
for (int col=start; col<start+ncols; col++)
matrix[row][col] += rank;
/* gather the updated columns */
MPI_Allgatherv(&(matrix[0][start]), ncols, column,
&(matrix[0][0]), allncols, allstarts,
column, MPI_COMM_WORLD);
/* rank 0 print the results */
if (rank == 0) {
printf("After:\n");
printMatrix(matrix, globalnrows, globalncols);
}
MPI_Type_free (&column);
free (matrix[0]);
free (matrix);
MPI_Finalize();
return 0;
}
跑步:
$ mpirun -np 4 ./columns2
Before:
0: 1 0 0 0 0 0 0 0 0 0
1: 0 1 0 0 0 0 0 0 0 0
2: 0 0 1 0 0 0 0 0 0 0
3: 0 0 0 1 0 0 0 0 0 0
4: 0 0 0 0 1 0 0 0 0 0
5: 0 0 0 0 0 1 0 0 0 0
6: 0 0 0 0 0 0 1 0 0 0
7: 0 0 0 0 0 0 0 1 0 0
8: 0 0 0 0 0 0 0 0 1 0
9: 0 0 0 0 0 0 0 0 0 1
After:
0: 1 0 1 1 2 2 2 3 3 3
1: 0 1 1 1 2 2 2 3 3 3
2: 0 0 2 1 2 2 2 3 3 3
3: 0 0 1 2 2 2 2 3 3 3
4: 0 0 1 1 3 2 2 3 3 3
5: 0 0 1 1 2 3 2 3 3 3
6: 0 0 1 1 2 2 3 3 3 3
7: 0 0 1 1 2 2 2 4 3 3
8: 0 0 1 1 2 2 2 3 4 3
9: 0 0 1 1 2 2 2 3 3 4
其它你可能感兴趣的问题