Nektar++
Public Member Functions | Private Member Functions | Static Private Member Functions | Private Attributes | List of all members
Nektar::MultiRegions::AssemblyCommDG Class Reference

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines. More...

#include <AssemblyCommDG.h>

Public Member Functions

 ~AssemblyCommDG ()=default
 Default destructor. More...
 
 AssemblyCommDG (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap)
 
void PerformExchange (const Array< OneD, NekDouble > &testFwd, Array< OneD, NekDouble > &testBwd)
 Perform the trace exchange between processors, given the forwards and backwards spaces. More...
 

Private Member Functions

void InitialiseStructure (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
 Initalises the structure for the MPI communication. More...
 

Static Private Member Functions

static std::tuple< NekDouble, NekDouble, NekDoubleTiming (const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
 Timing of the MPI exchange method. More...
 

Private Attributes

ExchangeMethodSharedPtr m_exchange
 Chosen exchange method (either fastest parallel or serial) More...
 
int m_maxQuad = 0
 Max number of quadrature points in an element. More...
 
int m_nRanks = 0
 Number of ranks/processes/partitions. More...
 
std::map< int, std::vector< int > > m_rankSharedEdges
 Map of process to shared edge IDs. More...
 
std::map< int, std::vector< int > > m_edgeToTrace
 Map of edge ID to quad point trace indices. More...
 

Detailed Description

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines.

The AssemblyCommDG class constructs various exchange methods for performing the action of communicating trace data from the forwards space of one processor to the backwards space of the corresponding neighbour element, and vice versa.

This class initialises the structure for all exchange methods and then times to determine the fastest method for the particular system configuration, if running in serial configuration it assigns the #Serial exchange method. It then acts as a pass through to the chosen exchange method for the PerformExchange function.

Definition at line 244 of file AssemblyCommDG.h.

Constructor & Destructor Documentation

◆ ~AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::~AssemblyCommDG ( )
default

Default destructor.

◆ AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::AssemblyCommDG ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap 
)

Definition at line 339 of file AssemblyCommDG.cpp.

346 {
347  auto comm = locExp.GetComm()->GetRowComm();
348 
349  // If serial then skip initialising graph structure and the MPI timing
350  if (comm->IsSerial())
351  {
352  m_exchange =
354  }
355  else
356  {
357  // Initialise graph structure and link processes across partition
358  // boundaries
359  AssemblyCommDG::InitialiseStructure(locExp, trace, elmtToTrace,
360  bndCondExp, bndCond, perMap, comm);
361 
362  // Timing MPI comm methods, warm up with 10 iterations then time over 50
363  std::vector<ExchangeMethodSharedPtr> MPIFuncs;
364  std::vector<std::string> MPIFuncsNames;
365 
366  // Toggle off AllToAll/AllToAllV methods if cores greater than 16 for
367  // performance reasons unless override solver info parameter is present
368  if (locExp.GetSession()->MatchSolverInfo("OverrideMPI", "ON") ||
369  m_nRanks <= 16)
370  {
371  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
374  m_edgeToTrace)));
375  MPIFuncsNames.emplace_back("AllToAll");
376 
377  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
380  MPIFuncsNames.emplace_back("AllToAllV");
381  }
382 
383  MPIFuncs.emplace_back(
386  MPIFuncsNames.emplace_back("PairwiseSendRecv");
387 
388  // Disable neighbor MPI method on unsupported MPI version (below 3.0)
389  if (std::get<0>(comm->GetVersion()) >= 3)
390  {
391  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
394  MPIFuncsNames.emplace_back("NeighborAllToAllV");
395  }
396 
397  int numPoints = trace->GetNpoints();
398  int warmup = 10, iter = 50;
399  NekDouble min, max;
400  std::vector<NekDouble> avg(MPIFuncs.size(), -1);
401  bool verbose = locExp.GetSession()->DefinesCmdLineArgument("verbose");
402 
403  if (verbose && comm->GetRank() == 0)
404  {
405  std::cout << "MPI setup for trace exchange: " << std::endl;
406  }
407 
408  for (size_t i = 0; i < MPIFuncs.size(); ++i)
409  {
410  Timing(comm, warmup, numPoints, MPIFuncs[i]);
411  std::tie(avg[i], min, max) =
412  Timing(comm, iter, numPoints, MPIFuncs[i]);
413  if (verbose && comm->GetRank() == 0)
414  {
415  std::cout << " " << MPIFuncsNames[i]
416  << " times (avg, min, max): " << avg[i] << " " << min
417  << " " << max << std::endl;
418  }
419  }
420 
421  // Gets the fastest MPI method
422  int fastestMPI = std::distance(
423  avg.begin(), std::min_element(avg.begin(), avg.end()));
424 
425  if (verbose && comm->GetRank() == 0)
426  {
427  std::cout << " Chosen fastest method: "
428  << MPIFuncsNames[fastestMPI] << std::endl;
429  }
430 
431  m_exchange = MPIFuncs[fastestMPI];
432  }
433 }
static std::shared_ptr< DataType > AllocateSharedPtr(const Args &...args)
Allocate a shared pointer from the memory pool.
void InitialiseStructure(const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
Initalises the structure for the MPI communication.
int m_maxQuad
Max number of quadrature points in an element.
std::map< int, std::vector< int > > m_edgeToTrace
Map of edge ID to quad point trace indices.
int m_nRanks
Number of ranks/processes/partitions.
static std::tuple< NekDouble, NekDouble, NekDouble > Timing(const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
Timing of the MPI exchange method.
std::map< int, std::vector< int > > m_rankSharedEdges
Map of process to shared edge IDs.
ExchangeMethodSharedPtr m_exchange
Chosen exchange method (either fastest parallel or serial)
std::shared_ptr< ExchangeMethod > ExchangeMethodSharedPtr
double NekDouble

References Nektar::MultiRegions::ExpList::GetComm(), Nektar::MultiRegions::ExpList::GetSession(), InitialiseStructure(), m_edgeToTrace, m_exchange, m_maxQuad, m_nRanks, m_rankSharedEdges, and Timing().

Member Function Documentation

◆ InitialiseStructure()

void Nektar::MultiRegions::AssemblyCommDG::InitialiseStructure ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap,
const LibUtilities::CommSharedPtr comm 
)
private

Initalises the structure for the MPI communication.

This function sets up the initial structure to allow for the exchange methods to be created. This structure is contained within the member variable m_rankSharedEdges which is a map of rank to vector of the shared edges with that rank. This is filled by:

  • Create an edge to trace mapping, and realign periodic edges within this mapping so that they have the same data layout for ranks sharing periodic boundaries. - Create a list of all local edge IDs and calculate the maximum number of quadrature points used locally, then perform an AllReduce to find the maximum number of quadrature points across all ranks (for the AllToAll method). - Create a list of all boundary edge IDs except for those which are periodic - Using the boundary ID list, and all local ID list we can construct a unique list of IDs which are on a partition boundary (e.g. if doesn't occur in the local list twice, and doesn't occur in the boundary list it is on a partition boundary). We also check, if it is a periodic edge, whether the other side is local, if not we add the minimum of thetwo periodic IDs to the unique list as we must have a consistent numbering scheme across ranks. - We send the unique list to all other ranks/partitions. Each ranks unique list is then compared with the local unique edge ID list, if a match is found then the member variable m_rankSharedEdges is filled with the matching rank and unique edge ID.

Definition at line 458 of file AssemblyCommDG.cpp.

465 {
466  Array<OneD, int> tmp;
467  int quad = 0, nDim = 0, eid = 0, offset = 0;
468  const LocalRegions::ExpansionVector &locExpVector = *(locExp.GetExp());
469 
470  // Assume that each element of the expansion is of the same
471  // dimension.
472  nDim = locExpVector[0]->GetShapeDimension();
473 
474  // This sets up the edge to trace mapping and realigns periodic edges
475  if (nDim == 1)
476  {
477  for (size_t i = 0; i < trace->GetExpSize(); ++i)
478  {
479  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
480  offset = trace->GetPhys_Offset(i);
481 
482  // Check to see if this vert is periodic. If it is, then we
483  // need use the unique eid of the two points
484  auto it = perMap.find(eid);
485  if (perMap.count(eid) > 0)
486  {
487  PeriodicEntity ent = it->second[0];
488  if (!ent.isLocal) // Not sure if true in 1D
489  {
490  eid = std::min(eid, ent.id);
491  }
492  }
493 
494  m_edgeToTrace[eid].emplace_back(offset);
495  }
496  }
497  else
498  {
499  for (size_t i = 0; i < trace->GetExpSize(); ++i)
500  {
501  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
502  offset = trace->GetPhys_Offset(i);
503  quad = trace->GetExp(i)->GetTotPoints();
504 
505  // Check to see if this edge is periodic. If it is, then we
506  // need to reverse the trace order of one edge only in the
507  // edge to trace map so that the data are reversed w.r.t each
508  // other. We do this by using the minimum of the two IDs.
509  auto it = perMap.find(eid);
510  bool realign = false;
511  if (perMap.count(eid) > 0)
512  {
513  PeriodicEntity ent = it->second[0];
514  if (!ent.isLocal)
515  {
516  realign = eid == std::min(eid, ent.id);
517  eid = std::min(eid, ent.id);
518  }
519  }
520 
521  for (size_t j = 0; j < quad; ++j)
522  {
523  m_edgeToTrace[eid].emplace_back(offset + j);
524  }
525 
526  if (realign)
527  {
528  // Realign some periodic edges in m_edgeToTrace
529  Array<OneD, int> tmpArray(m_edgeToTrace[eid].size());
530  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
531  {
532  tmpArray[j] = m_edgeToTrace[eid][j];
533  }
534 
535  StdRegions::Orientation orient = it->second[0].orient;
536 
537  if (nDim == 2)
538  {
539  AssemblyMapDG::RealignTraceElement(tmpArray, orient, quad);
540  }
541  else
542  {
543  // Orient is going from face 2 -> face 1 but we want face 1
544  // -> face 2; in all cases except below these are
545  // equivalent. However below is not equivalent so we use the
546  // reverse of the mapping.
548  {
550  }
551  else if (orient == StdRegions::eDir1BwdDir2_Dir2FwdDir1)
552  {
554  }
555 
557  tmpArray, orient, trace->GetExp(i)->GetNumPoints(0),
558  trace->GetExp(i)->GetNumPoints(1));
559  }
560 
561  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
562  {
563  m_edgeToTrace[eid][j] = tmpArray[j];
564  }
565  }
566  }
567  }
568 
569  // This creates a list of all geometry of problem dimension - 1
570  // and populates the maxQuad member variable
571  std::vector<int> localEdgeIds;
572  for (eid = 0; eid < locExpVector.size(); ++eid)
573  {
574  LocalRegions::ExpansionSharedPtr locExpansion = locExpVector[eid];
575 
576  for (size_t j = 0; j < locExpansion->GetNtraces(); ++j)
577  {
578  int id = elmtToTrace[eid][j]->GetGeom()->GetGlobalID();
579  localEdgeIds.emplace_back(id);
580  }
581 
582  quad = locExpansion->GetTotPoints();
583  if (quad > m_maxQuad)
584  {
585  m_maxQuad = quad;
586  }
587  }
588 
589  // Find max quadrature points across all processes
590  comm->AllReduce(m_maxQuad, LibUtilities::ReduceMax);
591 
592  // Create list of boundary edge IDs
593  std::set<int> bndIdList;
594  for (size_t i = 0; i < bndCond.size(); ++i)
595  {
596  // Don't add if periodic boundary type
597  if ((bndCond[i]->GetBoundaryConditionType() ==
599  {
600  continue;
601  }
602  else
603  {
604  for (size_t j = 0; j < bndCondExp[i]->GetExpSize(); ++j)
605  {
606  eid = bndCondExp[i]->GetExp(j)->GetGeom()->GetGlobalID();
607  bndIdList.insert(eid);
608  }
609  }
610  }
611 
612  // Get unique edges to send
613  std::vector<int> uniqueEdgeIds;
614  std::vector<bool> duplicated(localEdgeIds.size(), false);
615  for (size_t i = 0; i < localEdgeIds.size(); ++i)
616  {
617  eid = localEdgeIds[i];
618  for (size_t j = i + 1; j < localEdgeIds.size(); ++j)
619  {
620  if (eid == localEdgeIds[j])
621  {
622  duplicated[i] = duplicated[j] = true;
623  }
624  }
625 
626  if (!duplicated[i]) // Not duplicated in local partition
627  {
628  if (bndIdList.find(eid) == bndIdList.end()) // Not a boundary edge
629  {
630  // Check if periodic and if not local set eid to other side
631  auto it = perMap.find(eid);
632  if (it != perMap.end())
633  {
634  if (!it->second[0].isLocal)
635  {
636  uniqueEdgeIds.emplace_back(
637  std::min(eid, it->second[0].id));
638  }
639  }
640  else
641  {
642  uniqueEdgeIds.emplace_back(eid);
643  }
644  }
645  }
646  }
647 
648  // Send uniqueEdgeIds size so all partitions can prepare buffers
649  m_nRanks = comm->GetSize();
650  Array<OneD, int> rankNumEdges(m_nRanks);
651  Array<OneD, int> localEdgeSize(1, uniqueEdgeIds.size());
652  comm->AllGather(localEdgeSize, rankNumEdges);
653 
654  Array<OneD, int> rankLocalEdgeDisp(m_nRanks, 0);
655  for (size_t i = 1; i < m_nRanks; ++i)
656  {
657  rankLocalEdgeDisp[i] = rankLocalEdgeDisp[i - 1] + rankNumEdges[i - 1];
658  }
659 
660  Array<OneD, int> localEdgeIdsArray(uniqueEdgeIds.size());
661  for (size_t i = 0; i < uniqueEdgeIds.size(); ++i)
662  {
663  localEdgeIdsArray[i] = uniqueEdgeIds[i];
664  }
665 
666  // Sort localEdgeIdsArray before sending (this is important!)
667  std::sort(localEdgeIdsArray.begin(), localEdgeIdsArray.end());
668 
669  Array<OneD, int> rankLocalEdgeIds(
670  std::accumulate(rankNumEdges.begin(), rankNumEdges.end(), 0), 0);
671 
672  // Send all unique edge IDs to all partitions
673  comm->AllGatherv(localEdgeIdsArray, rankLocalEdgeIds, rankNumEdges,
674  rankLocalEdgeDisp);
675 
676  // Find what edge Ids match with other ranks
677  size_t myRank = comm->GetRank();
678  Array<OneD, int> perTraceSend(m_nRanks, 0);
679  for (size_t i = 0; i < m_nRanks; ++i)
680  {
681  if (i == myRank)
682  {
683  continue;
684  }
685 
686  for (size_t j = 0; j < rankNumEdges[i]; ++j)
687  {
688  int edgeId = rankLocalEdgeIds[rankLocalEdgeDisp[i] + j];
689  if (std::find(uniqueEdgeIds.begin(), uniqueEdgeIds.end(), edgeId) !=
690  uniqueEdgeIds.end())
691  {
692  m_rankSharedEdges[i].emplace_back(edgeId);
693  }
694  }
695  }
696 }
static void RealignTraceElement(Array< OneD, int > &toAlign, StdRegions::Orientation orient, int nquad1, int nquad2=0)
std::shared_ptr< Expansion > ExpansionSharedPtr
Definition: Expansion.h:68
std::vector< ExpansionSharedPtr > ExpansionVector
Definition: Expansion.h:70
InputIterator find(InputIterator first, InputIterator last, InputIterator startingpoint, const EqualityComparable &value)
Definition: StdRegions.hpp:327

References Nektar::StdRegions::eDir1BwdDir2_Dir2FwdDir1, Nektar::StdRegions::eDir1FwdDir2_Dir2BwdDir1, Nektar::SpatialDomains::ePeriodic, Nektar::StdRegions::find(), Nektar::MultiRegions::ExpList::GetExp(), Nektar::MultiRegions::PeriodicEntity::id, Nektar::MultiRegions::PeriodicEntity::isLocal, m_edgeToTrace, m_maxQuad, m_nRanks, m_rankSharedEdges, Nektar::MultiRegions::AssemblyMapDG::RealignTraceElement(), and Nektar::LibUtilities::ReduceMax.

Referenced by AssemblyCommDG().

◆ PerformExchange()

void Nektar::MultiRegions::AssemblyCommDG::PerformExchange ( const Array< OneD, NekDouble > &  testFwd,
Array< OneD, NekDouble > &  testBwd 
)
inline

Perform the trace exchange between processors, given the forwards and backwards spaces.

Parameters
testFwdLocal forwards space of the trace (which will be sent)
testBwdLocal bacwards space of the trace (which will receive contributions)

Definition at line 268 of file AssemblyCommDG.h.

270  {
271  m_exchange->PerformExchange(testFwd, testBwd);
272  }

References m_exchange.

◆ Timing()

std::tuple< NekDouble, NekDouble, NekDouble > Nektar::MultiRegions::AssemblyCommDG::Timing ( const LibUtilities::CommSharedPtr comm,
const int &  count,
const int &  num,
const ExchangeMethodSharedPtr f 
)
staticprivate

Timing of the MPI exchange method.

Timing of the exchange method f, performing the exchange count times for array of length num.

Parameters
commCommunicator
countNumber of timing iterations to run
numNumber of quadrature points to communicate
f#ExchangeMethod to time
Returns
tuple of loop times {avg, min, max}

Definition at line 709 of file AssemblyCommDG.cpp.

712 {
713  Array<OneD, NekDouble> testFwd(num, 1);
714  Array<OneD, NekDouble> testBwd(num, -2);
715 
716  LibUtilities::Timer t;
717  t.Start();
718  for (size_t i = 0; i < count; ++i)
719  {
720  f->PerformExchange(testFwd, testBwd);
721  }
722  t.Stop();
723 
724  // These can just be 'reduce' but need to setup the wrapper in comm.h
725  Array<OneD, NekDouble> minTime(1, t.TimePerTest(count));
726  comm->AllReduce(minTime, LibUtilities::ReduceMin);
727 
728  Array<OneD, NekDouble> maxTime(1, t.TimePerTest(count));
729  comm->AllReduce(maxTime, LibUtilities::ReduceMax);
730 
731  Array<OneD, NekDouble> sumTime(1, t.TimePerTest(count));
732  comm->AllReduce(sumTime, LibUtilities::ReduceSum);
733 
734  NekDouble avgTime = sumTime[0] / comm->GetSize();
735  return std::make_tuple(avgTime, minTime[0], maxTime[0]);
736 }

References Nektar::LibUtilities::ReduceMax, Nektar::LibUtilities::ReduceMin, Nektar::LibUtilities::ReduceSum, Nektar::LibUtilities::Timer::Start(), Nektar::LibUtilities::Timer::Stop(), and Nektar::LibUtilities::Timer::TimePerTest().

Referenced by AssemblyCommDG().

Member Data Documentation

◆ m_edgeToTrace

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_edgeToTrace
private

Map of edge ID to quad point trace indices.

Definition at line 284 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_exchange

ExchangeMethodSharedPtr Nektar::MultiRegions::AssemblyCommDG::m_exchange
private

Chosen exchange method (either fastest parallel or serial)

Definition at line 276 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and PerformExchange().

◆ m_maxQuad

int Nektar::MultiRegions::AssemblyCommDG::m_maxQuad = 0
private

Max number of quadrature points in an element.

Definition at line 278 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_nRanks

int Nektar::MultiRegions::AssemblyCommDG::m_nRanks = 0
private

Number of ranks/processes/partitions.

Definition at line 280 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_rankSharedEdges

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_rankSharedEdges
private

Map of process to shared edge IDs.

Definition at line 282 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().