Nektar++
Public Member Functions | Private Member Functions | Static Private Member Functions | Private Attributes | List of all members
Nektar::MultiRegions::AssemblyCommDG Class Reference

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines. More...

#include <AssemblyCommDG.h>

Public Member Functions

 ~AssemblyCommDG ()=default
 Default destructor. More...
 
 AssemblyCommDG (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap)
 
void PerformExchange (const Array< OneD, NekDouble > &testFwd, Array< OneD, NekDouble > &testBwd)
 Perform the trace exchange between processors, given the forwards and backwards spaces. More...
 

Private Member Functions

void InitialiseStructure (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
 Initalises the structure for the MPI communication. More...
 

Static Private Member Functions

static std::tuple< NekDouble, NekDouble, NekDoubleTiming (const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
 Timing of the MPI exchange method. More...
 

Private Attributes

ExchangeMethodSharedPtr m_exchange
 Chosen exchange method (either fastest parallel or serial) More...
 
int m_maxQuad = 0
 Max number of quadrature points in an element. More...
 
int m_nRanks = 0
 Number of ranks/processes/partitions. More...
 
std::map< int, std::vector< int > > m_rankSharedEdges
 Map of process to shared edge IDs. More...
 
std::map< int, std::vector< int > > m_edgeToTrace
 Map of edge ID to quad point trace indices. More...
 

Detailed Description

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines.

The AssemblyCommDG class constructs various exchange methods for performing the action of communicating trace data from the forwards space of one processor to the backwards space of the corresponding neighbour element, and vice versa.

This class initialises the structure for all exchange methods and then times to determine the fastest method for the particular system configuration, if running in serial configuration it assigns the #Serial exchange method. It then acts as a pass through to the chosen exchange method for the PerformExchange function.

Definition at line 244 of file AssemblyCommDG.h.

Constructor & Destructor Documentation

◆ ~AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::~AssemblyCommDG ( )
default

Default destructor.

◆ AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::AssemblyCommDG ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap 
)

Definition at line 339 of file AssemblyCommDG.cpp.

346 {
347  auto comm = locExp.GetComm()->GetRowComm();
348 
349  // If serial then skip initialising graph structure and the MPI timing
350  if (comm->IsSerial())
351  {
352  m_exchange =
354  }
355  else
356  {
357  // Initialise graph structure and link processes across partition
358  // boundaries
359  AssemblyCommDG::InitialiseStructure(locExp, trace, elmtToTrace,
360  bndCondExp, bndCond, perMap, comm);
361 
362  // Timing MPI comm methods, warm up with 10 iterations then time over 50
363  std::vector<ExchangeMethodSharedPtr> MPIFuncs;
364  std::vector<std::string> MPIFuncsNames;
365 
366  // Toggle off AllToAll/AllToAllV methods if cores greater than 16 for
367  // performance reasons unless override solver info parameter is present
368  if (locExp.GetSession()->MatchSolverInfo("OverrideMPI", "ON") ||
369  m_nRanks <= 16)
370  {
371  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
374  m_edgeToTrace)));
375  MPIFuncsNames.emplace_back("AllToAll");
376 
377  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
380  MPIFuncsNames.emplace_back("AllToAllV");
381  }
382 
383  MPIFuncs.emplace_back(
386  MPIFuncsNames.emplace_back("PairwiseSendRecv");
387 
388  // Disable neighbor MPI method on unsupported MPI version (below 3.0)
389  if (std::get<0>(comm->GetVersion()) >= 3)
390  {
391  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
394  MPIFuncsNames.emplace_back("NeighborAllToAllV");
395  }
396 
397  int numPoints = trace->GetNpoints();
398  int warmup = 10, iter = 50;
399  NekDouble min, max;
400  std::vector<NekDouble> avg(MPIFuncs.size(), -1);
401  bool verbose = locExp.GetSession()->DefinesCmdLineArgument("verbose");
402 
403  if (verbose && comm->TreatAsRankZero())
404  {
405  std::cout << "MPI setup for trace exchange: " << std::endl;
406  }
407 
408  // Padding for output
409  int maxStrLen = 0;
410  for (size_t i = 0; i < MPIFuncs.size(); ++i)
411  {
412  maxStrLen = MPIFuncsNames[i].size() > maxStrLen
413  ? MPIFuncsNames[i].size()
414  : maxStrLen;
415  }
416 
417  for (size_t i = 0; i < MPIFuncs.size(); ++i)
418  {
419  Timing(comm, warmup, numPoints, MPIFuncs[i]);
420  std::tie(avg[i], min, max) =
421  Timing(comm, iter, numPoints, MPIFuncs[i]);
422  if (verbose && comm->TreatAsRankZero())
423  {
424  std::cout << " " << MPIFuncsNames[i]
425  << " times (avg, min, max)"
426  << std::string(maxStrLen - MPIFuncsNames[i].size(),
427  ' ')
428  << ": " << avg[i] << " " << min << " " << max
429  << std::endl;
430  }
431  }
432 
433  // Gets the fastest MPI method
434  int fastestMPI = std::distance(
435  avg.begin(), std::min_element(avg.begin(), avg.end()));
436 
437  if (verbose && comm->TreatAsRankZero())
438  {
439  std::cout << " Chosen fastest method: "
440  << MPIFuncsNames[fastestMPI] << std::endl;
441  }
442 
443  m_exchange = MPIFuncs[fastestMPI];
444  }
445 }
static std::shared_ptr< DataType > AllocateSharedPtr(const Args &...args)
Allocate a shared pointer from the memory pool.
void InitialiseStructure(const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
Initalises the structure for the MPI communication.
int m_maxQuad
Max number of quadrature points in an element.
std::map< int, std::vector< int > > m_edgeToTrace
Map of edge ID to quad point trace indices.
int m_nRanks
Number of ranks/processes/partitions.
static std::tuple< NekDouble, NekDouble, NekDouble > Timing(const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
Timing of the MPI exchange method.
std::map< int, std::vector< int > > m_rankSharedEdges
Map of process to shared edge IDs.
ExchangeMethodSharedPtr m_exchange
Chosen exchange method (either fastest parallel or serial)
std::shared_ptr< ExchangeMethod > ExchangeMethodSharedPtr
double NekDouble

References Nektar::MultiRegions::ExpList::GetComm(), Nektar::MultiRegions::ExpList::GetSession(), InitialiseStructure(), m_edgeToTrace, m_exchange, m_maxQuad, m_nRanks, m_rankSharedEdges, and Timing().

Member Function Documentation

◆ InitialiseStructure()

void Nektar::MultiRegions::AssemblyCommDG::InitialiseStructure ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap,
const LibUtilities::CommSharedPtr comm 
)
private

Initalises the structure for the MPI communication.

This function sets up the initial structure to allow for the exchange methods to be created. This structure is contained within the member variable m_rankSharedEdges which is a map of rank to vector of the shared edges with that rank. This is filled by:

  • Create an edge to trace mapping, and realign periodic edges within this mapping so that they have the same data layout for ranks sharing periodic boundaries. - Create a list of all local edge IDs and calculate the maximum number of quadrature points used locally, then perform an AllReduce to find the maximum number of quadrature points across all ranks (for the AllToAll method). - Create a list of all boundary edge IDs except for those which are periodic - Using the boundary ID list, and all local ID list we can construct a unique list of IDs which are on a partition boundary (e.g. if doesn't occur in the local list twice, and doesn't occur in the boundary list it is on a partition boundary). We also check, if it is a periodic edge, whether the other side is local, if not we add the minimum of thetwo periodic IDs to the unique list as we must have a consistent numbering scheme across ranks. - We send the unique list to all other ranks/partitions. Each ranks unique list is then compared with the local unique edge ID list, if a match is found then the member variable m_rankSharedEdges is filled with the matching rank and unique edge ID.

Definition at line 470 of file AssemblyCommDG.cpp.

477 {
478  Array<OneD, int> tmp;
479  int quad = 0, nDim = 0, eid = 0, offset = 0;
480  const LocalRegions::ExpansionVector &locExpVector = *(locExp.GetExp());
481 
482  // Assume that each element of the expansion is of the same
483  // dimension.
484  nDim = locExpVector[0]->GetShapeDimension();
485 
486  // This sets up the edge to trace mapping and realigns periodic edges
487  if (nDim == 1)
488  {
489  for (size_t i = 0; i < trace->GetExpSize(); ++i)
490  {
491  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
492  offset = trace->GetPhys_Offset(i);
493 
494  // Check to see if this vert is periodic. If it is, then we
495  // need use the unique eid of the two points
496  auto it = perMap.find(eid);
497  if (perMap.count(eid) > 0)
498  {
499  PeriodicEntity ent = it->second[0];
500  if (!ent.isLocal) // Not sure if true in 1D
501  {
502  eid = std::min(eid, ent.id);
503  }
504  }
505 
506  m_edgeToTrace[eid].emplace_back(offset);
507  }
508  }
509  else
510  {
511  for (size_t i = 0; i < trace->GetExpSize(); ++i)
512  {
513  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
514  offset = trace->GetPhys_Offset(i);
515  quad = trace->GetExp(i)->GetTotPoints();
516 
517  // Check to see if this edge is periodic. If it is, then we
518  // need to reverse the trace order of one edge only in the
519  // edge to trace map so that the data are reversed w.r.t each
520  // other. We do this by using the minimum of the two IDs.
521  auto it = perMap.find(eid);
522  bool realign = false;
523  if (perMap.count(eid) > 0)
524  {
525  PeriodicEntity ent = it->second[0];
526  if (!ent.isLocal)
527  {
528  realign = eid == std::min(eid, ent.id);
529  eid = std::min(eid, ent.id);
530  }
531  }
532 
533  for (size_t j = 0; j < quad; ++j)
534  {
535  m_edgeToTrace[eid].emplace_back(offset + j);
536  }
537 
538  if (realign)
539  {
540  // Realign some periodic edges in m_edgeToTrace
541  Array<OneD, int> tmpArray(m_edgeToTrace[eid].size());
542  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
543  {
544  tmpArray[j] = m_edgeToTrace[eid][j];
545  }
546 
547  StdRegions::Orientation orient = it->second[0].orient;
548 
549  if (nDim == 2)
550  {
551  AssemblyMapDG::RealignTraceElement(tmpArray, orient, quad);
552  }
553  else
554  {
555  // Orient is going from face 2 -> face 1 but we want face 1
556  // -> face 2; in all cases except below these are
557  // equivalent. However below is not equivalent so we use the
558  // reverse of the mapping.
560  {
562  }
563  else if (orient == StdRegions::eDir1BwdDir2_Dir2FwdDir1)
564  {
566  }
567 
569  tmpArray, orient, trace->GetExp(i)->GetNumPoints(0),
570  trace->GetExp(i)->GetNumPoints(1));
571  }
572 
573  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
574  {
575  m_edgeToTrace[eid][j] = tmpArray[j];
576  }
577  }
578  }
579  }
580 
581  // This creates a list of all geometry of problem dimension - 1
582  // and populates the maxQuad member variable for AllToAll
583  std::vector<int> localEdgeIds;
584  for (eid = 0; eid < locExpVector.size(); ++eid)
585  {
586  LocalRegions::ExpansionSharedPtr locExpansion = locExpVector[eid];
587 
588  for (size_t j = 0; j < locExpansion->GetNtraces(); ++j)
589  {
590  int id = elmtToTrace[eid][j]->GetGeom()->GetGlobalID();
591  localEdgeIds.emplace_back(id);
592  }
593 
594  quad = locExpansion->GetTotPoints();
595  if (quad > m_maxQuad)
596  {
597  m_maxQuad = quad;
598  }
599  }
600 
601  // Find max quadrature points across all processes for AllToAll method
602  comm->AllReduce(m_maxQuad, LibUtilities::ReduceMax);
603 
604  // Create list of boundary edge IDs
605  std::set<int> bndIdList;
606  for (size_t i = 0; i < bndCond.size(); ++i)
607  {
608  // Don't add if periodic boundary type
609  if ((bndCond[i]->GetBoundaryConditionType() ==
611  {
612  continue;
613  }
614  else
615  {
616  for (size_t j = 0; j < bndCondExp[i]->GetExpSize(); ++j)
617  {
618  eid = bndCondExp[i]->GetExp(j)->GetGeom()->GetGlobalID();
619  bndIdList.insert(eid);
620  }
621  }
622  }
623 
624  // Get unique edges to send
625  std::vector<int> uniqueEdgeIds;
626  std::vector<bool> duplicated(localEdgeIds.size(), false);
627  for (size_t i = 0; i < localEdgeIds.size(); ++i)
628  {
629  eid = localEdgeIds[i];
630  for (size_t j = i + 1; j < localEdgeIds.size(); ++j)
631  {
632  if (eid == localEdgeIds[j])
633  {
634  duplicated[i] = duplicated[j] = true;
635  }
636  }
637 
638  if (!duplicated[i]) // Not duplicated in local partition
639  {
640  if (bndIdList.find(eid) == bndIdList.end()) // Not a boundary edge
641  {
642  // Check if periodic and if not local set eid to other side
643  auto it = perMap.find(eid);
644  if (it != perMap.end())
645  {
646  if (!it->second[0].isLocal)
647  {
648  uniqueEdgeIds.emplace_back(
649  std::min(eid, it->second[0].id));
650  }
651  }
652  else
653  {
654  uniqueEdgeIds.emplace_back(eid);
655  }
656  }
657  }
658  }
659 
660  // Send uniqueEdgeIds size so all partitions can prepare buffers
661  m_nRanks = comm->GetSize();
662  Array<OneD, int> rankNumEdges(m_nRanks);
663  Array<OneD, int> localEdgeSize(1, uniqueEdgeIds.size());
664  comm->AllGather(localEdgeSize, rankNumEdges);
665 
666  Array<OneD, int> rankLocalEdgeDisp(m_nRanks, 0);
667  for (size_t i = 1; i < m_nRanks; ++i)
668  {
669  rankLocalEdgeDisp[i] = rankLocalEdgeDisp[i - 1] + rankNumEdges[i - 1];
670  }
671 
672  Array<OneD, int> localEdgeIdsArray(uniqueEdgeIds.size());
673  for (size_t i = 0; i < uniqueEdgeIds.size(); ++i)
674  {
675  localEdgeIdsArray[i] = uniqueEdgeIds[i];
676  }
677 
678  // Sort localEdgeIdsArray before sending (this is important!)
679  std::sort(localEdgeIdsArray.begin(), localEdgeIdsArray.end());
680 
681  Array<OneD, int> rankLocalEdgeIds(
682  std::accumulate(rankNumEdges.begin(), rankNumEdges.end(), 0), 0);
683 
684  // Send all unique edge IDs to all partitions
685  comm->AllGatherv(localEdgeIdsArray, rankLocalEdgeIds, rankNumEdges,
686  rankLocalEdgeDisp);
687 
688  // Find what edge Ids match with other ranks
689  size_t myRank = comm->GetRank();
690  for (size_t i = 0; i < m_nRanks; ++i)
691  {
692  if (i == myRank)
693  {
694  continue;
695  }
696 
697  for (size_t j = 0; j < rankNumEdges[i]; ++j)
698  {
699  int edgeId = rankLocalEdgeIds[rankLocalEdgeDisp[i] + j];
700  if (std::find(uniqueEdgeIds.begin(), uniqueEdgeIds.end(), edgeId) !=
701  uniqueEdgeIds.end())
702  {
703  m_rankSharedEdges[i].emplace_back(edgeId);
704  }
705  }
706  }
707 }
static void RealignTraceElement(Array< OneD, int > &toAlign, StdRegions::Orientation orient, int nquad1, int nquad2=0)
std::shared_ptr< Expansion > ExpansionSharedPtr
Definition: Expansion.h:68
std::vector< ExpansionSharedPtr > ExpansionVector
Definition: Expansion.h:70
InputIterator find(InputIterator first, InputIterator last, InputIterator startingpoint, const EqualityComparable &value)
Definition: StdRegions.hpp:444

References Nektar::StdRegions::eDir1BwdDir2_Dir2FwdDir1, Nektar::StdRegions::eDir1FwdDir2_Dir2BwdDir1, Nektar::SpatialDomains::ePeriodic, Nektar::StdRegions::find(), Nektar::MultiRegions::ExpList::GetExp(), Nektar::MultiRegions::PeriodicEntity::id, Nektar::MultiRegions::PeriodicEntity::isLocal, m_edgeToTrace, m_maxQuad, m_nRanks, m_rankSharedEdges, Nektar::MultiRegions::AssemblyMapDG::RealignTraceElement(), and Nektar::LibUtilities::ReduceMax.

Referenced by AssemblyCommDG().

◆ PerformExchange()

void Nektar::MultiRegions::AssemblyCommDG::PerformExchange ( const Array< OneD, NekDouble > &  testFwd,
Array< OneD, NekDouble > &  testBwd 
)
inline

Perform the trace exchange between processors, given the forwards and backwards spaces.

Parameters
testFwdLocal forwards space of the trace (which will be sent)
testBwdLocal backwards space of the trace (which will receive contributions)

Definition at line 268 of file AssemblyCommDG.h.

270  {
271  m_exchange->PerformExchange(testFwd, testBwd);
272  }

References m_exchange.

◆ Timing()

std::tuple< NekDouble, NekDouble, NekDouble > Nektar::MultiRegions::AssemblyCommDG::Timing ( const LibUtilities::CommSharedPtr comm,
const int &  count,
const int &  num,
const ExchangeMethodSharedPtr f 
)
staticprivate

Timing of the MPI exchange method.

Timing of the exchange method f, performing the exchange count times for array of length num.

Parameters
commCommunicator
countNumber of timing iterations to run
numNumber of quadrature points to communicate
f#ExchangeMethod to time
Returns
tuple of loop times {avg, min, max}

Definition at line 720 of file AssemblyCommDG.cpp.

723 {
724  Array<OneD, NekDouble> testFwd(num, 1);
725  Array<OneD, NekDouble> testBwd(num, -2);
726 
727  LibUtilities::Timer t;
728  t.Start();
729  for (size_t i = 0; i < count; ++i)
730  {
731  f->PerformExchange(testFwd, testBwd);
732  }
733  t.Stop();
734 
735  // These can just be 'reduce' but need to setup the wrapper in comm.h
736  Array<OneD, NekDouble> minTime(1, t.TimePerTest(count));
737  comm->AllReduce(minTime, LibUtilities::ReduceMin);
738 
739  Array<OneD, NekDouble> maxTime(1, t.TimePerTest(count));
740  comm->AllReduce(maxTime, LibUtilities::ReduceMax);
741 
742  Array<OneD, NekDouble> sumTime(1, t.TimePerTest(count));
743  comm->AllReduce(sumTime, LibUtilities::ReduceSum);
744 
745  NekDouble avgTime = sumTime[0] / comm->GetSize();
746  return std::make_tuple(avgTime, minTime[0], maxTime[0]);
747 }

References Nektar::LibUtilities::ReduceMax, Nektar::LibUtilities::ReduceMin, Nektar::LibUtilities::ReduceSum, Nektar::LibUtilities::Timer::Start(), Nektar::LibUtilities::Timer::Stop(), and Nektar::LibUtilities::Timer::TimePerTest().

Referenced by AssemblyCommDG().

Member Data Documentation

◆ m_edgeToTrace

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_edgeToTrace
private

Map of edge ID to quad point trace indices.

Definition at line 284 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_exchange

ExchangeMethodSharedPtr Nektar::MultiRegions::AssemblyCommDG::m_exchange
private

Chosen exchange method (either fastest parallel or serial)

Definition at line 276 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and PerformExchange().

◆ m_maxQuad

int Nektar::MultiRegions::AssemblyCommDG::m_maxQuad = 0
private

Max number of quadrature points in an element.

Definition at line 278 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_nRanks

int Nektar::MultiRegions::AssemblyCommDG::m_nRanks = 0
private

Number of ranks/processes/partitions.

Definition at line 280 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_rankSharedEdges

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_rankSharedEdges
private

Map of process to shared edge IDs.

Definition at line 282 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().