Nektar++
Public Member Functions | Private Member Functions | Static Private Member Functions | Private Attributes | List of all members
Nektar::MultiRegions::AssemblyCommDG Class Reference

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines. More...

#include <AssemblyCommDG.h>

Public Member Functions

 ~AssemblyCommDG ()=default
 Default destructor. More...
 
 AssemblyCommDG (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap)
 
void PerformExchange (const Array< OneD, NekDouble > &testFwd, Array< OneD, NekDouble > &testBwd)
 Perform the trace exchange between processors, given the forwards and backwards spaces. More...
 

Private Member Functions

void InitialiseStructure (const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
 Initalises the structure for the MPI communication. More...
 

Static Private Member Functions

static std::tuple< NekDouble, NekDouble, NekDoubleTiming (const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
 Timing of the MPI exchange method. More...
 

Private Attributes

ExchangeMethodSharedPtr m_exchange
 Chosen exchange method (either fastest parallel or serial) More...
 
int m_maxQuad = 0
 Max number of quadrature points in an element. More...
 
int m_nRanks = 0
 Number of ranks/processes/partitions. More...
 
std::map< int, std::vector< int > > m_rankSharedEdges
 Map of process to shared edge IDs. More...
 
std::map< int, std::vector< int > > m_edgeToTrace
 Map of edge ID to quad point trace indices. More...
 

Detailed Description

Implements communication for populating forward and backwards spaces across processors in the discontinuous Galerkin routines.

The AssemblyCommDG class constructs various exchange methods for performing the action of communicating trace data from the forwards space of one processor to the backwards space of the corresponding neighbour element, and vice versa.

This class initialises the structure for all exchange methods and then times to determine the fastest method for the particular system configuration, if running in serial configuration it assigns the #Serial exchange method. It then acts as a pass through to the chosen exchange method for the PerformExchange function.

Definition at line 246 of file AssemblyCommDG.h.

Constructor & Destructor Documentation

◆ ~AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::~AssemblyCommDG ( )
default

Default destructor.

◆ AssemblyCommDG()

Nektar::MultiRegions::AssemblyCommDG::AssemblyCommDG ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap 
)

Definition at line 337 of file AssemblyCommDG.cpp.

344 {
345  auto comm = locExp.GetSession()->GetComm()->GetRowComm();
346 
347  // If serial then skip initialising graph structure and the MPI timing
348  if (comm->IsSerial())
349  {
350  m_exchange =
352  }
353  else
354  {
355  // Initialise graph structure and link processes across partition
356  // boundaries
357  AssemblyCommDG::InitialiseStructure(locExp, trace, elmtToTrace,
358  bndCondExp, bndCond, perMap, comm);
359 
360  // Timing MPI comm methods, warm up with 10 iterations then time over 50
361  std::vector<ExchangeMethodSharedPtr> MPIFuncs;
362  std::vector<std::string> MPIFuncsNames;
363 
364  // Toggle off AllToAll/AllToAllV methods if cores greater than 16 for
365  // performance reasons unless override solver info parameter is present
366  if (locExp.GetSession()->MatchSolverInfo("OverrideMPI", "ON") ||
367  m_nRanks <= 16)
368  {
369  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
372  m_edgeToTrace)));
373  MPIFuncsNames.emplace_back("AllToAll");
374 
375  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
378  MPIFuncsNames.emplace_back("AllToAllV");
379  }
380 
381  MPIFuncs.emplace_back(
384  MPIFuncsNames.emplace_back("PairwiseSendRecv");
385 
386  // Disable neighbor MPI method on unsupported MPI version (below 3.0)
387  if (std::get<0>(comm->GetVersion()) >= 3)
388  {
389  MPIFuncs.emplace_back(ExchangeMethodSharedPtr(
392  MPIFuncsNames.emplace_back("NeighborAllToAllV");
393  }
394 
395  int numPoints = trace->GetNpoints();
396  int warmup = 10, iter = 50;
397  NekDouble min, max;
398  std::vector<NekDouble> avg(MPIFuncs.size(), -1);
399  bool verbose = locExp.GetSession()->DefinesCmdLineArgument("verbose");
400 
401  if (verbose && comm->GetRank() == 0)
402  {
403  std::cout << "MPI setup for trace exchange: " << std::endl;
404  }
405 
406  for (size_t i = 0; i < MPIFuncs.size(); ++i)
407  {
408  Timing(comm, warmup, numPoints, MPIFuncs[i]);
409  std::tie(avg[i], min, max) =
410  Timing(comm, iter, numPoints, MPIFuncs[i]);
411  if (verbose && comm->GetRank() == 0)
412  {
413  std::cout << " " << MPIFuncsNames[i]
414  << " times (avg, min, max): " << avg[i] << " " << min
415  << " " << max << std::endl;
416  }
417  }
418 
419  // Gets the fastest MPI method
420  int fastestMPI = std::distance(
421  avg.begin(), std::min_element(avg.begin(), avg.end()));
422 
423  if (verbose && comm->GetRank() == 0)
424  {
425  std::cout << " Chosen fastest method: "
426  << MPIFuncsNames[fastestMPI] << std::endl;
427  }
428 
429  m_exchange = MPIFuncs[fastestMPI];
430  }
431 }
static std::shared_ptr< DataType > AllocateSharedPtr(const Args &...args)
Allocate a shared pointer from the memory pool.
void InitialiseStructure(const ExpList &locExp, const ExpListSharedPtr &trace, const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &elmtToTrace, const Array< OneD, const ExpListSharedPtr > &bndCondExp, const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &bndCond, const PeriodicMap &perMap, const LibUtilities::CommSharedPtr &comm)
Initalises the structure for the MPI communication.
int m_maxQuad
Max number of quadrature points in an element.
std::map< int, std::vector< int > > m_edgeToTrace
Map of edge ID to quad point trace indices.
int m_nRanks
Number of ranks/processes/partitions.
static std::tuple< NekDouble, NekDouble, NekDouble > Timing(const LibUtilities::CommSharedPtr &comm, const int &count, const int &num, const ExchangeMethodSharedPtr &f)
Timing of the MPI exchange method.
std::map< int, std::vector< int > > m_rankSharedEdges
Map of process to shared edge IDs.
ExchangeMethodSharedPtr m_exchange
Chosen exchange method (either fastest parallel or serial)
std::shared_ptr< ExchangeMethod > ExchangeMethodSharedPtr
double NekDouble

References Nektar::MultiRegions::ExpList::GetSession(), InitialiseStructure(), m_edgeToTrace, m_exchange, m_maxQuad, m_nRanks, m_rankSharedEdges, and Timing().

Member Function Documentation

◆ InitialiseStructure()

void Nektar::MultiRegions::AssemblyCommDG::InitialiseStructure ( const ExpList locExp,
const ExpListSharedPtr trace,
const Array< OneD, Array< OneD, LocalRegions::ExpansionSharedPtr >> &  elmtToTrace,
const Array< OneD, const ExpListSharedPtr > &  bndCondExp,
const Array< OneD, const SpatialDomains::BoundaryConditionShPtr > &  bndCond,
const PeriodicMap perMap,
const LibUtilities::CommSharedPtr comm 
)
private

Initalises the structure for the MPI communication.

This function sets up the initial structure to allow for the exchange methods to be created. This structure is contained within the member variable m_rankSharedEdges which is a map of rank to vector of the shared edges with that rank. This is filled by:

  • Create an edge to trace mapping, and realign periodic edges within this mapping so that they have the same data layout for ranks sharing periodic boundaries.
  • Create a list of all local edge IDs and calculate the maximum number of quadrature points used locally, then perform an AllReduce to find the maximum number of quadrature points across all ranks (for the AllToAll method).
  • Create a list of all boundary edge IDs except for those which are periodic
  • Using the boundary ID list, and all local ID list we can construct a unique list of IDs which are on a partition boundary (e.g. if doesn't occur in the local list twice, and doesn't occur in the boundary list it is on a partition boundary). We also check, if it is a periodic edge, whether the other side is local, if not we add the minimum of the two periodic IDs to the unique list as we must have a consistent numbering scheme across ranks.
  • We send the unique list to all other ranks/partitions. Each ranks unique list is then compared with the local unique edge ID list, if a match is found then the member variable m_rankSharedEdges is filled with the matching rank and unique edge ID.

Definition at line 458 of file AssemblyCommDG.cpp.

465 {
466  Array<OneD, int> tmp;
467  int quad = 0, nDim = 0, eid = 0, offset = 0;
468  const LocalRegions::ExpansionVector &locExpVector = *(locExp.GetExp());
469 
470  // Assume that each element of the expansion is of the same
471  // dimension.
472  nDim = locExpVector[0]->GetShapeDimension();
473 
474  // This sets up the edge to trace mapping and realigns periodic edges
475  if (nDim == 1)
476  {
477  for (size_t i = 0; i < trace->GetExpSize(); ++i)
478  {
479  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
480  offset = trace->GetPhys_Offset(i);
481 
482  // Check to see if this vert is periodic. If it is, then we
483  // need use the unique eid of the two points
484  auto it = perMap.find(eid);
485  if (perMap.count(eid) > 0)
486  {
487  PeriodicEntity ent = it->second[0];
488  if (!ent.isLocal) // Not sure if true in 1D
489  {
490  eid = std::min(eid, ent.id);
491  }
492  }
493 
494  m_edgeToTrace[eid].emplace_back(offset);
495  }
496  }
497  else
498  {
499  for (size_t i = 0; i < trace->GetExpSize(); ++i)
500  {
501  eid = trace->GetExp(i)->GetGeom()->GetGlobalID();
502  offset = trace->GetPhys_Offset(i);
503  quad = trace->GetExp(i)->GetTotPoints();
504 
505  // Check to see if this edge is periodic. If it is, then we
506  // need to reverse the trace order of one edge only in the
507  // edge to trace map so that the data are reversed w.r.t each
508  // other. We do this by using the minimum of the two IDs.
509  auto it = perMap.find(eid);
510  bool realign = false;
511  if (perMap.count(eid) > 0)
512  {
513  PeriodicEntity ent = it->second[0];
514  if (!ent.isLocal)
515  {
516  realign = eid == std::min(eid, ent.id);
517  eid = std::min(eid, ent.id);
518  }
519  }
520 
521  for (size_t j = 0; j < quad; ++j)
522  {
523  m_edgeToTrace[eid].emplace_back(offset + j);
524  }
525 
526  if (realign)
527  {
528  // Realign some periodic edges in m_edgeToTrace
529  Array<OneD, int> tmpArray(m_edgeToTrace[eid].size());
530  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
531  {
532  tmpArray[j] = m_edgeToTrace[eid][j];
533  }
534 
535  StdRegions::Orientation orient = it->second[0].orient;
536 
537  if (nDim == 2)
538  {
539  AssemblyMapDG::RealignTraceElement(tmpArray, orient, quad);
540  }
541  else
542  {
543  // Orient is going from face 2 -> face 1 but we want face 1
544  // -> face 2; in all cases except below these are
545  // equivalent. However below is not equivalent so we use the
546  // reverse of the mapping.
548  {
550  }
551  else if (orient == StdRegions::eDir1BwdDir2_Dir2FwdDir1)
552  {
554  }
555 
557  tmpArray, orient, trace->GetExp(i)->GetNumPoints(0),
558  trace->GetExp(i)->GetNumPoints(1));
559  }
560 
561  for (size_t j = 0; j < m_edgeToTrace[eid].size(); ++j)
562  {
563  m_edgeToTrace[eid][j] = tmpArray[j];
564  }
565  }
566  }
567  }
568 
569  // This creates a list of all geometry of problem dimension - 1
570  // and populates the maxQuad member variable
571  std::vector<int> localEdgeIds;
572  for (eid = 0; eid < locExpVector.size(); ++eid)
573  {
574  LocalRegions::ExpansionSharedPtr locExpansion = locExpVector[eid];
575 
576  for (size_t j = 0; j < locExpansion->GetNtraces(); ++j)
577  {
578  int id = elmtToTrace[eid][j]->GetGeom()->GetGlobalID();
579  localEdgeIds.emplace_back(id);
580  }
581 
582  quad = locExpansion->GetTotPoints();
583  if (quad > m_maxQuad)
584  {
585  m_maxQuad = quad;
586  }
587  }
588 
589  // Find max quadrature points across all processes
590  comm->AllReduce(m_maxQuad, LibUtilities::ReduceMax);
591 
592  // Create list of boundary edge IDs
593  std::set<int> bndIdList;
594  for (size_t i = 0; i < bndCond.size(); ++i)
595  {
596  for (size_t j = 0; j < bndCondExp[i]->GetExpSize(); ++j)
597  {
598  eid = bndCondExp[i]->GetExp(j)->GetGeom()->GetGlobalID();
599  if (perMap.find(eid) ==
600  perMap.end()) // Don't add if periodic boundary
601  {
602  bndIdList.insert(eid);
603  }
604  }
605  }
606 
607  // Get unique edges to send
608  std::vector<int> uniqueEdgeIds;
609  std::vector<bool> duplicated(localEdgeIds.size(), false);
610  for (size_t i = 0; i < localEdgeIds.size(); ++i)
611  {
612  eid = localEdgeIds[i];
613  for (size_t j = i + 1; j < localEdgeIds.size(); ++j)
614  {
615  if (eid == localEdgeIds[j])
616  {
617  duplicated[i] = duplicated[j] = true;
618  }
619  }
620 
621  if (!duplicated[i]) // Not duplicated in local partition
622  {
623  if (bndIdList.find(eid) == bndIdList.end()) // Not a boundary edge
624  {
625  // Check if periodic and if not local set eid to other side
626  auto it = perMap.find(eid);
627  if (it != perMap.end())
628  {
629  if (!it->second[0].isLocal)
630  {
631  uniqueEdgeIds.emplace_back(
632  std::min(eid, it->second[0].id));
633  }
634  }
635  else
636  {
637  uniqueEdgeIds.emplace_back(eid);
638  }
639  }
640  }
641  }
642 
643  // Send uniqueEdgeIds size so all partitions can prepare buffers
644  m_nRanks = comm->GetSize();
645  Array<OneD, int> rankNumEdges(m_nRanks);
646  Array<OneD, int> localEdgeSize(1, uniqueEdgeIds.size());
647  comm->AllGather(localEdgeSize, rankNumEdges);
648 
649  Array<OneD, int> rankLocalEdgeDisp(m_nRanks, 0);
650  for (size_t i = 1; i < m_nRanks; ++i)
651  {
652  rankLocalEdgeDisp[i] = rankLocalEdgeDisp[i - 1] + rankNumEdges[i - 1];
653  }
654 
655  Array<OneD, int> localEdgeIdsArray(uniqueEdgeIds.size());
656  for (size_t i = 0; i < uniqueEdgeIds.size(); ++i)
657  {
658  localEdgeIdsArray[i] = uniqueEdgeIds[i];
659  }
660 
661  // Sort localEdgeIdsArray before sending (this is important!)
662  std::sort(localEdgeIdsArray.begin(), localEdgeIdsArray.end());
663 
664  Array<OneD, int> rankLocalEdgeIds(
665  std::accumulate(rankNumEdges.begin(), rankNumEdges.end(), 0), 0);
666 
667  // Send all unique edge IDs to all partitions
668  comm->AllGatherv(localEdgeIdsArray, rankLocalEdgeIds, rankNumEdges,
669  rankLocalEdgeDisp);
670 
671  // Find what edge Ids match with other ranks
672  size_t myRank = comm->GetRank();
673  Array<OneD, int> perTraceSend(m_nRanks, 0);
674  for (size_t i = 0; i < m_nRanks; ++i)
675  {
676  if (i == myRank)
677  {
678  continue;
679  }
680 
681  for (size_t j = 0; j < rankNumEdges[i]; ++j)
682  {
683  int edgeId = rankLocalEdgeIds[rankLocalEdgeDisp[i] + j];
684  if (std::find(uniqueEdgeIds.begin(), uniqueEdgeIds.end(), edgeId) !=
685  uniqueEdgeIds.end())
686  {
687  m_rankSharedEdges[i].emplace_back(edgeId);
688  }
689  }
690  }
691 }
static void RealignTraceElement(Array< OneD, int > &toAlign, StdRegions::Orientation orient, int nquad1, int nquad2=0)
std::shared_ptr< Expansion > ExpansionSharedPtr
Definition: Expansion.h:68
std::vector< ExpansionSharedPtr > ExpansionVector
Definition: Expansion.h:70
InputIterator find(InputIterator first, InputIterator last, InputIterator startingpoint, const EqualityComparable &value)
Definition: StdRegions.hpp:362

References Nektar::StdRegions::eDir1BwdDir2_Dir2FwdDir1, Nektar::StdRegions::eDir1FwdDir2_Dir2BwdDir1, Nektar::StdRegions::find(), Nektar::MultiRegions::ExpList::GetExp(), Nektar::MultiRegions::PeriodicEntity::id, Nektar::MultiRegions::PeriodicEntity::isLocal, m_edgeToTrace, m_maxQuad, m_nRanks, m_rankSharedEdges, Nektar::MultiRegions::AssemblyMapDG::RealignTraceElement(), and Nektar::LibUtilities::ReduceMax.

Referenced by AssemblyCommDG().

◆ PerformExchange()

void Nektar::MultiRegions::AssemblyCommDG::PerformExchange ( const Array< OneD, NekDouble > &  testFwd,
Array< OneD, NekDouble > &  testBwd 
)
inline

Perform the trace exchange between processors, given the forwards and backwards spaces.

Parameters
testFwdLocal forwards space of the trace (which will be sent)
testBwdLocal bacwards space of the trace (which will receive contributions)

Definition at line 270 of file AssemblyCommDG.h.

272  {
273  m_exchange->PerformExchange(testFwd, testBwd);
274  }

References m_exchange.

◆ Timing()

std::tuple< NekDouble, NekDouble, NekDouble > Nektar::MultiRegions::AssemblyCommDG::Timing ( const LibUtilities::CommSharedPtr comm,
const int &  count,
const int &  num,
const ExchangeMethodSharedPtr f 
)
staticprivate

Timing of the MPI exchange method.

Timing of the exchange method f, performing the exchange count times for array of length num.

Parameters
commCommunicator
countNumber of timing iterations to run
numNumber of quadrature points to communicate
f#ExchangeMethod to time
Returns
tuple of loop times {avg, min, max}

Definition at line 704 of file AssemblyCommDG.cpp.

707 {
708  Array<OneD, NekDouble> testFwd(num, 1);
709  Array<OneD, NekDouble> testBwd(num, -2);
710 
711  LibUtilities::Timer t;
712  t.Start();
713  for (size_t i = 0; i < count; ++i)
714  {
715  f->PerformExchange(testFwd, testBwd);
716  }
717  t.Stop();
718 
719  // These can just be 'reduce' but need to setup the wrapper in comm.h
720  Array<OneD, NekDouble> minTime(1, t.TimePerTest(count));
721  comm->AllReduce(minTime, LibUtilities::ReduceMin);
722 
723  Array<OneD, NekDouble> maxTime(1, t.TimePerTest(count));
724  comm->AllReduce(maxTime, LibUtilities::ReduceMax);
725 
726  Array<OneD, NekDouble> sumTime(1, t.TimePerTest(count));
727  comm->AllReduce(sumTime, LibUtilities::ReduceSum);
728 
729  NekDouble avgTime = sumTime[0] / comm->GetSize();
730  return std::make_tuple(avgTime, minTime[0], maxTime[0]);
731 }

References Nektar::LibUtilities::ReduceMax, Nektar::LibUtilities::ReduceMin, Nektar::LibUtilities::ReduceSum, Nektar::LibUtilities::Timer::Start(), Nektar::LibUtilities::Timer::Stop(), and Nektar::LibUtilities::Timer::TimePerTest().

Referenced by AssemblyCommDG().

Member Data Documentation

◆ m_edgeToTrace

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_edgeToTrace
private

Map of edge ID to quad point trace indices.

Definition at line 286 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_exchange

ExchangeMethodSharedPtr Nektar::MultiRegions::AssemblyCommDG::m_exchange
private

Chosen exchange method (either fastest parallel or serial)

Definition at line 278 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and PerformExchange().

◆ m_maxQuad

int Nektar::MultiRegions::AssemblyCommDG::m_maxQuad = 0
private

Max number of quadrature points in an element.

Definition at line 280 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_nRanks

int Nektar::MultiRegions::AssemblyCommDG::m_nRanks = 0
private

Number of ranks/processes/partitions.

Definition at line 282 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().

◆ m_rankSharedEdges

std::map<int, std::vector<int> > Nektar::MultiRegions::AssemblyCommDG::m_rankSharedEdges
private

Map of process to shared edge IDs.

Definition at line 284 of file AssemblyCommDG.h.

Referenced by AssemblyCommDG(), and InitialiseStructure().