14.2 Collections

The Collections library adds optimisations to perform certain elemental operations collectively by applying an operator using a matrix-matrix operation, rather than a sequence of matrix-vector multiplications. Certain operators benefit more than other from this treatment, so the following implementations are available:

All configuration relating to Collections is given in the COLLECTIONS XML element within the NEKTAR XML element.

14.2.1 Default implementation

The default implementation for all operators may be chosen through setting the DEFAULT attribute of the COLLECTIONS XML element to one of StdMat, SumFac, IterPerExp or NoCollection. For example, the following uses the collated matrix-matrix type elemental operation for all operators and expansion orders:


14.2.2 Auto-tuning

The choice of implementation for each operator, for the given mesh and expansion orders, can be selected automatically through auto-tuning. To enable this, add the following to the Nektar++ session file:


This will collate elements from the given mesh and given expansion orders, run and time each implementation strategy in turn, and select the fastest performing case. Note that the selections will be mesh- and order- specific. The selections made via auto-tuning are output if the –verbose command-line switch is given.

14.2.3 Manual selection

The choice of implementation for each operator may be set manually within the COLLECTIONS tag as shown in the following example. Different implementations may be chosen for different element shapes and expansion orders. Specifying * for ORDER sets the default implementation for any expansion orders not explicity defined.

2    <OPERATOR TYPE="BwdTrans"> 
3        <ELEMENT TYPE="T" ORDER="*"   IMPTYPE="IterPerExp" /> 
4        <ELEMENT TYPE="T" ORDER="1-5" IMPTYPE="StdMat" /> 
5    </OPERATOR> 
6    <OPERATOR TYPE="IProductWRTBase"> 
7        <ELEMENT TYPE="Q" ORDER="*"   IMPTYPE="SumFac" /> 
8    </OPERATOR> 

Manual selection is intended to document the optimal selections on a given hardware platform after extensive prior testing, to avoid the need to run the auto-tuning for each run.

14.2.4 Collection size

The maximum number of elements within a single collection can be enforced using the MAXSIZE attribute.