BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20190719T085744Z
LOCATION:HG F 1
DTSTART;TZID=Europe/Stockholm:20190613T171500
DTEND;TZID=Europe/Stockholm:20190613T174500
UID:submissions.pasc-conference.org_PASC19_sess140_msa252@linklings.com
SUMMARY:MFEM: Accelerating Efficient Solution of PDEs at Exascale
DESCRIPTION:Minisymposium\nComputer Science and Applied Mathematics, Engin
 eering\n\nMFEM: Accelerating Efficient Solution of PDEs at Exascale\n\nKol
 ev\n\nEfficient exploitation of exascale architectures requires rethinking
  of the numerical algorithms used in large-scale PDE-based applications. T
 hese architectures will favor algorithms, such as high-order finite elemen
 ts, that expose fine-grain parallelism and maximize the ratio of floating 
 point operations to energy intensive data movement. In this talk we presen
 t an overview of MFEM (mfem.org), a scalable library for high-order finite
  element discretization of PDEs on general unstructured grids. We also rep
 ort on recent work in the Center for Efficient Exascale Discretizations (h
 ttp://ceed.exascaleproject.org), a co-design center in the US Exascale Com
 puting Project focused on next-generation discretization software and algo
 rithms. Our approach to efficiency is based on a "matrix-free" representat
 ion of the finite element operator, that factors a bilinear form into a se
 ries of sparse and dense components corresponding to the parallelism, mesh
  topology, basis, geometry, and pointwise physics in the problem. The oper
 ator decomposition exposes several layers of parallelism, enables the use 
 of batched dgemms and tensor contractions, and only requires quadrature po
 int values to be assembled for computing the action. This "partial assembl
 y" formulation results both in less (nearly optimal) computation and less 
 (optimal) data movement compared to assembling a global sparse matrix, the
 refore increasing performance and reducing time to solution.
END:VEVENT
END:VCALENDAR

