BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20190719T085744Z
LOCATION:HG EO Nord
DTSTART;TZID=Europe/Stockholm:20190613T195000
DTEND;TZID=Europe/Stockholm:20190613T215000
UID:submissions.pasc-conference.org_PASC19_sess179_post110@linklings.com
SUMMARY:CHM04 - DBCSR: A Library for Sparse Linear Algebra
DESCRIPTION:Poster\n\n\nCHM04 - DBCSR: A Library for Sparse Linear Algebra
 \n\nSeewald, Jakobovits, Sivkov, Lazzaro, Müller...\n\nThe DBCSR library p
 rovides an optimized implementation for carrying out dense and sparse matr
 ix-matrix multiplication. Its multi-layered structure automatically takes 
 care of and optimizes several computational aspects like parallelism (MPI,
  OpenMP, GPU), data (cache) locality and on-the-fly filtering. DBCSR was o
 riginally developed with a focus on iterative methods to compute matrix fu
 nctions for linear scaling electronic structure calculations. Recently the
  library was generalized to multidimensional tensor contraction for l
 ow-scaling methods beyond density functional theory. We report optimizatio
 ns specifically targeting the case of rectangular matrix-matrix multiplica
 tion on non-square process grids arising in our tensor implementation. We 
 also report improvements on our GPU implementation by using just-in-t
 ime (JIT) compilation of CUDA kernels for small matrix-matrix multiplicati
 on.
END:VEVENT
END:VCALENDAR

