Auflistung nach Autor:in "Nowak, Fabian"
1 - 5 von 5
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAdaptive Cache Infrastructure: Supporting dynamic Program Changes following dynamic Program Behavior(9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Nowak, Fabian; Buchty, Rainer; Karl, WolfgangRecent examinations of program behavior at run-time revealed distinct phases. Thus, it is evident that a framework for supporting hardware adaptation to phase behavior is needed. With the memory access behavior being most important and cache accesses being a very big subset of them, we herein propose an infrastructure for fitting cache accesses to a program’s requirements for a distinct phase.
- ZeitschriftenartikelAn Architecture Framework for Porting Applications to FPGAs(PARS-Mitteilungen: Vol. 31, Nr. 1, 2014) Nowak, Fabian; Bromberger, Michael; Karl, WolfgangHigh-level language converters help creating FPGAbased accelerators and allow to rapidly come up with a working prototype. But the generated state machines do often not perform as optimal as hand-designed control units, and they require much area. Also, the created deep pipelines are not very efficient for small amounts of data. Our approach is an architecture framework of hand-coded building blocks (BBs). A microprogrammable control unit allows programming the BBs to perform computations in a data-flow style. We accelerate applications further by executing independent tasks in parallel on different BBs. Our microprogram implementation for the Conjugate-Gradient method on our data-driven, microprogrammable, task-parallel architecture framework on the Convey HC-1 is competitive with a 24-thread Intel Westmere system. It is 1.2× faster using only one out of four available FPGAs, thereby proving its potential for accelerating numerical applications. Moreover, we show that hardware developers can change the BBs and thereby reduce iteration count of a numerical algorithm like the ConjugateGradient method to less than 0.5× due to more precise operations inside the BBs, speeding up execution time 2.47×.
- ZeitschriftenartikelEvaluating the Energy Efficiency of Reconfigurable Computing Toward Heterogeneous Multi-Core Computing(PARS-Mitteilungen: Vol. 31, Nr. 1, 2014) Nowak, FabianFuture exascale systems need to have a much better performance-to-power ratio than today’s systems. Accelerators are a promising approach to pave this path by more energy-efficient computing. We show some early results of our investigations toward energy efficiency of reconfigurable and heterogeneous computing against multi-core processors for special applications. The results are supported by a general framework and toolchain for early evaluation of potential benefits of reconfigurable hardware. As a result, heterogeneous systems based on reconfigurable hardware, efficient data exchange mechanisms, data-driven and component-based programming, and task-parallel execution can help achieve power-efficient exascale systems in future.
- ZeitschriftenartikelParallel Prefiltering for Accelerating HHblits on the Convey HC-1(PARS-Mitteilungen: Vol. 30, Nr. 1, 2013) Bromberger, Michael; Nowak, FabianHHblits is a bioinformatics application for finding proteins with common ances- tors. To achieve more sensitivitythe protein sequences of the query are not compared directly against the database protein sequencesbut rather their Hidden Markov Models are compared. ThusHHblits is very time-consuming and therefore needs to be accelerated. A multi-FPGA system such as the Convey HC-1 is a promising candiate to achieve acceleration. We present the design and implementation of a parallel coprocessor on the Convey HC-1 to accelerate HHblits after analyzing the application toward acceleration candidates. We achieve a speedup of 117.5× against a sequential implementation for FPGA-suitable data sizes per kernel and negligible speedup for the entire uniprot20 protein database against an optimized SSE implementation.
- ZeitschriftenartikelParallel Prefiltering for Accelerating HHblits on the Convey HC-1(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 30, No. 1, 2013) Bromberger, Michael; Nowak, FabianHHblits is a bioinformatics application for finding proteins with common ancestors. To achieve more sensitivity, the protein sequences of the query are not compared directly against the database protein sequences, but rather their Hidden Markov Models are compared. Thus, HHblits is very time-consuming and therefore needs to be accelerated. A multi-FPGA system such as the Convey HC-1 is a promising candidate to achieve acceleration. We present the design and implementation of a parallel coprocessor on the Convey HC-1 to accelerate HHblits after analyzing the application toward acceleration candidates. We achieve a speedup of 117.5× against a sequential implementation for FPGA-suitable data sizes per kernel and negligible speedup for the entire uniprot20 protein database against an optimized SSE implementation.