Auflistung nach Autor:in "Boehm, Matthias"
1 - 8 von 8
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragBerufsbegleitende Weiterbildung im Spannungsfeld von Wissenschaft und IT-Beratung: State-of-the-Art und Entwicklung eines Vorgehensmodells(INFORMATIK 2011 – Informatik schafft Communities, 2011) Boehm, Matthias; Stolze, Carl; Thomas, OliverIn diesem Beitrag wird anhand eines integrierten Ansatzes zur berufsbegleitenden Weiterbildung die Möglichkeit einer engeren Zusammenarbeit von Wissenschaft und IT-Beratung aufgezeigt. Die Entwicklung beruht auf einer umfangreichen Literaturanalyse sowie einer State-of-the-Art-Erhebung von Angeboten der universitären Weiterbildung. Anschließend wird ein Vorgehensmodell für die Kooperation zwischen Wissenschaft und Praxis beschrieben. Schlussendlich wird aufgezeigt, dass gerade Wissenschaftler und IT-Berater voneinander profitieren können, um gemeinsam Trends zu erkennen, zu verstehen und zu analysieren.
- TextdokumentEfficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning(BTW 2019, 2019) Boehm, Matthias; Evfimievski, Alexandre; Reinwald, BertholdCumulative aggregates are often overlooked yet important operations in large-scale machine learning (ML) systems. Examples are prefix sums and more complex aggregates, but also preprocessing techniques such as the removal of empty rows or columns. These operations are challenging to parallelize over distributed, blocked matrices—as commonly used in ML systems—due to recursive data dependencies. However, computing prefix sums is a classic example of a presumably sequential operation that can be efficiently parallelized via aggregation trees. In this paper, we describe an efficient framework for data-parallel cumulative aggregates over distributed, blocked matrices. The basic idea is a self-similar operator composed of a forward cascade that reduces the data size by orders of magnitude per iteration until the data fits in local memory, a local cumulative aggregate over the partial aggregates, and a backward cascade to produce the final result. We also generalize this framework for complex cumulative aggregates of sum-product expressions, and characterize the class of supported operations. Finally, we describe the end-to-end compiler and runtime integration into SystemML, and the use of cumulative aggregates in other operations. Our experiments show that this framework achieves both high performance for moderate data sizes and good scalability.
- KonferenzbeitragEfficient in-memory indexing with generalized prefix trees(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Boehm, Matthias; Schlegel, Benjamin; Volk, Peter Benjamin; Fischer, Ulrike; Habich, Dirk; Lehner, WolfgangEfficient data structures for in-memory indexing gain in importance due to (1) the exponentially increasing amount of data, (2) the growing main-memory capacity, and (3) the gap between main-memory and CPU speed. In consequence, there are high performance demands for in-memory data structures. Such index structures are used-with minor changes-as primary or secondary indices in almost every DBMS. Typically, tree-based or hash-based structures are used, while structures based on prefix-trees (tries) are neglected in this context. For tree-based and hash-based structures, the major disadvantages are inherently caused by the need for reorganization and key comparisons. In contrast, the major disadvantage of trie-based structures in terms of high memory consumption (created and accessed nodes) could be improved. In this paper, we argue for reconsidering prefix trees as in-memory index structures and we present the generalized trie, which is a prefix tree with variable prefix length for indexing arbitrary data types of fixed or variable length. The variable prefix length enables the adjustment of the trie height and its memory consumption. Further, we introduce concepts for reducing the number of created and accessed trie levels. This trie is order-preserving and has deterministic trie paths for keys, and hence, it does not require any dynamic reorganization or key comparisons. Finally, the generalized trie yields improvements compared to existing in-memory index structures, especially for skewed data. In conclusion, the generalized trie is applicable as general-purpose in-memory index structure in many different OLTP or hybrid (OLTP and OLAP) data management systems that require balanced read/write performance.
- KonferenzbeitragEnabling Integrated Data Analysis Pipelines on Heterogeneous Hardware through Holistic Extensibility(BTW 2023, 2023) Damme, Patrick; Boehm, MatthiasThis submission is an extended abstract.
- ZeitschriftenartikelInternationaler Vergleich berufsbegleitender Weiterbildung im IT-Management und -Consulting(Wirtschaftsinformatik & Management: Vol. 5, No. 5, 2013) Boehm, Matthias; Stolze, Carl; Ewald, Sven; Thomas, Oliver
- KonferenzbeitragOffline design tuning for hierarchies of forecast models(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Fischer, Ulrike; Boehm, Matthias; Lehner, WolfgangForecasting of time series data is crucial for decision-making processes in many domains as it allows the prediction of future behavior. In this context, a model is fit to the observed data points of the time series by estimating the model parameters. The computed parameters are then utilized to forecast future points in time. Existing approaches integrate forecasting into traditional relational query processing, where a forecast query requests the creation of a forecast model. Models of continued interest should be deployed only once and used many times afterwards. This however leads to additional maintenance costs as models need to be kept up-to-date. Costs can be reduced by choosing a well-defined subset of models and answering queries using derivation schemes. In contrast to materialized view selection, model selection opens a whole new problem area as results are approximate. A derivation schema might increase or decrease the accuracy of a forecast query. Thus, a two-dimensional optimization problem of minimizing the model cost and model usage error is introduced in this paper. Our solution consists of a greedy enumeration approach that empirically evaluates different configurations of forecast models. In our experimental evaluation, with data sets from different domains, we show the superiority of our approach over traditional approaches from forecasting literature.
- ZeitschriftenartikelTowards Integrated Data Analytics: Time Series Forecasting in DBMS(Datenbank-Spektrum: Vol. 13, No. 1, 2013) Fischer, Ulrike; Dannecker, Lars; Siksnys, Laurynas; Rosenthal, Frank; Boehm, Matthias; Lehner, WolfgangIntegrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.
- KonferenzbeitragUnderstanding IT-management and IT-consulting teaching as product-service system: application of an engineering model(Enterprise modelling and information systems architectures (EMISA 2011), 2011) Boehm, Matthias; Stolze, Carl; Thomas, OliverIn this research-in-progress paper we conceptualize the teaching of IT-Management and IT-Consulting as a hybrid package of products (resources) and services (teaching). This understanding offers a new perspective on teaching approaches and creates new opportunities for all stakeholders. Following a wellestablished procedure model for product-service systems (PSS) engineering, we derive the customer requirements to the hybrid package. Than a product model is developed. Based on these findings, an Education Integration Platform Solution (EIPS) is prototypically designed. Finally, a conclusion and outlook are given.