Auflistung nach Schlagwort "Data discovery"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelA Terminology Service Supporting Semantic Annotation, Integration, Discovery and Analysis of Interdisciplinary Research Data(Datenbank-Spektrum: Vol. 16, No. 3, 2016) Karam, Naouel; Müller-Birn, Claudia; Gleisberg, Maren; Fichtmüller, David; Tolksdorf, Robert; Güntsch, AntonResearch has become more data-intensive over the last few decades. Sharing research data is often a challenge, especially for interdisciplinary collaborative projects. One primary goal of a research infrastructure for data management should be to enable efficient data discovery and integration of heterogeneous data. In order to enable such interoperability, a lot of effort has been undertaken by scientists to develop standards and characterize their domain knowledge in the form of taxonomies and formal ontologies. However, these knowledge models are often disconnected and distributed. The work presented here provides a promising approach for integrating and harmonizing terminological resources to serve as a backbone for a platform. The component developed, called the GFBio Terminology Service, acts as a semantic platform for access, development and reasoning over internally and externally maintained terminological resources within the biological and environmental domain. We highlight the utility of the Terminology Service by practical use cases of semantically enhanced components. We show how the Terminology Service enables applications to add meaning to their data by giving access to the knowledge that can be derived from the terminologies and data annotated by them.
- ZeitschriftenartikelEnabling data-centric AI through data quality management and data literacy(it - Information Technology: Vol. 64, No. 1-2, 2022) Abedjan, ZiawaschData is being produced at an intractable pace. At the same time, there is an insatiable interest in using such data for use cases that span all imaginable domains, including health, climate, business, and gaming. Beyond the novel socio-technical challenges that surround data-driven innovations, there are still open data processing challenges that impede the usability of data-driven techniques. It is commonly acknowledged that overcoming heterogeneity of data with regard to syntax and semantics to combine various sources for a common goal is a major bottleneck. Furthermore, the quality of such data is always under question as the data science pipelines today are highly ad-hoc and without the necessary care for provenance. Finally, quality criteria that go beyond the syntactical and semantic correctness of individual values but also incorporate population-level constraints, such as equal parity and opportunity with regard to protected groups, play a more and more important role in this process. Traditional research on data integration was focused on post-merger integration of companies, where customer or product databases had to be integrated. While this is often hard enough, today the challenges aggravate because of the fact that more stakeholders are using data analytics tools to derive domain-specific insights. I call this phenomenon the democratization of data science, a process, which is both challenging and necessary. Novel systems need to be user-friendly in a way that not only trained database admins can handle them but also less computer science savvy stakeholders. Thus, our research focuses on scalable example-driven techniques for data preparation and curation. Furthermore, we believe that it is important to educate the breadth of society on implications of a data-driven world and actively promote the concept of data literacy as a fundamental competence.