Conference Proceeding
Refine
Year of publication
Institute
- Fachbereich Medizintechnik und Technomathematik (135) (remove)
Has Fulltext
- no (135) (remove)
Language
- English (135) (remove)
Document Type
- Conference Proceeding (135) (remove)
Keywords
The integration of product data from heterogeneous sources and manufacturers into a single catalog is often still a laborious, manual task. Especially small- and medium-sized enterprises face the challenge of timely integrating the data their business relies on to have an up-to-date product catalog, due to format specifications, low quality of data and the requirement of expert knowledge. Additionally, modern approaches to simplify catalog integration demand experience in machine learning, word vectorization, or semantic similarity that such enterprises do not have. Furthermore, most approaches struggle with low-quality data. We propose Attribute Label Ranking (ALR), an easy to understand and simple to adapt learning approach. ALR leverages a model trained on real-world integration data to identify the best possible schema mapping of previously unknown, proprietary, tabular format into a standardized catalog schema. Our approach predicts multiple labels for every attribute of an inpu t column. The whole column is taken into consideration to rank among these labels. We evaluate ALR regarding the correctness of predictions and compare the results on real-world data to state-of-the-art approaches. Additionally, we report findings during experiments and limitations of our approach.
The integration of frequently changing, volatile product data from different manufacturers into a single catalog is a significant challenge for small and medium-sized e-commerce companies. They rely on timely integrating product data to present them aggregated in an online shop without knowing format specifications, concept understanding of manufacturers, and data quality. Furthermore, format, concepts, and data quality may change at any time. Consequently, integrating product catalogs into a single standardized catalog is often a laborious manual task. Current strategies to streamline or automate catalog integration use techniques based on machine learning, word vectorization, or semantic similarity. However, most approaches struggle with low-quality or real-world data. We propose Attribute Label Ranking (ALR) as a recommendation engine to simplify the integration process of previously unknown, proprietary tabular format into a standardized catalog for practitioners. We evaluate ALR by focusing on the impact of different neural network architectures, language features, and semantic similarity. Additionally, we consider metrics for industrial application and present the impact of ALR in production and its limitations.
Biomechanical simulation of different prosthetic meshes for repairing uterine/vaginal vault prolapse
(2017)
The overall objective of this study is to develop a new external fixator, which closely maps the native kinematics of the elbow to decrease the joint force resulting in reduced rehabilitation time and pain. An experimental setup was designed to determine the native kinematics of the elbow during flexion of cadaveric arms. As a preliminary study, data from literature was used to modify a published biomechanical model for the calculation of the joint and muscle forces. They were compared to the original model and the effect of the kinematic refinement was evaluated. Furthermore, the obtained muscle forces were determined in order to apply them in the experimental setup. The joint forces in the modified model differed slightly from the forces in the original model. The muscle force curves changed particularly for small flexion angles but their magnitude for larger angles was consistent.
Reliable methods for automatic readability assessment have the potential to impact a variety of fields, ranging from machine translation to self-informed learning. Recently, large language models for the German language (such as GBERT and GPT-2-Wechsel) have become available, allowing to develop Deep Learning based approaches that promise to further improve automatic readability assessment. In this contribution, we studied the ability of ensembles of fine-tuned GBERT and GPT-2-Wechsel models to reliably predict the readability of German sentences. We combined these models with linguistic features and investigated the dependence of prediction performance on ensemble size and composition. Mixed ensembles of GBERT and GPT-2-Wechsel performed better than ensembles of the same size consisting of only GBERT or GPT-2-Wechsel models. Our models were evaluated in the GermEval 2022 Shared Task on Text Complexity Assessment on data of German sentences. On out-of-sample data, our best ensemble achieved a root mean squared error of 0:435.
In collaborative research projects, both researchers and practitioners work together solving business-critical challenges. These projects often deal with ETL processes, in which humans extract information from non-machine-readable documents by hand. AI-based machine learning models can help to solve this problem.
Since machine learning approaches are not deterministic, their quality of output may decrease over time. This fact leads to an overall quality loss of the application which embeds machine learning models. Hence, the software qualities in development and production may differ.
Machine learning models are black boxes. That makes practitioners skeptical and increases the inhibition threshold for early productive use of research prototypes. Continuous monitoring of software quality in production offers an early response capability on quality loss and encourages the use of machine learning approaches. Furthermore, experts have to ensure that they integrate possible new inputs into the model training as quickly as possible.
In this paper, we introduce an architecture pattern with a reference implementation that extends the concept of Metrics Driven Research Collaboration with an automated software quality monitoring in productive use and a possibility to auto-generate new test data coming from processed documents in production.
Through automated monitoring of the software quality and auto-generated test data, this approach ensures that the software quality meets and keeps requested thresholds in productive use, even during further continuous deployment and changing input data.