Conference Proceeding
Refine
Year of publication
Institute
- Fachbereich Medizintechnik und Technomathematik (152) (remove)
Has Fulltext
- no (152) (remove)
Document Type
- Conference Proceeding (152) (remove)
Keywords
The integration of product data from heterogeneous sources and manufacturers into a single catalog is often still a laborious, manual task. Especially small- and medium-sized enterprises face the challenge of timely integrating the data their business relies on to have an up-to-date product catalog, due to format specifications, low quality of data and the requirement of expert knowledge. Additionally, modern approaches to simplify catalog integration demand experience in machine learning, word vectorization, or semantic similarity that such enterprises do not have. Furthermore, most approaches struggle with low-quality data. We propose Attribute Label Ranking (ALR), an easy to understand and simple to adapt learning approach. ALR leverages a model trained on real-world integration data to identify the best possible schema mapping of previously unknown, proprietary, tabular format into a standardized catalog schema. Our approach predicts multiple labels for every attribute of an inpu t column. The whole column is taken into consideration to rank among these labels. We evaluate ALR regarding the correctness of predictions and compare the results on real-world data to state-of-the-art approaches. Additionally, we report findings during experiments and limitations of our approach.
The integration of frequently changing, volatile product data from different manufacturers into a single catalog is a significant challenge for small and medium-sized e-commerce companies. They rely on timely integrating product data to present them aggregated in an online shop without knowing format specifications, concept understanding of manufacturers, and data quality. Furthermore, format, concepts, and data quality may change at any time. Consequently, integrating product catalogs into a single standardized catalog is often a laborious manual task. Current strategies to streamline or automate catalog integration use techniques based on machine learning, word vectorization, or semantic similarity. However, most approaches struggle with low-quality or real-world data. We propose Attribute Label Ranking (ALR) as a recommendation engine to simplify the integration process of previously unknown, proprietary tabular format into a standardized catalog for practitioners. We evaluate ALR by focusing on the impact of different neural network architectures, language features, and semantic similarity. Additionally, we consider metrics for industrial application and present the impact of ALR in production and its limitations.
Human induced pluripotent stem cells (hiPSCs) have shown to be promising in disease studies and drug screenings [1]. Cardiomyocytes derived from hiPSCs have been extensively investigated using patch-clamping and optical methods to compare their electromechanical behaviour relative to fully matured adult cells. Mathematical models can be used for translating findings on hiPSCCMs to adult cells [2] or to better understand the mechanisms of various ion channels when a drug is applied [3,4]. Paci et al. (2013) [3] developed the first model of hiPSC-CMs, which they later refined based on new data [3]. The model is based on iCells® (Fujifilm Cellular Dynamics, Inc. (FCDI), Madison WI, USA) but major differences among several cell lines and even within a single cell line have been found and motivate an approach for creating sample-specific models. We have developed an optimisation algorithm that parameterises the conductances (in S/F=Siemens/Farad) of the latest Paci et al. model (2018) [5] using current-voltage data obtained in individual patch-clamp experiments derived from an automated patch clamp system (Patchliner, Nanion Technologies GmbH, Munich).
We compare four different algorithms for automatically estimating the muscle fascicle angle from ultrasonic images: the vesselness filter, the Radon transform, the projection profile method and the gray level cooccurence matrix (GLCM). The algorithm results are compared to ground truth data generated by three different experts on 425 image frames from two videos recorded during different types of motion. The best agreement with the ground truth data was achieved by a combination of pre-processing with a vesselness filter and measuring the angle with the projection profile method. The robustness of the estimation is increased by applying the algorithms to subregions with high gradients and performing a LOESS fit through these estimates.
Label-free sensing of biomolecules by their intrinsic molecular charge using field-effect devices
(2015)
Label-free Electrostatic Detection of DNA Amplification by PCR Using Capacitive Field-effect Devices
(2016)
A capacitive field-effect EIS (electrolyte-insulator-semiconductor) sensor modified with a positively charged weak polyelectrolyte of poly(allylamine hydrochloride) (PAH)/single-stranded probe DNA (ssDNA) bilayer has been used for a label-free electrostatic detection of pathogen-specific DNA amplification via polymerase chain reaction (PCR). The sensor is able to distinguish between positive and negative PCR solutions, to detect the existence of target DNA amplicons in PCR samples and thus, can be used as tool for a quick verification of DNA amplification and the successful PCR process.
In positron emission tomography improving time, energy and spatial detector resolutions and using Compton kinematics introduces the possibility to reconstruct a radioactivity distribution image from scatter coincidences, thereby enhancing image quality. The number of single scattered coincidences alone is in the same order of magnitude as true coincidences. In this work, a compact Compton camera module based on monolithic scintillation material is investigated as a detector ring module. The detector interactions are simulated with Monte Carlo package GATE. The scattering angle inside the tissue is derived from the energy of the scattered photon, which results in a set of possible scattering trajectories or broken line of response. The Compton kinematics collimation reduces the number of solutions. Additionally, the time of flight information helps localize the position of the annihilation. One of the questions of this investigation is related to how the energy, spatial and temporal resolutions help confine the possible annihilation volume. A comparison of currently technically feasible detector resolutions (under laboratory conditions) demonstrates the influence on this annihilation volume and shows that energy and coincidence time resolution have a significant impact. An enhancement of the latter from 400 ps to 100 ps leads to a smaller annihilation volume of around 50%, while a change of the energy resolution in the absorber layer from 12% to 4.5% results in a reduction of 60%. The inclusion of single tissue-scattered data has the potential to increase the sensitivity of a scanner by a factor of 2 to 3 times. The concept can be further optimized and extended for multiple scatter coincidences and subsequently validated by a reconstruction algorithm.
Detection of Adrenaline Based on Bioelectrocatalytical System to Support Tumor Diagnostic Technology
(2017)
In the research domain of energy informatics, the importance of open datais rising rapidly. This can be seen as various new public datasets are created andpublished. Unfortunately, in many cases, the data is not available under a permissivelicense corresponding to the FAIR principles, often lacking accessibility or reusability.Furthermore, the source format often differs from the desired data format or does notmeet the demands to be queried in an efficient way. To solve this on a small scale atoolbox for ETL-processes is provided to create a local energy data server with openaccess data from different valuable sources in a structured format. So while the sourcesitself do not fully comply with the FAIR principles, the provided unique toolbox allows foran efficient processing of the data as if the FAIR principles would be met. The energydata server currently includes information of power systems, weather data, networkfrequency data, European energy and gas data for demand and generation and more.However, a solution to the core problem - missing alignment to the FAIR principles - isstill needed for the National Research Data Infrastructure.
Due to the transition to renewable energies, electricity markets need to be made fit for purpose. To enable the comparison of different energy market designs, modeling tools covering market actors and their heterogeneous behavior are needed. Agent-based models are ideally suited for this task. Such models can be used to simulate and analyze changes to market design or market mechanisms and their impact on market dynamics. In this paper, we conduct an evaluation and comparison of two actively developed open-source energy market simulation models. The two models, namely AMIRIS and ASSUME, are both designed to simulate future energy markets using an agent-based approach. The assessment encompasses modelling features and techniques, model performance, as well as a comparison of model results, which can serve as a blueprint for future comparative studies of simulation models. The main comparison dataset includes data of Germany in 2019 and simulates the Day-Ahead market and participating actors as individual agents. Both models are comparable close to the benchmark dataset with a MAE between 5.6 and 6.4 €/MWh while also modeling the actual dispatch realistically.
Market abstraction of energy markets and policies - application in an agent-based modeling toolbox
(2023)
In light of emerging challenges in energy systems, markets are prone to changing dynamics and market design. Simulation models are commonly used to understand the changing dynamics of future electricity markets. However, existing market models were often created with specific use cases in mind, which limits their flexibility and usability. This can impose challenges for using a single model to compare different market designs. This paper introduces a new method of defining market designs for energy market simulations. The proposed concept makes it easy to incorporate different market designs into electricity market models by using relevant parameters derived from analyzing existing simulation tools, morphological categorization and ontologies. These parameters are then used to derive a market abstraction and integrate it into an agent-based simulation framework, allowing for a unified analysis of diverse market designs. Furthermore, we showcase the usability of integrating new types of long-term contracts and over-the-counter trading. To validate this approach, two case studies are demonstrated: a pay-as-clear market and a pay-as-bid long-term market. These examples demonstrate the capabilities of the proposed framework.
Useful market simulations are key to the evaluation of diferent market designs existing of multiple market mechanisms or rules. Yet a simulation framework which has a comparison of diferent market mechanisms in mind was not found. The need to create an objective view on different sets of market rules while investigating meaningful agent strategies concludes that such a simulation framework is needed to advance the research on this subject. An overview of diferent existing market simulation models is given which also shows the research gap and the missing capabilities of those systems. Finally, a methodology is outlined how a novel market simulation which can answer the research questions can be developed.
Conventional EEG devices cannot be used in everyday life and hence, past decade research has been focused on Ear-EEG for mobile, at-home monitoring for various applications ranging from emotion detection to sleep monitoring. As the area available for electrode contact in the ear is limited, the electrode size and location play a vital role for an Ear-EEG system. In this investigation, we present a quantitative study of ear-electrodes with two electrode sizes at different locations in a wet and dry configuration. Electrode impedance scales inversely with size and ranges from 450 kΩ to 1.29 MΩ for dry and from 22 kΩ to 42 kΩ for wet contact at 10 Hz. For any size, the location in the ear canal with the lowest impedance is ELE (Left Ear Superior), presumably due to increased contact pressure caused by the outer-ear anatomy. The results can be used to optimize signal pickup and SNR for specific applications. We demonstrate this by recording sleep spindles during sleep onset with high quality (5.27 μVrms).
In energy economy forecasts of different time series are rudimentary. In this study, a prediction for the German day-ahead spot market is created with Apache Spark and R. It is just an example for many different applications in virtual power plant environments. Other examples of use as intraday price processes, load processes of machines or electric vehicles, real time energy loads of photovoltaic systems and many more time series need to be analysed and predicted.
This work gives a short introduction into the project where this study is settled. It describes the time series methods that are used in energy industry for forecasts shortly. As programming technique Apache Spark, which is a strong cluster computing technology, is utilised. Today, single time series can be predicted. The focus of this work is on developing a method to parallel forecasting, to process multiple time series simultaneously with R and Apache Spark.
The progress in natural language processing (NLP) research over the last years, offers novel business opportunities for companies, as automated user interaction or improved data analysis. Building sophisticated NLP applications requires dealing with modern machine learning (ML) technologies, which impedes enterprises from establishing successful NLP projects. Our experience in applied NLP research projects shows that the continuous integration of research prototypes in production-like environments with quality assurance builds trust in the software and shows convenience and usefulness regarding the business goal. We introduce STAMP 4 NLP as an iterative and incremental process model for developing NLP applications. With STAMP 4 NLP, we merge software engineering principles with best practices from data science. Instantiating our process model allows efficiently creating prototypes by utilizing templates, conventions, and implementations, enabling developers and data scientists to focus on the business goals. Due to our iterative-incremental approach, businesses can deploy an enhanced version of the prototype to their software environment after every iteration, maximizing potential business value and trust early and avoiding the cost of successful yet never deployed experiments.
Supervised machine learning and deep learning require a large amount of labeled data, which data scientists obtain in a manual, and time-consuming annotation process. To mitigate this challenge, Active Learning (AL) proposes promising data points to annotators they annotate next instead of a subsequent or random sample. This method is supposed to save annotation effort while maintaining model performance.
However, practitioners face many AL strategies for different tasks and need an empirical basis to choose between them. Surveys categorize AL strategies into taxonomies without performance indications. Presentations of novel AL strategies compare the performance to a small subset of strategies. Our contribution addresses the empirical basis by introducing a reproducible active learning evaluation (ALE) framework for the comparative evaluation of AL strategies in NLP.
The framework allows the implementation of AL strategies with low effort and a fair data-driven comparison through defining and tracking experiment parameters (e.g., initial dataset size, number of data points per query step, and the budget). ALE helps practitioners to make more informed decisions, and researchers can focus on developing new, effective AL strategies and deriving best practices for specific use cases. With best practices, practitioners can lower their annotation costs. We present a case study to illustrate how to use the framework.
Multi-attribute relation extraction (MARE): simplifying the application of relation extraction
(2021)
Natural language understanding’s relation extraction makes innovative and encouraging novel business concepts possible and facilitates new digitilized decision-making processes. Current approaches allow the extraction of relations with a fixed number of entities as attributes. Extracting relations with an arbitrary amount of attributes requires complex systems and costly relation-trigger annotations to assist these systems. We introduce multi-attribute relation extraction (MARE) as an assumption-less problem formulation with two approaches, facilitating an explicit mapping from business use cases to the data annotations. Avoiding elaborated annotation constraints simplifies the application of relation extraction approaches. The evaluation compares our models to current state-of-the-art event extraction and binary relation extraction methods. Our approaches show improvement compared to these on the extraction of general multi-attribute relations.