Refine
Year of publication
Institute
- Fachbereich Medizintechnik und Technomathematik (162) (remove)
Has Fulltext
- no (162) (remove)
Document Type
- Conference Proceeding (162) (remove)
Keywords
- Natural language processing (4)
- Clustering (2)
- Information extraction (2)
- humans (2)
- Active learning (1)
- Agent-based modeling (1)
- Agent-based simulation (1)
- Analytical models (1)
- Chance constrained programming (1)
- Cloud Computing (1)
Multi-attribute relation extraction (MARE): simplifying the application of relation extraction
(2021)
Natural language understanding’s relation extraction makes innovative and encouraging novel business concepts possible and facilitates new digitilized decision-making processes. Current approaches allow the extraction of relations with a fixed number of entities as attributes. Extracting relations with an arbitrary amount of attributes requires complex systems and costly relation-trigger annotations to assist these systems. We introduce multi-attribute relation extraction (MARE) as an assumption-less problem formulation with two approaches, facilitating an explicit mapping from business use cases to the data annotations. Avoiding elaborated annotation constraints simplifies the application of relation extraction approaches. The evaluation compares our models to current state-of-the-art event extraction and binary relation extraction methods. Our approaches show improvement compared to these on the extraction of general multi-attribute relations.
Human induced pluripotent stem cells (hiPSCs) have shown to be promising in disease studies and drug screenings [1]. Cardiomyocytes derived from hiPSCs have been extensively investigated using patch-clamping and optical methods to compare their electromechanical behaviour relative to fully matured adult cells. Mathematical models can be used for translating findings on hiPSCCMs to adult cells [2] or to better understand the mechanisms of various ion channels when a drug is applied [3,4]. Paci et al. (2013) [3] developed the first model of hiPSC-CMs, which they later refined based on new data [3]. The model is based on iCells® (Fujifilm Cellular Dynamics, Inc. (FCDI), Madison WI, USA) but major differences among several cell lines and even within a single cell line have been found and motivate an approach for creating sample-specific models. We have developed an optimisation algorithm that parameterises the conductances (in S/F=Siemens/Farad) of the latest Paci et al. model (2018) [5] using current-voltage data obtained in individual patch-clamp experiments derived from an automated patch clamp system (Patchliner, Nanion Technologies GmbH, Munich).
Market abstraction of energy markets and policies - application in an agent-based modeling toolbox
(2023)
In light of emerging challenges in energy systems, markets are prone to changing dynamics and market design. Simulation models are commonly used to understand the changing dynamics of future electricity markets. However, existing market models were often created with specific use cases in mind, which limits their flexibility and usability. This can impose challenges for using a single model to compare different market designs. This paper introduces a new method of defining market designs for energy market simulations. The proposed concept makes it easy to incorporate different market designs into electricity market models by using relevant parameters derived from analyzing existing simulation tools, morphological categorization and ontologies. These parameters are then used to derive a market abstraction and integrate it into an agent-based simulation framework, allowing for a unified analysis of diverse market designs. Furthermore, we showcase the usability of integrating new types of long-term contracts and over-the-counter trading. To validate this approach, two case studies are demonstrated: a pay-as-clear market and a pay-as-bid long-term market. These examples demonstrate the capabilities of the proposed framework.
Magnetic nanoparticles (MNP) are investigated with great interest for biomedical applications in diagnostics (e.g. imaging: magnetic particle imaging (MPI)), therapeutics (e.g. hyperthermia: magnetic fluid hyperthermia (MFH)) and multi-purpose biosensing (e.g. magnetic immunoassays (MIA)). What all of these applications have in common is that they are based on the unique magnetic relaxation mechanisms of MNP in an alternating magnetic field (AMF). While MFH and MPI are currently the most prominent examples of biomedical applications, here we present results on the relatively new biosensing application of frequency mixing magnetic detection (FMMD) from a simulation perspective. In general, we ask how the key parameters of MNP (core size and magnetic anisotropy) affect the FMMD signal: by varying the core size, we investigate the effect of the magnetic volume per MNP; and by changing the effective magnetic anisotropy, we study the MNPs’ flexibility to leave its preferred magnetization direction. From this, we predict the most effective combination of MNP core size and magnetic anisotropy for maximum signal generation.
Direct methods comprising limit and shakedown analysis is a branch of computational mechanics. It plays a significant role in mechanical and civil engineering design. The concept of direct method aims to determinate the ultimate load bearing capacity of structures beyond the elastic range. For practical problems, the direct methods lead to nonlinear convex optimization problems with a large number of variables and onstraints. If strength and loading are random quantities, the problem of shakedown analysis is considered as stochastic programming. This paper presents a method so called chance constrained programming, an effective method of stochastic programming, to solve shakedown analysis problem under random condition of strength. In this our investigation, the loading is deterministic, the strength is distributed as normal or lognormal variables.
Light-stimulated hydrogel actuators with incorporated graphene oxide for microfluidic applications
(2015)
Label-free sensing of biomolecules by their intrinsic molecular charge using field-effect devices
(2015)
Label-free Electrostatic Detection of DNA Amplification by PCR Using Capacitive Field-effect Devices
(2016)
A capacitive field-effect EIS (electrolyte-insulator-semiconductor) sensor modified with a positively charged weak polyelectrolyte of poly(allylamine hydrochloride) (PAH)/single-stranded probe DNA (ssDNA) bilayer has been used for a label-free electrostatic detection of pathogen-specific DNA amplification via polymerase chain reaction (PCR). The sensor is able to distinguish between positive and negative PCR solutions, to detect the existence of target DNA amplicons in PCR samples and thus, can be used as tool for a quick verification of DNA amplification and the successful PCR process.
Due to the transition to renewable energies, electricity markets need to be made fit for purpose. To enable the comparison of different energy market designs, modeling tools covering market actors and their heterogeneous behavior are needed. Agent-based models are ideally suited for this task. Such models can be used to simulate and analyze changes to market design or market mechanisms and their impact on market dynamics. In this paper, we conduct an evaluation and comparison of two actively developed open-source energy market simulation models. The two models, namely AMIRIS and ASSUME, are both designed to simulate future energy markets using an agent-based approach. The assessment encompasses modelling features and techniques, model performance, as well as a comparison of model results, which can serve as a blueprint for future comparative studies of simulation models. The main comparison dataset includes data of Germany in 2019 and simulates the Day-Ahead market and participating actors as individual agents. Both models are comparable close to the benchmark dataset with a MAE between 5.6 and 6.4 €/MWh while also modeling the actual dispatch realistically.
Effective training requires high muscle forces potentially leading to training-induced injuries. Thus, continuous monitoring and controlling of the loadings applied to the musculoskeletal system along the motion trajectory is required. In this paper, a norm-optimal iterative learning control algorithm for the robot-assisted training is developed. The algorithm aims at minimizing the external knee joint moment, which is commonly used to quantify the loading of the medial compartment. To estimate the external knee joint moment, a musculoskeletal lower extremity model is implemented in OpenSim and coupled with a model of an industrial robot and a force plate mounted at its end-effector. The algorithm is tested in simulation for patients with varus, normal and valgus alignment of the knee. The results show that the algorithm is able to minimize the external knee joint moment in all three cases and converges after less than seven iterations.
The human arm consists of the humerus (upper arm), the medial ulna and the lateral radius (forearm). The joint between the humerus and the ulna is called humeroulnar joint and the joint between the humerus and the radius is called humeroradial joint. Lateral and medial collateral ligaments stabilize the elbow. Statistically, 2.5 out of 10,000 people suffer from radial head fractures [1]. In these fractures the cartilage is often affected. Caused by the injured cartilage, degenerative diseases like posttraumatic arthrosis may occur. The resulting pain and reduced range of motion have an impact on the patient’s quality of life. Until now, there has not been a treatment which allows typical loads in daily life activities and offers good long-term results. A new surgical approach was developed with the motivation to reduce the progress of the posttraumatic arthrosis. Here, the radius is shortened by 3 mm in the proximal part [2]. By this means, the load of the radius is intended to be reduced due to a load shift to the ulna. Since the radius is the most important stabilizer of the elbow it has to be confirmed that the stability is not affected. In the first test (Fig. 1 left), pressure distributions within the humeroulnar and humeroradial joints a native and a shortened radius were measured using resistive pressure sensors (I5076 and I5027, Tekscan, USA). The humerus was loaded axially in a tension testing machine (Z010, Zwick Roell, Germany) in 50 N steps up to 400 N. From the humerus the load is transmitted through both the radius and the ulna into the hand which is fixed on the ground. In the second test (Fig. 1 right), the joint stability was investigated using a digital image correlation system to measure the displacement of the ulna. Here, the humerus is fixed with a desired flexion angle and the unconstrained forearm lies on the ground. A rope connects the load actuator with a hook fixed in the ulna. A guide roller is used so that the rope pulls the ulna horizontally when a tensile load is applied. This creates a moment about the elbow joint with a maximum value of 7.5 Nm. Measurements were performed with varying flexion angles (0°, 30°, 60°, 90°, 120°). For both tests and each measurement, seven specimens were used. Student ́s t-test was employed to determine whether the mean values of the measurements in native specimen and operated specimens differ significantly.
Heavy metal detection with semiconductor devices based on PLD-prepared chalcogenide glass thin films
(2007)
Messenger apps like WhatsApp and Telegram are frequently used for everyday communication, but they can also be utilized as a platform for illegal activity. Telegram allows public groups with up to 200.000 participants. Criminals use these public groups for trading illegal commodities and services, which becomes a concern for law enforcement agencies, who manually monitor suspicious activity in these chat rooms. This research demonstrates how natural language processing (NLP) can assist in analyzing these chat rooms, providing an explorative overview of the domain and facilitating purposeful analyses of user behavior. We provide a publicly available corpus of annotated text messages with entities and relations from four self-proclaimed black market chat rooms. Our pipeline approach aggregates the extracted product attributes from user messages to profiles and uses these with their sold products as features for clustering. The extracted structured information is the foundation for further data exploration, such as identifying the top vendors or fine-granular price analyses. Our evaluation shows that pretrained word vectors perform better for unsupervised clustering than state-of-the-art transformer models, while the latter is still superior for sequence labeling.
Useful market simulations are key to the evaluation of diferent market designs existing of multiple market mechanisms or rules. Yet a simulation framework which has a comparison of diferent market mechanisms in mind was not found. The need to create an objective view on different sets of market rules while investigating meaningful agent strategies concludes that such a simulation framework is needed to advance the research on this subject. An overview of diferent existing market simulation models is given which also shows the research gap and the missing capabilities of those systems. Finally, a methodology is outlined how a novel market simulation which can answer the research questions can be developed.
The structure of the female pelvic floor (PF) is an inter-related system of bony pelvis,muscles, pelvic organs, fascias, ligaments, and nerves with multiple functions. Mechanically, thepelvic organ support system are of two types: (I) supporting system of the levator ani (LA) muscle,and (II) the suspension system of the endopelvic fascia condensation [1], [2]. Significantdenervation injury to the pelvic musculature, depolimerization of the collagen fibrils of the softvaginal hammock, cervical ring and ligaments during pregnancy and vaginal delivery weakens thenormal functions of the pelvic floor. Pelvic organ prolapse, incontinence, sexual dysfunction aresome of the dysfunctions which increases progressively with age and menopause due toweakened support system according to the Integral theory [3]. An improved 3D finite elementmodel of the female pelvic floor as shown in Fig. 1 is constructed that: (I) considers the realisticsupport of the organs to the pelvic side walls, (II) employs the improvement of our previous FEmodel [4], [5] along with the patient based geometries, (III) incorporates the realistic anatomy andboundary conditions of the endopelvic (pubocervical and rectovaginal) fascia, and (IV) considersvarying stiffness of the endopelvic fascia in the craniocaudal direction [3]. Several computationsare carried out on the presented computational model with healthy and damaged supportingtissues, and comparisons are made to understand the physiopathology of the female PF disorders.
A new formulation to calculate the shakedown limit load of Kirchhoff plates under stochastic conditions of strength is developed. Direct structural reliability design by chance con-strained programming is based on the prescribed failure probabilities, which is an effective approach of stochastic programming if it can be formulated as an equivalent deterministic optimization problem. We restrict uncertainty to strength, the loading is still deterministic. A new formulation is derived in case of random strength with lognormal distribution. Upper bound and lower bound shakedown load factors are calculated simultaneously by a dual algorithm.
Messenger apps like WhatsApp or Telegram are an integral part of daily communication. Besides the various positive effects, those services extend the operating range of criminals. Open trading groups with many thousand participants emerged on Telegram. Law enforcement agencies monitor suspicious users in such chat rooms. This research shows that text analysis, based on natural language processing, facilitates this through a meaningful domain overview and detailed investigations. We crawled a corpus from such self-proclaimed black markets and annotated five attribute types products, money, payment methods, user names, and locations. Based on each message a user sends, we extract and group these attributes to build profiles. Then, we build features to cluster the profiles. Pretrained word vectors yield better unsupervised clustering results than current
state-of-the-art transformer models. The result is a semantically meaningful high-level overview of the user landscape of black market chatrooms. Additionally, the extracted structured information serves as a foundation for further data exploration, for example, the most active users or preferred payment methods.
In recent years, the development of large pretrained language models, such as BERT and GPT, significantly improved information extraction systems on various tasks, including relation classification. State-of-the-art systems are highly accurate on scientific benchmarks. A lack of explainability is currently a complicating factor in many real-world applications. Comprehensible systems are necessary to prevent biased, counterintuitive, or harmful decisions.
We introduce semantic extents, a concept to analyze decision patterns for the relation classification task. Semantic extents are the most influential parts of texts concerning classification decisions. Our definition allows similar procedures to determine semantic extents for humans and models. We provide an annotation tool and a software framework to determine semantic extents for humans and models conveniently and reproducibly. Comparing both reveals that models tend to learn shortcut patterns from data. These patterns are hard to detect with current interpretability methods, such as input reductions. Our approach can help detect and eliminate spurious decision patterns during model development. Semantic extents can increase the reliability and security of natural language processing systems. Semantic extents are an essential step in enabling applications in critical areas like healthcare or finance. Moreover, our work opens new research directions for developing methods to explain deep learning models.
We compare four different algorithms for automatically estimating the muscle fascicle angle from ultrasonic images: the vesselness filter, the Radon transform, the projection profile method and the gray level cooccurence matrix (GLCM). The algorithm results are compared to ground truth data generated by three different experts on 425 image frames from two videos recorded during different types of motion. The best agreement with the ground truth data was achieved by a combination of pre-processing with a vesselness filter and measuring the angle with the projection profile method. The robustness of the estimation is increased by applying the algorithms to subregions with high gradients and performing a LOESS fit through these estimates.
In diesem Beitrag werden Ergebnisse der Entwicklung eines modularen festkörperbasierten Sensorsystems für die Überwachung von Zellkulturfermentationen präsentiert. Zur Messung der Elektrolytleitfähigkeit wurde das Layout von Interdigitalelektroden angepasst, um in vergleichsweise gut leitenden Elektrolyten zu messen. Durch Quervernetzung von Glucose-Oxidase mit Glutaraldehyd und Immobilisierung auf einer Platinelektrode wurde ein amperometrischer Glucosesensor mit einem linearen Messbereich von bis zu 2 mM und einer Sensitivität von 168 nA/mM realisiert.
Beim Ausbau nachhaltiger, regenerativer Energieversorgung hat die Umwandlung von organischer Biomasse in Biogas ein großes Potential. Der zugrundeliegende, komplexe biologische Prozess wird noch immer unzureichend verstanden und bedarf systematischer Untersuchungen der Prozessparameter, um einen hohen Ertrag bei guter Gasqualität zu ermöglichen. Die Fragestellungen zur Entschlüsselung des Prozesses sind sowohl verfahrenstechnischer als auch mikrobiologischer Natur. Aus mikrobiologischer Sicht ist die Kenntnis der tatsächlich beteiligten prozesstragenden Mikroorganismen von erheblicher Bedeutung, aus verfahrenstechnischer Sicht die Kenntnis der physikalischen und chemischen Faktoren, welche die mikrobiologischen Prozesse und kontrollieren. Im Zusammenspiel aller dieser Parameter wird die Biogasbildung befördert oder behindert, bis zum Abbruch des Prozesses.
Eine mögliche Kontrollmethode ist die Messung der metabolischen Aktivität prozesstragender Organismen.
Diese soll, beruhend auf fundierten Prozessdaten, gewonnen durch eine Parallelanlage, mit einem lichtadressierbaren potentiometrischen Sensor-System (LAPS) realisiert werden. Dieser Sensor ist in der Lage, pH-Wert-änderungen zu detektieren, die durch den Stoffwechsel der auf dem Chip immobilisierten Organismen hervorgerufen werden, um eine Online-Überwachung von Biogasanlagen zu ermöglichen.
Effectiveness of the edge-based smoothed finite element method applied to soft biological tissues
(2012)
Pulmonary arterial cannulation is a common and effective method for percutaneous mechanical circulatory support for concurrent right heart and respiratory failure [1]. However, limited data exists to what effect the positioning of the cannula has on the oxygen perfusion throughout the pulmonary artery (PA). This study aims to evaluate, using computational fluid dynamics (CFD), the effect of different cannula positions in the PA with respect to the oxygenation of the different branching vessels in order for an optimal cannula position to be determined. The four chosen different positions (see Fig. 1) of the cannulas are, in the lower part of the main pulmonary artery (MPA), in the MPA at the junction between the right pulmonary artery (RPA) and the left pulmonary artery (LPA), in the RPA at the first branch of the RPA and in the LPA at the first branch of the LPA.
Conventional EEG devices cannot be used in everyday life and hence, past decade research has been focused on Ear-EEG for mobile, at-home monitoring for various applications ranging from emotion detection to sleep monitoring. As the area available for electrode contact in the ear is limited, the electrode size and location play a vital role for an Ear-EEG system. In this investigation, we present a quantitative study of ear-electrodes with two electrode sizes at different locations in a wet and dry configuration. Electrode impedance scales inversely with size and ranges from 450 kΩ to 1.29 MΩ for dry and from 22 kΩ to 42 kΩ for wet contact at 10 Hz. For any size, the location in the ear canal with the lowest impedance is ELE (Left Ear Superior), presumably due to increased contact pressure caused by the outer-ear anatomy. The results can be used to optimize signal pickup and SNR for specific applications. We demonstrate this by recording sleep spindles during sleep onset with high quality (5.27 μVrms).
DNA-hybridization detection using light-addressable potentiometric sensor modified with gold layer
(2014)
We propose a stochastic programming method to analyse limit and shakedown of structures under random strength with lognormal distribution. In this investigation a dual chance constrained programming algorithm is developed to calculate simultaneously both the upper and lower bounds of the plastic collapse limit or the shakedown limit. The edge-based smoothed finite element method (ES-FEM) using three-node linear triangular elements is used.
The discovery of human induced pluripotent stem cells reprogrammed from somatic cells [1] and their ability to differentiate into cardiomyocytes (hiPSC-CMs) has provided a robust platform for drug screening [2]. Drug screenings are essential in the development of new components, particularly for evaluating the potential of drugs to induce life-threatening pro-arrhythmias. Between 1988 and 2009, 14 drugs have been removed from the market for this reason [3]. The microelectrode array (MEA) technique is a robust tool for drug screening as it detects the field potentials (FPs) for the entire cell culture. Furthermore, the propagation of the field potential can be examined on an electrode basis. To analyze MEA measurements in detail, we have developed an open-source tool.
Detection of Adrenaline Based on Bioelectrocatalytical System to Support Tumor Diagnostic Technology
(2017)
Sexism in online media comments is a pervasive challenge that often manifests subtly, complicating moderation efforts as interpretations of what constitutes sexism can vary among individuals. We study monolingual and multilingual open-source text embeddings to reliably detect sexism and misogyny in Germanlanguage online comments from an Austrian newspaper. We observed classifiers trained on text embeddings to mimic closely the individual judgements of human annotators. Our method showed robust performance in the GermEval 2024 GerMS-Detect Subtask 1 challenge, achieving an average macro F1 score of 0.597 (4th place, as reported on Codabench). It also accurately predicted the distribution of human annotations in GerMS-Detect Subtask 2, with an average Jensen-Shannon distance of 0.301 (2nd place). The computational efficiency of our approach suggests potential for scalable applications across various languages and linguistic contexts.
Design and implementation aspects of a 3D reconstruction algorithm for the Jülich TierPET system
(1997)
In the research domain of energy informatics, the importance of open datais rising rapidly. This can be seen as various new public datasets are created andpublished. Unfortunately, in many cases, the data is not available under a permissivelicense corresponding to the FAIR principles, often lacking accessibility or reusability.Furthermore, the source format often differs from the desired data format or does notmeet the demands to be queried in an efficient way. To solve this on a small scale atoolbox for ETL-processes is provided to create a local energy data server with openaccess data from different valuable sources in a structured format. So while the sourcesitself do not fully comply with the FAIR principles, the provided unique toolbox allows foran efficient processing of the data as if the FAIR principles would be met. The energydata server currently includes information of power systems, weather data, networkfrequency data, European energy and gas data for demand and generation and more.However, a solution to the core problem - missing alignment to the FAIR principles - isstill needed for the National Research Data Infrastructure.
Chemische Sensoren mit Bariumstrontiumtitanat als funktionelle Schicht zur Multiparameterdetektion
(2013)
The integration of product data from heterogeneous sources and manufacturers into a single catalog is often still a laborious, manual task. Especially small- and medium-sized enterprises face the challenge of timely integrating the data their business relies on to have an up-to-date product catalog, due to format specifications, low quality of data and the requirement of expert knowledge. Additionally, modern approaches to simplify catalog integration demand experience in machine learning, word vectorization, or semantic similarity that such enterprises do not have. Furthermore, most approaches struggle with low-quality data. We propose Attribute Label Ranking (ALR), an easy to understand and simple to adapt learning approach. ALR leverages a model trained on real-world integration data to identify the best possible schema mapping of previously unknown, proprietary, tabular format into a standardized catalog schema. Our approach predicts multiple labels for every attribute of an inpu t column. The whole column is taken into consideration to rank among these labels. We evaluate ALR regarding the correctness of predictions and compare the results on real-world data to state-of-the-art approaches. Additionally, we report findings during experiments and limitations of our approach.
The integration of frequently changing, volatile product data from different manufacturers into a single catalog is a significant challenge for small and medium-sized e-commerce companies. They rely on timely integrating product data to present them aggregated in an online shop without knowing format specifications, concept understanding of manufacturers, and data quality. Furthermore, format, concepts, and data quality may change at any time. Consequently, integrating product catalogs into a single standardized catalog is often a laborious manual task. Current strategies to streamline or automate catalog integration use techniques based on machine learning, word vectorization, or semantic similarity. However, most approaches struggle with low-quality or real-world data. We propose Attribute Label Ranking (ALR) as a recommendation engine to simplify the integration process of previously unknown, proprietary tabular format into a standardized catalog for practitioners. We evaluate ALR by focusing on the impact of different neural network architectures, language features, and semantic similarity. Additionally, we consider metrics for industrial application and present the impact of ALR in production and its limitations.
Biomechanical simulation of different prosthetic meshes for repairing uterine/vaginal vault prolapse
(2017)
The overall objective of this study is to develop a new external fixator, which closely maps the native kinematics of the elbow to decrease the joint force resulting in reduced rehabilitation time and pain. An experimental setup was designed to determine the native kinematics of the elbow during flexion of cadaveric arms. As a preliminary study, data from literature was used to modify a published biomechanical model for the calculation of the joint and muscle forces. They were compared to the original model and the effect of the kinematic refinement was evaluated. Furthermore, the obtained muscle forces were determined in order to apply them in the experimental setup. The joint forces in the modified model differed slightly from the forces in the original model. The muscle force curves changed particularly for small flexion angles but their magnitude for larger angles was consistent.
Es wurde ein automatisiertes, computerunterstütztes Testsystem für die Funktionsprüfung und Charakterisierung von (bio-)chemischen Sensoren auf Waferebene entwickelt und in einen konventionellen Spitzenmessplatz integriert. Das System ermöglicht die Charakterisierung und Identifizierung „funktionstauglicher“ Sensoren bereits auf Waferebene zwischen den einzelnen Herstellungsschritten, wodurch weitere, bisher übliche Verarbeitungsschritte wie das Fixieren, Bonden und Verkapseln für die defekten oder nicht funktionstauglichen Sensorstrukturen entfällt. Außerdem bietet eine speziell entworfene miniaturisierte Durchflussmesszelle die Möglichkeit, bereits auf Waferlevel die Sensitivität, Drift, Hysterese und Ansprechzeit der (bio-)chemischen Sensoren zu charakterisieren. Das System wurde exemplarisch mit kapazitiven, pH-sensitiven EIS- (Elektrolyt-Isolator-Silizium) Strukturen und ISFET- (ionensensitiver Feldeffekttransistor) Strukturen mit verschiedenen Geometrien und Gate-Layouts getestet.
Reliable methods for automatic readability assessment have the potential to impact a variety of fields, ranging from machine translation to self-informed learning. Recently, large language models for the German language (such as GBERT and GPT-2-Wechsel) have become available, allowing to develop Deep Learning based approaches that promise to further improve automatic readability assessment. In this contribution, we studied the ability of ensembles of fine-tuned GBERT and GPT-2-Wechsel models to reliably predict the readability of German sentences. We combined these models with linguistic features and investigated the dependence of prediction performance on ensemble size and composition. Mixed ensembles of GBERT and GPT-2-Wechsel performed better than ensembles of the same size consisting of only GBERT or GPT-2-Wechsel models. Our models were evaluated in the GermEval 2022 Shared Task on Text Complexity Assessment on data of German sentences. On out-of-sample data, our best ensemble achieved a root mean squared error of 0:435.
In collaborative research projects, both researchers and practitioners work together solving business-critical challenges. These projects often deal with ETL processes, in which humans extract information from non-machine-readable documents by hand. AI-based machine learning models can help to solve this problem.
Since machine learning approaches are not deterministic, their quality of output may decrease over time. This fact leads to an overall quality loss of the application which embeds machine learning models. Hence, the software qualities in development and production may differ.
Machine learning models are black boxes. That makes practitioners skeptical and increases the inhibition threshold for early productive use of research prototypes. Continuous monitoring of software quality in production offers an early response capability on quality loss and encourages the use of machine learning approaches. Furthermore, experts have to ensure that they integrate possible new inputs into the model training as quickly as possible.
In this paper, we introduce an architecture pattern with a reference implementation that extends the concept of Metrics Driven Research Collaboration with an automated software quality monitoring in productive use and a possibility to auto-generate new test data coming from processed documents in production.
Through automated monitoring of the software quality and auto-generated test data, this approach ensures that the software quality meets and keeps requested thresholds in productive use, even during further continuous deployment and changing input data.
Mathematical morphology is a part of image processing that has proven to be fruitful for numerous applications. Two main operations in mathematical morphology are dilation and erosion. These are based on the construction of a supremum or infimum with respect to an order over the tonal range in a certain section of the image. The tonal ordering can easily be realised in grey-scale morphology, and some morphological methods have been proposed for colour morphology. However, all of these have certain limitations.
In this paper we present a novel approach to colour morphology extending upon previous work in the field based on the Loewner order. We propose to consider an approximation of the supremum by means of a log-sum exponentiation introduced by Maslov. We apply this to the embedding of an RGB image in a field of symmetric 2x2 matrices. In this way we obtain nearly isotropic matrices representing colours and the structural advantage of transitivity. In numerical experiments we highlight some remarkable properties of the proposed approach.
An application of a scanning light-addressable potentiometric sensor for label-free DNA detection
(2013)
Supervised machine learning and deep learning require a large amount of labeled data, which data scientists obtain in a manual, and time-consuming annotation process. To mitigate this challenge, Active Learning (AL) proposes promising data points to annotators they annotate next instead of a subsequent or random sample. This method is supposed to save annotation effort while maintaining model performance.
However, practitioners face many AL strategies for different tasks and need an empirical basis to choose between them. Surveys categorize AL strategies into taxonomies without performance indications. Presentations of novel AL strategies compare the performance to a small subset of strategies. Our contribution addresses the empirical basis by introducing a reproducible active learning evaluation (ALE) framework for the comparative evaluation of AL strategies in NLP.
The framework allows the implementation of AL strategies with low effort and a fair data-driven comparison through defining and tracking experiment parameters (e.g., initial dataset size, number of data points per query step, and the budget). ALE helps practitioners to make more informed decisions, and researchers can focus on developing new, effective AL strategies and deriving best practices for specific use cases. With best practices, practitioners can lower their annotation costs. We present a case study to illustrate how to use the framework.