Conference Proceeding
Refine
Year of publication
Institute
- Fachbereich Medizintechnik und Technomathematik (150) (remove)
Has Fulltext
- no (150) (remove)
Document Type
- Conference Proceeding (150) (remove)
Keywords
- Natural language processing (4)
- Clustering (2)
- Information extraction (2)
- Active learning (1)
- Agent-based simulation (1)
- Chance constrained programming (1)
- Cloud Computing (1)
- Cloud Service Broker (1)
- Deep learning (1)
- EEG (1)
Messenger apps like WhatsApp or Telegram are an integral part of daily communication. Besides the various positive effects, those services extend the operating range of criminals. Open trading groups with many thousand participants emerged on Telegram. Law enforcement agencies monitor suspicious users in such chat rooms. This research shows that text analysis, based on natural language processing, facilitates this through a meaningful domain overview and detailed investigations. We crawled a corpus from such self-proclaimed black markets and annotated five attribute types products, money, payment methods, user names, and locations. Based on each message a user sends, we extract and group these attributes to build profiles. Then, we build features to cluster the profiles. Pretrained word vectors yield better unsupervised clustering results than current
state-of-the-art transformer models. The result is a semantically meaningful high-level overview of the user landscape of black market chatrooms. Additionally, the extracted structured information serves as a foundation for further data exploration, for example, the most active users or preferred payment methods.
Messenger apps like WhatsApp and Telegram are frequently used for everyday communication, but they can also be utilized as a platform for illegal activity. Telegram allows public groups with up to 200.000 participants. Criminals use these public groups for trading illegal commodities and services, which becomes a concern for law enforcement agencies, who manually monitor suspicious activity in these chat rooms. This research demonstrates how natural language processing (NLP) can assist in analyzing these chat rooms, providing an explorative overview of the domain and facilitating purposeful analyses of user behavior. We provide a publicly available corpus of annotated text messages with entities and relations from four self-proclaimed black market chat rooms. Our pipeline approach aggregates the extracted product attributes from user messages to profiles and uses these with their sold products as features for clustering. The extracted structured information is the foundation for further data exploration, such as identifying the top vendors or fine-granular price analyses. Our evaluation shows that pretrained word vectors perform better for unsupervised clustering than state-of-the-art transformer models, while the latter is still superior for sequence labeling.
A New Class of Biosensors Based on Tobacco Mosaic Virus and Coat Proteins as Enzyme Nanocarrier
(2016)
Fields of asymmetric tensors play an important role in many applications such as medical imaging (diffusion tensor magnetic resonance imaging), physics, and civil engineering (for example Cauchy-Green-deformation tensor, strain tensor with local rotations, etc.). However, such asymmetric tensors are usually symmetrized and then further processed. Using this procedure results in a loss of information. A new method for the processing of asymmetric tensor fields is proposed restricting our attention to tensors of second-order given by a 2x2 array or matrix with real entries. This is achieved by a transformation resulting in Hermitian matrices that have an eigendecomposition similar to symmetric matrices. With this new idea numerical results for real-world data arising from a deformation of an object by external forces are given. It is shown that the asymmetric part indeed contains valuable information.
Light-stimulated hydrogel actuators with incorporated graphene oxide for microfluidic applications
(2015)
Clearance of blood components and fluid drainage play a crucial role in subarachnoid hemorrhage (SAH) and post hemorrhagic hydrocephalus (PHH). With the involvement of interstitial fluid (ISF) and cerebrospinal fluid (CSF), two pathways for the clearance of fluid and solutes in the brain are proposed. Starting at the level of capillaries, flow of ISF follows along the basement membranes in the walls of cerebral arteries out of the parenchyma to drain into the lymphatics and CSF [1]–[3]. Conversely, it is shown that CSF enters the parenchyma between glial and pial basement membranes of penetrating arteries [4]–[6]. Nevertheless, the involved structures and the contribution of either flow pathway to fluid balance between the subarachnoid space and interstitial space remains controversial. Low frequency oscillations in vascular tone are referred to as vasomotion and corresponding vasomotion waves are modeled as the driving force for flow of ISF out of the parenchyma [7]. Retinal vessel analysis (RVA) allows non-invasive measurement of retinal vessel vasomotion with respect to diameter changes [8]. Thus, the aim of the study is to investigate vasomotion in RVA signals of SAH and PHH patients.
Reliable methods for automatic readability assessment have the potential to impact a variety of fields, ranging from machine translation to self-informed learning. Recently, large language models for the German language (such as GBERT and GPT-2-Wechsel) have become available, allowing to develop Deep Learning based approaches that promise to further improve automatic readability assessment. In this contribution, we studied the ability of ensembles of fine-tuned GBERT and GPT-2-Wechsel models to reliably predict the readability of German sentences. We combined these models with linguistic features and investigated the dependence of prediction performance on ensemble size and composition. Mixed ensembles of GBERT and GPT-2-Wechsel performed better than ensembles of the same size consisting of only GBERT or GPT-2-Wechsel models. Our models were evaluated in the GermEval 2022 Shared Task on Text Complexity Assessment on data of German sentences. On out-of-sample data, our best ensemble achieved a root mean squared error of 0:435.