Tan Jerome, Nicholas
PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2019.
Exploring large and complex data sets is a crucial factor in a digital library framework. To find a specific data set within a large repository, visualisation can help to validate the content apart from the textual description. However, even with the existing visual tools, the difficulty of large-scale data concerning their size and heterogeneity impedes building visualisation as part of the digital library framework, thus hindering the effectiveness of large-scale data exploration.
The scope of this research focuses on managing Big Data and eventually visualising the core information of the data itself. Specifically, I study three large-scale experiments that feature two Big Data challenges: large data size (Volume) and heterogeneous data (Variety), and provide the final visualisation through the web browser in which the size of the input data has to be reduced while preserving the vital information. Despite the intimidating size, i.e., approximately 30 GB, and the complexity of the data, i.e., about 100 parameters per timestamp, I demonstrated how to provide a comprehensive overview of each data set at an interactive rate where the system response time is less than 1 s—visualising gigabytes of data, and visualising multifaceted data in a single representation. For better data shar- ing, I selected a web-based system which serves as a ubiquitous platform for the domain experts. Being a useful collaborative tool, I also address the shortcomings related to limited bandwidth latency and various client hardware.
In this thesis, I present a design of web-based Big Data visualisation systems based on the data state reference model. Also, I develop frameworks that can process and output multi- dimensional data sets. For any Big Data feature, I propose a standard design guideline that helps domain experts to build their data visualisation. I introduce the use of texture-based images as the primary data object where the images are loaded in the texture memory of the client’s GPU for final visualisation. The visualisation ensures high interactivity since the data resides in the client’s memory. In particular, the interactivity of the system enables domain experts to narrow their search or analysis by using a top-down methodological ap- proach. Also, I provide four use case studies to examine the feasibility of the proposed design concepts: (1) analysing multi-spectral imagery, (2) Doppler wind lidar, (3) ultra- sound computer tomography, and (4) X-ray computer tomography. These case studies show the challenges of dealing with Big Data such as large data size or disperse data sets.
To this end, this dissertation contributes to a better understanding of web-based Big Data visualisation by using the proposed design guideline. I show that domain experts appreciate the WAVE, BORA, and 3D optimal viewpoint finder frameworks as tools to understand and explore their data sets. Mainly, the frameworks help them to build and customise their visualisation system. Although specific customisation is necessary for the different application, the effort is worthwhile, and it helps domain experts to understand their vast amounts of data better. The BORA framework fits perfectly in any time series data repositories where no programming knowledge is required. The WAVE framework serves as a web-based data exploration system. The 3D optimal viewpoint finder framework helps to generate 2D images from 3D data, where the 2D image is based on the 3D scene with optimal view angle. To cope with increasing data rates, a general hierarchical organisation of data is necessary to extract valuable information from data sets.
First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr. W. Nahm
Amsbaugh, J.F. et al.
in Nuclear Instruments and Methods in Physics Research, Section A: Accelerators, Spectrometers, Detectors and Associated Equipment Volume 778, 1 April 2015, Pages 40-60
The focal-plane detector system for the KArlsruhe TRItium Neutrino (KATRIN) experiment consists of a multi-pixel silicon p-i-n-diode array, custom readout electronics, two superconducting solenoid magnets, an ultra high-vacuum system, a high-vacuum system, calibration and monitoring devices, a scintillating veto, and a custom data-acquisition system. It is designed to detect the low-energy electrons selected by the KATRIN main spectrometer. We describe the system and summarize its performance after its final installation. © 2015 Elsevier B.V. All rights reserved.
Phillips, D.G et al.
in IEEE Nuclear Science Symposium Conference Record
2010, Article number 5874002, Pages 1399-1403
This article will describe the procedures used to validate and characterize the combined hardware and software DAQ system of the KATRIN experiment. The Mk4 DAQ Electronics is the latest version in a series of field programmable gate array (FPGA)-based electronics developed at the Karlsruhe Institute of Technology’s Institute of Data Processing and Electronics (IPE). This system will serve as the primary detector readout in the KATRIN experiment. The KATRIN data acquisition software is a MacOS X application called ORCA (Object-oriented Real-time Control and Acquisition), which includes a powerful scripting language called ORCAScript. This article will also describe how ORCAScript is used in the validation and characterization tests of the Mk4 DAQ electronics system. © 2010 IEEE.
Chilingaryan, S., et al.
in Journal of Physics: Conference Series. Vol. 219. No. 4. IOP Publishing, 2010.
During operation of high energy physics experiments a big amount of slow control
data is recorded. It is necessary to examine all collected data checking the integrity and validity
of measurements. With growing maturity of AJAX technologies it becomes possible to construct
sophisticated interfaces using web technologies only.
Our solution for handling time series, generally slow control data, has a modular architecture:
backend system for data analysis and preparation, a web service interface for data access and a
fast AJAX web display. In order to provide fast interactive access the time series are aggregated
over time slices of few predefined lengths. The aggregated values are stored in the temporary
caching database and, then, are used to create generalizing data plots. These plots may include
indication of data quality and are generated within few hundreds of milliseconds even if very
high data rates are involved. The extensible export subsystem provides data in multiple formats
including CSV, Excel, ROOT, and TDMS. The search engine can be used to find periods of
time where indications of selected sensors are falling into the specified ranges. Utilization of
the caching database allows performing most of such lookups within a second. Based on this
functionality a web interface facilitating fast (Google-maps style) navigation through the data
has been implemented.
The solution is at the moment used by several slow control systems at Test Facility for
Fusion Magnets (TOSKA) and Karlsruhe Tritium Neutrino (KATRIN).