Tan Jerome, Nicholas

PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2019.

Abstract

Exploring large and complex data sets is a crucial factor in a digital library framework. To find a specific data set within a large repository, visualisation can help to validate the content apart from the textual description. However, even with the existing visual tools, the difficulty of large-scale data concerning their size and heterogeneity impedes building visualisation as part of the digital library framework, thus hindering the effectiveness of large-scale data exploration.
The scope of this research focuses on managing Big Data and eventually visualising the core information of the data itself. Specifically, I study three large-scale experiments that feature two Big Data challenges: large data size (Volume) and heterogeneous data (Variety), and provide the final visualisation through the web browser in which the size of the input data has to be reduced while preserving the vital information. Despite the intimidating size, i.e., approximately 30 GB, and the complexity of the data, i.e., about 100 parameters per timestamp, I demonstrated how to provide a comprehensive overview of each data set at an interactive rate where the system response time is less than 1 s—visualising gigabytes of data, and visualising multifaceted data in a single representation. For better data shar- ing, I selected a web-based system which serves as a ubiquitous platform for the domain experts. Being a useful collaborative tool, I also address the shortcomings related to limited bandwidth latency and various client hardware.
In this thesis, I present a design of web-based Big Data visualisation systems based on the data state reference model. Also, I develop frameworks that can process and output multi- dimensional data sets. For any Big Data feature, I propose a standard design guideline that helps domain experts to build their data visualisation. I introduce the use of texture-based images as the primary data object where the images are loaded in the texture memory of the client’s GPU for final visualisation. The visualisation ensures high interactivity since the data resides in the client’s memory. In particular, the interactivity of the system enables domain experts to narrow their search or analysis by using a top-down methodological ap- proach. Also, I provide four use case studies to examine the feasibility of the proposed design concepts: (1) analysing multi-spectral imagery, (2) Doppler wind lidar, (3) ultra- sound computer tomography, and (4) X-ray computer tomography. These case studies show the challenges of dealing with Big Data such as large data size or disperse data sets.
To this end, this dissertation contributes to a better understanding of web-based Big Data visualisation by using the proposed design guideline. I show that domain experts appreciate the WAVE, BORA, and 3D optimal viewpoint finder frameworks as tools to understand and explore their data sets. Mainly, the frameworks help them to build and customise their visualisation system. Although specific customisation is necessary for the different application, the effort is worthwhile, and it helps domain experts to understand their vast amounts of data better. The BORA framework fits perfectly in any time series data repositories where no programming knowledge is required. The WAVE framework serves as a web-based data exploration system. The 3D optimal viewpoint finder framework helps to generate 2D images from 3D data, where the 2D image is based on the 3D scene with optimal view angle. To cope with increasing data rates, a general hierarchical organisation of data is necessary to extract valuable information from data sets.

 

First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr. W. Nahm

Stevanovic, Uros

PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2017.

Abstract

This dissertation proposes a novel smart camera platform serving as a flexible data acquisition system for scientific applications. Current technological progress offers increasing performance in the areas we consider, namely high data-throughput, data processing, and detector performance. Prevalent data acquisition solutions typically focus on one of these aspects. However, driven by science, experiments experience increasing demands in terms of data throughput, speed and flexibility. In this dissertation, we introduce a system which, in addition to being able to provide high-speed data transfer, is also capable of interpreting the incoming information at an early stage. In order to demonstrate the full potential of the smart camera platform, we focus on X-ray imaging with synchrotron light sources. X-ray imaging applications can investigate the traits of technological and biological processes over microseconds for radiography, and milliseconds for tomography applications. These applications may require different sensors, and include complex experiment operations. The new smart camera platform is part of a larger project, UFO, which introduces a new concept for X-ray imaging. On-line data assessment is used to provide a data-driven feedback and active management of both the process and data acquisition procedure. This is accomplished using a GPU platform for fast reconstruction, embedded on-camera data processing, and integrating smart camera in a high-throughput data acquisition system. The final design of the smart camera platform consists of a custom high-performance FPGA board, providing continuous data transfer, embedded image processing, and a flexible input stage. In the IMAGE beamline of ANKA, camera is integrated in the new control system, and used in real-life applications. A maximum data-throughput of up to 8 GB/s is achieved. A custom image-based algorithm is implemented in the FPGA, with stringent real-time requirements, able to increase native sensor speed up to five times while reducing the amount of transfered data. Several image sensors are used, with resolutions of up to 20 megapixels and frame rates of up to 5 kfps. The smart camera platform was also used in non-imaging applications, stemming from the flexible input stage. The proposed camera architecture enables the user to modify the current system for any kind of high data-throughput applications, and to modify and implement custom processing algorithms.

 

First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr.-Ing. Dr. h.c. J. Becker

Rota, Lorenzo

PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2017.

Abstract

In modern particle accelerators, a precise control of the particle beam is essential for the correct operation of the facility. The experimental observation of the beam behavior relies on dedicated techniques, which are often described by the term “beam diagnostics”. Cutting-edge beam diagnostics systems, in particular several experimental setups currently installed at KIT’s synchrotron light source ANKA, employ line scan detectors to characterize and monitor the beam parameters precisely. Up to now, the experimental resolution of these setups has been limited by the line rate of existing detectors, which is limited to a few hundreds of kHz.

This thesis addresses this limitation with the development a novel line scan detector system named KALYPSO – KArlsruhe Linear arraY detector for MHz rePetition-rate SpectrOscopy. The goal is to provide scientists at ANKA with a complete detector system which will enable real-time measurements at MHz repetition rates. The design of both front-end and back-end electronics suitable for beam diagnostic experiments is a challenging task, because the detector must achieve low-noise performance at high repetition rates and with a large number of channels. Moreover, the detector system must sustain continuous data taking and introduce low-latency. To meet these stringent requirements, several novel components have been developed by the author of this thesis, such as a novel readout ASIC and a high-performance DAQ system.

The front-end ASIC has been designed to readout different types of microstrip sensors for the detection of visible and near-infrared light. The ASIC is composed of 128 analog channels which are operated in parallel, plus additional mixed-signal stages which interface external devices. Each channel consists of a Charge Sensitive Amplifier (CSA), a Correlated Double Sampling (CDS) stage and a channel buffer. Moreover, a high-speed output driver has been implemented to interface directly an off-chip ADC. The first version of the ASIC with a reduced number of channels has been produced in a 110 nm CMOS technology. The chip is fully functional and achieves a line rate of 12 MHz with an equivalent noise charge of 417 electrons when connected to a detector capacitance of 1.3 pF.

Moreover, a dedicated DAQ system has been developed to connect directly FPGA readout cards and GPU computing nodes. The data transfer is handled by a novel DMA engine implemented on FPGA. The performance of the DMA engine compares favorably with the current state-of-the-art, achieving a throughput of more than 7 GB/s and latencies as low as 2 us. The high-throughput and low-latency performance of the DAQ system enables real-time data processing on GPUs, as it has been demonstrated with extensive measurements. The DAQ system is currently integrated with KALYPSO and with other detector systems developed at the Institute for Data Processing and Electronics (IPE).

In parallel with the development of the ASIC, a first version of the KALYPSO detector system has been produced. This version is based on a Si or InGaAs microstrip sensor with 256 channels and on the GOTTHARD chip. A line rate of 2.7 MHz has been achieved, and experimental measurements have established KALYPSO as a powerful line scan detector operating at high line rates. The final version of the KALYPSO detector system, which will achieve a line rate of 10 MHz, is anticipated for early 2018.

Finally, KALYPSO has been installed at two different experimental setups at ANKA during several commissioning campaigns. The KALYPSO detector system allowed scientists to observe the beam behavior with unprecedented experimental resolution. First exciting and widely recognized scientific results were obtained at ANKA and at the European XFEL, demonstrating the benefits brought by the KALYPSO detector system in modern beam diagnostics.

 

First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr.-Ing. Dr. h.c. J. Becker

Farago, Tomas

PhD thesis, Faculty of Computer Science, Karlsruhe Institute of Technology, 2017.

Abstract

X-ray imaging experiments shed light on internal material structures. The success of an experiment depends on the properly selected experimental conditions, mechanics and the behavior of the sample or process under study. Up to now, there is no autonomous data acquisition scheme which would enable us to conduct a broad range of X-ray imaging experiments driven by image-based feedback. This thesis aims to close this gap by solving problems related to the selection of experimental parameters, fast data processing and automatic feedback to the experiment based on image metrics applied to the processed data.

In order to determine the best initial experimental conditions, we study the X-ray image formation principles and develop a framework for their simulation. It enables us to conduct a broad range of X-ray imaging experiments by taking into account many physical principles of the full light path from the X-ray source to the detector. Moreover, we focus on various sample geometry models and motion, which allows simulations of experiments such as 4D time-resolved tomography.

We further develop an autonomous data acquisition scheme which is able to fine-tune the initial conditions and control the experiment based on fast image analysis. We focus on high-speed experiments which require significant data processing speed, especially when the control is based on compute-intensive algorithms. We employ a highly parallelized framework to implement an efficient 3D reconstruction algorithm whose output is plugged into various image metrics which provide information about the acquired data. Such metrics are connected to a decision-making scheme which controls the data acquisition hardware in a closed loop.

We demonstrate the simulation framework accuracy by comparing virtual and real grating interferometry experiments. We also look into the impact of imaging conditions on the accuracy of the filtered back projection algorithm and how it can guide the optimization of experimental conditions. We also show how simulation together with ground truth can help to choose data processing parameters for motion estimation by a high-speed experiment.

We demonstrate the autonomous data acquisition system on an in-situ tomographic experiment, where it optimizes the camera frame rate based on tomographic reconstruction. We also use our system to conduct a high-throughput tomography experiment, where it scans many similar biological samples, finds the tomographic rotation axis for every sample and reconstructs a full 3D volume on-the-fly for quality assurance. Furthermore, we conduct an in-situ laminography experiment studying crack formation in a material. Our system performs the data acquisition and reconstructs a central slice of the sample to check its alignment and data quality.

Our work enables selection of the optimal initial experimental conditions based on high-fidelity simulations, their fine-tuning during a real experiment and its automatic control based on fast data analysis. Such a data acquisition scheme enables novel high-speed and in-situ experiments which cannot be controlled by a human operator due to high data rates.

 

First assessor: Prof. Dr.-Ing. R. Dillmann
Second assessor: Prof. Dr. Tilo Baumbach

Vogelgesang, Matthias

PhD thesis, Faculty of Computer Science, Karlsruhe Institute of Technology, 2014.

Abstract

Moore’s law stays the driving force behind higher chip integration density and an ever- increasing number of transistors. However, the adoption of massively parallel hardware architectures widens the gap between the potentially available microprocessor performance and the performance a developer can make use of. is thesis tries to close this gap by solving the problems that arise from the challenges of achieving optimal performance on parallel compute systems, allowing developers and end-users to use this compute performance in a transparent manner and using the compute performance to enable data-driven processes.

A general solution cannot realistically achieve optimal operation which is why we will focus on streamed data processing in this thesis. Data streams lend themselves to describe high-throughput data processing tasks such as audio and video processing. With this specific data stream use case, we can systematically improve the existing designs and optimize the execution from the instruction-level parallelism up to node-level task parallelism. In particular, we want to focus on X-ray imaging applications used at synchrotron light sources. These large-scale facilities provide an X-ray beam that enables scanning samples at much higher spatial and temporal resolution compared to conventional X-ray sources. The increased data rate inevitably requires highly parallel processing systems as well as an optimized data acquisition and control environment.

To solve the problem of high-throughput streamed data processing we developed, modeled and evaluated system architectures to acquire and process data streams on parallel and heterogeneous compute systems. We developed a method to map general task descriptions onto heterogeneous compute systems and execute them with optimizations for local multi-machines and clusters of multi-user compute nodes. We also proposed an source-to-source translation system to simplify the development of task descriptions.

We have shown that it is possible to acquire and compute tomographic reconstructions on a heterogeneous compute system consisting of CPUs and GPUs in soft real-time. The end-user’s only responsibility is to describe the problem correctly. With the proposed system architectures, we paved the way for novel in-situ and in-vivo experiments and a much smarter experiment setup in general. Where existing experiments depend on a static environment and process sequence, we established the possibility to control the experiment setup in a closed feedback loop.

 

First assessor: Prof. Dr. Achim Streit
Second assessor: Prof. Dr. Marc Weber