Tan Jerome, Nicholas

PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2019.

Abstract

Exploring large and complex data sets is a crucial factor in a digital library framework. To find a specific data set within a large repository, visualisation can help to validate the content apart from the textual description. However, even with the existing visual tools, the difficulty of large-scale data concerning their size and heterogeneity impedes building visualisation as part of the digital library framework, thus hindering the effectiveness of large-scale data exploration.
The scope of this research focuses on managing Big Data and eventually visualising the core information of the data itself. Specifically, I study three large-scale experiments that feature two Big Data challenges: large data size (Volume) and heterogeneous data (Variety), and provide the final visualisation through the web browser in which the size of the input data has to be reduced while preserving the vital information. Despite the intimidating size, i.e., approximately 30 GB, and the complexity of the data, i.e., about 100 parameters per timestamp, I demonstrated how to provide a comprehensive overview of each data set at an interactive rate where the system response time is less than 1 s—visualising gigabytes of data, and visualising multifaceted data in a single representation. For better data shar- ing, I selected a web-based system which serves as a ubiquitous platform for the domain experts. Being a useful collaborative tool, I also address the shortcomings related to limited bandwidth latency and various client hardware.
In this thesis, I present a design of web-based Big Data visualisation systems based on the data state reference model. Also, I develop frameworks that can process and output multi- dimensional data sets. For any Big Data feature, I propose a standard design guideline that helps domain experts to build their data visualisation. I introduce the use of texture-based images as the primary data object where the images are loaded in the texture memory of the client’s GPU for final visualisation. The visualisation ensures high interactivity since the data resides in the client’s memory. In particular, the interactivity of the system enables domain experts to narrow their search or analysis by using a top-down methodological ap- proach. Also, I provide four use case studies to examine the feasibility of the proposed design concepts: (1) analysing multi-spectral imagery, (2) Doppler wind lidar, (3) ultra- sound computer tomography, and (4) X-ray computer tomography. These case studies show the challenges of dealing with Big Data such as large data size or disperse data sets.
To this end, this dissertation contributes to a better understanding of web-based Big Data visualisation by using the proposed design guideline. I show that domain experts appreciate the WAVE, BORA, and 3D optimal viewpoint finder frameworks as tools to understand and explore their data sets. Mainly, the frameworks help them to build and customise their visualisation system. Although specific customisation is necessary for the different application, the effort is worthwhile, and it helps domain experts to understand their vast amounts of data better. The BORA framework fits perfectly in any time series data repositories where no programming knowledge is required. The WAVE framework serves as a web-based data exploration system. The 3D optimal viewpoint finder framework helps to generate 2D images from 3D data, where the 2D image is based on the 3D scene with optimal view angle. To cope with increasing data rates, a general hierarchical organisation of data is necessary to extract valuable information from data sets.

 

First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr. W. Nahm

van de Kamp T., Schwermann A.H., dos Santos Rolo T., Losel P.D., Engler T., Etter W., Farago T., Gottlicher J., Heuveline V., Kopmann A., Mahler B., Mors T., Odar J., Rust J., Tan Jerome N., Vogelgesang M., Baumbach T., Krogmann L.

in Nature Communications, 9 (2018), 3325. DOI:10.1038/s41467-018-05654-y

Abstract

© 2018, The Author(s). About 50% of all animal species are considered parasites. The linkage of species diversity to a parasitic lifestyle is especially evident in the insect order Hymenoptera. However, fossil evidence for host–parasitoid interactions is extremely rare, rendering hypotheses on the evolution of parasitism assumptive. Here, using high-throughput synchrotron X-ray microtomography, we examine 1510 phosphatized fly pupae from the Paleogene of France and identify 55 parasitation events by four wasp species, providing morphological and ecological data. All species developed as solitary endoparasitoids inside their hosts and exhibit different morphological adaptations for exploiting the same hosts in one habitat. Our results allow systematic and ecological placement of four distinct endoparasitoids in the Paleogene and highlight the need to investigate ecological data preserved in the fossil record.

Schmelzle S., Heethoff M., Heuveline V., Losel P., Becker J., Beckmann F., Schluenzen F., Hammel J.U., Kopmann A., Mexner W., Vogelgesang M., Jerome N.T., Betz O., Beutel R., Wipfler B., Blanke A., Harzsch S., Hornig M., Baumbach T., Van De Kamp T.

in Proceedings of SPIE – The International Society for Optical Engineering, 10391 (2017), 103910P. DOI:10.1117/12.2275959

Abstract

© 2017 SPIE. Beamtime and resulting SRμCT data are a valuable resource for researchers of a broad scientific community in life sciences. Most research groups, however, are only interested in a specific organ and use only a fraction of their data. The rest of the data usually remains untapped. By using a new collaborative approach, the NOVA project (Network for Online Visualization and synergistic Analysis of tomographic data) aims to demonstrate, that more efficient use of the valuable beam time is possible by coordinated research on different organ systems. The biological partners in the project cover different scientific aspects and thus serve as model community for the collaborative approach. As proof of principle, different aspects of insect head morphology will be investigated (e.g., biomechanics of the mouthparts, and neurobiology with the topology of sensory areas). This effort is accomplished by development of advanced analysis tools for the ever-increasing quantity of tomographic datasets. In the preceding project ASTOR, we already successfully demonstrated considerable progress in semi-automatic segmentation and classification of internal structures. Further improvement of these methods is essential for an efficient use of beam time and will be refined in the current NOVAproject. Significant enhancements are also planned at PETRA III beamline p05 to provide all possible contrast modalities in x-ray imaging optimized to biological samples, on the reconstruction algorithms, and the tools for subsequent analyses and management of the data. All improvements made on key technologies within this project will in the long-term be equally beneficial for all users of tomography instrumentations.