Selected papers on our projects.
Cavadini P., Weinhold H., Tonsmann M., Chilingaryan S., Kopmann A., Lewkowicz A., Miao C., Scharfer P., Schabel W.
in Experiments in Fluids, 59 (2018), 61. DOI:10.1007/s00348-017-2482-z
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature. To understand the effects of inhomogeneous drying on the quality of polymer coatings, an experimental setup to resolve the occurring flow field throughout the drying film has been developed. Deconvolution microscopy is used to analyze the flow field in 3D and time. Since the dimension of the spatial component in the direction of the line-of-sight is limited compared to the lateral components, a multi-focal approach is used. Here, the beam of light is equally distributed on up to five cameras using cubic beam splitters. Adding a meniscus lens between each pair of camera and beam splitter and setting different distances between each camera and its meniscus lens creates multi-focality and allows one to increase the depth of the observed volume. Resolving the spatial component in the line-of-sight direction is based on analyzing the point spread function. The analysis of the PSF is computational expensive and introduces a high complexity compared to traditional particle image velocimetry approaches. A new algorithm tailored to the parallel computing architecture of recent graphics processing units has been developed. The algorithm is able to process typical images in less than a second and has further potential to realize online analysis in the future. As a prove of principle, the flow fields occurring in thin polymer solutions drying at ambient conditions and at boundary conditions that force inhomogeneous drying are presented.
Hanschke D., Danilewsky A., Helfen L., Hamann E., Baumbach T.
in Physical Review Letters, 119 (2017), 215504. DOI:10.1103/PhysRevLett.119.215504
© 2017 American Physical Society. Correlated x-ray diffraction imaging and light microscopy provide a conclusive picture of three-dimensional dislocation arrangements on the micrometer scale. The characterization includes bulk crystallographic properties like Burgers vectors and determines links to structural features at the surface. Based on this approach, we study here the thermally induced slip-band formation at prior mechanical damage in Si wafers. Mobilization and multiplication of preexisting dislocations are identified as dominating mechanisms, and undisturbed long-range emission from regenerative sources is discovered.
Schmelzle S., Heethoff M., Heuveline V., Losel P., Becker J., Beckmann F., Schluenzen F., Hammel J.U., Kopmann A., Mexner W., Vogelgesang M., Jerome N.T., Betz O., Beutel R., Wipfler B., Blanke A., Harzsch S., Hornig M., Baumbach T., Van De Kamp T.
in Proceedings of SPIE – The International Society for Optical Engineering, 10391 (2017), 103910P. DOI:10.1117/12.2275959
© 2017 SPIE. Beamtime and resulting SRμCT data are a valuable resource for researchers of a broad scientific community in life sciences. Most research groups, however, are only interested in a specific organ and use only a fraction of their data. The rest of the data usually remains untapped. By using a new collaborative approach, the NOVA project (Network for Online Visualization and synergistic Analysis of tomographic data) aims to demonstrate, that more efficient use of the valuable beam time is possible by coordinated research on different organ systems. The biological partners in the project cover different scientific aspects and thus serve as model community for the collaborative approach. As proof of principle, different aspects of insect head morphology will be investigated (e.g., biomechanics of the mouthparts, and neurobiology with the topology of sensory areas). This effort is accomplished by development of advanced analysis tools for the ever-increasing quantity of tomographic datasets. In the preceding project ASTOR, we already successfully demonstrated considerable progress in semi-automatic segmentation and classification of internal structures. Further improvement of these methods is essential for an efficient use of beam time and will be refined in the current NOVAproject. Significant enhancements are also planned at PETRA III beamline p05 to provide all possible contrast modalities in x-ray imaging optimized to biological samples, on the reconstruction algorithms, and the tools for subsequent analyses and management of the data. All improvements made on key technologies within this project will in the long-term be equally beneficial for all users of tomography instrumentations.
PhD thesis, Faculty of Electrical Engineering and Information Technology, Karlsruhe Institute of Technology, 2017.
In modern particle accelerators, a precise control of the particle beam is essential for the correct operation of the facility. The experimental observation of the beam behavior relies on dedicated techniques, which are often described by the term “beam diagnostics”. Cutting-edge beam diagnostics systems, in particular several experimental setups currently installed at KIT’s synchrotron light source ANKA, employ line scan detectors to characterize and monitor the beam parameters precisely. Up to now, the experimental resolution of these setups has been limited by the line rate of existing detectors, which is limited to a few hundreds of kHz.
This thesis addresses this limitation with the development a novel line scan detector system named KALYPSO – KArlsruhe Linear arraY detector for MHz rePetition-rate SpectrOscopy. The goal is to provide scientists at ANKA with a complete detector system which will enable real-time measurements at MHz repetition rates. The design of both front-end and back-end electronics suitable for beam diagnostic experiments is a challenging task, because the detector must achieve low-noise performance at high repetition rates and with a large number of channels. Moreover, the detector system must sustain continuous data taking and introduce low-latency. To meet these stringent requirements, several novel components have been developed by the author of this thesis, such as a novel readout ASIC and a high-performance DAQ system.
The front-end ASIC has been designed to readout different types of microstrip sensors for the detection of visible and near-infrared light. The ASIC is composed of 128 analog channels which are operated in parallel, plus additional mixed-signal stages which interface external devices. Each channel consists of a Charge Sensitive Amplifier (CSA), a Correlated Double Sampling (CDS) stage and a channel buffer. Moreover, a high-speed output driver has been implemented to interface directly an off-chip ADC. The first version of the ASIC with a reduced number of channels has been produced in a 110 nm CMOS technology. The chip is fully functional and achieves a line rate of 12 MHz with an equivalent noise charge of 417 electrons when connected to a detector capacitance of 1.3 pF.
Moreover, a dedicated DAQ system has been developed to connect directly FPGA readout cards and GPU computing nodes. The data transfer is handled by a novel DMA engine implemented on FPGA. The performance of the DMA engine compares favorably with the current state-of-the-art, achieving a throughput of more than 7 GB/s and latencies as low as 2 us. The high-throughput and low-latency performance of the DAQ system enables real-time data processing on GPUs, as it has been demonstrated with extensive measurements. The DAQ system is currently integrated with KALYPSO and with other detector systems developed at the Institute for Data Processing and Electronics (IPE).
In parallel with the development of the ASIC, a first version of the KALYPSO detector system has been produced. This version is based on a Si or InGaAs microstrip sensor with 256 channels and on the GOTTHARD chip. A line rate of 2.7 MHz has been achieved, and experimental measurements have established KALYPSO as a powerful line scan detector operating at high line rates. The final version of the KALYPSO detector system, which will achieve a line rate of 10 MHz, is anticipated for early 2018.
Finally, KALYPSO has been installed at two different experimental setups at ANKA during several commissioning campaigns. The KALYPSO detector system allowed scientists to observe the beam behavior with unprecedented experimental resolution. First exciting and widely recognized scientific results were obtained at ANKA and at the European XFEL, demonstrating the benefits brought by the KALYPSO detector system in modern beam diagnostics.
First assessor: Prof. Dr. M. Weber
Second assessor: Prof. Dr.-Ing. Dr. h.c. J. Becker
Mohr H., Dritschler T., Ardila L.E., Balzer M., Caselle M., Chilingaryan S., Kopmann A., Rota L., Schuh T., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C04019. DOI:10.1088/1748-0221/12/04/C04019
© 2017 IOP Publishing Ltd and Sissa Medialab srl. In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5-7 × 1034 cm-2s-1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.
PhD thesis, Faculty of Computer Science, Karlsruhe Institute of Technology, 2017.
X-ray imaging experiments shed light on internal material structures. The success of an experiment depends on the properly selected experimental conditions, mechanics and the behavior of the sample or process under study. Up to now, there is no autonomous data acquisition scheme which would enable us to conduct a broad range of X-ray imaging experiments driven by image-based feedback. This thesis aims to close this gap by solving problems related to the selection of experimental parameters, fast data processing and automatic feedback to the experiment based on image metrics applied to the processed data.
In order to determine the best initial experimental conditions, we study the X-ray image formation principles and develop a framework for their simulation. It enables us to conduct a broad range of X-ray imaging experiments by taking into account many physical principles of the full light path from the X-ray source to the detector. Moreover, we focus on various sample geometry models and motion, which allows simulations of experiments such as 4D time-resolved tomography.
We further develop an autonomous data acquisition scheme which is able to fine-tune the initial conditions and control the experiment based on fast image analysis. We focus on high-speed experiments which require significant data processing speed, especially when the control is based on compute-intensive algorithms. We employ a highly parallelized framework to implement an efficient 3D reconstruction algorithm whose output is plugged into various image metrics which provide information about the acquired data. Such metrics are connected to a decision-making scheme which controls the data acquisition hardware in a closed loop.
We demonstrate the simulation framework accuracy by comparing virtual and real grating interferometry experiments. We also look into the impact of imaging conditions on the accuracy of the filtered back projection algorithm and how it can guide the optimization of experimental conditions. We also show how simulation together with ground truth can help to choose data processing parameters for motion estimation by a high-speed experiment.
We demonstrate the autonomous data acquisition system on an in-situ tomographic experiment, where it optimizes the camera frame rate based on tomographic reconstruction. We also use our system to conduct a high-throughput tomography experiment, where it scans many similar biological samples, finds the tomographic rotation axis for every sample and reconstructs a full 3D volume on-the-fly for quality assurance. Furthermore, we conduct an in-situ laminography experiment studying crack formation in a material. Our system performs the data acquisition and reconstructs a central slice of the sample to check its alignment and data quality.
Our work enables selection of the optimal initial experimental conditions based on high-fidelity simulations, their fine-tuning during a real experiment and its automatic control based on fast data analysis. Such a data acquisition scheme enables novel high-speed and in-situ experiments which cannot be controlled by a human operator due to high data rates.
First assessor: Prof. Dr.-Ing. R. Dillmann
Second assessor: Prof. Dr. Tilo Baumbach
Caselle M., Perez L.E.A., Balzer M., Kopmann A., Rota L., Weber M., Brosi M., Steinmann J., Brundermann E., Muller A.-S.
in Journal of Instrumentation, 12 (2017), C01040. DOI:10.1088/1748-0221/12/01/C01040
© 2017 IOP Publishing Ltd and Sissa Medialab srl. This paper presents a novel data acquisition system for continuous sampling of ultra-short pulses generated by terahertz (THz) detectors. Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to digitize pulse shapes with a sampling time down to 3 ps and pulse repetition rates up to 500 MHz. KAPTURE has been integrated as a permanent diagnostic device at ANKA and is used for investigating the emitted coherent synchrotron radiation in the THz range. A second version of KAPTURE has been developed to improve the performance and flexibility. The new version offers a better sampling accuracy for a pulse repetition rate up to 2 GHz. The higher data rate produced by the sampling system is processed in real-time by a heterogeneous FPGA and GPU architecture operating up to 6.5 GB/s continuously. Results in accelerator physics will be reported and the new design of KAPTURE be discussed.
M. Heethoff, V. Heuveline, H. Hartenstein, W. Mexner, T. van de Kamp, A. Kopmann
Final report, BMBF Programme: “Erforschung kondensierter Materie”, 2016.
Die Synchrotron-Röntgentomographie ist eine einzigartige Abbildungsmethode zur Untersuchung innerer Strukturen – insbesondere in undurchsichtigen Proben. In den letzten Jahren konnte die räumliche und zeitliche Auflösung der Methode stark verbessert werden. Die Auswertung der Datensätze ist allerdings bedingt durch ihre Größe und die Komplexität der abgebildeten Strukturen herausfordernd. Der Verbund für Funktionsmorphologie und Systematik hat sich mit dem Projekt ASTOR das Ziel gesetzt, den Zugang zur Röntgentomographie durch eine integrierte Analyseumgebung für biologische Nutzer zu erleichtern.
Durch den interdisziplinären Zusammenschluss von Biologen, Informatikern, Mathematikern und Ingenieuren war es möglich, die gesamte Datenverarbeitungskette zu betrachten. Es sind weitgehend automatisierte Datenverarbeitungs- und -transfermethoden entstanden. Die tomographischen Aufnahmen werden online rekonstruiert und in die ASTOR Analyseumgebung transferiert. Die Daten stehen anschließend über virtuelle Rechner den Nutzern sowohl bei ANKA als auch außerhalb zur Verfügung. Ein Autorisierungsschema für den Zugriff wurde erarbeitet. Die Analyseinfrastruktur besteht aus einem temporären Datenspeicher, dem Virtualisierungsserver, sowie der Anbindung an Beamlines und Langzeitarchiv. Die Analyseumgebung bietet neben kostenintensiven kommerziellen Programmen neu entwickelte Werkzeuge an. Hervorzuheben sind hier die ASTOR- Segmentierungsfunktionen, die den bislang sehr zeit- und arbeitsintensiven Arbeitsschritt um ein Vielfaches beschleunigen. Die automatische Segmentierung lässt sich transparent über in nur wenigen Schichten markierte Bereiche steuern und erzielt ein bislang unerreichtes automatisches Segmentierungsergebnis.
Die Analyseumgebung hat sich als sehr effizient für die Datenauswertung und Methodenentwicklung erwiesen. Neben den Antragstellern wird das System inzwischen von weiteren Nutzern erfolgreich eingesetzt. Im Verlauf des Projektes wurde in mehreren Strahlzeiten ein umfangreicher Satz an Beispielaufnahmen über einen breiten Bereich von Organismen aufgenommen. Ausgewählte Proben wurden als Vorlage für die Methodenentwicklung segmentiert und klassifiziert. Im Verlauf des Projektes konnte die Zahl der Aufnahmen innerhalb einer Messwoche auf zunächst 400 und zum Schluss sogar auf bis zu 1000 drastisch erhöht werden.
Mit ASTOR ist es gelungen, eine durchgehende Analyseumgebung aufzubauen, und damit den nächsten Schritt im Ausbau solcher Experimentiereinrichtungen aufzuzeigen. Für die gewählte Anwendung, die Funktionsmorphologie, ist es erstmals möglich, auch quantitative Reihenuntersuchungen an kleinen Organismen durchzuführen. Die Auswertesystematik ist nicht auf diese Anwendung beschränkt, sondern vielmehr ein generelles Beispiel für datenintensive Experimente. Das ebenfalls von der BMBF-Verbundforschung geförderte Projekt NOVA setzt die begonnenen Aktivitäten in diesem Sinne fort und beabsichtigt durch synergistische Zusammenarbeit einen offenen Datenkatalog für eine gesamte Community zu erstellen.
Steinmann J.L., Blomley E., Brosi M., Brundermann E., Caselle M., Hesler J.L., Hiller N., Kehrer B., Mathis Y.-L., Nasse M.J., Raasch J., Schedler M., Schonfeldt P., Schuh M., Schwarz M., Siegel M., Smale N., Weber M., Muller A.-S.
in Physical Review Letters, 117 (2016), 174802. DOI:10.1103/PhysRevLett.117.174802
© 2016 American Physical Society. Using arbitrary periodic pulse patterns we show the enhancement of specific frequencies in a frequency comb. The envelope of a regular frequency comb originates from equally spaced, identical pulses and mimics the single pulse spectrum. We investigated spectra originating from the periodic emission of pulse trains with gaps and individual pulse heights, which are commonly observed, for example, at high-repetition-rate free electron lasers, high power lasers, and synchrotrons. The ANKA synchrotron light source was filled with defined patterns of short electron bunches generating coherent synchrotron radiation in the terahertz range. We resolved the intensities of the frequency comb around 0.258 THz using the heterodyne mixing spectroscopy with a resolution of down to 1 Hz and provide a comprehensive theoretical description. Adjusting the electron’s revolution frequency, a gapless spectrum can be recorded, improving the resolution by up to 7 and 5 orders of magnitude compared to FTIR and recent heterodyne measurements, respectively. The results imply avenues to optimize and increase the signal-to-noise ratio of specific frequencies in the emitted synchrotron radiation spectrum to enable novel ultrahigh resolution spectroscopy and metrology applications from the terahertz to the x-ray region.
Master Thesis, Faculty for Physics, Karlsruhe Institute of Technology, 2016.
In this work we present an evaluation of GPUs as a possible L1 Track Trigger for the High Luminosity LHC, effective after Long Shutdown 3 around 2025.
The novelty lies in presenting an implementation based on calculations done entirely in software, in contrast to currently discussed solutions relying on specialized hardware, such as FPGAs and ASICs.
Our solution relies on using GPUs for the calculation instead, offering floating point calculations as well as flexibility and adaptability. Normally the involved data transfer latencies make GPUs unfeasible for use in low latency environments. To this end we use a data transfer scheme based on RDMA technology. This mitigates the normally involved overheads.
We based our efforts on previous work by the collaboration of the KIT and the English track trigger group [An FPGA-based track finder for the L1 trigger of the CMS experiment at the high luminosity LHC] whose algorithm was implemented on FPGAs.
In addition to the Hough transformation used regularly, we present our own version of the algorithm based on a hexagonal layout of the binned parameter space. With comparable computational latency and workload, the approach produces significantly less fake track candidates than the traditionally used method. This comes at a cost of efficiency of around 1 percent.
This work focuses on the track finding part of the proposed L1 Track Trigger and only looks at the result of a least squares fit to make an estimate of the performance of said seeding step. We furthermore present our results in terms of overall latency of this novel approach.
While not yet competitive, our implementation has surpassed initial expectations and are on the same order of magnitude as the FPGA approach in terms of latencies. Some caveats apply at the moment. Ultimately, more recent technology, not yet available to us in the current discussion will have to be tested and benchmarked to come to a more complete assessment of the feasibility of GPUs as a means of track triggering
at the High-Luminosity-LHC’s CMS experiment.
First assessor: Prof. Dr. Marc Weber
Second assessor: Prof. Dr. Ulrich Husemann
Supervised by Dipl.-Inform. Timo Dritschler