Publications of the computing group for ultrafast imaging
Cavadini P., Weinhold H., Tonsmann M., Chilingaryan S., Kopmann A., Lewkowicz A., Miao C., Scharfer P., Schabel W.
in Experiments in Fluids, 59 (2018), 61. DOI:10.1007/s00348-017-2482-z
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature. To understand the effects of inhomogeneous drying on the quality of polymer coatings, an experimental setup to resolve the occurring flow field throughout the drying film has been developed. Deconvolution microscopy is used to analyze the flow field in 3D and time. Since the dimension of the spatial component in the direction of the line-of-sight is limited compared to the lateral components, a multi-focal approach is used. Here, the beam of light is equally distributed on up to five cameras using cubic beam splitters. Adding a meniscus lens between each pair of camera and beam splitter and setting different distances between each camera and its meniscus lens creates multi-focality and allows one to increase the depth of the observed volume. Resolving the spatial component in the line-of-sight direction is based on analyzing the point spread function. The analysis of the PSF is computational expensive and introduces a high complexity compared to traditional particle image velocimetry approaches. A new algorithm tailored to the parallel computing architecture of recent graphics processing units has been developed. The algorithm is able to process typical images in less than a second and has further potential to realize online analysis in the future. As a prove of principle, the flow fields occurring in thin polymer solutions drying at ambient conditions and at boundary conditions that force inhomogeneous drying are presented.
Farago T., Mikulik P., Ershov A., Vogelgesang M., Hanschke D., Baumbach T.
in Journal of Synchrotron Radiation, 24 (2017) 1283-1295. DOI:10.1107/S1600577517012255
© International Union of Crystallography, 2017. An open-source framework for conducting a broad range of virtual X-ray imaging experiments, syris, is presented. The simulated wavefield created by a source propagates through an arbitrary number of objects until it reaches a detector. The objects in the light path and the source are time-dependent, which enables simulations of dynamic experiments, e.g. four-dimensional time-resolved tomography and laminography. The high-level interface of syris is written in Python and its modularity makes the framework very flexible. The computationally demanding parts behind this interface are implemented in OpenCL, which enables fast calculations on modern graphics processing units. The combination of flexibility and speed opens new possibilities for studying novel imaging methods and systematic search of optimal combinations of measurement conditions and data processing parameters. This can help to increase the success rates and efficiency of valuable synchrotron beam time. To demonstrate the capabilities of the framework, various experiments have been simulated and compared with real data. To show the use case of measurement and data processing parameter optimization based on simulation, a virtual counterpart of a high-speed radiography experiment was created and the simulated data were used to select a suitable motion estimation algorithm; one of its parameters was optimized in order to achieve the best motion estimation accuracy when applied on the real data. syris was also used to simulate tomographic data sets under various imaging conditions which impact the tomographic reconstruction accuracy, and it is shown how the accuracy may guide the selection of imaging conditions for particular use cases.The flexible and efficient framework syris is presented and its capabilities for the simulation of four-dimensional X-ray imaging experiments are demonstrated by two exemplary applications.
Kopmann A., Chilingaryan S., Vogelgesang M., Dritschler T., Shkarin A., Shkarin R., Dos Santos Rolo T., Farago T., Van De Kamp T., Balzer M., Caselle M., Weber M., Baumbach T.
in 2016 IEEE Nuclear Science Symposium, Medical Imaging Conference and Room-Temperature Semiconductor Detector Workshop, NSS/MIC/RTSD 2016, 2017-January (2017), 8069895. DOI:10.1109/NSSMIC.2016.8069895
© 2016 IEEE. New imaging stations aim for high spatial and temporal resolution and are characterized by ever increasing sampling rates and demanding data processing workflows. Key to successful imaging experiments is to open up high-performance computing resources. This includes carefully selected components for computing hardware and development of advanced imaging algorithms optimized for efficient use of parallel processor architectures. We present the novel UFO computing platform for online data processing for imaging experiments and image-based feedback. The platform handles the full data life cycle from the X-ray detector to long-term data archives. Core components of this system are an FPGA platform for ultra-fast data acquisition, the GPU-based UFO image processing framework, and the fast control system “Concert”. Reconstruction algorithms implemented in the UFO framework are optimized for the latest GPU architectures and provide a reconstruction throughput in the GB/s-range. The control system “Concert” integrates high-speed computing nodes and fast beamline devices and thus enables image-based control loops and advanced workflow automation for efficient beam time usage. Low latencies are ensured by direct communication between FPGA and GPUs using AMDs DirectGMA technology. Time resolved tomography is supported by cutting edge regularization methods for high quality reconstructions with a reduced number of projections. The new infrastructure at ANKA has dramatically accelerated tomography from hours to second and resulted in new application fields, like high-throughput tomography, pump-probe radiography and stroboscopic tomography. Ultra-fast X-ray cine-tomography for the first time allows one to observe internal dynamics of moving millimeter-sized objects in real-time.
Schmelzle S., Heethoff M., Heuveline V., Losel P., Becker J., Beckmann F., Schluenzen F., Hammel J.U., Kopmann A., Mexner W., Vogelgesang M., Jerome N.T., Betz O., Beutel R., Wipfler B., Blanke A., Harzsch S., Hornig M., Baumbach T., Van De Kamp T.
in Proceedings of SPIE – The International Society for Optical Engineering, 10391 (2017), 103910P. DOI:10.1117/12.2275959
© 2017 SPIE. Beamtime and resulting SRμCT data are a valuable resource for researchers of a broad scientific community in life sciences. Most research groups, however, are only interested in a specific organ and use only a fraction of their data. The rest of the data usually remains untapped. By using a new collaborative approach, the NOVA project (Network for Online Visualization and synergistic Analysis of tomographic data) aims to demonstrate, that more efficient use of the valuable beam time is possible by coordinated research on different organ systems. The biological partners in the project cover different scientific aspects and thus serve as model community for the collaborative approach. As proof of principle, different aspects of insect head morphology will be investigated (e.g., biomechanics of the mouthparts, and neurobiology with the topology of sensory areas). This effort is accomplished by development of advanced analysis tools for the ever-increasing quantity of tomographic datasets. In the preceding project ASTOR, we already successfully demonstrated considerable progress in semi-automatic segmentation and classification of internal structures. Further improvement of these methods is essential for an efficient use of beam time and will be refined in the current NOVAproject. Significant enhancements are also planned at PETRA III beamline p05 to provide all possible contrast modalities in x-ray imaging optimized to biological samples, on the reconstruction algorithms, and the tools for subsequent analyses and management of the data. All improvements made on key technologies within this project will in the long-term be equally beneficial for all users of tomography instrumentations.
Mohr H., Dritschler T., Ardila L.E., Balzer M., Caselle M., Chilingaryan S., Kopmann A., Rota L., Schuh T., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C04019. DOI:10.1088/1748-0221/12/04/C04019
© 2017 IOP Publishing Ltd and Sissa Medialab srl. In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5-7 × 1034 cm-2s-1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.
Kaever P., Balzer M., Kopmann A., Zimmer M., Rongen H.
in Journal of Instrumentation, 12 (2017), C04004. DOI:10.1088/1748-0221/12/04/C04004
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Various centres of the German Helmholtz Association (HGF) started in 2012 to develop a modular data acquisition (DAQ) platform, covering the entire range from detector readout to data transfer into parallel computing environments. This platform integrates generic hardware components like the multi-purpose HGF-Advanced Mezzanine Card or a smart scientific camera framework, adding user value with Linux drivers and board support packages. Technically the scope comprises the DAQ-chain from FPGA-modules to computing servers, notably frontend-electronics-interfaces, microcontrollers and GPUs with their software plus high-performance data transmission links. The core idea is a generic and component-based approach, enabling the implementation of specific experiment requirements with low effort. This so called DTS-platform will support standards like MTCA.4 in hard- and software to ensure compatibility with commercial components. Its capability to deploy on other crate standards or FPGA-boards with PCI express or Ethernet interfaces remains an essential feature. Competences of the participating centres are coordinated in order to provide a solid technological basis for both research topics in the Helmholtz Programme “Matter and Technology”: “Detector Technology and Systems” and “Accelerator Research and Development”. The DTS-platform aims at reducing costs and development time and will ensure access to latest technologies for the collaboration. Due to its flexible approach, it has the potential to be applied in other scientific programs.
Caselle M., Perez L.E.A., Balzer M., Dritschler T., Kopmann A., Mohr H., Rota L., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C03015. DOI:10.1088/1748-0221/12/03/C03015
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using “DirectGMA (AMD)” and “GPUDirect (NVIDIA)” technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Caselle M., Perez L.E.A., Balzer M., Kopmann A., Rota L., Weber M., Brosi M., Steinmann J., Brundermann E., Muller A.-S.
in Journal of Instrumentation, 12 (2017), C01040. DOI:10.1088/1748-0221/12/01/C01040
© 2017 IOP Publishing Ltd and Sissa Medialab srl. This paper presents a novel data acquisition system for continuous sampling of ultra-short pulses generated by terahertz (THz) detectors. Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to digitize pulse shapes with a sampling time down to 3 ps and pulse repetition rates up to 500 MHz. KAPTURE has been integrated as a permanent diagnostic device at ANKA and is used for investigating the emitted coherent synchrotron radiation in the THz range. A second version of KAPTURE has been developed to improve the performance and flexibility. The new version offers a better sampling accuracy for a pulse repetition rate up to 2 GHz. The higher data rate produced by the sampling system is processed in real-time by a heterogeneous FPGA and GPU architecture operating up to 6.5 GB/s continuously. Results in accelerator physics will be reported and the new design of KAPTURE be discussed.
Lautner S., Lenz C., Hammel J., Moosmann J., Kuhn M., Caselle M., Vogelgesang M., Kopmann A., Beckmann F.
in Proceedings of SPIE – The International Society for Optical Engineering, 10391 (2017), 1039118. DOI:10.1117/12.2287221
© 2017 SPIE. Water transport from roots to shoots is a vital necessity in trees in order to sustain their photosynthetic activity and, hence, their physiological activity. The vascular tissue in charge is the woody body of root, stem and branches. In gymnosperm trees, like spruce trees (Picea abies (L.) Karst.), vascular tissue consists of tracheids: elongated, protoplast- free cells with a rigid cell wall that allow for axial water transport via their lumina. In order to analyze the over-all water transport capacity within one growth ring, time-consuming light microscopy analysis of the woody sample still is the conventional approach for calculating tracheid lumen area. In our investigations at the Imaging Beamline (IBL) operated by the Helmholtz-Zentrum Geesthacht (HZG) at PETRA III storage ring of the Deutsches Elektronen-Synchrotron DESY, Hamburg, we applied SRμCT on small wood samples of spruce trees in order to visualize and analyze size and formation of xylem elements and their respective lumina. The selected high-resolution phase-contrast technique makes full use of the novel 20 MPixel CMOS area detector developed within the cooperation of HZG and the Karlsruhe data by light microscopy analysis and, hence, prove, that μCT is a most appropriate method to gain valid information on xylem cell structure and tree water transport capacity.
Bergmann T., Balzer M., Bormann D., Chilingaryan S.A., Eitel K., Kleifges M., Kopmann A., Kozlov V., Menshikov A., Siebenborn B., Tcherniakhovski D., Vogelgesang M., Weber M.
in 2015 IEEE Nuclear Science Symposium and Medical Imaging Conference, NSS/MIC 2015 (2016), 7581841. DOI:10.1109/NSSMIC.2015.7581841
© 2015 IEEE. The EDELWEISS experiment, located in the underground laboratory LSM (France), is one of the leading experiments using cryogenic germanium (Ge) detectors for a direct search for dark matter. For the EDELWEISS-III phase, a new scalable data acquisition (DAQ) system was designed and built, based on the ‘IPE4 DAQ system’, which has already been used for several experiments in astroparticle physics.