Publications of the IPE expert group for embedded parallel systems
Kopmann A., Chilingaryan S., Vogelgesang M., Dritschler T., Shkarin A., Shkarin R., Dos Santos Rolo T., Farago T., Van De Kamp T., Balzer M., Caselle M., Weber M., Baumbach T.
in 2016 IEEE Nuclear Science Symposium, Medical Imaging Conference and Room-Temperature Semiconductor Detector Workshop, NSS/MIC/RTSD 2016, 2017-January (2017), 8069895. DOI:10.1109/NSSMIC.2016.8069895
© 2016 IEEE. New imaging stations aim for high spatial and temporal resolution and are characterized by ever increasing sampling rates and demanding data processing workflows. Key to successful imaging experiments is to open up high-performance computing resources. This includes carefully selected components for computing hardware and development of advanced imaging algorithms optimized for efficient use of parallel processor architectures. We present the novel UFO computing platform for online data processing for imaging experiments and image-based feedback. The platform handles the full data life cycle from the X-ray detector to long-term data archives. Core components of this system are an FPGA platform for ultra-fast data acquisition, the GPU-based UFO image processing framework, and the fast control system “Concert”. Reconstruction algorithms implemented in the UFO framework are optimized for the latest GPU architectures and provide a reconstruction throughput in the GB/s-range. The control system “Concert” integrates high-speed computing nodes and fast beamline devices and thus enables image-based control loops and advanced workflow automation for efficient beam time usage. Low latencies are ensured by direct communication between FPGA and GPUs using AMDs DirectGMA technology. Time resolved tomography is supported by cutting edge regularization methods for high quality reconstructions with a reduced number of projections. The new infrastructure at ANKA has dramatically accelerated tomography from hours to second and resulted in new application fields, like high-throughput tomography, pump-probe radiography and stroboscopic tomography. Ultra-fast X-ray cine-tomography for the first time allows one to observe internal dynamics of moving millimeter-sized objects in real-time.
Aggleton R. et al.
in 2017 27th International Conference on Field Programmable Logic and Applications, FPL 2017 (2017), 8056825. DOI:10.23919/FPL.2017.8056825
© 2017 Ghent University. The Compact Muon Solenoid (CMS) experiment at CERN is scheduled for a major upgrade in the next decade in order to meet the demands of the new High Luminosity Large Hadron Collider. Amongst others, a new tracking system is under development including an outer tracker capable of rejecting low transverse momentum particles by looking at the coincidences of hits (stubs) in two closely spaced sensor layers in the same tracker module. Accepted stubs are transmitted off-detector for further processing at 40 MHz. In order to maintain under the increased luminosity the Level-1 trigger rate at 750 kHz, tracker data need to be included in the decision making process. For this purpose, a system architecture has to be developed that will be able to identify particles with transverse momentum above 3 GeV/c by building tracks out of stubs, while achieving an overall processing latency of maximum 4us. Targeting these requirements the current paper presents an FPGA-based track finding architecture that identifies track candidates in real-time and bases its functionality on a fully time-multiplexed approach. As a proof of concept, a hardware system has been assembled targeting the MP7 MicroTCA processing card that features a Xilinx Virtex-7 FPGA, demonstrating a realistic slice of the track finder. The paper discusses the algorithms’ implementation and the efficient utilisation of the available FPGA resources, it outlines the system architecture, and presents some of the hardware demonstrator results.
Adam W. et al.
in Journal of Instrumentation, 12 (2017), P06018. DOI:10.1088/1748-0221/12/06/P06018
© 2017 CERN for the benefit of the CMS collaboration.The upgrade of the LHC to the High-Luminosity LHC (HL-LHC) is expected to increase the LHC design luminosity by an order of magnitude. This will require silicon tracking detectors with a significantly higher radiation hardness. The CMS Tracker Collaboration has conducted an irradiation and measurement campaign to identify suitable silicon sensor materials and strip designs for the future outer tracker at the CMS experiment. Based on these results, the collaboration has chosen to use n-in-p type silicon sensors and focus further investigations on the optimization of that sensor type. This paper describes the main measurement results and conclusions that motivated this decision.
Gentsos C., Fedi G., Magazzu G., Magalotti D., Modak A., Storchi L., Palla F., Bilei G.M., Biesuz N., Chowdhury S.R., Crescioli F., Checcucci B., Tcherniakhovski D., Galbit G.C., Baulieu G., Balzer M.N., Sander O., Viret S., Servoli L., Nikolaidis S.
in 2017 6th International Conference on Modern Circuits and Systems Technologies, MOCAST 2017 (2017), 7937676. DOI:10.1109/MOCAST.2017.7937676
© 2017 IEEE. The increase of the luminosity in the High Luminosity upgrade of the CERN Large Hadron Collider (HL-LHC) will require the use of Tracker information in the evaluation of the Level-1 trigger in order to keep the trigger rate acceptable (i.e.: <1MHz). In order to extract the track information within the latency constraints (<5μs), a custom real-time system is necessary. We developed a prototype of the main building block of this system, the Pattern Recognition Mezzanine (PRM) that combines custom Associative Memory ASICs with modern FPGA devices. The architecture, functionality and test results of the PRM are described in the present work.
Mohr H., Dritschler T., Ardila L.E., Balzer M., Caselle M., Chilingaryan S., Kopmann A., Rota L., Schuh T., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C04019. DOI:10.1088/1748-0221/12/04/C04019
© 2017 IOP Publishing Ltd and Sissa Medialab srl. In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5-7 × 1034 cm-2s-1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.
Kaever P., Balzer M., Kopmann A., Zimmer M., Rongen H.
in Journal of Instrumentation, 12 (2017), C04004. DOI:10.1088/1748-0221/12/04/C04004
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Various centres of the German Helmholtz Association (HGF) started in 2012 to develop a modular data acquisition (DAQ) platform, covering the entire range from detector readout to data transfer into parallel computing environments. This platform integrates generic hardware components like the multi-purpose HGF-Advanced Mezzanine Card or a smart scientific camera framework, adding user value with Linux drivers and board support packages. Technically the scope comprises the DAQ-chain from FPGA-modules to computing servers, notably frontend-electronics-interfaces, microcontrollers and GPUs with their software plus high-performance data transmission links. The core idea is a generic and component-based approach, enabling the implementation of specific experiment requirements with low effort. This so called DTS-platform will support standards like MTCA.4 in hard- and software to ensure compatibility with commercial components. Its capability to deploy on other crate standards or FPGA-boards with PCI express or Ethernet interfaces remains an essential feature. Competences of the participating centres are coordinated in order to provide a solid technological basis for both research topics in the Helmholtz Programme “Matter and Technology”: “Detector Technology and Systems” and “Accelerator Research and Development”. The DTS-platform aims at reducing costs and development time and will ensure access to latest technologies for the collaboration. Due to its flexible approach, it has the potential to be applied in other scientific programs.
Caselle M., Perez L.E.A., Balzer M., Dritschler T., Kopmann A., Mohr H., Rota L., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C03015. DOI:10.1088/1748-0221/12/03/C03015
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using “DirectGMA (AMD)” and “GPUDirect (NVIDIA)” technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Caselle M., Perez L.E.A., Balzer M., Kopmann A., Rota L., Weber M., Brosi M., Steinmann J., Brundermann E., Muller A.-S.
in Journal of Instrumentation, 12 (2017), C01040. DOI:10.1088/1748-0221/12/01/C01040
© 2017 IOP Publishing Ltd and Sissa Medialab srl. This paper presents a novel data acquisition system for continuous sampling of ultra-short pulses generated by terahertz (THz) detectors. Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to digitize pulse shapes with a sampling time down to 3 ps and pulse repetition rates up to 500 MHz. KAPTURE has been integrated as a permanent diagnostic device at ANKA and is used for investigating the emitted coherent synchrotron radiation in the THz range. A second version of KAPTURE has been developed to improve the performance and flexibility. The new version offers a better sampling accuracy for a pulse repetition rate up to 2 GHz. The higher data rate produced by the sampling system is processed in real-time by a heterogeneous FPGA and GPU architecture operating up to 6.5 GB/s continuously. Results in accelerator physics will be reported and the new design of KAPTURE be discussed.
Lautner S., Lenz C., Hammel J., Moosmann J., Kuhn M., Caselle M., Vogelgesang M., Kopmann A., Beckmann F.
in Proceedings of SPIE – The International Society for Optical Engineering, 10391 (2017), 1039118. DOI:10.1117/12.2287221
© 2017 SPIE. Water transport from roots to shoots is a vital necessity in trees in order to sustain their photosynthetic activity and, hence, their physiological activity. The vascular tissue in charge is the woody body of root, stem and branches. In gymnosperm trees, like spruce trees (Picea abies (L.) Karst.), vascular tissue consists of tracheids: elongated, protoplast- free cells with a rigid cell wall that allow for axial water transport via their lumina. In order to analyze the over-all water transport capacity within one growth ring, time-consuming light microscopy analysis of the woody sample still is the conventional approach for calculating tracheid lumen area. In our investigations at the Imaging Beamline (IBL) operated by the Helmholtz-Zentrum Geesthacht (HZG) at PETRA III storage ring of the Deutsches Elektronen-Synchrotron DESY, Hamburg, we applied SRμCT on small wood samples of spruce trees in order to visualize and analyze size and formation of xylem elements and their respective lumina. The selected high-resolution phase-contrast technique makes full use of the novel 20 MPixel CMOS area detector developed within the cooperation of HZG and the Karlsruhe data by light microscopy analysis and, hence, prove, that μCT is a most appropriate method to gain valid information on xylem cell structure and tree water transport capacity.
Bergmann T., Balzer M., Hopp T., Van De Kamp T., Kopmann A., Jerome N.T., Zapf M.
in VISIGRAPP 2017 – Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 3 (2017) 330-334.
Copyright © 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved. The computer gaming industry is traditionally the moving power and spirit in the development of computer visualization hardware and software. This year, affordable and high quality virtual reality headsets became available and the science community is eager to get benefit from it. This paper describes first experiences in adapting the new hardware for three different visualization use cases. In all three examples existing visualization pipelines were extended by virtual reality technology. We describe our approach, based on the HTC Vive VR headset, the open source software Blender and the Unreal Engine 4 game engine. The use cases are from three different fields: large-scale particle physics research, X-ray-imaging for entomology research and medical imaging with ultrasound computer tomography. Finally we discuss benefits and limits of the current virtual reality technology and present an outlook to future developments.