Gentsos C., Fedi G., Magazzu G., Magalotti D., Modak A., Storchi L., Palla F., Bilei G.M., Biesuz N., Chowdhury S.R., Crescioli F., Checcucci B., Tcherniakhovski D., Galbit G.C., Baulieu G., Balzer M.N., Sander O., Viret S., Servoli L., Nikolaidis S.
in 2017 6th International Conference on Modern Circuits and Systems Technologies, MOCAST 2017 (2017), 7937676. DOI:10.1109/MOCAST.2017.7937676
© 2017 IEEE. The increase of the luminosity in the High Luminosity upgrade of the CERN Large Hadron Collider (HL-LHC) will require the use of Tracker information in the evaluation of the Level-1 trigger in order to keep the trigger rate acceptable (i.e.: <1MHz). In order to extract the track information within the latency constraints (<5μs), a custom real-time system is necessary. We developed a prototype of the main building block of this system, the Pattern Recognition Mezzanine (PRM) that combines custom Associative Memory ASICs with modern FPGA devices. The architecture, functionality and test results of the PRM are described in the present work.
Mohr H., Dritschler T., Ardila L.E., Balzer M., Caselle M., Chilingaryan S., Kopmann A., Rota L., Schuh T., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C04019. DOI:10.1088/1748-0221/12/04/C04019
© 2017 IOP Publishing Ltd and Sissa Medialab srl. In this work, we investigate the use of GPUs as a way of realizing a low-latency, high-throughput track trigger, using CMS as a showcase example. The CMS detector at the Large Hadron Collider (LHC) will undergo a major upgrade after the long shutdown from 2024 to 2026 when it will enter the high luminosity era. During this upgrade, the silicon tracker will have to be completely replaced. In the High Luminosity operation mode, luminosities of 5-7 × 1034 cm-2s-1 and pileups averaging at 140 events, with a maximum of up to 200 events, will be reached. These changes will require a major update of the triggering system. The demonstrated systems rely on dedicated hardware such as associative memory ASICs and FPGAs. We investigate the use of GPUs as an alternative way of realizing the requirements of the L1 track trigger. To this end we implemeted a Hough transformation track finding step on GPUs and established a low-latency RDMA connection using the PCIe bus. To showcase the benefits of floating point operations, made possible by the use of GPUs, we present a modified algorithm. It uses hexagonal bins for the parameter space and leads to a more truthful representation of the possible track parameters of the individual hits in Hough space. This leads to fewer duplicate candidates and reduces fake track candidates compared to the regular approach. With data-transfer latencies of 2 μs and processing times for the Hough transformation as low as 3.6 μs, we can show that latencies are not as critical as expected. However, computing throughput proves to be challenging due to hardware limitations.
Kaever P., Balzer M., Kopmann A., Zimmer M., Rongen H.
in Journal of Instrumentation, 12 (2017), C04004. DOI:10.1088/1748-0221/12/04/C04004
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Various centres of the German Helmholtz Association (HGF) started in 2012 to develop a modular data acquisition (DAQ) platform, covering the entire range from detector readout to data transfer into parallel computing environments. This platform integrates generic hardware components like the multi-purpose HGF-Advanced Mezzanine Card or a smart scientific camera framework, adding user value with Linux drivers and board support packages. Technically the scope comprises the DAQ-chain from FPGA-modules to computing servers, notably frontend-electronics-interfaces, microcontrollers and GPUs with their software plus high-performance data transmission links. The core idea is a generic and component-based approach, enabling the implementation of specific experiment requirements with low effort. This so called DTS-platform will support standards like MTCA.4 in hard- and software to ensure compatibility with commercial components. Its capability to deploy on other crate standards or FPGA-boards with PCI express or Ethernet interfaces remains an essential feature. Competences of the participating centres are coordinated in order to provide a solid technological basis for both research topics in the Helmholtz Programme “Matter and Technology”: “Detector Technology and Systems” and “Accelerator Research and Development”. The DTS-platform aims at reducing costs and development time and will ensure access to latest technologies for the collaboration. Due to its flexible approach, it has the potential to be applied in other scientific programs.
Caselle M., Perez L.E.A., Balzer M., Dritschler T., Kopmann A., Mohr H., Rota L., Vogelgesang M., Weber M.
in Journal of Instrumentation, 12 (2017), C03015. DOI:10.1088/1748-0221/12/03/C03015
© 2017 IOP Publishing Ltd and Sissa Medialab srl. Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using “DirectGMA (AMD)” and “GPUDirect (NVIDIA)” technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Caselle M., Perez L.E.A., Balzer M., Kopmann A., Rota L., Weber M., Brosi M., Steinmann J., Brundermann E., Muller A.-S.
in Journal of Instrumentation, 12 (2017), C01040. DOI:10.1088/1748-0221/12/01/C01040
© 2017 IOP Publishing Ltd and Sissa Medialab srl. This paper presents a novel data acquisition system for continuous sampling of ultra-short pulses generated by terahertz (THz) detectors. Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to digitize pulse shapes with a sampling time down to 3 ps and pulse repetition rates up to 500 MHz. KAPTURE has been integrated as a permanent diagnostic device at ANKA and is used for investigating the emitted coherent synchrotron radiation in the THz range. A second version of KAPTURE has been developed to improve the performance and flexibility. The new version offers a better sampling accuracy for a pulse repetition rate up to 2 GHz. The higher data rate produced by the sampling system is processed in real-time by a heterogeneous FPGA and GPU architecture operating up to 6.5 GB/s continuously. Results in accelerator physics will be reported and the new design of KAPTURE be discussed.
Steinmann J.L., Blomley E., Brosi M., Brundermann E., Caselle M., Hesler J.L., Hiller N., Kehrer B., Mathis Y.-L., Nasse M.J., Raasch J., Schedler M., Schonfeldt P., Schuh M., Schwarz M., Siegel M., Smale N., Weber M., Muller A.-S.
in Physical Review Letters, 117 (2016), 174802. DOI:10.1103/PhysRevLett.117.174802
© 2016 American Physical Society. Using arbitrary periodic pulse patterns we show the enhancement of specific frequencies in a frequency comb. The envelope of a regular frequency comb originates from equally spaced, identical pulses and mimics the single pulse spectrum. We investigated spectra originating from the periodic emission of pulse trains with gaps and individual pulse heights, which are commonly observed, for example, at high-repetition-rate free electron lasers, high power lasers, and synchrotrons. The ANKA synchrotron light source was filled with defined patterns of short electron bunches generating coherent synchrotron radiation in the terahertz range. We resolved the intensities of the frequency comb around 0.258 THz using the heterodyne mixing spectroscopy with a resolution of down to 1 Hz and provide a comprehensive theoretical description. Adjusting the electron’s revolution frequency, a gapless spectrum can be recorded, improving the resolution by up to 7 and 5 orders of magnitude compared to FTIR and recent heterodyne measurements, respectively. The results imply avenues to optimize and increase the signal-to-noise ratio of specific frequencies in the emitted synchrotron radiation spectrum to enable novel ultrahigh resolution spectroscopy and metrology applications from the terahertz to the x-ray region.
Bergmann T., Balzer M., Bormann D., Chilingaryan S.A., Eitel K., Kleifges M., Kopmann A., Kozlov V., Menshikov A., Siebenborn B., Tcherniakhovski D., Vogelgesang M., Weber M.
in 2015 IEEE Nuclear Science Symposium and Medical Imaging Conference, NSS/MIC 2015 (2016), 7581841. DOI:10.1109/NSSMIC.2015.7581841
© 2015 IEEE. The EDELWEISS experiment, located in the underground laboratory LSM (France), is one of the leading experiments using cryogenic germanium (Ge) detectors for a direct search for dark matter. For the EDELWEISS-III phase, a new scalable data acquisition (DAQ) system was designed and built, based on the ‘IPE4 DAQ system’, which has already been used for several experiments in astroparticle physics.
Harbaum T., Seboui M., Balzer M., Becker J., Weber M.
in Proceedings – 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016 (2016) 184-191, 7544775. DOI:10.1109/FCCM.2016.52
© 2016 IEEE. Modern high-energy physics experiments such as the Compact Muon Solenoid experiment at CERN produce an extraordinary amount of data every 25ns. To handle a data rate of more than 50Tbit/s a multi-level trigger system is required, which reduces the data rate. Due to the increased luminosity after the Phase-II-Upgrade of the LHC, the CMS tracking system has to be redesigned. The current trigger system is unable to handle the resulting amount of data after this upgrade. Because of the latency of a few microseconds the Level 1 Track Trigger has to be implemented in hardware. State-of-the-art pattern recognition filter the incoming data by template matching on ASICs with a content addressable memory architecture. An implementation on an FPGA, which replaces the content addressable memory of the ASIC, has not been possible so far. This paper presents a new approach to a content addressable memory architecture, which allows an implementation of an FPGA based design. By combining filtering and track finding on an FPGA design, there are many possibilities of adjusting the two algorithms to each other. There is more flexibility enabled by the FPGA architecture in contrast to the ASIC. The presented design minimizes the stored data by logic to optimally utilize the available resources of an FPGA. Furthermore, the developed design meets the strong timing constraints and possesses the required properties of the content addressable memory.
Amstutz C. et al.
in 2016 IEEE-NPSS Real Time Conference, RT 2016 (2016), 7543102. DOI:10.1109/RTC.2016.7543102
© 2016 IEEE.A new tracking system is under development for operation in the CMS experiment at the High Luminosity LHC. It includes an outer tracker which will construct stubs, built by correlating clusters in two closely spaced sensor layers for the rejection of hits from low transverse momentum tracks, and transmit them off-detector at 40 MHz. If tracker data is to contribute to keeping the Level-1 trigger rate at around 750 kHz under increased luminosity, a crucial component of the upgrade will be the ability to identify tracks with transverse momentum above 3 GeV/c by building tracks out of stubs. A concept for an FPGA-based track finder using a fully time-multiplexed architecture is presented, where track candidates are identified using a projective binning algorithm based on the Hough Transform. A hardware system based on the MP7 MicroTCA processing card has been assembled, demonstrating a realistic slice of the track finder in order to help gauge the performance and requirements for a full system. This paper outlines the system architecture and algorithms employed, highlighting some of the first results from the hardware demonstrator and discusses the prospects and performance of the completed track finder.
Rota L., Balzer M., Caselle M., Kudella S., Weber M., Mozzanica A., Hiller N., Nasse M.J., Niehues G., Schonfeldt P., Gerth C., Steffen B., Walther S., Makowski D., Mielczarek A.
in 2016 IEEE-NPSS Real Time Conference, RT 2016 (2016), 7543157. DOI:10.1109/RTC.2016.7543157
© 2016 IEEE. We developed a fast linear array detector to improve the acquisition rate and the resolution of Electro-Optical Spectral Decoding (EOSD) experimental setups currently installed at several light sources. The system consists of a detector board, an FPGA readout board and a high-Throughput data link. InGaAs or Si sensors are used to detect near-infrared (NIR) or visible light. The data acquisition, the operation of the detector board and its synchronization with synchrotron machines are handled by the FPGA. The readout architecture is based on a high-Throughput PCI-Express data link. In this paper we describe the system and we present preliminary measurements taken at the ANKA storage ring. A line-rate of 2.7 Mlps (lines per second) has been demonstrated.