Birk M., Balzer M., Ruiter N.V., Becker J.
in Computers and Electrical Engineering, 40 (2014) 1171-1185. DOI:10.1016/j.compeleceng.2013.11.033
In heterogeneous computing, application developers have to identify the best-suited target platform from a variety of alternatives. In this work, we compare performance and architectural efficiency of Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs) for two algorithms taken from a novel medical imaging method named 3D ultrasound computer tomography. From the 40 nm and 28 nm generations, we use top-notch devices and those with similar power consumption values. For our two benchmark algorithms from the signal processing and imaging domain, the results show that if power consumption is not considered, the GPU and FPGA from the 40nm generation give both, a similar performance and efficiency per transistor. In the 28 nm process, in contrast, the FPGA is superior to its GPU counterpart by 86% and 39%, depending on the algorithm. If power is limited, FPGAs outperform GPUs in each investigated case by at least a factor of four. © 2013 Elsevier Ltd. All rights reserved.
Brogna A.S., Balzer M., Smale S., Hartmann J., Bormann D., Hamann E., Cecilia A., Zuber M., Koenig T., Zwerger A., Weber M., Fiederle M., Baumbach T.
in Journal of Instrumentation, 9 (2014), C05047. DOI:10.1088/1748-0221/9/05/C05047
In this work we present a novel readout electronics for an X-ray sensor based on a Si crystal bump-bonded to an array of 3 × 2 Medipix ASICs. The pixel size is 55 μm × 55 μm with a total number of ∼ 400k pixels and a sensitive area of 42 mm × 28 mm. The readout electronics operate Medipix-2 MXR or Timepix ASICs with a clock speed of 125 MHz. The data acquisition system is centered around an FPGA and each of the six ASICs has a dedicated I/O port for simultaneous data acquisition. The settings of the auxiliary devices (ADCs and DACs) are also processed in the FPGA. Moreover, a high-resolution timer operates the electronic shutter to select the exposure time from 8 ns to several milliseconds. A sophisticated trigger is available in hardware and software to synchronize the acquisition with external electro-mechanical motors. The system includes a diagnostic subsystem to check the sensor temperature and to control the cooling Peltier cells and a programmable high-voltage generator to bias the crystal. A network cable transfers the data, encapsulated into the UDP protocol and streamed at 1 Gb/s. Therefore most notebooks or personal computers are able to process the data and to program the system without a dedicated interface. The data readout software is compatible with the well-known Pixelman 2.x running both on Windows and GNU/Linux. Furthermore the open architecture encourages users to write their own applications. With a low-level interface library which implements all the basic features, a MATLAB or Python script can be implemented for special manipulations of the raw data. In this paper we present selected images taken with a microfocus X-ray tube to demonstrate the capability to collect the data at rates up to 120 fps corresponding to 0.76 Gb/s. © 2014 IOP Publishing Ltd and Sissa Medialab srl.
Judin V., Brosi M., Caselle M., Hertle E., Hiller N., Kopmann A., Muller A.-S., Schuh M., Smale N.J., Steinmann J.L., Weber M.
in IPAC 2014: Proceedings of the 5th International Particle Accelerator Conference (2014) 225-227.
Copyright © 2014 CC-BY-3.0 and by the respective authors.The ANKA storage ring can generate brilliant coherent synchrotron radiation (CSR) in the THz range due to a dedicated low-α<inf>c</inf>-optics with reduced bunch lengths. At higher electron currents the radiation is not stable, but occurs in powerful bursts caused by micro-bunching instabilities. This intense THz radiation is very attractive for users. However, the reproducibility of the experimental conditions is very low due to those power fluctuations. Systematic studies of bursting CSR in multi-bunch operation were performed with fast THz detectors at ANKA using a dedicated, ultra-fast DAQ-FPGA board. The technique and preliminary results of these studies are presented in this paper.
Caselle M., Brosi M., Chilingaryan S., Dritschler T., Hertle E., Judin V., Kopmann A., Muller A.-S., Raasch J., Schleicher M., Smale N.J., Steinmann J., Vogelgesang M., Wuensch S., Siegel M., Weber M.
in IPAC 2014: Proceedings of the 5th International Particle Accelerator Conference (2014) 3497-3499.
Copyright © 2014 CC-BY-3.0 and by the respective authors.The commissioning of a new real-time and high-accuracy data acquisition system suitable for recording individual ultra-short coherent pulses detected by fast terahertz detectors will be presented. The Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to monitor turn-by-turn all buckets in streaming mode. KAPTURE is based on a direct sampling pulse operating with a minimum sampling time of 3 ps and a total time jitter less than 1.7 ps. A very low noise layout design combined with wide dynamic range and bandwidth of the analog front-end enables the sampling of signals generated by different GHz/THz detectors. The system has already been used with NbN and YBCO superconductor film detectors as well as zero biased Schottky diode detectors. The digitized data is transmitted to a DAQ system by a FPGA high throughput board with data transfer rates of 4 GByte/s. The setup is accomplished by a real-time data processing unit based on high-end graphics processor units (GPUs) for on-line analysis of the frequency behaviour of the coherent synchrotron emission. The system has been successfully used to study the beam properties of the ANKA synchrotron radiation source located at the Karlsruhe Institute of Technology.
Rota L., Caselle M., Hiller N., Muller A.-S., Weber M.
in International Beam Instrumentation Conference, IBIC 2014 (2014).
A new spectrometer system has been developed at ANKA for near-field single-shot Electro-Optical (EO) bunch profile measurements with a frame rate of 5 Mfps. The frame rate of commercial line detectors is limited to several tens of kHz, unsuitable for measuring fast dynamic changes of the bunch conditions. The new system aims to realize continuous data acquisition and over long observation periods without dead time. InGaAs or Si linear array pixel sensors are used to detect the near IR and visible spectrum radiation. The detector signals are fed via wire-bonding connections to the GOTTHARD ASIC, a charge-sensitive amplifier with analog outputs. The front-end board is also equipped with an array of fast ADCs. The digital samples are then acquired by an FPGA-based readout card and transmitted to an external DAQ system via a high-speed PCI-Express data link. The DAQ system uses high-end Graphics Processors Units (GPUs) to perform a real-time analysis of the beam conditions. In this paper we present the concept, the first prototype and the low-noise layout techniques used for fast linear detectors.
Caselle M., Brosi M., Chilingaryan S., Dritschler T., Hiller N., Judin V., Kopmann A., Muller A.-S., Raasch J., Rota L., Petzold L., Smale N.J., Steinmann J.L., Vogelgesang M., Wuensch S., Siegel M., Weber M.
in International Beam Instrumentation Conference, IBIC 2014 (2014).
The ANKA storage ring generates brilliant coherent synchrotron radiation (CSR) in the THz range due to a dedicated low-ac-optics with reduced bunch length. At higher electron currents the radiation is not stable but is emitted in powerful bursts caused by micro-bunching instabilities. This intense THz radiation is very attractive for users. However, the experimental conditions cannot be easily reproduced due to those power fluctuations. To study the bursting CSR in multi-bunch operation an ultra- fast and high-accuracy data acquisition system for recording of individual ultra-short coherent pulses has been developed. The Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to monitor all buckets turn-by-turn in streaming mode. KAPTURE provides real-time sampling of the pulse with a minimum sampling time of 3 ps and a total time jitter of less than 1.7 ps. The KAPTURE system, the synchrotron operation modes and beam test results are presented in this paper.
Birk M., Zapf M., Balzer M., Ruiter N., Becker J.
in Journal of Real-Time Image Processing, 9 (2014) 159-170. DOI:10.1007/s11554-012-0267-4
As today’s standard screening methods frequently fail to diagnose breast cancer before metastases have developed, earlier breast cancer diagnosis is still a major challenge. Three-dimensional ultrasound computer tomography promises high-quality images of the breast, but is currently limited by a time-consuming image reconstruction. In this work, we investigate the acceleration of the image reconstruction by GPUs and FPGAs. We compare the obtained performance results with a recent multi-core CPU. We show that both architectures are able to accelerate processing, whereas the GPU reaches the highest performance. Furthermore, we draw conclusions in terms of applicability of the accelerated reconstructions in future clinical application and highlight general principles for speed-up on GPUs and FPGAs. © 2012 Springer-Verlag.
Stevanovic U., Caselle M., Balzer M., Cecilia A., Chilingaryan S., Farago T., Gasilov S., Herth A., Kopmann A., Vogelgesang M., Weber M.
in 2014 19th IEEE-NPSS Real Time Conference, RT 2014 – Conference Records (2014), 7097495. DOI:10.1109/RTC.2014.7097495
© 2014 IEEE. High-speed X-ray imaging applications such as radiography and tomography play a crucial role for non-destructive investigations in material and biology sciences. For data-intensive applications, on-line analysis of the data is necessary for initial quality assurance and data-driven feedback. In this article we will present a new smart camera platform, with embedded FPGA processing that is able to stream and process data continuously in real-time. It is used in the new imaging beamline IMAGE, in ANKA. The new smart camera platform consists of a CMOS sensor, an FPGA readout card connected with a high speed PCIe interface to the GPU-based readout computer. It is tightly coupled to a newly implemented control system, called Concert. Concert enables efficient operation of the beamline by integrating devices and experiment process control, as well as data analysis. A key feature of smart cameras is embedded image processing. In this article we will demonstrate the potential of this approach with the implementation of an image-based self-event trigger. The algorithm automatically restricts the readout to selected regions with changed content. Application dependent trigger parameters are hidden by our control system which sets them automatically according to experiment requirements and conditions.
Lytaev P., Hipp A., Lottermoser L., Herzen J., Greving I., Khokhriakov I., Meyer-Loges S., Plewka J., Burmester J., Caselle M., Vogelgesang M., Chilingaryan S., Kopmann A., Balzer M., Schreyer A., Beckmann F.
in Proceedings of SPIE – The International Society for Optical Engineering, 9212 (2014), 921218. DOI:10.1117/12.2061389
© 2014 SPIE. In this article we present the quantitative characterization of CCD and CMOS sensors which are used at the experiments for microtomography operated by HZG at PETRA III at DESY in Hamburg, Germany. A standard commercial CCD camera is compared to a camera based on a CMOS sensor. This CMOS camera is modified for grating-based differential phase-contrast tomography. The main goal of the project is to quantify and to optimize the statistical parameters of this camera system. These key performance parameters such as readout noise, conversion gain and full-well capacity are used to define an optimized measurement for grating-based phase-contrast. First results will be shown.
Caselle M., Balzer M., Chilingaryan S., Hofherr M., Judin V., Kopmann A., Smale N.J., Thoma P., Wuensch S., Muller A.-S., Siegel M., Weber M.
in Journal of Instrumentation, 9 (2014), C01024. DOI:10.1088/1748-0221/9/01/C01024
The recording of coherent synchrotron radiation requires data acquisition systems with a temporal resolution of tens of picosecond. This paper describes a new real-time and high-accuracy data acquisition system suitable for recording individual ultra-short pulses generated by a fast terahertz (THz) detector (e.g. YBCO, NbN, Zero Biased Schottky Diode). The system consists of a fast sampling board combined with a high data throughput readout. The first board is designed for sampling the fast pulse signals with a full width half maximum (FWHM) between a few tens to one hundred picoseconds with a minimum sampling time of 3 ps. The high data throughput board consists of a PCIe-Bus Master DMA architecture used for fast data transfer up to 3 GByte/s. The full readout chain with fast THz detectors and the acquisition system has been successfully tested at the synchrotron ANKA. An overview of the electronics system and preliminary results with multi-bunch filling pattern will be presented. © 2014 IOP Publishing Ltd and Sissa Medialab srl.