Shkarin A., Ametova E., Chilingaryan S., Dritschler T., Kopmann A., Vogelgesang M., Shkarin R., Tsapko S.

in Fundamenta Informaticae, 141 (2015) 259-274. DOI:10.3233/FI-2015-1275


© 2015 Fundamenta Informaticae 141. The recent developments in detector technology made possible 4D (3D + time) X-ray microtomographywith high spatial and time resolutions. The resolution and duration of such experiments is currently limited by destructive X-ray radiation. Algebraic reconstruction technique (ART) can incorporate a priori knowledge into a reconstruction model that will allow us to apply some approaches to reduce an imaging dose and keep a good enough reconstruction quality. However, these techniques are very computationally demanding. In this paper we present a framework for ART reconstruction based on OpenCL technology. Our approach treats an algebraic method as a composition of interacting blocks which performdifferent tasks, such as projection selection, minimization, projecting and regularization. These tasks are realised using multiple algorithms differing in performance, the quality of reconstruction, and the area of applicability. Our framework allows to freely combine algorithms to build the reconstruction chain. All algorithms are implemented with OpenCL and are able to run on a wide range of parallel hardware. As well the framework is easily scalable to clustered environment with MPI. We will describe the architecture of ART framework and evaluate the quality and performance on latest generation of GPU hardware from NVIDIA and AMD.

Shkarin R., Ametova E., Chilingaryan S., Dritschler T., Kopmann A., Mirone A., Shkarin A., Vogelgesang M., Tsapko S.

in Fundamenta Informaticae, 141 (2015) 245-258. DOI:10.3233/FI-2015-1274


© 2015 Fundamenta Informaticae 141.On-line monitoring of synchrotron 3D-imaging experiments requires very fast tomographic reconstruction. Direct Fourier methods (DFM) have the potential to be faster than standard Filtered Backprojection. We have evaluated multiple DFMs using various interpolation techniques. We compared reconstruction quality and studied the parallelization potential. A method using Direct Fourier Inversion (DFI) and a sinc-based interpolation was selected and parallelized for execution on GPUs. Several optimization steps were considered to boost the performance. Finally we evaluated the achieved performance for the latest generation of GPUs from NVIDIA and AMD. The results show that tomographic reconstruction with a throughput of more than 1.5 GB/sec on a single GPU is possible.

Gehrke R., Kopmann A., Wintersberger E., Beckmann F.

in Synchrotron Radiation News, 28 (2015) 36-42. DOI:10.1080/08940886.2015.1013420


© Taylor & Francis. The Helmholtz Association is the largest scientific organization in Germany. It operates all major German research infrastructures involved in research with photons, neutrons, and ions. These are DESY in Hamburg; the Karlsruhe Institute of Technology (KIT); the Research Centre Jülich (FZJ); the Helmholtz Centres in Geesthacht (HZG), Berlin (HZB), and Dresden-Rossendorf (HZDR); and the GSI Centre for research with heavy ions in Darmstadt. In common, all these centers are facing similar challenges related to dramatically increasing data rates and volumes generated with more and more powerful radiation sources together with larger and faster detectors. On the other hand, each center has its own specific portfolio of long-lasting technical expertise in areas like data analysis, information technology, or hardware development. Therefore, it was obvious to address the challenges by acting in concert. This was the main motivation in 2010 for the launch of a joint project among the partners called the “High Data Rate Processing and Analysis Initiative (HDRI).” The initiative is organized into three basic work packages: “Data Management,” “Real-time Data Processing,” and “Data Analysis, Modelling, and Simulation.” The aim is to carry out the development of methods, hardware components, and software for data acquisition, real-time and offline analysis, documentation and archiving, and for remote access to data. The solutions are finally meant to be integrated at the various experimental stations and thus have to be versatile and flexible to cope with the heterogeneous requirements of the different experiments. The claim to create standard solutions makes it mandatory to closely collaborate with large international activities in the field of data handling, like the European PaNdata project (see article in this issue), but also with vendors of detectors, data evaluation software, etc., as well as with corresponding standardization bodies.

Caselle M., Brosi M., Chilingaryan S., Dritschler T., Hertle E., Judin V., Kopmann A., Muller A.-S., Raasch J., Schleicher M., Smale N.J., Steinmann J., Vogelgesang M., Wuensch S., Siegel M., Weber M.

in IPAC 2014: Proceedings of the 5th International Particle Accelerator Conference (2014) 3497-3499.


Copyright © 2014 CC-BY-3.0 and by the respective authors.The commissioning of a new real-time and high-accuracy data acquisition system suitable for recording individual ultra-short coherent pulses detected by fast terahertz detectors will be presented. The Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to monitor turn-by-turn all buckets in streaming mode. KAPTURE is based on a direct sampling pulse operating with a minimum sampling time of 3 ps and a total time jitter less than 1.7 ps. A very low noise layout design combined with wide dynamic range and bandwidth of the analog front-end enables the sampling of signals generated by different GHz/THz detectors. The system has already been used with NbN and YBCO superconductor film detectors as well as zero biased Schottky diode detectors. The digitized data is transmitted to a DAQ system by a FPGA high throughput board with data transfer rates of 4 GByte/s. The setup is accomplished by a real-time data processing unit based on high-end graphics processor units (GPUs) for on-line analysis of the frequency behaviour of the coherent synchrotron emission. The system has been successfully used to study the beam properties of the ANKA synchrotron radiation source located at the Karlsruhe Institute of Technology.

Caselle M., Brosi M., Chilingaryan S., Dritschler T., Hiller N., Judin V., Kopmann A., Muller A.-S., Raasch J., Rota L., Petzold L., Smale N.J., Steinmann J.L., Vogelgesang M., Wuensch S., Siegel M., Weber M.

in International Beam Instrumentation Conference, IBIC 2014 (2014).


The ANKA storage ring generates brilliant coherent synchrotron radiation (CSR) in the THz range due to a dedicated low-ac-optics with reduced bunch length. At higher electron currents the radiation is not stable but is emitted in powerful bursts caused by micro-bunching instabilities. This intense THz radiation is very attractive for users. However, the experimental conditions cannot be easily reproduced due to those power fluctuations. To study the bursting CSR in multi-bunch operation an ultra- fast and high-accuracy data acquisition system for recording of individual ultra-short coherent pulses has been developed. The Karlsruhe Pulse Taking Ultra-fast Readout Electronics (KAPTURE) is able to monitor all buckets turn-by-turn in streaming mode. KAPTURE provides real-time sampling of the pulse with a minimum sampling time of 3 ps and a total time jitter of less than 1.7 ps. The KAPTURE system, the synchrotron operation modes and beam test results are presented in this paper.

Stevanovic U., Caselle M., Balzer M., Cecilia A., Chilingaryan S., Farago T., Gasilov S., Herth A., Kopmann A., Vogelgesang M., Weber M.

in 2014 19th IEEE-NPSS Real Time Conference, RT 2014 – Conference Records (2014), 7097495. DOI:10.1109/RTC.2014.7097495


© 2014 IEEE. High-speed X-ray imaging applications such as radiography and tomography play a crucial role for non-destructive investigations in material and biology sciences. For data-intensive applications, on-line analysis of the data is necessary for initial quality assurance and data-driven feedback. In this article we will present a new smart camera platform, with embedded FPGA processing that is able to stream and process data continuously in real-time. It is used in the new imaging beamline IMAGE, in ANKA. The new smart camera platform consists of a CMOS sensor, an FPGA readout card connected with a high speed PCIe interface to the GPU-based readout computer. It is tightly coupled to a newly implemented control system, called Concert. Concert enables efficient operation of the beamline by integrating devices and experiment process control, as well as data analysis. A key feature of smart cameras is embedded image processing. In this article we will demonstrate the potential of this approach with the implementation of an image-based self-event trigger. The algorithm automatically restricts the readout to selected regions with changed content. Application dependent trigger parameters are hidden by our control system which sets them automatically according to experiment requirements and conditions.

Lytaev P., Hipp A., Lottermoser L., Herzen J., Greving I., Khokhriakov I., Meyer-Loges S., Plewka J., Burmester J., Caselle M., Vogelgesang M., Chilingaryan S., Kopmann A., Balzer M., Schreyer A., Beckmann F.

in Proceedings of SPIE – The International Society for Optical Engineering, 9212 (2014), 921218. DOI:10.1117/12.2061389


© 2014 SPIE. In this article we present the quantitative characterization of CCD and CMOS sensors which are used at the experiments for microtomography operated by HZG at PETRA III at DESY in Hamburg, Germany. A standard commercial CCD camera is compared to a camera based on a CMOS sensor. This CMOS camera is modified for grating-based differential phase-contrast tomography. The main goal of the project is to quantify and to optimize the statistical parameters of this camera system. These key performance parameters such as readout noise, conversion gain and full-well capacity are used to define an optimized measurement for grating-based phase-contrast. First results will be shown.

Anzt H., Beglarian A., Chilingaryan S., Ferrone A., Heuveline V., Kopmann A.

in Computer Science – Research and Development, 29 (2014) 131-138. DOI:10.1007/s00450-012-0225-1


The focus in High-Performance Computing increasingly turns to energy efficiency. Therefore the pure concentration on floating point operations and runtime performance is no longer sufficient. In terms of hardware, this change of paradigm has already taken place: The GREEN500 list as counterpart to the runtime performance oriented TOP500 list has been established. The new metrics take runtime and energy consumption into account. Nevertheless, all these developments consider hardware only – still an inadequate situation to face the challenges of Energy-Efficient Exascale Computing. The necessity of optimizing simulation software with respect to power and energy draft demands for detailed profiling of the power consumption during the calculations and a norm quantifying the respective efficiency. In this paper we propose a unified energy footprint for simulation software that enables a fast comparison between different models, implementations and hardware configurations, respectively. By way of example we provide the footprints for the tomographic reconstruction code PyHST optimized for CPU and GPU operation as well as the operational numerical weather prediction model COSMO. We then discuss the power and energy profiles and investigate the effects of scaling with respect to hardware resources and simulation parameters. © 2012 Springer-Verlag.

Khokhriakov I., Lottermoser L., Gehrke R., Kracht T., Wintersberger E., Kopmann A., Vogelgesang M., Beckmann F.

in Proceedings of SPIE – The International Society for Optical Engineering, 9212 (2014), 921217. DOI:10.1117/12.2060975


© 2014 SPIE. A new control system for high-throughput experiments (X-Ray, Neutrons) is introduced in this article. The system consists of several software components which are required to make optimized use of the beamtime and to fulfill the demand to implement the new standardized data format established within the Helmholtz Association in Germany. The main components are: PreExperiment Data Collector; Status server; Data Format Server. Especially for tomography a concept for an online reconstruction based on GPU computing is presented. One of the main goals of the system is to collect data that extends standard experimental data, e.g. instrument’s hardware state, preinvestigation data, experiment description data etc. The collected data is stored together with the experiment data in the permanent storage of the user. The stored data is then used for post processing and analysis of the experiment.

Caselle M., Balzer M., Chilingaryan S., Hofherr M., Judin V., Kopmann A., Smale N.J., Thoma P., Wuensch S., Muller A.-S., Siegel M., Weber M.

in Journal of Instrumentation, 9 (2014), C01024. DOI:10.1088/1748-0221/9/01/C01024


The recording of coherent synchrotron radiation requires data acquisition systems with a temporal resolution of tens of picosecond. This paper describes a new real-time and high-accuracy data acquisition system suitable for recording individual ultra-short pulses generated by a fast terahertz (THz) detector (e.g. YBCO, NbN, Zero Biased Schottky Diode). The system consists of a fast sampling board combined with a high data throughput readout. The first board is designed for sampling the fast pulse signals with a full width half maximum (FWHM) between a few tens to one hundred picoseconds with a minimum sampling time of 3 ps. The high data throughput board consists of a PCIe-Bus Master DMA architecture used for fast data transfer up to 3 GByte/s. The full readout chain with fast THz detectors and the acquisition system has been successfully tested at the synchrotron ANKA. An overview of the electronics system and preliminary results with multi-bunch filling pattern will be presented. © 2014 IOP Publishing Ltd and Sissa Medialab srl.