Birk M., Zapf M., Balzer M., Ruiter N., Becker J.

in Journal of Real-Time Image Processing, 9 (2014) 159-170. DOI:10.1007/s11554-012-0267-4

Abstract

As today’s standard screening methods frequently fail to diagnose breast cancer before metastases have developed, earlier breast cancer diagnosis is still a major challenge. Three-dimensional ultrasound computer tomography promises high-quality images of the breast, but is currently limited by a time-consuming image reconstruction. In this work, we investigate the acceleration of the image reconstruction by GPUs and FPGAs. We compare the obtained performance results with a recent multi-core CPU. We show that both architectures are able to accelerate processing, whereas the GPU reaches the highest performance. Furthermore, we draw conclusions in terms of applicability of the accelerated reconstructions in future clinical application and highlight general principles for speed-up on GPUs and FPGAs. © 2012 Springer-Verlag.

Stevanovic U., Caselle M., Balzer M., Cecilia A., Chilingaryan S., Farago T., Gasilov S., Herth A., Kopmann A., Vogelgesang M., Weber M.

in 2014 19th IEEE-NPSS Real Time Conference, RT 2014 – Conference Records (2014), 7097495. DOI:10.1109/RTC.2014.7097495

Abstract

© 2014 IEEE. High-speed X-ray imaging applications such as radiography and tomography play a crucial role for non-destructive investigations in material and biology sciences. For data-intensive applications, on-line analysis of the data is necessary for initial quality assurance and data-driven feedback. In this article we will present a new smart camera platform, with embedded FPGA processing that is able to stream and process data continuously in real-time. It is used in the new imaging beamline IMAGE, in ANKA. The new smart camera platform consists of a CMOS sensor, an FPGA readout card connected with a high speed PCIe interface to the GPU-based readout computer. It is tightly coupled to a newly implemented control system, called Concert. Concert enables efficient operation of the beamline by integrating devices and experiment process control, as well as data analysis. A key feature of smart cameras is embedded image processing. In this article we will demonstrate the potential of this approach with the implementation of an image-based self-event trigger. The algorithm automatically restricts the readout to selected regions with changed content. Application dependent trigger parameters are hidden by our control system which sets them automatically according to experiment requirements and conditions.

Lytaev P., Hipp A., Lottermoser L., Herzen J., Greving I., Khokhriakov I., Meyer-Loges S., Plewka J., Burmester J., Caselle M., Vogelgesang M., Chilingaryan S., Kopmann A., Balzer M., Schreyer A., Beckmann F.

in Proceedings of SPIE – The International Society for Optical Engineering, 9212 (2014), 921218. DOI:10.1117/12.2061389

Abstract

© 2014 SPIE. In this article we present the quantitative characterization of CCD and CMOS sensors which are used at the experiments for microtomography operated by HZG at PETRA III at DESY in Hamburg, Germany. A standard commercial CCD camera is compared to a camera based on a CMOS sensor. This CMOS camera is modified for grating-based differential phase-contrast tomography. The main goal of the project is to quantify and to optimize the statistical parameters of this camera system. These key performance parameters such as readout noise, conversion gain and full-well capacity are used to define an optimized measurement for grating-based phase-contrast. First results will be shown.

Caselle M., Balzer M., Chilingaryan S., Hofherr M., Judin V., Kopmann A., Smale N.J., Thoma P., Wuensch S., Muller A.-S., Siegel M., Weber M.

in Journal of Instrumentation, 9 (2014), C01024. DOI:10.1088/1748-0221/9/01/C01024

Abstract

The recording of coherent synchrotron radiation requires data acquisition systems with a temporal resolution of tens of picosecond. This paper describes a new real-time and high-accuracy data acquisition system suitable for recording individual ultra-short pulses generated by a fast terahertz (THz) detector (e.g. YBCO, NbN, Zero Biased Schottky Diode). The system consists of a fast sampling board combined with a high data throughput readout. The first board is designed for sampling the fast pulse signals with a full width half maximum (FWHM) between a few tens to one hundred picoseconds with a minimum sampling time of 3 ps. The high data throughput board consists of a PCIe-Bus Master DMA architecture used for fast data transfer up to 3 GByte/s. The full readout chain with fast THz detectors and the acquisition system has been successfully tested at the synchrotron ANKA. An overview of the electronics system and preliminary results with multi-bunch filling pattern will be presented. © 2014 IOP Publishing Ltd and Sissa Medialab srl.

Judin V., Brosi M., Caselle M., Hertle E., Hiller N., Kopmann A., Muller A.-S., Schuh M., Smale N.J., Steinmann J.L., Weber M.

in IPAC 2014: Proceedings of the 5th International Particle Accelerator Conference (2014) 225-227.

Abstract

Copyright © 2014 CC-BY-3.0 and by the respective authors.The ANKA storage ring can generate brilliant coherent synchrotron radiation (CSR) in the THz range due to a dedicated low-α<inf>c</inf>-optics with reduced bunch lengths. At higher electron currents the radiation is not stable, but occurs in powerful bursts caused by micro-bunching instabilities. This intense THz radiation is very attractive for users. However, the reproducibility of the experimental conditions is very low due to those power fluctuations. Systematic studies of bursting CSR in multi-bunch operation were performed with fast THz detectors at ANKA using a dedicated, ultra-fast DAQ-FPGA board. The technique and preliminary results of these studies are presented in this paper.

Muller A.-S., Judin V., Balzer M., Caselle M., Hiller N., Hofherr M., Ilin K.S., Kehrer B., Marsching S., Naknaimueang S., Nasse M.J., Raasch J., Scheuring A., Schuh M., Schwarz M., Siegel M., Smale N.J., Steinmann J., Thoma P., Weber M., Wuensch S.

in IPAC 2013: Proceedings of the 4th International Particle Accelerator Conference (2013) 109-111.

Abstract

In the low-alpha operation mode of the ANKA synchrotron light source, coherent synchrotron radiation (CSR) is emitted from short electron bunches. Depending on the bunch current, the radiation shows bursts of high intensity. These bursts of high intensity THz radiation display a time evolution which can be observed only on long time scales with respect to the revolution period. In addition, long range wake fields can introduce a correlation between the bunches within a bunch train and thus modify the observed behaviour. A novel detection system consisting of an ultra-fast superconducting THz detector and data acquisition system was used to investigate correlations visible on the bursting pattern and to study the interactions of very short pulses in the ANKA storage ring. Copyright © 2013 by JACoW.

Caselle M., Balzer M., Cilingaryan S., Hofherr M., Judin V., Kopmann A., Ll’in K., Menshikov A., Muller A.-S., Smale N.J., Thoma P., Wuensch S., Siegel M., Weber M.

in IPAC 2013: Proceedings of the 4th International Particle Accelerator Conference (2013) 2094-2096.

Abstract

This paper describes a new real-time and high accuracy data acquisition system suitable for recording of the individual ultra-short pulses generated by a fast terahertz (THz) detector (e.g. YBCO, NbN, Zero Biased Schottky Diode). The system proposed consists of a fast pulse sampling board and a high data throughput readout board. The first board is designed for sampling of the fast pulse signals with a full width half maximum (FWHM) between few tens to hundred picoseconds. For each THz pulse four samples are acquired with a minimum sampling time of 3 ps. The high data throughput board consists of a PCIe – Bus Master DMA architecture used for fast data transfer up to 3 GByte/s. A prototype setup with fast THz detectors and the acquisition system has been successfully tested at the synchrotron ANKA. An overview of the experimental setup and preliminary results with multi-bunch filling pattern will be shown. Copyright © 2013 by JACoW.

Spillmann U., Blumenhagen K.H., Badura E., Balzer M., Brauning H., Hoffmann J., Koch K., Kurz N., Martin R., Minami S., Ott W., Stohlker T., Weber G., Weber M.

in Physica Scripta, T156 (2013), 014103. DOI:10.1088/0031-8949/2013/T156/014103

Abstract

The future x-ray spectroscopy and polarimetry experiment program of the SPARC collaboration at GSI and FAIR relies strongly on the availability of two-dimensional position-sensitive, energy- and time-dispersive thick semiconductor detector systems, including the appropriate signal processing electronics. To meet these demands, the development of a compact and scalable data acquisition system that has higher rate acceptance compared to commercial VME electronics by employing digital pulse processing electronics was started. © 2013 The Royal Swedish Academy of Sciences.

Balzer M., Kleinert J., Obermayr M.

in Particle-Based Methods III: Fundamentals and Applications – Proceedings of the 3rd International Conference on Particle-based MethodsFundamentals and Applications, Particles 2013 (2013) 920-930.

Abstract

In numerous industrial applications there is the need to realistically model granular material. For instance, simulating the interaction of vehicles and tools with soil is of great importance for the design of earth moving machinery. The Discrete Element Method (DEM) has been successfully applied to this task [1, 2]. Large scale problems require a lot of computational resources. Hence, for the application in the industrial engineering process, the computational effort is an issue. In DEM parallelization is straight forward, since each contact between adjacent particles is resolved locally without regard of the other contacts. However, modelling a contact as a stiff spring imposes strong limitations on the time step size to maintain a stable simulation. The Non-Smooth Contact Dynamics Method (NSCD), on the other hand, models contacts globally as a set of inequality constraints on a system of perfectly rigid bodies [3]. At the end of every time step, all inequality constraints must be satisfied simultaneously, which can be achieved by solving a complementarity problem. This leads to a numerically stable method that is robust with respect to much larger time steps in comparison to DEM. Since a global problem must be solved, parallelization now strongly depends on the numerical solver that is used for the complementarity problem. We present our first massively parallel implementation of NSCD based on the projected Gauß-Jacobi (PGJ) iterative scheme presented in [4]. Focusing on one-sided asynchronous communication patterns with double buffering for data exchange, global synchronizations can be avoided. Only weak synchronization due to data dependencies of neighboring domains remains. The implementation is based on the Global address space Programming Interface (GPI), supplemented by the Multi Core Threading Package (MCTP) [5] on the processor level. This allows to efficiently overlap calculation and communication between processors.