Visualization in Parallel Manycore Environments
Foreseeable changes in the development of hardware architectures in the domain of high performance computing allow for considerable increase of the complexity of scientific simulations. Single workstations are replaced by visualization clusters to provide adequate performance for post processing and visualization of huge data sets. These clusters comprise multiple nodes that are connected by a high bandwidth, low latency network infrastructure such as Infiniband. The nodes themselves are parallel systems, composed of multiple processors containing multiple cores. These cores have access to shared memory, often with non-uniform latency and bandwidth.
Each of these nodes contains one or more graphics cards that can be used for rendering. Modern graphics cards can also be used for highly complex parallel computations that are not directly related to rendering. Utilizing these graphics cards for postprocessing introduces additional latency due to the need for copying data between the host main memory and the comparably small amount of memory available on the graphics board. To make this infrastructure available for interactive visualization to the users, they have to be able to access the cluster remotely from their workstation which is typically connected by a low bandwidth, high latency network. For maximum utilization of the visualization infrastructure by a visualization environment based on the data-flow paradigm, several requirements have to be met. The data management has to be adapted to the distributed memory model with hierarchical memory access. The algorithms have to be optimized for parallel hybrid distributed/shared memory architectures and have to be integrated with the data management. The algorithms have to be adapted to make use of GPUs and other accelerators such as FPGAs. The rendering has to be able to utilize several GPUs to drive immersive environments such as CAVEs and provide remote rendering services with low latency and high detail. Due to the complexity of the post processing, an automatic scheduling of the required processes has to be provided. To make optimal use of the available resources and to assure tolerable latency for user interaction, the scheduling algorithms have to take into account load balancing, a given decomposition of the data and the amount of communication between the compute nodes.
The High Performance Computing Centre Stuttgart (HLRS) is the leader of a consortium of research institutes and industry collaborating in this research project that is partially funded by the German Federal Ministry of Education and Research (BMBF)
- to create a flexible framework for
parallel and distributed visualization that is applicable throughout the
- to develop scheduling strategies
that are able to re-order the processing of data and dynamically adapt the accuracy of algorithms
and resolution of the data
- to provide algorithms for visualization
of data sets from different application areas in distributed manycore environments
- to assure the applicability of the
developed concepts and the usability of the visualization environments with the help of users from different
Algorithms for Clusters
of Manycore Systems
For the first time, the availability of massively parallel manycore architectures – both within a visualization system and as remote computing resources – allows for the interactive utilization of post-processing algorithms such as particle tracing even for extremely complex phenomena. One goal of this project is the development of methods for highly-interactive particle tracing in complex flow fields and their implementation in a software framework in order to offer engineers a tool for the intuitive exploration of flow phenomena. Besides offering support for structured cartesian grids, this requires all developed approaches to work with unstructured, potentially time-varying grids as well.
Realtime interactive postprocessing requires extremely low latency which can be achieved by employing massively parallel stream processing units available on the latest generation graphics boards. To be able to use GPUs efficently, partitioning and resampling of the data is necessary due to the limited amount of memory available. To be able to utilize remote visualization clusters for rendering of polygonal and volume data, lossy as well as lossless low-latency compression schemes running directly on the GPU have to be developed, possibly exploiting intra-eye coherence in stereo image streams.
The visualization systems COVISE and ViSTA FlowLib that are enhanced and extended as a result of this project are available to the scientific community. The partners involved are collaborating with researchers and industry of different application domains to interpret large, transient data obtained from simulation and measurements. As a result of this project, tools will be made available to foster the understanding of processes in the domain of engineering (e.g. reduction of CO2 emmisions in coal combustion) and medicine (e.g. formation of edema due to apoplexia) which, due to their complexity, cannot be intuitively analyzed by the means of interactive visualization so far.
VisPME is a project partially funded by the German Federal Ministry of Education and Research (BMBF) within the program “IKT 2020 – Forschung für Innovationen” in the call “HPC-Software für skalierbare Parallelrechner”. The consortium consists of seven project partners. The project runs for 36 months and started on 1st of January 2009.
High Performance Computing Centre Stuttgart (HLRS) | Regional Computing Centre of the University of Cologne | Center for Computing and Communication of RWTH Aachen University | Max Planck Institute for Neurological Research | Aachen Institute for Computational Engineering Science | RECOM Services GmbH | NVIDIA GmbH
• Florian Niebling
University of Stuttgart, HLRS