MASSIVELY PARALLEL COMPUTING AND VISUAL COMPUTING
NVIDIA® Parallel Nsight™, in combination with Visual Studio, makes GPU application development for massively parallel computing easier than ever. Through its native GPU debugging and profiling feature set, Parallel Nsight provides the most efficient way to debug, profile, and optimize GPU code. In addition, Parallel Nsight provides visibility into the heterogeneous execution of the application with its Analysis trace to maximize multi-core CPU utilization and multi-GPU, multi-API acceleration.
If you are a scientist looking at solving your research 10 times faster, an application developer taking advantage of the GPU for advanced 3d graphics visualization and scientific processing or a graphics developer pushing the limits of DirectX, Parallel Nsight enables you to reach these goals more efficiently than any other development environment.
CUDA DEVELOPMENT
NVIDIA Parallel Nsight for GPU Compute Development
NVIDIA Parallel Nsight software is the industry’s first development environment for massively parallel computing integrated into Microsoft Visual Studio, the world’s most popular development environment. Parallel Nsight is a powerful tool that allows programmers to develop for both GPUs and CPUs within Microsoft Visual Studio.
*NEW* for Parallel Nsight 2.1
- CUDA 4.1 Support.
- New CUDA Warp Watch view and CUDA Info page for an improved massively threaded application debugging
experience.
- Advanced CUDA Profiler experiments for deeper performance analysis of kernels.
- Support for Optimus laptop which enables a fully-featured Parallel Nsight experience for CUDA developers on a
single system.

 |
CUDA DEBUGGER
> *NEW* CUDA Info page gives detailed information about the state of CUDA launches in the user’s application. Users can filter and find detailed information about exceptions, asserts, breakpoints, MMU faults, and easily switch to a specific warp of interest to debug problems.
> *NEW* CUDA Warp Watch provides a more efficient way to navigate through the resident threads and visualize thread states across a warp.
> *NEW* System Information page allows system information to be available with the connection to the monitor and gives more detailed information for all the CUDA devices such as driver model, GPU architecture, memory and more.
> *NEW* GPU break when a CUDA assertion is encountered.
> Debug CUDA C/C++ and DirectCompute kernels directly on GPU hardware.
> Examine thousands of threads executing in parallel using the familiar Locals, Watch, Memory and Breakpoints windows in Visual Studio.
> View GPU memory directly using the standard Memory windows in Visual Studio.
> Use conditional breakpoints to quickly identify and correct errors in massively parallel code.
> Identify memory access violations using the CUDA C/C++ Memory Checker.
|
| |

 |
CUDA PROFILER AND APPLICATION TRACE
> *NEW* CUDA profiling experiments allow developers to understand performance issues caused by the following factors:
- Thread divergence or code branches;
- Memory statistics;
- Statistics on stall reasons;
- Instruction issue efficiency;
- Achieved FLOPS.
> *NEW* Traced workloads can now navigate the dependencies and call stack to allow the developer to follow through GPU workloads, corresponding API calls and host code that was the cause of the activity.
> *NEW* CUDA Trace adds support for concurrent trace of memory copies and memory sets.
> *NEW* System trace adds support for capturing data from a 64-bit process launched from a 32-bit process.
> *NEW* OpenCL 1.1 API trace support.
> *NEW* NVTX and Direct3D Performance Marker report pages now support statistics display for all CUDA, OpenCL, Direct3D, and OpenGL API calls made during a range as well as for all GPU work submitted by the API calls.
> *NEW* correlation pane allows mining of data selected in report tables or the timeline view.
> DirectCompute shader profiling.
> Capture CPU and GPU level events, including: API calls, kernel launches, memory transfers and custom application annotations.
> Single correlated timeline displays all captured events.
> Timeline inspection tools allow for the examination of workload dependencies.
> Filter and sort captured events using specialized reporting views.
> Profile CUDA kernels using GPU performance counters.
|
Graphics Development
NVIDIA Parallel Nsight for GPU Graphics Development
NVIDIA Parallel Nsight software is the world’s first graphics development environment integrated into Microsoft Visual Studio, the world’s most popular development environment, enabling DirectX 10 and DirectX 11 graphics development, with native GPU debugging and API debugging, as well as advanced performance optimization.
*NEW* for Parallel Nsight 2.1
- Dynamic Shader Editing enables editing and recompiling shaders while your application is still running. This can help with debugging rendering issues, as well as testing out optimizations on the fly.
- Frame Timings page allows you to run a quick profile on a captured frame, to see various timings on each draw call for fast profiling turnaround.
 |
GRAPHICS INSPECTOR AND DEBUGGER
> *NEW* Dynamic Shader Editing during application execution
> *NEW* Shader Inspector page shows constant buffer with HLSL variable names.
> *NEW* Nsight HUD for graphics debugging outside of Visual Studio.
> Real-time examination of DirectX rendering calls.
> Interactive examination of GPU pipeline state, including visualization of bound textures, geometry and compute buffers.
> Pixel History shows all operations that affect a given pixel.
> Debug all HLSL graphics shaders natively on the GPU hardware.
> Examine shaders executing in parallel using the familiar Locals, Watch, Memory and Breakpoints windows in Visual Studio.
> View and interact at the source code level with all shaders loaded by the application.
> Identify shaders that affect any given primitive or pixel using conditional breakpoints.
|
| |

|
GRAPHICS PROFILER AND APPLICATION TRACE
> *NEW* Frame Timings page displays advanced drawcall timing information.
> *NEW* Frame Profiler sessions can be saved and restored allowing the profiling results to be shared.
> *NEW* System trace adds support for capturing data from a 64-bit process launched from a 32-bit process.
> *NEW* Trace support for DirectX and OpenGL workloads, memory transfers and correlation of these back to command buffers and API calls.
> Direct3D, OpenGL and Cg API trace.
> Frame Profiler identifies performance bottlenecks and GPU utilization.
> Save frame captures for offline collaboration and analysis.
|