Graphics Pipeline & APIs

1. Graphics Pipeline

Graphics Pipelines may etiher be:

Declarative: an abstract scene definition is provided, and the system decides how to render it (e.g. WebGL, Three.js).
Imperative : the programmer specifies each step of the rendering process (e.g. OpenGL, DirectX, Vulkan, Metal).

The pipeline usually consists of 7 main stages:

Modelling Transformations: 3D models are defined in their own coordinate system, so modelling transformations orient the models within a common coordinate frame (worldspace).
Illumination (Shading): Vertices are lit (shaded) according to material properties, surface properties and light sources, using a local lighting model.
Viewing Transformation: The scene is transformed from world space to camera space (view space or eye space). Viewing position is transformed to the origin, and viewing direction is aligned with the negative z-axis.
Clipping : Portions of objects outside the view frustrum (boundaries of the image plane projected into 3D) are removed. This transforms it to Normalized Device Coordinates (NDC).
Projection: The 3D scene is projected onto a 2D image plane, using either perspective or orthographic projection. This converts NDC to screen space.
Scan Conversion: rasterizes objects into pixels, interpolating values inside objects.
Display: handle occlusion and transparency blending using a depth buffer (z-buffer) and outputs the final image to the screen.

2. Hardware Implementation

To achieve real-time, we base rasterization on graphics primitives (i.e. lines, tris), and implement in hardware using a Graphics Processing Unit (GPU), controlled by an API such as OpenGL. Certain parts of the pipeline are programmable using shaders. These can be inserted at two points:

Vertex Shaders: executed for each vertex, allowing custom transformations and lighting calculations.
Fragment Shaders: executed for each pixel fragment, allowing custom color and texture calculations.

2.1 Application Stage

The application stage has a database of scene objects, each defined in its own local coordinate system. We also have acceleration structures (i.e. octrees). It has simulations, input event handlers, database traversal and utility functions.

2.2 Geometry Stage

First, we do vertex processing, taking an input vertex stream composed of arbitrary vertex attributes (position, normal, color, etc), transforming it into a stream of vertices mapped onto the screen (using ModelViewProjection Matrix), composed of their clip space coordinates and varying attributes (interpolated across the primitive). This is programmable by the vertex shader.

Afterwards, the geometry stage does clipping (primitives not in view are clipped), projection (project clip space to image plane) and then viewport transform (maps device coordaintes to a rectangular window in the frame buffer).

Aside: Geometry Shader

Finally a Geometry Shader may optionally be applied, which has full knowledge of the primitive and can even generate procedural primitives.

2.3 Rasterization Stage

This stage starts with primtive assembly (do backface culling & setup primitive for traversal), then does primitive traversal (scan conversion) which samples tris into fragments, and interpolates vertex attributes. After this, a programmable fragment shader computes fragment colors, which are later merged into pixel colors with pixel ownership tests, scissor tests, stencil tests, depth tests & blending.

Rasterization has rules for each primitive type. We want to make it non-ambiguous, but this can generate aliasing artifacts.

2.4 Display Stage

Finally, the display stage reads pixel values from the frame buffer and sends them to the display device. This may involve color space conversion and gamma correction.

3. Graphics Architecture

Graphics hardware is a shared resource, so we must map a user mode driver (UMD) to eaehc process to prepare command buffers for the hardware. The Graphics Kernel Subsystem schedules access to the hardware. Then, the Kernel Mode Driver (KMD) submits the prepared command buffers to the hardware.

4. Graphics APIs

OpenGL is a cross-language, cross-platform API.
OpenGL ES is a subset of OpenGL for embedded systems (mobile, web).
DirectX is a Microsoft API for Windows and Xbox.
Vulkan is a successor to OpenGL, designed for high performance and low CPU usage.
WebGL is a JavaScript API for rendering 3D graphics in web browsers.

Back to Home

Table of Contents