Crash Course in HLSL

What does HLSL stand for? Why was it created? How does an HLSL effect file look like? What can you do with HLSL? What do float4x4, TEXCOORD0 or compile ps_3_0 mean? The answer to the first question is simple: HLSL means High Level Shader Language. This answer by itself might raise a few questions. The answers for these and a few other questions will be found in this article. You will first learn about the history of HLSL and why it came to existence. After that, you will see the basic structure of an HLSL effect file, and learn about the different elements of the language. Finally, after looking over the language’s basics, you will see the template effect file that XNA gives to us.

History of HLSL

In more than one way, HLSL is for GPU programming as C is for CPU programming. You can use them both well without caring much about what was before them. However, just like with C, knowledge about what was before it will bring better understanding of what makes it special, and similar to computer science students who learn about CPU history, next you will see a brief history of GPUs.

As you probably know already, a GPU (Graphics Processing Unit) is the component inside your PC or console which has a single purpose: process and display computer graphics. The ancestors of modern GPUs are graphics chips present in 80’s PCs that has special circuits dedicated to combining two or more bitmaps into one image. However, the history of interest starts with the release of the first 3D hardware accelerator, in 1996. This was the 3dfx Voodoo card, which was a separate hardware component that had 3D functions implemented into it. The 3D functionality was soon integrated with the classic 2D video cards, to produce a single chip, the grandfather of current GPUs. The most popular video cards in that generation were the Voodoo, TNT, GeForce and Radeon series.

Another step towards current GPUs was the appearance of hardware support for Transform and Lighting, which meant that the video card could handle transformation of geometry in 3D space, clipping and lighting of the polygons, all in hardware. At this moment in time, all 3D capabilities were implemented as fixed-functions (thus called the Fixed Function Pipeline), which meant that you could only use the transformations and lighting models provided by the card. This caused many games of that time to look very similar.

To further enhance the realism and flexibility of computer graphics, the video card manufacturers could take one of two possible paths. One would have been to simply add more and more fixed-function functionality, to cover the many features that programmers desired. This would have needed an exponential growth in the number of circuits on the board, as well as the number of variables and states in the APIs. The second option, which is fortunately what was chosen, was to allow small programs to be written, which were to be executed for each vertex, or for each pixel that is processed by the video cards. This would give a lot of flexibility to developers to process graphics in any way they wanted (in the limits of the hardware, of course).

These small programs are called shaders, and were introduced in DirectX 8.0. A set of specifications, called Shader Model 1 detailed their possible functionality. In this version of Direct3D, programmers had to write shaders in a language similar to assembly and you can see an example of a vertex shader written in assembly language below.

vs_1_1
dcl_position v0
dcl_color v0
m4x4 oPos, v0, c0
mov oD0, v1

Shaders written in assembly were hard to read, understand and maintain. Besides this, Shader Model 1 allowed for very few instructions, and limited functionality. And just like CPU programming was blessed with higher level programming languages to make programming easier, DirectX 9.0 brought Shader Model 2.0, and together with it the High Level Shading Language.

HLSL is the language used to write shaders for GPUs in DirectX. It is syntactically similar to C, but has its own data types and program structure. It makes the graphics programmer’s life easier by allowing the elements of high level programming languages, such as named variables, functions, expressions, statements, standard and user-defined data types, and preprocessor directives. This makes it easier to write, read, understand and debug.

Before we dive into HLSL syntax and structure, you need to look at the graphics-processing pipeline.

The DirectX Graphics Processing Pipeline

To be able to understand how shaders work and what each type of shader does, you need to see how 3D objects are processed and displayed. The Direct3D 9 Graphics Pipeline (also used by XNA) is outlined in the following figure.

simple_pipeline

Each block in the figure has a specific purpose in the rendering process. Data comes in through the vertex data, primitive data and textures. The vertex data contains untransformed model vertices. These are stored in vertex buffers. The Primitive data specifies geometric primitives, like points, lines, triangles, and polygons. These are defined by groups of vertices from the vertex data, indexed in index buffers. In theory, the tessellation unit converts high-order primitives (like N-patches or displacement maps) to vertices and stores these in vertex buffers. However, there are few  (if any) GPUs that actually implement this functionality. In the newer versions of Direct3D (Direct3D 11), this stage of the pipeline got reborn and is able to do wonderful stuff, but for now, this functionality is not available in XNA.

The vertex processing stage is one of interest. Here, the vertices stored in the vertex buffer are read, and various transformations are applied to them, before sending them forward to the geometry processor. This is where the first type of shaders comes into play: the vertex shaders. When you will be writing vertex shaders, your task will be to transform vertices given in object coordinates into vertices in screen-projected coordinates. You can learn about the different coordinate spaces in the Creator’s Club site’s Education section. For now, remember that vertices read from a vertex buffer are multiplied by some matrices, and end up being projected on the 2D screen. Together with the projected position, other attributes are usually passed, like color, normals, texture coordinates.

In the geometry processing stage, several algorithms are applied on the geometry formed by the transformed vertices. These algorithms include clipping (geometry which is off the screen is removed), back face culling (geometry facing away from the view direction is culled), rasterization (triangles defined by vertices are transformed into pixels). The rasterizer takes three points of a triangle together with their attributes, and interpolates these values for each pixel inside those triangles. These values are passed forward to the pixel processors.

The pixel processing stage is the home of the second type of shader you will learn about, namely the pixel shader. A pixel shader receives as input data coming from the rasterizer. This includes texture coordinates, normals, binormals, and many others. Inside the pixel shader, you need to use this data, together with data read from textures, to output a single color. This color is taken by the pixel rendering stage, and it is modified by alpha blending, depth, and stencil testing, before finally being written into the framebuffer.

The two blocks we haven’t talked about are textures and texture samplers. Textures are memory blocks of data, usually representing arrays of colors. These are accessed by pixel shaders (and in some cases by vertex shaders) through texture samplers which specify the addressing mode and filtering that are used when accessing a certain texture.

This is the general view of the graphics pipeline. It is good to have it in mind, in order to understand where and when each shader is executed. Now that we have this in place, let us look at the two types of shaders we have seen.

Vertex Shaders

As you saw earlier, vertex shaders are programs that are executed in the “vertex processing” stage of the pipeline. Thus, a vertex shader is responsible for a number of things:

  • Coordinate Space Transformations. This is normally composed of three transformations. The world transformation positions and rotates objects in space, relative to the world coordinate system. The view transformation moves all vertices in view space, i.e. the space relative to the camera. In view space, the camera is at the origin of the system. The last transformation is the projection transformation, which converts the 3D triangles and polygons in view space into 2D triangles and polygons, which can be rendered on the screen.
  • Some animation techniques happen in the vertex shader. While this is actually part of the world transformation, it is worth mentioning it separately, to keep it in mind.
  • Light and color computations can also be done in a vertex shader
  • Vertex shaders should output any variables and values that might be needed by the pixel shader for advanced affects.

Vertex shaders are implemented as functions containing a list of instructions. The maximum length of a shader is determined by the shader version used, and the code that results after the compilation of the HLSL code into assembly. The shader version also determines the instructions that are available inside a vertex shader. As a general rule, higher shader versions offer more features, but lower versions are supported on more system configurations. The Xbox 360 shader version is a super-set of vs_3_0, which means that it has the features of vs_3_0, and a few more besides them.

The simplest example of a vertex shader written in HLSL can be seen below.

float4x4 WorldViewProjection;
float4 VertexShaderFunction(float4 inputPosition : POSITION) : POSITION
{
    return mul(inputPosition, WorldViewProjection);
}

This shader transforms the vertex from model space into projection space, by multiplying it with the WorldViewProjection matrix. The application is responsible of setting this parameter to an appropriate value.

As you can see, the code is similar to C, and looks much better than the assembly one. Don’t worry if there are some things you don’t yet understand. Some of them will be explained later, and others you’ll understand with more practice. The data types float4x4 and float4 are some of the data types of HLSL; mul is an intrinsic function, and the two instances of POSITION are called semantics. There are also other semantics, for outputting different types of information, but we will get to them later. For now, we move forward to take a look at pixel shaders.

Pixel Shader

As vertex shaders work on vertices, pixel shaders work on pixels. Before a pixel is written to the frame buffer, it is first passed through the pixel shader. This is represented by the pixel processing state. As a rule, a pixel shader needs to output a single color. The value of this color may be computed in several ways, taking into account the texture, ambient light, directional lights, shadows, material type, etc. The inputs used for this by the shader are attributes coming from the vertex texture, shader parameters coming from the application, and texture data coming from the texture samplers. Pixel shader length and complexity is also limited by the shader version used when compiling.

Take a look at a very simple example of a pixel shader.

float4 PixelShaderFunction() : COLOR0
{
    return float4(1, 0, 0, 1);
}

This very simple shader makes each pixel red. The returned value, float4(1, 0, 0, 1), represents the color of the pixel, stored as an array of floats, each representing one element of the RGBA (Red-Green-Blue-Alpha) color mix. In a real example, a pixel shader is rarely this simple. It usually has some input parameters, and more complex computations, including reading from textures and computing lighting.

Effect Files

So far we have talked about shaders. After having the ability to write vertex and pixel shaders, the next logical step is to combine these two in one place. The combination of these types of shaders, together with a set of other states that control how the graphic pipeline functions is called an effect.

If you look again at the graphics pipeline, you can imagine an effect as a block that replaces or controls a few of the blocks in the diagram: Vertex Processing, Geometry Processing, Pixel Processing, Texture Samplers, and a part of Pixel Rendering. The Vertex Processing and Pixel Processing are implemented by writing programmable shaders. The other blocks can be controlled by setting the pipeline state, i.e. setting values to a set of specific variables.

Effects also provide a convenient way to write shaders for multiple hardware versions. For this, effects have the concept of techniques. A technique is a collection of one or more passes. Each pass defines a certain way of rendering the object. To do this, a technique and its passes encapsulate global variables, pipeline states, texture sampler states and shader states. This way, you can write different shader versions, which rely on different functionalities from the hardware, and encapsulate them in techniques. Then, at runtime, select the proper technique to be used, based on the hardware the application is running. On a high-end machine you will probably want to use a technique that contains more complex shaders, and on a low-end machine, the simpler and faster one. The shaders are defined as functions, and assigned to two pipeline variables, named VertexShader and PixelShader.

The general layout of an effect file can be seen below.

//parameter declarations
[…]
//data type declarations
[…]
//function declarations
[…]
//technique declarations
[…]

This probably is not very descriptive, but in the rest of the article, I will explain each element.

Data Types

Let us take a quick look at the data types available in HLSL. After this, in the next session, you will see hoe to declare variables and parameters using these data types. The simplest types are the scalar types, listed below.

Type Value
bool true or false
int signed integer
half 16-bit floating point
float 32-bit floating point

The float type is the native type used inside GPUs, and this is what you will most often use. Because of this, some GPUs that might not support other types (int, half, double) will emulate them using float. Integers usually fall into this category, and because of this emulation, the full range of 32-bit integers will not be covered by a 32-bit floating-point representation.

Right after scalar types, vector types come into play. A vector contains between one and four components of a scalar type. A vector type is composed by a scalar type immediately followed by a number that indicates the number of components of the vector. Some examples include: float3, half2, int4, double2. An advantage of vector types, besides the ease of use, is the fact that operations done on them are applied simultaneously on all components. For example, adding two float4 variables will add the corresponding components in just a single instruction. To access individual components of a vector, you can use two sets of accessors: rgba and xyzw. As an example, for a variable of type float4 named var, writing var.x, or var.r gives access to the first component of var; var.y and var.g to the seconds, and so on. Usually, when using a variable as a position, you would use the .xyzw accessors, and when using it as a color, the .rgba accessors.

The next data structure that is common in computer graphics is the matrix type. A matrix type is defined similar to vectors, but it uses two dimensions. In the definition, the scalar type is followed by the number of rows, an x and the number of columns: int4x2, double1x3, float4x4. Usually, you’ll just use matrices in multiplications, but if you ever need to access a certain component of the matrix, there are a few ways of doing this: a zero based notation, which has the form of an underscore, followed by the letter m, the row number, and then the column number; a one based notation, which has the form of an underscore, followed by the row and column number; and finally access can be done as in an array, by specifying the row and column numbers in square brackets, zero-based. Assuming we have a variable mvar of type float4x4, all the following notations will access the same component, the one on the second row and third column: mvar._m12, mvar._23, mvar[1][2].

A texture type is the data type that represents a texture object. The keyword for this data type is simply texture. Another data type linked to textures is the sampler, which contains a sampler state, i.e. the texture to be sampled, and the filters that should be used. The common sampler types are sampler, sampler1D, sampler2D, sampler3D and samplerCUBE. We will go into more details about using textures and samplers later.

The last important data type is the structure. A structure is a user-defined data type that has a collection of members, of other data types. The keyword struct is used to define a structure, after which the structure name may be used. To access components of a structure, the structure operator (.) is used. An example of a structure that contains two members, a position and a color, can be seen below.

struct demoStruct
{
    float3 position;
    float4 color;
}

These are the most important data types you will get to use when writing HLSL code.

Intrinsic Functions

HLSL comes with a large set of intrinsic functions (functions defined by the language) that offer access to commonly used functionality. The parameters for these functions depend on what the function does, but they are always matrices, samplers, scalars or vectors. You can see the full set of intrinsic functions here.

Writing HLSL Shaders

Until now, you saw the building blocks that you will use from now on to write shaders. As a reminder, there are two types of shaders. The vertex shader is responsible for transforming vertices, while the pixel shader is responsible for determining the final color of a pixel on the screen. In HLSL, you can use several data types, which were created for the specific purpose of computer graphics. The language also offers a set of functions that provide useful functionality in many cases. Now you will learn how to put these building blocks together.

Declaring Variables

As in any programming language, to declare variables, you need to specify the type and name of the variables. The type can be any of the data types mentioned earlier. Arrays are defined using brackets. You can also initialize variables during its declaration. Below, you can see some examples of declarations.

float float_variable1;
float float_variable2 = 3.4f;
float3 position = float3(0,1,0);
float4 color = 1.0f;
float4x4 BoneMatrices[58];

A storage class and/or a type modifier may precede variable declarations. The storage class modifiers specify the scope and lifetime of the variables.

Storage class modifiers:

  • extern – a global variable is an external input to the shader; global are considered extern by default
  • noninterpolation – do not interpolate the output from a vertex shader, when passing it to a pixel shader
  • shared – the variable is shared between several effects
  • static – specified that a local variable is initialized once, and keeps its value between function calls
  • uniform – the variable has a constant value through the execution of a shader; global variables are considered uniform by default
  • volatile – a variable that changes value often; only for local variables

Type Modifiers:

  • const – variable cannot be changed by a shader. Must be initialized in the declaration.
  • row_major – components are stored 4 in a row; stored in a single constant register
  • column_major – components are stored 4 in a column; optimize matrix math

Semantics and annotations can also follow variable names, but we will return to that a little later.

Shader Inputs

In order to generate some interesting output, shaders need some input data. This data can be of two forms: uniform inputs and varying inputs.

Uniform Inputs

Uniform inputs consist of data that is supposed to have a constant value throughout multiple executions of a shader. Typically, uniform inputs hold data like material colors, texture, world transformations.

There are two ways to specify uniform input. The first and most commonly used it through global variables. You declare these outside the shader functions, and use them inside the shader functions. The second way is to mark an input parameter of a function with the uniform storage class modifier.

float4x4 WorldMatrix : WORLD; //this is a global variable, with the WORLD semantic
float4 AFunction(PSIn input, uniform another_var)
{
   //the values of WorldMatrix and another_var are constant through multiple executions
}

In the code above, we declared another_var with the uniform modifier. This makes it act like a global variable.

The uniform variables are stored in a constant table, which can be accessed from the application. We will usually access them from XNA using EffectParameters. The Effect class has a collection of parameters in its Parameters member. Thus, we can access the global variables by index, by name, or by semantic. Assuming we have an effect loaded inside a variable called effect, the code below shows the three ways of accessing the WorldMatrix variable (from the above example).

effect.Parameters[0]
effect.Parameters[“WorldMatrix”]
effect.Parameters.GetParameterBySemantic(“WORLD”)

To set the value of a parameter from an application, you need to use the SetValue function. The type of the parameter given to the function has to match the type of the shader variable.

//correct
effect.Parameters[“WorldMatrix”].SetValue(Matrix.CreateTranslation(10,0,10));
//incorrect – runtime error
effect.Parameters[“WorldMatrix”].SetValue(Vector3.Zero);

If you ever need to (though it is not recommended for performance reasons) you can read the value of a variable using the GetValueXXX functions, where XXX specified the data type that is to be read.

effect.Parameters[“WorldMatrix”].GetValueMatrix();
Varying Inputs

Varying inputs represent data that is unique to each execution of the shader. For example, each time a vertex shader is executed, is has as input a different position, different texture coordinates, etc.

Varying input parameters are declared as input parameters in the shader functions. In order for the shader to compile, each parameter has to be marked with a semantic. If a parameter does not have an attached semantic (to make it a varying input), or a uniform modifier (to make it a uniform input), the compilation of the shader will fail.

So what is a semantic? A semantic is a name used to specify how data is passed from a part of the graphics pipeline to another. For example, the POSITION0 semantic specifies that a variable should be filled with data specifying the position of a vertex. The exact process of how data is extracted from streams of bytes and put into variables with the corresponding semantics will be detailed later in this article, in the Vertex Declarations section. Vertex shaders use semantics to link data from the vertex buffers sent by the applications, while pixel shaders use semantics to link data from the outputs of a vertex shader.

Semantics need to be specified both for input variables and for output variables, to let the graphics pipeline know how to move the data around. To assign semantics to variables, two methods are used: adding the semantic after the variable declaration, using a colon, or defining a data structure where each member has an attached semantic. You can also specify an output semantic by attaching it to the shader function name.

float4 PixelShaderFunction(float2 TexCoords:TEXCOORD0) : COLOR0
{}

In the above code, the TEXCOORD0 semantic specifies that the variable TexCoords should contain the first channel of texture coordinates. The semantic COLOR0 is attached to the function, which means that the value returned by PixelShaderFunction will be assigned to the color of that pixel. If you want to specify an output variable in the parameter list of the function, you need to add the out modifier to it. You will often encounter input data specified as structures, like the ones seen below.

struct VertexShaderInput
{
    float4 Position : POSITION0;
    float3 Normal : NORMAL0;
    float2 TexCoord : TEXCOORD0;
};
struct VertexShaderOutput
{
    float4 Position : POSITION0;
    float2 TexCoord : TEXCOORD0;
    float3 Normal : TEXCOORD1;
};

The VertexShaderInput structure holds three variables, each of them linked to a specific data in the input vertex stream: the position of a vertex, the normal, and the texture coordinates. If the vertex declaration and the vertex buffer do not contain all elements specified by the semantics, the corresponding variables are set to 0. The VertexShaderOutput structure outputs data from the shader, and puts it in place, so a pixel shader can read it. A vertex shader that uses these structures is declared below.

VertexShaderOutput VertexShaderFunction(VertexShaderInput input)
{
    VertexShaderOutput output;
    [...]//compute members of the output based on the input
    return output;
}

The data outputted by a vertex shader is interpolated before being given as input to pixel shaders. The minimum output of a vertex shader is the position of the vertex, using the POSITION semantic.

To use the above data, a pixel shader can have the following declaration.

float4 PixelShaderFunction(VertexShaderOutput input) : COLOR0

Inside the function you can use all members of the input structure, except the Position, which is hidden from the pixel shader, and its usage would generate a compilation error. A pixel shader must always output a COLOR0, of type float4.

You saw how data is passed from the application, to the vertex shader, and then to the pixel shader. However, when drawing objects on the screen, we also want to draw those using textures. Next, you will see how to use textures and samplers as inputs to shaders.

Textures and Samplers

To read data from a texture, it is not enough to have a member of type texture. You also need a sampler, which specifies from what texture to read, and how to do the reading, and a sampler instruction, which executes the actual reading, at some given coordinates.

A sampler declaration needs to contain a sampler state. You can see the most important elements of a sampler state below.

  • AddressU Specifies the addressing mode for the U texture coordinate
  • AddressV Specifies the addressing mode for the V texture coordinate
  • MagFilter Specifies the magnifying filter to use when sampling from a texture
  • MinFilter Specifies the minimizing filter to use when sampling from a texture
  • MipFilter Specifies the mip filter to use when sampling from a texture

Below you can see an example of a sampler declaration.

Texture DiffuseTexture;
sampler TextureSampler = sampler_state
{
    Texture = (DiffuseTexture);
    MinFilter = Linear;
    MagFilter = Linear;
    MipFilter = Linear;
    AddressU = Wrap;
    AddressV = Wrap;
};

This sampler will be used to read data from the DiffuseTexture texture, using Linear filtering, and wrapping the texture coordinates if they are outside the [0..1] range.

Now that you have a texture and a sampler, you want to read data from that texture, using the sampler. To do this, use the sampling functions (seen earlier in the Intrinsic Functions section). The most common are tex2D, tex3D and texCUBE. They take as parameters a sampler variable and texture coordinates, of type float2, for tex2D and float3 for tex3D and texCUBE. The example below reads from a 2D texture using the TextureSampler sampler, and the texture coordinates received from the vertex shader.

float4 PixelShaderFunction(float2 TexCoords : TEXCOORD0) : COLOR0
{
    return tex2D(TextureSampler, TexCoords);
}
Flow Control

Depending on the shader model, HLSL also supports flow control instructions, like branching and looping.

The simplest form of branching for the video hardware is static branching. This allows for blocks of code to be executed or not, based on some shader constant. You can set the value of the corresponding constant between draw call to make the shader behave different for each model you draw. However, the block of code will be enabled / disable for the whole object.

For programmers, branching is more familiar as done on the CPU. The comparison condition is evaluated for each pixel or vertex at run time, so different code paths may be taken for different parts of the model. This is called dynamic branching. While more flexible and convenient, dynamic branching incurs some performance hits, and is not available on all hardware.

Short introduction to Semantics and Annotations

You saw earlier how semantics are added to variable to identify and link inputs and outputs of the graphics pipeline. Thus, semantics are relevant in the context of the language and are necessary for some variables.

You can use semantics to have a unified way of setting values to different variables in different shader. For example, you could use the WORLD semantic to access the corresponding variable, even if a shader uses WorldMatrix as the variable name, or wrld, or world_mat, as long as they are declared using the WORLD semantic.

In addition to semantics, HLSL also offers the concept of annotations. Annotations are used-defined data that can be attached to techniques, passes or parameters. Annotations are not relevant for the HLSL compiler. So having an annotation does not affect the final shader code. However, they are a flexible way to add information to parameters. An application can read and use this information in whatever way it chooses. Annotation declarations are delimited by angle brackets.

In the following example, we add an annotation to a texture, to specify the file that should be loaded in that texture.

texture Texture <string filename = "myTexture.bmp">

The annotation is just a decoration. The file myTexture.bmp is not loaded automatically in the Texture variable. However, we used the annotation, so we can use this information in the application. You can read annotations by using the Annotations member of the EffectParameter class. The following code illustrates how to use the annotation added to the Texture variable.

Effect effect;
Texture2D texture;
//read the annotation
String filename = effect.Parameters["Texture"].Annotations["filename"].GetValueString();
[...]//load the texture specified in the filename in the texture variable
//set the texture to the effect parameter
effect.Parameters["Texture"].SetValue(texture);

Because annotations are so flexible, you can use them to interact with different effect files in the same way, both in a game and in some other tool.

Techniques and Passes

An effect file can contain one or more techniques. A technique encapsulates all information needed to determine a style of rendering, and contains one or more passes. Declaring a technique is done using the technique keyword.

technique Technique_0
{
    //list of passes
}
technique Technique_1
{
    //list of passes
}

From XNA code, the techniques are accessible through the Techniques member, and the currently active technique is accessible through CurrentTechnique. Techniques are often use for providing different versions of the same shader for different shader models. This way, when initializing an effect, based on the hardware configuration, you can use a technique to us from that point onwards.

As mentioned before, a technique is composed of one or more passes. A pass contains the state assignments required to render. These include the vertex and pixel shaders and the render states. Passes are usually used when the final rendering of a model/object requires more types of processing to be done, and then combined. For example, you could have a pass that evaluates directional lighting, and a pass that evaluates spotlights, and in the end combine their results. However, simply declaring more passes in a technique does not cause them to be automatically executed. The application code has to iterate through the list of passes of a technique, and issue a draw call for each pass.

The code below shows an example of a technique containing more passes.

technique example_technique
{
    pass Pass0
    {
        ...
    }
    pass Pass1
    {
        ...
    }
}

An XNA application would iterate and execute more passes in the following way.

effect.Begin();
foreach (EffectPass pass in effect.CurrentTechnique.Passes
{
    pass.Begin();
    //issue draw calls
    pass.End();
}
effect.End();

Vertex Declarations

You saw earlier how semantics determine how data is linked from one pipeline stage to the next. One thing that was not so clear was how the graphics pipeline links bytes from vertex streams into the corresponding variables. We know that, for example, VertexShaderInput.Position should contain the position of the vertex, but when the graphics card receives N bytes of data representing a vertex, it needs a way to decide which of these bytes represent the position, and which represent other things, like normals, or texture coordinates.

This is where vertex declarations come in. A vertex declaration is a structure initialized through application code, which is set on the graphics device to indicate how streams of bytes are organized.

A vertex declaration is represented by an array of vertex elements. A vertex element is the base structure that defines one element of a vertex. For example, a vertex which contains information on position, normal and texture coordinates, would have three vertex elements, one for position, one for the normal and one for the texture coordinates. Below you can see the attributes of a vertex element.

  • stream – stream index to use
  • offset – offset to the beginning of the data
  • element format – defines the vertex data type and size
  • element method – defines the tessellation method
  • element usage – defines how this data should be used
  • usage index – allows specification of multiple data with the same usage. Usage index is used to differentiate between them

The first three attributes determine how and where the data corresponding to this element data is positioned in the byte stream. The usage and usage index together specify the logical meaning of the data, which maps directly to a semantic. Consider the following declaration.

VertexElement texCoords2 = new VertexElement(0,
                                                     12,
                                                     VertexElementFormat.Vector2,
                                                     VertexElementMethod.Default,
                                                     VertexElementUsage.TextureCoordinate,
                                                     2);

This creates a new vertex element that indicates that starting from the 12th byte in the stream, there are two floats that represent texture coordinates, with usage index 2. This translates into the semantic TEXCOORD2. So when a vertex declaration containing this vertex element in on the device, the graphics hardware will read 8 bytes (two floats) starting from position 12 in the stream and assign them to a shader variable which has the semantic TEXCOORD2, is such a variable exists.

XNA contains several vertex types predefined in the language, which have an array of vertex elements attached to them, through the VertexElements member. So to create a vertex declaration that specifies vertices of type VertexPositionColorTexture, you would need to use the next line.

VertexDeclaration vdecl = new VertexDeclaration(
                                GraphicsDevice,
                                VertexPositionColorTexture.VertexElements);

Sometime you will need to learn how to define your own vertex structures, together with their arrays of vertex elements, but this will only become necessary in advanced situations.

The XNA Effect Template Explained

Now that the main theoretical aspects are in place, you should look at the default template for effect files provided by XNA Game Studio 3.1. To see this template, create a new XNA project, and add a new Effect File to your Content project. You should see something similar to the code below.

float4x4 World;
float4x4 View;
float4x4 Projection;
// TODO: add effect parameters here.
struct VertexShaderInput
{
    float4 Position : POSITION0;
    // TODO: add input channels such as texture
    // coordinates and vertex colors here.
};
struct VertexShaderOutput
{
    float4 Position : POSITION0;
    // TODO: add vertex shader outputs such as colors and texture
    // coordinates here. These values will automatically be interpolated
    // over the triangle, and provided as input to your pixel shader.
};
VertexShaderOutput VertexShaderFunction(VertexShaderInput input)
{
    VertexShaderOutput output;
    float4 worldPosition = mul(input.Position, World);
    float4 viewPosition = mul(worldPosition, View);
    output.Position = mul(viewPosition, Projection);
    // TODO: add your vertex shader code here.
    return output;
}
float4 PixelShaderFunction(VertexShaderOutput input) : COLOR0
{
    // TODO: add your pixel shader code here.
    return float4(1, 0, 0, 1);
}
technique Technique1
{
    pass Pass1
    {
        // TODO: set renderstates here.
        VertexShader = compile vs_1_1 VertexShaderFunction();
        PixelShader = compile ps_1_1 PixelShaderFunction();
    }
}

At the very beginning, the effect contains definitions for three parameters. These parameters are matrices (float4x4) and represent the world, view and projection matrices. Under them, two structure data types are definded: one that defines the input to a vertex shader, VertexShaderInput, and one that defined the output of the vertex shader, VertexShaderOutput. Each of these structures contain a channel named Position, linked to the semantic POSITION0, which lets the GPU know the meaning of this data.

Next, a function is defined, which will fill the place of a vertex shader. As expected, its input is a VertexShaderInput structure, and it returns a VertexShaderOutput structure. The position given as an input is multiplied by the three matrices specified earlier, and the resulting position is written as the output. This sequence of multiplications is the most common one that happens in almost all vertex shaders.

The function destined to become a pixel shader returns a float4 as a result, and binds it to the COLOR0 semantic. The function uses a VertexShaderInput structure as input. This can be written in lots of different ways, but organizing the data like this yields in shader code which is cleaner and easier to maintain. The default XNA code simply returns the color red as the result.

And lastly, there is a technique defined. This technique only contains one pass, which uses the two functions defined above as a vertex shader and a pixel shader.

You can use this template as a starting point for your effect files.

Ending Words

I hope this short introduction was helpful for those of you looking to learn HLSL and shader programming. If you have any questions, feel free to post them in the comments below.

  • Dennis Brandis

    On page 4:
    float4 PixelShaderFunction(float2 TexCoords : TEXCOORD0) : COLOR0
    {
    return tex2D(TextureSampler, input.TexCoord);
    }

    may be it should be:
    return tex2D(TextureSampler, TexCoord);

  • http://www.catalinzima.com Catalin Zima

    I fixed it now. Thanks!

  • Pingback: Crash Course in HLSL « optic rust

  • http://hassanselim0.brinkster.net/ Hassan Aly Selim

    Thanks alot for this HLSL Crash Course =)
    Now I can start writing my own Custom Shaders in XNA !

  • Pingback: Catalin’s XNA Experiments » Re-awarded XNA/DirectX MVP for 2010

  • Enio

    Thanks for your helpful explanations. Which book would be more suitable for a beginner who needs to learn everything from scratch (shaders, algebra, algorithms, and physical)? Thanks!

  • Enio

    Why not develop the second part of the course with several practical examples (Blur, DOF, Sepia, etc.)?

  • Pingback: Crash Course in HLSL

  • Devrunner

    You’re the best.

  • http://Website ZWabbit

    I’ve seen both () and used when doing Texture = (something) in declaring samplers. Is there an actual difference between the two?

  • http://Website ZWabbit

    Uh, that was supposed to two braces, that some reason didn’t show up.

  • http://Website ZWabbit

    Sorry, again that didn’t work, I think they’re treating them as html delimiters and I’m not quite sure how to escape them. Basically the less than and greater than symbols wrapping something.

  • http://roy-t.nl Roy Triesscheijn


    ZWabbit:

    Sorry, again that didn’t work, I think they’re treating them as html delimiters and I’m not quite sure how to escape them. Basically the less than and greater than symbols wrapping something.

    Hey Zwabbit, they don’t defer in meaning :).

    Btw CatalinZima, great tutorial. Been reading this and the deferred rendering tuts. Really freshens up my shader knowledge, I forgot almost everything that I learned when following riemers’ tuts.

  • http://Website Tom

    very nice introduction,
    cleared the stuff up. thanks!

  • http://Website Peter

    Thanks a lot.
    Great introduction.

  • Pingback: RenderMonkey Beginner’s Tutorial | David Gouveia

  • Pingback: Adding the Effect files | Project Vanquish

  • Pingback: Rabid Lion Games » Blog Archive » 2D Lighting System tutorial series: Part 2 – An Introduction to Shaders

  • http://Website pvns

    Hi,
    Thanks for sharing awesome information.
    Can you give some simple source codes for HLSL and C++?

  • Pingback: CSC 578 Final Project « jasoncummer

  • Admiral Ronton

    That was amazing, thank you SO MUCH for this clear and concise introduction to HLSL! I feel much better equipped to decipher the MS DirectX examples now. =]

  • Seyed Ahmad Parkhid

    nice explanation ….

  • Pingback: Useful articles « 3vis Blog

  • faTWave

    Oh man, thank you so much Catalinzima! I’m no student of these sorts of matters but I do have a need for the ability to do “simple” effects, such as lighting for a particular side, given what I do. This tutorial of yours will finally make my rendermonkey be of some use. This, along with David’s tutorial…oh I thank you so much. Know that I will forever credit you two whenever I use effects that I will create perhaps someday in the future…