Sgt. Conker We are "absolutely fine"

27Nov/097

Article : Multi-threading your XNA

4. Example: Balls

The code sample shows a scene filled with a great number of balls, which bounce of each other, and the walls.


To be totally honest, my skills with writing physics code is not that great, so at times, you will see some balls go through the walls around the scene, and wander out into the blue, but otherwise, the code does its job. What I did was extend the UpdateManager class, and create a new class called BallsUpdater. In this class, I added code for camera control, code for input handling, and code that updated the physics of the world. The most of the code is collision handling. One interesting thing to notice is that the function UpdatePhysics(), which updates the physics of a single ball returns a boolean value, which tells us if the position of that ball has changed this frame. So, in the Update function of the class, the following code can be seen.

<br />
    public override void Update(GameTime gameTime)<br />
    {<br />
        messageBuffer.Clear();<br />
        HandleInput();</p>
<p>        for (int i = 0; i &lt; GameDataOjects.Count; i++)<br />
        {<br />
            GameData gd = GameDataOjects[i];<br />
            if (UpdatePhysics(gd, (float)gameTime.ElapsedGameTime.TotalSeconds))<br />
            {<br />
                Matrix newWorldMatrix = gd.rotation * Matrix.CreateTranslation(gd.position);</p>
<p>				ChangeMessage msg = new ChangeMessage();<br />
                msg.ID = i;<br />
				msg.MessageType = ChangeMessageType.UpdateWorldMatrix;<br />
				msg.UpdatedWorldMatrix = newWorldMatrix;</p>
<p>				messageBuffer.Add(msg);<br />
            }<br />
        }</p>
<p>        UpdateCamera();<br />
        base.Update(gameTime);<br />
    }<br />
    

As you can see, because we inherit from the UpdateManager class, we only need to write code in the Update() function, and we don't deal with any multi-threading code anymore. We just need to use the ChangeBuffer to transmit data to the rendering thread. The first thing we do is to clear the messageBuffer. After this, we call the function that handles the input. Inside this function, if your press the B button, a new ball is created, and a message of type CreateNewRenderData is added to the buffer. Then, we go through each of the objects in the GameDataObjects list. Note that we didn't use foreach, because we want to have access to the index of that object. That index is used as an ID in this example. So, for each object we call UpdatePhysics on that object, which moves the ball, and updates the physics. Then, if the ball has moved this frame, we compute the new world matrix, and create a message of type UpdateWorldMatrix. We then put this message into the buffer, to be consumed by the rendering thread at a later time.
If the ball didn't move this frame, no message will be sent.
Here you can see that if our messages would have been creates as classes, instead of structs, we would have generated quite a lot of garbage during the frames when balls move. This way, we have no garbage generated. Lastly, we call the UpdateCamera() function, which computes the new position and orientation of the camera, based on the position of the player's ball and the orientation, and creates a message of type UpdateCameraView and puts it in the buffer.

For rendering, I extended the RenderManager class, and created a new class called BallsRenderer. In the LoadContent() function, we load the assets we need, like a model for the balls, a model for the table, effects, and so on. The most important function in the contest of this tutorial is the Draw() function. Here, at the beginning of the function, we need to consume the messages which were created in the previous frame by the update manager.

<br />
    public override void Draw(GameTime gameTime)<br />
    {</p>
<p>        foreach (ChangeMessage msg in messageBuffer.Messages)<br />
        {<br />
            switch (msg.MessageType)<br />
            {<br />
                case ChangeMessageType.UpdateCameraView:<br />
					viewMatrix = msg.CameraViewMatrix;<br />
					break;<br />
                case ChangeMessageType.UpdateWorldMatrix:<br />
                    RenderDataOjects[msg.ID].worldMatrix = msg.UpdatedWorldMatrix;<br />
					break;<br />
                case ChangeMessageType.CreateNewRenderData:<br />
					if (RenderDataOjects.Count == msg.ID)<br />
					{<br />
						RenderData newRD = new RenderData();<br />
						newRD.color = msg.Color;</p>
<p>						newRD.worldMatrix = Matrix.CreateTranslation(msg.Position);</p>
<p>						RenderDataOjects.Add(newRD);</p>
<p>					}</p>
<p>					break;<br />
                default:<br />
					break;<br />
			}<br />
        }<br />
        //draw the scene<br />
        [...]<br />
    }<br />

So taking each message from the buffer, we look at its type. If it is a message for setting the camera's view matrix, we use the value stored in the field CameraViewMatrix, and set it to out local variable. If the message is of type UpdateWorldMatrix, we modify the world matrix of the RenderData object identified by the ID stored in the message.
If the message is of type CreateNewRenderData, we create a new render data, and add it to the list of RenderDataObjects. Because we don't destroy balls in this sample, the new balls created by the update thread should always be added to the end of the list, with a new index. If we would have had code to destroy the balls, we would have needed more complex logic to handle the IDs of the objects. But as it is, we don't need anything else. Now we can go ahead and render the scene. Another good thing to know is that not all objects we render need to depend on the Update thread. For example, the table is never moving, so all the code that deals with it is solely in the BallsRenderer class.

At first, we simply had 197 spherical balls, and no other special effects. But as the Xbox has a very powerfull GPU, and CPU floating point code is not that fast, the physics thread took much longer than the rendering thread, and the gains from using multi-threading were not that spectacular (only about 30%-40% lower frame times). So I decided to give some work to that GPU. First step, I took the spherical balls, and made a new model for them, which has about 9000 polygons. But even this (197 * 9000 polygons per frame) was a walk in the park for the Xbox's GPU.

So I began writing code, and added a cartoon shader for the object, which basically draws each object twice (for normal+depth, and color), and then applies some post-processing effects.

Now things were balanced, and the difference between the physics and rendering times was smaller.

So what are the numbers? The code was run on an Xbox 360 and the following numbers were obtained:

Physics time (ms) Rendering time (ms) Total frame time (ms) Average Frames / Seconds
Single-threaded 28-38 23-24 51-62 16-19 FPS
Multi-threaded 30-40 25-26 30-40 26-33 FPS

So even though adding multi-threading added a couple of milliseconds to both the rendering time and physics time, the total frame time was significantly reduced, and the gain in the framerate was an important one. You can easily modify the number of balls (Game1.LoadContent()) and see what other numbers you get. Also, when running the sample, you can see the number of messages in the buffers varies from 1 (when all balls are still) to 198 (when all balls are moving).

The link to the archive is at the end of the article. The control are as follows:

  • Left Stick to turn the camera left or right
  • Keep A down to accelerate the player's ball
  • Press Y to give the player's ball a short burts of speed
  • Press X to give all the balls a short burst of speed
  • Press B to create a new ball, at the position of the player's ball, and shoot it forward

About Sgt. Conker

The Sergeant!
Comments (7) Trackbacks (6)
  1. Hi Catalin, I originally went down a very similar research path as what you are describing in your threaded-rendering system.

    However I did some prototyping and came to the realization that the GPU is already running asynchronously, and only forces a CPU-side block if the previous frame hasnt finished when the next .Begin() is called. This result plus the added complexity of caching game-state led me to abandon this system.

    What we are planning to do on our engine is to use a seperate cpu thread to process vertex data (batch instancing), but that’s really no different than multithreading other cpu-side modules.

  2. Hi Jason.
    Yes, the GPU does indeed run asynchronously from the CPU. However, the speed of the GPU is rarely the main performance bottleneck. Most time, the problem is with the communication between the CPU and the GPU which happens at each DrawXXX() call. Having a high number of draw calls eats up lots of CPU time, and that is what I am trying to reduce in this article. So what I am threading is not actually the GPU-drawing, but the CPU invoking of draw calls (which happens at driver-level).
    In most multi-threading approaches, you either try to distribute the non-graphics CPU-work on multiple threads, or try to distribute the GPU-communication work done by the CPU.

    But yes, the approach is not what I’d recommend for a truly advanced system. It is good for learning purposes, and it is definitely a robust way to add threading to your game and obtain a good performance boost, but at the cost of extra memory needed for the two buffers.

    For a more complex threading system, there are some other ways to do it, as presented in some of the papers I linked to at the bottom of the article. Each method has it’s advantages, and you can’t always decide that one way is better then others. It usually depends on the structure of the rest of the engine.

  3. Nice one. It will help me a lot.

    Thanks,
    Timo

  4. yah, I think your system is a great intro to the very complex world of multithreading. Like I said, I originally went down a very similar path to what you are describing, so I do feel that there are benefits, but in my situation the drawbacks outweighed them.

  5. a simpler approach might be:
    a) each GameComponent maintains two DrawState objects
    b) Draw() method ONLY uses information in current DrawState object to draw
    c) Update() keeps only ONE version of “UpdateState”
    d) single point of control for flip flop using Game.CurrentDrawStateIndex = 0 or 1
    e) GameComponent.Draw() { CurrentState = DrawState[Game.CurrentDrawStateIndex]; }
    this is easy to design – anything that Draw() requires goes into DrawState

  6. When I try using Multiple Threads, the controls don’t seem to work, however they do work if I use a single thread (uncomment the line in the draw method and comment out the one in load). Any idea why this is?

  7. I should notice something: because of the way xna handles input, Keyboard input can be received only from the main thread, due to this you have to create your own input handler to fix this


Leave a comment


*

XNA Tutorial Contest!

Our Absolutely Fine tutorial contest comes to an end.

See the results!

Submit News

Saw anything interesting related to XNA and game development? Let us know at news@sgtconker.com!

Pages

Categories

Recommended

Meta