Monday 20 January 2014

FireMonkey canvas classes and a bugfix to speed up your apps

Everything you need to know about FireMonkey canvases - and a performance boost bugfix for some people as well!

I recently posted my first real-world FireMonkey app, which gave a zoomable, scrollable, very interactive view of the Mandelbrot fractal using the precomputed DWS Mandelbrot tiles. It worked fine on my computer.

Those are famous last words.

Soon the comments on that page were filled with people saying it didn't work: the UI said tiles were downloading etc, but it drew only a blank solid colour where the fractal should have been. I made an educated guess that the problem only happened when using the Direct2D canvas, and put out a "fix" that restricted it to drawing using GDI+. This fix worked - it draws - but GDI+ is slow, and the app as it's currently available is not of a quality I feel personally comfortable having publicly available with my name attached. Clearly I need to fix it. But how?

This is a perfect example of why being aware of the different canvases in FireMonkey matters. You need to test with each one that your app could possibly end up using on an end-user's machine, which means you need to know what they are, when they're chosen by FMX to be used, and how to force a specific choice in order to test each case. Moreover, there is (IMO) a bug in Firemonkey's logic about which class to choose when, which results in your apps rendering much more slowly than they need to in some use cases, and you may want to tweak some code in order to fix this and make your app render faster.

What's in this article?

  • The role of canvases in FireMonkey rendering
  • Overview of each possible Windows canvas class: GDI+, Direct2D, and GPU
  • How does FireMonkey choose which canvas class to use?
    • Investigating when Direct2D is chosen vs GDI+, and we find a bug
    • Fixing the bug - three possible solutions
  • For testing: how to force the selection of a specific class
    • Checking what class you are using
  • Summary
This is a long article - two and a half thousand words - so let's get going.

The role of canvases in FireMonkey rendering


FireMonkey is a cross-platform UI toolkit. As such it needs to be able to render everything onscreen independent of the underlying graphics framework - it needs one API you and I can code against that runs on Windows and OSX and iOS and Android.

It achieves this by using a variety of different canvas classes.  That is, when you access a TCanvas such as Form.Canvas or TBitmap.Canvas, due to the wonderfulness of polymorphism the actual class you are using can vary widely.  Here are the possibilities:
  • TCanvasGDIPlus (Windows)
  • TCanvasD2D (Windows)
  • TCanvasGpu (Windows)
  • TCanvasQuartz (OSX)
  • TCanvasQuartz (iOS, implementation appears independent of the OSX class with the same name)
  • At least one more for Android in XE5+. 
Let's write off the platforms that only have a single canvas implementation - OSX, iOS, and probably Android. (I don't have XE5 and googling didn't show much about the underlying code.) If you're using one of those platforms, you are by default testing using the only canvas class and this is a non-issue. But that leaves three possible canvas classes that your app could end up using on Windows. (Even if you know about the Direct2D and GDI+ canvases, I bet you didn't know about the 'GPU' canvas. I sure didn't.)

Each Windows canvas class


GDI+


TCanvasGDIPlus is the default, fallback canvas. It uses GDI+, a software-only, fairly slow, API provided by Microsoft in the Windows XP days. It will run on anything, but your rendering performance may not be great. For example, in my fractal app which draws anywhere from four to a few dozen 256x256 tiles at various scales on the window with every paint, at the default small window size click-dragging to navigate is fast. But if you maximise the window, and the rendering area becomes much larger, scrolling around - which invalidates with every mouse movement, effectively drawing as fast as possible - is painfully laggy. This is not FireMonkey's fault. It is one of the problems with using GDI+, and I have experienced the same problem drawing complex interactive UIs with GDI+ before.

If your app is rendering using this class - I show how to find out which class later - I strongly recommend you find out why and do what you can to fix it. In general, avoid using this class if possible.

You will always end up using the class on Windows XP, since it's the only one supported. On all other versions of Windows, Vista and above, 99% of the time you will be able to use TCanvas2D instead (once you fix a problem with when it's chosen) and I highly recommend you do this.

Direct2D


TCanvasD2D uses Direct2D, a 2D API implemented over Direct3D, which is available on Vista SP2+ and above. It is hardware-accelerated and fast, and theoretically the default. The quick answer is that you want to use this class if at all possible, but you may need to make some code changes to do it. Without some very small tweaks, there are cases where FireMonkey will choose a GDI+ canvas instead of a D2D one on hardware where D2D would run faster - much, much faster. This is rare, but my setup is one where it occurs.

GPU


TCanvasGpu is turned off by default, and is only used if the global FMX.Types.GlobalUseGPUCanvas is true. (Set this in your project file before Application.Initialize.) It's quite neat in that it uses a base class TContext3D to do its work, which has a very similar system for choosing which subclass is appropriate to instantiate as the canvas system. There are context classes for D3D9, D3D10, GLES and Quartz.

The first time I tried this out, it crashed immediately - FillText ends up calling TCharHelper.ConvertToUtf32 with an empty string, which raises an exception. Reading the preceding code, which seems to implement text wrapping, I don't understand why it's trying to do what it is.

TCanvasGpu running on Windows 7 on DX9
hardware. Yes, there is a whole TTrackBar
between those two buttons. (See it? Me either.)
This class is turned off by default and I do not
 recommend manually enabling it.
On my machine, it uses a TDX9Context to draw. There are noticeable severe rendering bugs. In my fractal app one control, a TTrackBar, doesn't draw at all. Text draws 'bold', which looks similar to the effect you get drawing antialiased text over itself many times. Buttons had one-pixel-wide edges missing.

I don't know how much of this is due to the TDX9Context it was using, and how much is due to TCanvasGPU itself. Since on DX10-class hardware FireMonkey will use Direct2D, DirectX-9 class hardware is the only use case on Windows for this class. (As it turns out, we should normally use Direct2D even for DirectX9 hardware. More information below.) The severity of the bugs are, I suspect, why it is turned off by default. I do not recommend manually turning it on.

How does FireMonkey choose which canvas class to use?


In FMX.Types.pas is a method TCanvasManager.GetDefaultCanvas. This returns a metaclass which is used to instantiate the actual canvas class. The first time this method is called, it assembles a list of possible, valid canvases which the current platform supports and then from that list it chooses which one is best to instantiate. There are some complex if statements about whether a class is the default and whether to try to use a software canvas, but in my testing these didn't make any practical difference.

The key is in the TPlatformWin.RegisterCanvasClasses method, which out of the GPU, D2D and GDI+ canvas classess tests which can be used and where possible adds them to this list. It only 'registers' (adds to the list) the GPU canvas if GlobalUseGPUCanvas is true, and by default it is false (see above.) That leaves D2D and GDI+.

Investigating when Direct2D is chosen over GDI+, and a FMX bugfix


First off, the easy case: GDI+ is the fallback, and is always available on machines that meet the FireMonkey requirements. It is always registered. This means that if the Direct2D class is not registered, your app will end up using GDI+.

Direct2D is trickier. And remember, any bug or quirk here that invalidly thinks D2D is not the right choice will cause the GDI+ canvas to be chosen instead, and that's bad.

Fmx.Canvas.D2D.pas's RegisterCanvasClasses method checks the Direct3D 10 capabilities reported by DirectX, and registers the D2D canvas if the D3D10 driver type is either hardware or WARP. This latter is interesting: the Windows Advanced Rasterization Platform is a software rasterizer supporting Direct3D 9.1 through 10.1 feature levels, and by all accounts is a very good one.  It is part of the DX11 runtime which you need to have installed, which is part of the platform update for Vista or Windows 7. You should already have these automatically through Windows Update.
Direct2D applications benefit from hardware-accelerated rendering on modern mainstream GPUs. Hardware acceleration is also achieved on earlier Direct3D 9 hardware by using Direct3D 10-level-9 rendering. This combination provides excellent performance on graphics hardware on existing Windows PCs.
...
When rendering in software, applications that use Direct2D experience substantially better rendering performance than with GDI+ and with similar visual quality. 

- MSDN Direct2D page
In other words, on DirectX 9.1 hardware there is a high-performance hardware rasterizer available and on lesser hardware there is still a high-performance software rasterizer available. Now, for DirectX 10 and above, it's simple: Direct2D will be chosen. But for DirectX9-class hardware, there is a choice between two software renderers: GDI+, an old and slow API, or WARP, a speedy, very technically impressive API. Clearly, where possible, FireMonkey should choose to use it, falling back to GDI+ only if nothing else whatsoever is possible. As you've no doubt guessed if you've read this far, it doesn't, and this is what we need to investigate and fix.

The problem lies in TCustomDX10Context.CheckDevice. An edited version of the problematic portion of code is:
if ...{can create a D3D hardware device} then
begin
  FDriverType := D3D10_DRIVER_TYPE_HARDWARE;
end else if
  not TCustomDX9Context.HardwareSupported and
  Succeeded(D3D10CreateDevice1Ex(D3D10_DRIVER_TYPE_WARP, D3D10_CREATE_DEVICE_BGRA_SUPPORT, g_pd3dDevice)) then
begin
  // Switch to software mode
  FDriverType := D3D10_DRIVER_TYPE_WARP;
end;
It's this else statement that is problematic. It basically says to use WARP if it's supported (fine) but only if Direct3D9-class hardware is not supported (not fine.) Almost all computers since about 2005 will support D3D9, and this API is available on Vista and above. The only reason I can think of for this is that TCanvasGpu with D3D9 support is expected to be the fallback here before GDI+. However, as we've seen, not only is that class buggy but it is disabled by default (probably because it's buggy.) This means that anyone with D3D9 hardware (but not D3D10+ hardware), and that includes people running on virtual machines like VMWare Fusion, which only supports D3D9 emulation, will end up using GDI+ when they could be using the much faster WARP.

How can we fix it?


A global switch

Normally the best way to force FireMonkey to choose a particular graphics path is via one of the globals at the top of FMX.Types.pas. There is a potentially suitable one: GlobalUseDX10Software. (Remember you need to set these in your project file before you call Application.Initialize.) It's false by default but if you set it to true, you will get WARP. Unfortunately, this means you will always get WARP when possible, even when hardware DX10 support is available. No matter how good WARP is we should choose the hardware-accelerated option when possible, and so this is a no go.

Edit the FireMonkey source


The second option is to edit the FMX source. To do this, make a local copy of FMX.Context.DX10.pas in your program's source folder. (I do not recommend editing RTL source directly and trying to recompile FMX - leave it alone and make your changes separately. If you add your local file to the project it will be used in preference to the RTL version. Just make sure you document what you've changed for future you.)

Add this local file to your project, and remove the 'not' from the else if statement above. It should look something like this:
end else if {not TCustomDX9Context.HardwareSupported and} ...
Recompile and you should get a Direct2D canvas. If you were using the GDI+ canvas before, you should notice a significant difference.

Manual code


The final - and probably best - option is to add some code to try to create a D3D10 context and check the driver type, and if it returns WARP then turn on the above switch. The following slightly ugly method (it's 1AM...) does the trick; call it before Application.Initialize in the project file. This method depends on  FMX.Types, Winapi.D3D10_1, Winapi.D3D10, and WinAPI.Windows.

Because of the method's dependencies and for code cleanliness, I would suggest putting this in a separate unit from the main project source, and ifdef both the unit being included, and the method being called, out completely if you are not compiling for Windows (the MSWINDOWS constant.) As suggested by a commenter below, it is probably also a good idea to ifdef for your specific Delphi version in case this is fixed in future.

The below code doesn't check for D3D9 hardware support, assuming that if it can create a WARP device that's enough. Feel free to add back in additional checks.
procedure TryUseWARPCanvas;
var
  DX10Library : THandle;
  TestDevice : ID3D10Device1;
begin
  DX10Library := LoadLibrary(Winapi.D3D10_1.D3D10_1_dll);
  if DX10Library = 0 then Exit;

  try
    SaveClearFPUState; // Copy from FMX.Context.DX10
    try
      if GetProcAddress(DX10Library, 'D3D10CreateDevice1') = nil then Exit;

      // If there's no hardware D3D10 support, but there /is/ WARP (software support)
      // force that to be used. Don't bother checking DX9 support, just go for WARP.
      if not Succeeded(D3D10CreateDevice1(nil, D3D10_DRIVER_TYPE_HARDWARE, 0, D3D10_CREATE_DEVICE_BGRA_SUPPORT, D3D10_FEATURE_LEVEL_10_1, D3D10_1_SDK_VERSION, TestDevice)) and
        Succeeded(D3D10CreateDevice1(nil, D3D10_DRIVER_TYPE_WARP, 0, D3D10_CREATE_DEVICE_BGRA_SUPPORT, D3D10_FEATURE_LEVEL_10_1, D3D10_1_SDK_VERSION, TestDevice))
        then begin
          FMX.Types.GlobalUseDX10Software := true;
        end;
    finally
      TestDevice := nil;
      RestoreFPUState; // Copy from FMX.Context.DX10
    end;
  finally
    FreeLibrary(DX10Library);
  end;
end;

Tweaks to this code


You might want to change a few things about this code:
  • Editing the FMX code: to match the manual code, I changed it to remove the DX9 check entirely. It either sees if it can create a D3D10 hardware device, or otherwise tries to create a WARP device. Thanks Remy for the suggestion.
  • Untested useful tweak: Direct2D is still hardware-accelerated on DX9-class hardware. Try changing the feature level to D3D10_FEATURE_LEVEL_9_1 to see if it has this level hardware support on your computer. You will need to change both the manual code test (if you use it) and the FireMonkey code creating the devices in the same area as above to match. I haven't tested this and it's just an idea for further investigation; the current code goes either either with D3D10-hardware or WARP, which I know for sure will work and are good, safe modifications to make. Changing this will always require editing the FMX source.
  • XP support: with Microsoft dropping XP support on April 8, 2014, it's quite possible the next version of Delphi will not need to support XP at all - or at least, will only do so as a legacy option. I would suggest that Embarcadero make some changes requiring the Platform Update be installed as prerequisite for FireMonkey apps, and then using one of the D3D10, D3D9-feature-level, or software WARP Direct2D canvases as the only option on Vista and above, and only using GDI+ on XP. There should never be a case on Vista or Windows 7 where GDI+ is the chosen canvas.

For testing: how to force the selection of a specific class


I stated at the beginning that you should test with each possible canvas type, in order to catch code tht works with one and doesn't with another.  How?
  • To force GDI+, set FMX.Types.GlobalUseDirect2D to false.
  • To force Direct2D (using the WARP software rasterizer even with hardware support - so only for testing) set FMX.Types.GlobalUseDX10Software to true.
  • To force the GPU canvas (unnecessary for testing, since it's off by default) set FMX.Types.GlobalUseGPUCanvas to true

How to check what class you are actually using


This is fairly simple. Find a valid normal canvas (such as Form.Canvas) and check its ClassName. It will be one of the above classes.

Summary


  • FireMonkey has several underlying graphics classes depending on the platform and, on Windows, on the capabilities of the platform
  • You need to test each one, because code that works on one can fail on another
  • On Windows, if you (or a user) have D3D9 hardware (but not D3D10 or higher hardware) FireMonkey will use GDI+ to render where it probably shouldn't, which will make your program noticeably slower when run on (a) D3D9-class hardware, or (b) in a virtual machine like VMWare Fusion. It should use Direct2D's software rasterizer instead. Fix this with one of the three ways above; I recommend with the sample code I showed above.

14 comments:

  1. Since your patching an issue with a specific version of Delphi which may be corrected in a future release, I also suggest surrounding the include with a version specific ifdef against the current compiler version...possibly also a forced error for compiler versions greater than the current tested version, requiring modifications after verifying the problem still exists in the next version.

    ReplyDelete
    Replies
    1. Good point. I've edited to reflect this. Thanks.

      Delete
  2. When editing FireMonkey source, wouldn't it make more sense to comment out the entire "TCustomDX9Context.HardwareSupported" check instead of just the offending "not" in front of it? If (can create a D3D hardware device) then ... else if (can create WARP) then ...

    ReplyDelete
    Replies
    1. Yes (and thanks.) Edited to post to reflect this and also make it clearer about when D2D can have hardware support or not - currently the code switches between hardware D3D10 and software WARP support, which is safe and works well. It is possible and untested that you can force it to be instantiated with a D3D9.1 feature level and it will still work, which will give hardware acceleration on that hardware as well. That change will always require editing the FMX source.

      Delete
  3. I have one comment regarding this line - "It's this else statement that is problematic. It basically says to use WARP if it's supported (fine) but only if Direct3D9-class hardware is not supported (not fine.)"

    TContext3D is designed to use mostly for 3D purpose. In this case use hardware-DX9 is preferable than use WARP.

    ReplyDelete
    Replies
    1. Eugene, looks like I replied inline rather than directly to you, sorry. See the reply below.

      Delete
  4. Hi Eugene - a reply from the architect himself!

    If I understand you correctly, as you saying that TContext3D (through TCanvasGpu) should be used for DX9 hardware? TContext3D - at least the DX9 version - doesn't work well at all, as you can see from the screenshot in the section about it. Direct2D should work on DX9-level hardware through the DX10-but-DX9.1-feature-level flag, and still be hardware accelerated. Is it possible for you to turn that on in FireMonkey, please?

    Either way, I think WARP is a lot better than GDI+ because of its quality and speed, and I'd really like to see it chosen where possible, ie Vista+, even when no hardware acceleration is available. If we could leave GDI+ as a legacy canvas for a legacy platform (XP) I think a lot of people would be happy.

    Thanks for your reply!

    ReplyDelete
  5. Interesting post.

    Another side-effect of GDI+ is the "dancing text", due to a GDI+ bug when evaluation text dimensions for common fonts at small size (which everyone but FMX works around by using plain old GDI for ClearText output when DirectWrite is not available)


    Is there a new version available with the accelerated canvas?

    ReplyDelete
    Replies
    1. A new version of the Mandelbrot viewer? Not yet. Monday was the first day I've had time to look at it, and I spent the evening figuring out what I needed to know in order to fix it. I first had to figure out how to get it to use Direct2D to reproduce the issue, and then I became curious why it wasn't using D2D already on apparently perfectly fine hardware, and then... a long diversion that became this blog post.

      "Soon", where that means when I have enough spare time :(

      Delete
  6. Dave, you should urgently manage to get the latest Delphi version (XE5 upd2) as per today since FireMonkey code evolves quickly. Did you see the update I made to you source code so that it compiles and works with XE5?

    ReplyDelete
    Replies
    1. François, I did just see that update! I will try to have a closer look tonight. Thanks very much for submitting changes, btw - I appreciate it!

      XE5: I would love to, but I don't have Software Assurance and can't justify buying it myself. You see, I have access to some versions of Delphi through work, but the other versions I have I own myself - and outside of work, stuff like this (this blog, investigating FireMonkey, etc) is a hobby, and I can't justify a couple of thousand euros / year on upgrades and SA for something I spend such a small amount of time on. I think I will upgrade to XE6 when that comes out and maybe re-examine SA then. I'm also thinking of writing and selling some components or tools, and I hope then I may be able to access all IDE versions through an Embarcadero partner scheme (I'm not sure, I haven't asked them yet, I'm speculating.)

      Delete
    2. Hi Dave,
      Becoming a technology partner is an excellent way as they do give you access to all versions of Delphi for developing your components. At the time (~10 years ago), I remember it as being pretty easy to become a partner (and I have been one ever since). I don't know how easy it is now though. Without that perk, I could not afford to keep up on Delphi purchases.
      Tom

      Delete
  7. Where can I find source code ?
    Thanks

    ReplyDelete
    Replies
    1. Hi Michael,

      You either have to patch the FMX source directly as in this article (not recommended) or you can use a unit I wrote described in the followup to this blog post: http://itinerantdeveloper.blogspot.com/2014/01/a-unit-to-enable-direct2d-in-firemonkey.html . There is a link to download the unit at the end of that post, and the post contains instructions about how to use it (you need to add something to the .dpr file source.)

      Hope that helps!

      Delete