Improving JavaScript code quality with JSLint integrated with Visual Studio

by Matt Perdeck 25. July 2012 11:29

JSLint is a tool that goes through your JavaScript code and points out bad coding practises, including undeclared variables. You can now easily integrate JSLint with Visual Studio.

Contents

JavaScript Weaknesses

An increasing number of web sites are using JavaScript not only for non-critical features such as client side validation, but also for mission critical core features. This to the point where serious application frameworks such as Knockout and Backbone are now in widespread use.

The problem is that JavaScript was never meant for large scale programming:

  1. Using an undeclared variable doesn't result in an exception, but in a new global variable - making JavaScript very error prone.
  2. No compiler, so no compile time checking for syntax errors, etc.
  3. No support for strong typing - making it harder for developers to write correct code and for browser vendors to write efficient interpreters/run time compilers.
  4. No classes, no generics - classes can be simulated, but in roundabout ways.
  5. JavaScript programs travel over the wire as source code, rather than more efficient bytecode.

With the demise of Silverlight however, there are now no alternatives to JavaScript that solve these problems.

JSLint

JSLint takes a little bit of the sting out of JavaScript programming. It is essentially a command line tool that goes through your program and picks up likely bugs and bad coding practices. Those bad practices have been set out by its author, Douglas Crockford, on his site.

JSLint has 30 options determining what likely bugs or practices to report on, from white space styling issues to disallowing eval and ++. Those options can go on the command line or in your source code as a specially formatted comment line. The naming of some options such as Tolerate stupidity and Tolerate messy white space probably give some insight into Mr. Crockford's personality.

JSLint will pick up undeclared variables, but not type violations.

Other linting programs

JSLint is not the only linting program available. Others include:

Lint in Visual Studio 2010

You could do you linting from the command line, but having it integrated with Visual Studio is much better. A number of solutions set out to do just that, including:

Of these options, JSLint for Visual Studio 2010 is the only true Visual Studio plugin. It has some nice features, including:

  • Very easy to install - download the .vsix and double click it.
  • Easy to configure from the Tools menu.
  • Options to lint CSS, HTML and JavaScript.
  • Option to validate each time you do a build, and to break the build if JSLint complains. Sadly, this does not break TFS Builds.
  • Import / export settings to an .xml file, for backup to to share with your team.
  • Issues found by JSLint appear in the Error List, like C# compile errors.
  • Ability to skip linting for individual files and entire directories. Useful, because JSLint doesn't like minified code.
  • Ability to not lint certain sections of code, using special comment lines.

Conclusion

It makes sense to lint your JavaScript to at least catch undeclared variables. Unlike unit testing JavaScript from within Visual Studio, linting JavaScript is now very easy to achieve.

Cache busting using assembly version for RequireJS / ASP.NET MVC projects

by Matt Perdeck 19. July 2012 12:19

Shows how to improve your ASP.NET MVC web site's performance through far future client side caching of RequireJS modules, while still forcing the browser to refresh its cache the moment you introduce a new version of your modules.

Contents

Introduction

RequireJS generates script tags to load modules. To increase web site performance, you want to configure IIS to send response headers that tell the browser to cache those modules for up to a year - the maximum according to the HTTP standard (why; how).

However, when you update one or more modules, you don't want your users to keep using the old versions for up to a year. You want them to update their caches right away.

In this article we'll see how to do cache busting based on the version of the assembly running your web site code. The download contains a working example.

Cache Busting 101

It is very easy to get a browser to refresh its cache. Take this script tag:

<script src="/Scripts/main.js" type="text/javascript"></script>

All you need to do is add some sort of version in a query string:

<script src="/Scripts/main.js?v=1.0.0.0" type="text/javascript"></script>

Then when you update the version:

<script src="/Scripts/main.js?v=1.1.0.0" type="text/javascript"></script>

The browser will look for main.js?v=1.1.0.0 in its cache, won't find it, and request it from the server.

Cache busting this way has pros and cons:

  • Advantage - Very simple. Because you're using a query string, there is no need to change the name of main.js on the server.
  • Disadvantage - Loss of caching performance. When you introduce the query string, proxies and the IIS kernel cache no longer cache your file.

The optimal cache busting method involves changing the file name itself, such as main.1.1.0.0.js instead of main.js?v=1.1.0.0. There are several packages that will do this for you on the fly (example), but they don't integrate with RequireJS. So we'll stick with query strings in the rest of this article.

Cache Busting with RequireJS

RequireJS lets you add a query string to all script tags it generates with the urlTags configuration option:

<script type="text/javascript"> var require = { urlArgs: "v=1.0.0.0" }; </script>
<script data-main="Scripts/main" 
        src="http://cdnjs.cloudflare.com/ajax/libs/require.js/2.0.2/require.min.js" 
        type="text/javascript"></script>

If you run the sample site in the download in Chrome and inspect the DOM (right click anywhere on the page | Inspect element), you'll find that RequireJS has generated this:

Using the assembly version

So far so good. However, you don't want to change your HTML manually each time you do a new release - too clumsy and error prone.

It's much more attractive to use the assembly version of your site:

  • Easy to update manually - right click your web site project | Properties | Assembly Information.
  • Can be updated automatically by TFS Build (how, how, how).

You can get the assembly version via the currently executing assembly:

string version = 
    System.Reflection.Assembly.GetExecutingAssembly().GetName().Version.ToString();

However, this won't work if your site is hosted on a shared hosting company that uses Medium Trust, such as GoDaddy, because the call to GetExecutingAssembly is not allowed for security reasons in such an environment.

Putting it together with an Html Helper

Lets first create an Html Helper that returns the assembly version:

namespace RequireJSAndVersioning.Helpers
{
    public static class AssemblyVersionHelper
    {
        public static string AssemblyVersion(this HtmlHelper helper)
        {
            return System.Reflection.Assembly.GetExecutingAssembly()
                         .GetName().Version.ToString();
        }
    }
}

Now import your helper's namespace in your razor file:

@using RequireJSAndVersioning.Helpers

Finally, use the helper with the RequireJS configuration section:

<script type="text/javascript">
    var require = {
        urlArgs: "v=@Html.AssemblyVersion()"
    };
</script>
<script data-main="Scripts/main" 
        src="http://cdnjs.cloudflare.com/ajax/libs/require.js/2.0.2/require.min.js" 
        type="text/javascript"></script>

As a result of all this, whenever the assembly version of your site changes, all your visitors' browsers will request the latest versions of your modules from your server.

How to use the RequireJS optimizer with jQuery loaded from CDN and plugins as shims

by Matt Perdeck 17. July 2012 09:44

Describes how to use the RequireJS optimizer with code that loads jQuery from a CDN and uses shims for jQuery plugins, avoiding the need to make the plugins into AMD modules.

Contents

Introduction

It is now very common for web sites to load jQuery from a CDN and to also use a number of jQuery plugins. Because most plugins are not available from a CDN, they need to be stored on the web server.

It would be great to use RequireJS to manage the dependencies of your code to the plugins and jQuery, so the plugins and jQuery are loaded automatically when your code loads. Because jQuery plugins are not AMD (Asynchronous Module Definition) modules, they need to be configured in RequireJS as shims.

However, the RequireJS documentation for shims points out that you can't use the RequireJS optimizer with code that uses both shims and libraries loaded from a CDN. This turns out to be true, but only to a certain extent. This article shows how to get around this limitation by loading your AMD modules code and your (non-AMD) plugins in separate stages.

Version 0 - simple site without RequireJS

To investigate this issue, I created a simple single page site without RequireJS at first, with a main.js file, two more custom JavaScript files and two jQuery plugins, Splatter and Toastmessage (version 0 in the download).

Here is the code:

html (note the long list of script tags!):

<h2>Index Page</h2>

<p>
<button onclick="model.save_click()">Save</button>
<button onclick="model.splat_click()">Splat!</button>
</p>

<div id="splatters"></div>

<script src="Scripts/dataAccess.js" type="text/javascript"></script>
<script src="Scripts/ticket.js" type="text/javascript"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js" 
           type="text/javascript"></script>
<script src="Scripts/lib/jquery.splatter.js" type="text/javascript"></script>
<script src="Scripts/lib/jquery.toastmessage.js" type="text/javascript"></script>
<script src="Scripts/main.js" type="text/javascript"></script>

main.js:

function TicketsViewModel() {
    this.save_click = function () {
        var ticket = new Ticket("Economy", 199.95);
        var message = ticket.save();
        $().toastmessage('showNoticeToast', message);
    }

    this.splat_click = function () {
        $('#splatters').splatter({
            height: 250,
            width: 700,
            splat_count: 120
        });
    }
}

var model = new TicketsViewModel();

ticket.js:

var Ticket = function (name, price) {
    this.name = name;
    this.price = price;
    this.save = function () {
        return DataAccess.save(this.name + ' at ' + this.price);
    };
};

dataAccess.js:

var DataAccess = (function () {
    var my = {};
    my.save = function (data) {
        //TODO: Save data
        return('Saving: ' + data);
    };

    return my;
} ());

Next step is to start using RequireJS for dependency management.

Version 1 - first unsuccessful attempt

First lets make the two custom JavaScript files into AMD modules (details).

ticket.js:

define(['dataAccess'], function (DataAccess) { return (function (name, price) {
        this.name = name;
        this.price = price;
        this.save = function () {
            return DataAccess.save(this.name + ' at ' + this.price);
        };
    }); });

dataAccess.js:

define({
        'save': function (data) {
            //TODO: Save data
            return('Saving: ' + data);
        }
});

We'll make main.js into an AMD module as well (red code). Additionally (green code), we'll introduce shims for the jQuery plugins. Finally, we'll point RequireJS at the download path for jQuery.

Note that the plugins and the backup file for jQuery all live in the lib subdirectory. Also, RequireJS wants you to leave off the .js extension.

requirejs.config({ paths: { jquery: [ 'https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min', // If the CDN location fails 'lib/jquery-1.7.2.min' ] }, shim: { 'lib/jquery.splatter': ['jquery'], 'lib/jquery.toastmessage': ['jquery'] } }); // ----------------------------------------

require(['ticket', 'jquery', 'lib/jquery.splatter', 'lib/jquery.toastmessage'], function (Ticket, $) {
	function TicketsViewModel() {
		this.save_click = function () {
			var ticket = new Ticket("Economy", 199.95);
			var message = ticket.save();
			$().toastmessage('showNoticeToast', message);
		}

		this.splat_click = function () {
			$('#splatters').splatter({
				height: 250,
				width: 700,
				splat_count: 120
			});
		}
	}

	model = new TicketsViewModel();
});

Finally, we'll replace the long list of script tags with a single load of the require library from CDNJS:

<h2>Index Page</h2>

<p>
<button onclick="model.save_click()">Save</button>
<button onclick="model.splat_click()">Splat!</button>
</p>

<div id="splatters"></div>

<script data-main="Scripts/main" src="http://cdnjs.cloudflare.com/ajax/libs/require.js/2.0.2/require.min.js" type="text/javascript"></script>

Fails when optimized

When you run this, you'll find that this works well. However, when you then optimize the code and run the result, you'll find that the jQuery plugins get loaded before jQuery itself, causing a JavaScript error.

This is because the plugins have been combined with all other code into main.js - except for jQuery, which is loaded from the CDN. Because the plugins are not AMD modules and so have not been marked as dependent on jQuery, they get loaded before jQuery itself has been loaded.

There are two solutions to this:

  1. Make the plugins into AMD modules and make them dependent on jQuery. This way, their code will only be executed after jQuery has loaded. However, this had the disadvantage that you (and your team members) have to remember to convert new plugins and new versions of plugins whenever they are introduced to the site.
  2. Create a new AMD module that loads the plugins. By making that module dependent on jQuery, it will only load the plugins after jQuery has loaded. This way, you don't have to convert the plugins, but there will be at least one additional download.

I chose the second option because it makes maintenance a bit easier. Note that your site might be loading a lot more stuff besides JavaScript files, such as stylesheets and images. This will reduce the impact of the additional file load.

Version 2 - using a new AMD module that loads the plugins

Lets first create the new AMD module. This has a dependency on jQuery, so won't be executed until jQuery has loaded. It then calls require with the two plugins as dependencies. That will prompt RequireJS to load the plugins.

jQueryPlugins.js:

define(['jquery'], function ($) {
	// Only once jquery has loaded, load the jQuery plugins
	require(['lib/jquery.splatter', 'lib/jquery.toastmessage'], function () {
		// Do nothing
	});
});

Now change main.js, to replace the dependency on the two plugins with a dependency on jQueryPlugins

...
require(['ticket', 'jquery', 'jQueryPlugins'], function (Ticket, $) {
...

RequireJS tends to be vigorous when it comes to finding modules to combine into a single file. We need to tell RequireJS not to combine the plugins into main.js, by excluding them in the app.build.js file which controls the optimization process.

app.build.js:

({
    appDir: "../",
    baseUrl: "Scripts",
    dir: "../../RequireJSTrial.Site-build",
    modules: [
        {
            name: "main", exclude: [ 'lib/jquery.splatter', 'lib/jquery.toastmessage' ]
        }
    ],
    paths: {
        jquery: "empty:"
    }
})

Better, but not good enough

If you now optimize the code and run it, you'll find that it now runs well (see the RequireJSTrial.Site-build directory in version 2 in the downloads).

However, looking at a waterfall chart of the page load, it is clear that the plugins are loaded separatedly. Which is no wonder, seeing that we went out of our way to stop them from being combined into main.js.

Version 2 waterfall

The solution is to another AMD module which is dependent on the jQuery plugins. By listing it as a module in app.build.js, that module and the plugins will be combined into one file. We can then get jQueryPlugins.js to load this one file, instead of the individual plugins.

This all leads to ...

Version 3 - using a second AMD module to load all plugins

First we need the second module that loads all plugins.

jQueryPluginsCollection.js:

define(['lib/jquery.splatter', 'lib/jquery.toastmessage'], function ($) {
	// Do nothing
});

Update jQueryPlugins.js so it loads jQueryPluginsCollection.js instead of loading the plugins individually.

jQueryPlugins.js:

define(['jquery'], function ($) {
	// Only once jquery has loaded, load the jQuery plugins
	require(['jQueryPluginsCollection'], function () {
		// Do nothing
	});
});

Finally, list jQueryPluginsCollection.js as a module in app.build.js. That way, the optimizer will combine and minify jQueryPluginsCollection.js and its dependencies (being the plugins). We also need to tell the optimizer not to combine it with main.js.

app.build.js:

({
    appDir: "../",
    baseUrl: "Scripts",
    dir: "../../RequireJSTrial.Site-build",
    modules: [
        {
            name: "main",
			// Exclude these, otherwise they get combined into main.js, 
            // meaning they'll be loaded before jQuery comes in from 
            // the CDN.
			exclude: [
				'jQueryPluginsCollection',
				'lib/jquery.splatter',
				'lib/jquery.toastmessage'
            ]
        }, { name: "jQueryPluginsCollection" }
    ],
    paths: {
        jquery: "empty:"
    }
})

If you now run the optimizer and load the page, you'll see this waterfall. This shows how all plugins are loaded in one go after jQuery has loaded:

Version 3 waterfall

Unit testing JavaScript as part of TFS Build

by Matt Perdeck 11. July 2012 00:45

Contents

Introduction

Visual Studio makes it very easy to unit test your C# or Visual Basic code. Even better, TFS can run your unit tests as part of a build, providing a bit more confidence in the quality of your code.

However, these days, a large proportion of a site's functionality is often not coded in C# or Visual Basic, but in JavaScript. Unfortunately, there is no obvious support for unit testing JavaScript in Visual Studio as of July 2012.

This article shows how to integrate JavaScript unit testing in your ASP.NET MVC solution. The objective here is to have the JavaScript unit tests execute whenever you run your unit tests - whether from the Test View window in Visual Studio or as part of a TFS build.

Working sample code

To provide a working example of all this, the download contains:

  • A very simple ASP.NET MVC site without JavaScript testing
  • The same site, with JavaScript testing added

You may have to unblock the zip file before unzipping it, so Visual Studio will properly run the solution that's inside - right click zip file | Properties | Unblock.

Comparing QUnit versus Jasmine

There are many JavaScript unit testing frameworks out there (overview). The two most popular ones at the time of writing however are QUnit and Jasmine.

  QUnit 1.8.0 Jasmine 1.2.0
Runs in web page web page
Written in JavaScript JavaScript
Style Similar to xUnit
test("increment a variable", function() {
  var a = 0;
  a++; 

  equal(a, 1, "should be 1");
});
Behavior Driven Development (BDD)
it('should increment a variable', 
   function () {
     var a = 0;
     a++; 

     expect(a).toEqual(1); 
});
Grouping of tests
module("core");
test("increment a variable", function() {
   ...
});

test("decrement a variable", function() {
   ...
});
describe('Calculator', 
  function () {
     it('can increment', 
        function () {
           ...
        });

     it('can decrement', 
        function () {
           ...
        });
});
Groups can contain groups No Yes
ReSharper support Yes (version 6) Yes (version 7)
Supports spies, stubbing, mocking No1 Yes
Supports async tests2 Yes Yes
Supports testing for exceptions Yes Yes
Built in assertions 5 unique + 4 not versions 12 + separate not modifier
Supports custom assertions No Yes
Setup/Teardown functions3 Yes Yes
Can mock the JavaScript clock4 No Yes
Can check for missing vars5 Yes No
Automatically resets test DOM area after each test Yes
<div id="qunit-fixture">
   content wiped before each test
</div>
No
Optionally disables try catch6 Yes No

1A separate package, SinonJs, can provide support for spies and stubbing and mocking for QUnit tests
2Pause the runner to wait for an async response from the server, setTimeout, etc.
3Executed before/after each test.
4So you don't have to wait 10 seconds for a 10 second timeout.
5Missing var in variable declaration makes variable global.
6Depending on your browser, can make it easier to get a stack trace.

Both QUnit and Jasmine are competent, well supported packages. However, because Jasmine currently has the edge in features, the rest of this article focusses on adding Jasmine tests to an ASP.NET MVC solution.

Visual Studio 2012 and Chutzpah

Microsoft has made unit testing extensible in Visual Studio 2012. If you already use that version, you can use the Chutzpah package to make JavaScript unit testing a first class citizen inside Visual Studio (details).

The rest of this article assumes you use Visual Studio 2010, where things a bit more complicated.

Integrating Jasmine unit tests in Visual Studio 2010 and TFS 2010

To make this work, we can take this approach:

  1. Install Jasmine and add JavaScript unit tests.

     

    This allows us to run the tests inside a browser, which is good. However, we also want to run those tests as part of a unit test inside Visual Studio, and have it fail that test when the JavaScript unit tests fail.
  2. Install Chutzpah. This allows us to run the Jasmine unit tests from the command line. It will write a failure message to the console if the tests fail.
  3. Create a unit test within Visual Studio that executes that command line. If the output contains the failure message, fail the test.

Lets make this work. You will find a worked out example in the download.

Install Jasmine and write some tests

  1. I'm assuming you have a project that looks a bit like this:

     

    MyProject.Site
       Controllers
       Models
       Views
       Scripts
       ...
    MyProject.Tests
       Controllers
       ...
    
  2. Download Jasmine zip file
  3. Unzip the file. This will result in a few directores and an html file. Create a new directory JsUnitTests in your Test project and move the Jasmine directories and files in there:
    MyProject.Site
       Controllers
       Models
       Views
       Scripts
       ...
    MyProject.Tests
       Controllers
       JsUnitTests spec src lib SpecRunner.html
       ...
    
  4. spec contains demo tests, and src contains demo JavaScript code that the demo tests run against. Open SpecRunner.html in a browser to see Jasmine in action.
  5. Remove the src directory. Update SpecRunner.html with script tags that load the actual JavaScript code you want to test.

     

    Use relative paths in your script tags, not absolute paths. That makes it so much easier if you ever want to move or copy your solution. And especially if you use TFS Build.
  6. Get rid of the demo tests in spec and write a few tests of your own against your own code. An excellent tutorial on writing Jasmine tests is on the Jasmine home page. Update SpecRunner.html with script tags that load your tests.
  7. Open SpecRunner.html again in a browser to see whether your code passes.
  8. Make sure that all your new files have been included in the project and that they are checked in.

Install Chutzpah

Chutzpah allows you to run JavaScript unit tests from the command line.

  1. Install NuGet if you haven't done so before.
  2. In Visual Studio, open the Package Manager Console - click Tools | Library Package Manager | Package Manager Console.
  3. At the PM> prompt, enter
    Install-Package Chutzpah
    

    This creates a new packages directory in your solution and installs the Chutzpah files in there. At the time of writing, Chutzpah was at version 1.4.2, which results in:

    packages Chutzpah.1.4.2 tools chutzpah.console.exe ...
    MyProject.Site
       Controllers
       Models
       Views
       Scripts
       ...
    MyProject.Tests
       Controllers
       JsUnitTests
          spec
          lib
          SpecRunner.html
       ...
    
  4. Now you can run the Jasmine tests from the command line (update this if the Chutzpah version number has changed):
    cd <directory containing SpecRunner.html>
    ..\..\packages\Chutzpah.1.4.2\tools\chutzpah.console.exe SpecRunner.html
    

Create unit test inside Visual Studio

Finally, we'll create a unit test that executes the command line you just saw.

  1. Install the NuGet package ExecConsoleProgram. This makes it easy to run a console program from managed code, such as a unit test. Open the PM> prompt again and enter:
    Install-Package ExecConsoleProgram
    

     

    NuGet stores package information in a packages directory in the root of your solution (where you .sln is located). Be sure to check in that packages directory, so the build controller can access it when you do a TFS Build.
  2. Add a reference to the ExecConsoleProgram dll to your Tests project. You'll find the dll in the packages directory.
  3. To your JsUnitTests directory, add a new unit test - right click JsUnitTests | Add | New Test | Basic Unit Test
  4. Make it look like this:
    [TestMethod]
    public void JsTest()
    {
        string stdOut;
        string stdErr;
    
        try
        {
            // Correct paths when running unit tests in Visual Studio
            ExecConsoleProgram.ConsoleProgram.Execute(
                @"..\..\JsUnitTests", 
                @"..\..\..\packages\Chutzpah.1.4.2\tools\chutzpah.console.exe", 
                @"SpecRunner.html",
                out stdOut, out stdErr);
        }
        catch (System.ComponentModel.Win32Exception)
        {
            // Correct paths when running unit tests as part of TFS Build
            // >>>>>> In the code below, replace JasmineMVC.Tests with the name 
            // of your own test project.
            ExecConsoleProgram.ConsoleProgram.Execute(
                @"..\Sources\JasmineMVC.Tests\JsUnitTests", 
                @"..\Sources\packages\Chutzpah.1.4.2\tools\chutzpah.console.exe", 
                @"SpecRunner.html",
                out stdOut, out stdErr);
        }
    
        Assert.IsFalse(stdOut.Contains("[FAIL]"));
        Assert.IsFalse(stdOut.Contains("ERROR OCCURRED"));
        Assert.IsTrue(string.IsNullOrEmpty(stdErr));
    }
    

    The Execute method runs the command line. It takes these parameters:

    • Working directory. This is relative to the directory where the currently running assembly is located.
    • Path to the executable (relative to the directory where the currently running assembly is located).
    • Parameters to the executable (relative to working directory).
    • stdOut and stdErr receive the output of the program. As you see, the unit test fails if Jasmine reported that the tests failed or if something went wrong.
  5. Done! When you now run your unit tests from the Test View window, they'll run the JavaScript tests as well. Change one of your tests to force it to fail and see what happens.

Speeding up database access - part 8 Fixing memory, disk and CPU issues

by Matt Perdeck 1. December 2011 20:16

This is part 8 of an 8 part series of articles about speeding up access to a SQL Server database. This series is based on chapter 8 "Speeding up Database Access" of my book ASP.NET Site Performance Secrets, available at amazon.com and other book sites.

In part 2, we saw how to pinpoint bottlenecks related to the database server hardware - memory, disks and CPU. In this last part 8, we'll look at fixing those hardware issues.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

Memory

These are the most common ways to relieve memory stress:

  • Add more physical memory.
  • Increase the amount of memory allocated to SQL Server. To see how much is currently allocated, run:
    EXEC sp_configure 'show advanced option', '1'
    RECONFIGURE
    EXEC sp_configure 'max server memory (MB)'

    If more physical memory is available on the server, increase the allocation. For example, to increase the allocation to 3000 MB, run:

    EXEC sp_configure 'show advanced option', '1'
    RECONFIGURE
    EXEC sp_configure 'max server memory (MB)', 3000
    RECONFIGURE WITH OVERRIDE

    Do not allocate all physical memory. Leave a few hundred MB free for the operating system and other software.

  • Reduce the amount of data read from disk. Each page read from disk needs to be stored and processed in memory. Table scans, aggregate queries and joins can read large amounts of data. Refer to parts 1 and 3 to see how to reduce the amount of data read from disk.
  • Promote reuse of execution plans, to reduce memory needed for the plan cache. See part 6.

Disk usage

Here are the most common methods to reduce stress on the disk system:

  • Optimizing query processing.
  • Move the log file to a dedicated physical disk.
  • Reduce fragmentation of the NTFS file system.
  • Consider moving the tempdb database to its own disk.
  • Split the data over 2 or more disks, to spread the load.
  • Alternatively, move heavily used database objects to another disk.
  • Use the optimal RAID configuration.

Let's go through these options one by one.

Optimizing query processing

Make sure you have the correct indexes in place and optimize the most expensive queries. Refer to parts 1, 3 and 4.

Moving the log file to a dedicated physical disk

Moving the read/write head of a disk is a relatively slow process. The log file is written sequentially, which by itself requires little head movement. This doesn't help you though if the log file and data file are on the same disk, because then the head has to move between log file and data file.

However, if you put the log file on its own disk, head movement on that disk is minimized, leading to faster access to the log file. That in turn leads to quicker Read/Write operations, such as UPDATEs, INSERTs and DELETEs.

To move the log file to another disk for an existing database, first detach the database. Move the log file to the dedicated disk. Then reattach the database, specifying the new location of the log file.

Reduce fragmentation of the NTFS file system

When the actual NTFS database files become fragmented, the disk has to hunt around the disk for the fragments when reading a file. To reduce fragmentation, set a large initial file size( for your database and log files) and a large increment size. Better still, set them large enough so neither file ever has to grow. You want to prevent growing and shrinking the files.

If you do need to grow and shrink the database or log files, consider using a 64KB NTFS cluster size to match SQL Server reading patterns.

Consider moving the tempdb database to its own disk

tempdb is used for sorting, subqueries, temporary tables, aggregation, cursors, and so on. It can be very busy. That means that it may be a good idea to move the tempdb database to its own disk, or to a disk that is less busy.

To check the level of activity of the database and log files of tempdb and the other databases on the server, use the dm_io_virtual_file_stats DMV:

SELECT d.name, mf.physical_name, mf.type_desc, vfs.*
FROM sys.dm_io_virtual_file_stats(NULL,NULL) vfs
JOIN sys.databases d ON vfs.database_id = d.database_id 
JOIN sys.master_files mf ON mf.database_id=vfs.database_id AND mf.file_id=vfs.file_id

To move the tempdb data and log files to for example the G: disk, setting their sizes to 10MB and 1MB, run this code. Then restart the server.

ALTER DATABASE tempdb MODIFY FILE (NAME = tempdev, FILENAME = 'G:\tempdb.mdf', SIZE = 10MB) 
GO
ALTER DATABASE tempdb MODIFY FILE (NAME = templog, FILENAME = 'G:\templog.ldf', SIZE = 1MB) 
GO

To reduce fragmentation, prevent growing and shrinking of the tempdb data and log files by giving them as much space as they are likely to ever need.

Split the database data over two or more disks

By splitting the database's data file over two or more disks, you spread the load. And because you wind up with more but smaller files, this also makes backup and moving the database easier.

To make this happen, add a file to the PRIMARY filegroup of the database. SQL Server then spreads the data over the existing file(s) and the new file. Put the new file on a new disk or a disk that isn't heavily used. If you can, make its initial size big enough so it doesn't have to grow further, thereby reducing fragmentation.

For example, to add a file to database TuneUp, on the G: disk, with an initial size of 20GB, run this command:

ALTER DATABASE TuneUp 
ADD FILE (NAME = TuneUp_2, FILENAME = N'G:\TuneUp_2.ndf', SIZE = 20GB) 

Note that the file has extension .ndf - the recommended extension for secondary files.

Move heavily used database objects to another disk

You could move heavily used database objects such as indexes to a new disk, or to less busy disks. In part 1 "Pinpointing missing indexes and expensive queries" you saw how to use the DMV dm_db_index_usage_stats to determine the number of reads and writes executed on each index. There it was used to find unused indexes, but you can also use it to find the busiest indexes.

And if your server has multiple disks, in part 2 "Pinpointing other bottlenecks", you saw how to measure the usage of your disks. Use this information to decide which objects to move to which disk.

To move an index to another disk, first create a new user defined filegroup. For example, this statement creates a filegroup FG2:

ALTER DATABASE TuneUp ADD FILEGROUP FG2

Then add a file to the filegroup:

ALTER DATABASE TuneUp 
ADD FILE (NAME = TuneUp_Fg2, FILENAME = N'G:\TuneUp_Fg2.ndf', SIZE = 200MB)
TO FILEGROUP FG2

Finally move the object to the filegroup. For example, here is how to move a non-clustered index IX_Title on column Title in table Book to filegroup FG2:

CREATE NONCLUSTERED INDEX [IX_Title] ON [dbo].[Book]([Title] ASC) 
WITH DROP_EXISTING ON FG2

You can assign multiple objects to a filegroup. And you can add multiple files to a filegroup, allowing you to spread for example a very busy table or index over multiple disks.

Have tables and their non-clustered indexes on separate disks, so one task can read the index itself, one another task is doing key lookups in the table.

Use the optimal RAID configuration

To improve performance and/or fault tolerance, many database servers use RAID (Redundant Array of Inexpensive Disks) subsystems instead of individual drives. RAID subsystems come in different configurations. Choosing the right configuration for your data files, log file and tempdb files can greatly affect performance.

The most commonly used RAID configurations are:

RAID
Configuration
Description
RAID 0 Each file is spread ("striped") over each disk in the array. When reading or writing a file, all disks are accessed in parallel, leading to high transfer rates.
RAID 5 Each file is striped over all disks. Parity information for each disk is stored on the other disks, providing fault tolerance. File writes are slow - a single file write requires 1 data read + 1 parity read + 1 data write + 1 parity write = 4 accesses.
RAID 10 Each file is striped over half the disks. Those disks are mirrored by the other half, providing excellent fault tolerance. A file write requires 1 data write to a main disk + 1 data write to a mirror disk.
RAID 1 This is RAID 10 but with just 2 disks, a main disk and a mirror disk. That gives you fault tolerance but no striping.

This translates to the following performance characteristics compared with an individual disk. N is the number of disks in the array.

 Read SpeedWrite SpeedFault Tolerant
Individual Disk 1 1 no
RAID 0 N N no
RAID 5 N N/4 yes
RAID 10 N N/2 yes
RAID 1 2 1 yes

So if you have a RAID 10 with 4 disks (2 main + 2 mirror), N = 4 and read performance will be 4 times better than an individual disk, while write performance will be 4 / 2 = 2 times better. This is assuming that the individual disk has the same speed as the disks in the RAID 10.

From this follows the optimal RAID configuration to use for your tempdb, data and log files:

FilesPerformance related attributesRecommended RAID configuration
tempdb Requires good read and write performance for random access. Relatively small. Losing temporary data may be acceptable. RAID 0, RAID 1, RAID 10
log Requires very good write performance, and fault tolerance. Uses sequential access, so striping is no benefit. RAID 1, RAID 10
data (writes make up less than 10% of accesses) Requires fault tolerance. Random access means striping is beneficial. Large data volume. RAID 5, RAID 10
data (writes make up over 10% of accesses) Same as above, plus good write performance. RAID 10

Having a battery backed caching RAID controller greatly improves write performance, because this allows SQL Server to hand over write requests to the cache without having to wait for the physical disk access to complete. The controller then executes the cached write requests in the background.

CPU

Common ways to resolve processor bottlenecks include:

  • Optimize CPU intensive queries. In part 1 "Pinpointing missing indexes and expensive queries", you saw how to identify the most expensive queries. The DMVs listed there give you the CPU usage of each query. See sections "Missing Indexes" and "Expensive Queries" on how to optimize these queries.
  • Building execution plans is highly CPU intensive. Refer to part 6 to improve reuse of execution plans.
  • Install more or faster processors, L2/L3 cache or more efficient drivers.

Conclusion

In this part, we looked at optimizing the use of the available hardware, including memory, disks and CPU.

This was the last part in this series. If you enjoyed reading these articles, consider buying my book ASP.NET Site Performance Secrets, available at amazon.com and other book sites. It shows how to fix performance issues in ASP.NET / SQL Server web sites in a structured and hands on manner, by first pinpointing the biggest bottlenecks and then fixing those bottlenecks. It covers not only the database, but also the web server and the browser.

Speeding up database access - part 7 Fixing fragmentation

by Matt Perdeck 1. December 2011 20:13

This is part 7 of an 8 part series of articles about speeding up access to a SQL Server database. This series is based on chapter 8 "Speeding up Database Access" of my book ASP.NET Site Performance Secrets, available at amazon.com and other book sites.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

In part 2, we looked at what fragmentation is and how to pinpoint excessive fragmentation. In this part 7, we'll look at fixing excessive fragmentation.

SQL Server provides two options to defragment tables and indexes, rebuild and reorganize. Here we'll examine their advantages and disadvantages.

Index Rebuild

Rebuilding an index is the most effective way to defragment an index or table. To do a rebuild, use the command:

ALTER INDEX myindex ON mytable REBUILD

This rebuilds the index physically, using fresh pages, to reduce fragmentation to a minimum.

If you rebuild a clustered index, that has the effect of rebuilding the underlying table, because the table effectively is part of the clustered index.

To rebuild all indexes on a table, use the command:

ALTER INDEX ALL ON mytable REBUILD

Index rebuilding has the disadvantage that it blocks all queries trying to access the table and its indexes. It can also be blocked by queries that already have access. You can reduce this with the ONLINE option:

ALTER INDEX myindex ON mytable REBUILD WITH (ONLINE=ON) 

This will cause the rebuild to take longer though.

Another issue is that rebuilding is an atomic operation. If it is stopped before completion, all defragmentation work done so far is lost.

Index Reorganize

Unlike index rebuilding, index reorganizing doesn't block the table and its indexes, and if it is stopped before completion, the work done so far isn't lost. However, this comes at the price of reduced effectiveness. If an index is between 20% and 40% fragmented, reorganizing the index should suffice.

To reorganize an index, use the command:

ALTER INDEX myindex ON mytable REORGANIZE

Use the LOB_COMPACTION option to consolidate columns with Large Object data (LOB), such as image, text, ntext, varchar(max), nvarchar(max), varbinary(max) and xml:

ALTER INDEX myindex ON mytable REORGANIZE WITH (LOB_COMPACTION=ON) 

Index reorganizing is much more geared towards being performed in a busy system than index rebuilding. It is non atomic, so if it fails not all defragmentation work is lost. It requests small numbers of locks for short periods while it executes, rather than blocking entire tables and their indexes. If it finds that a page is being used, it simply skips that page without trying again.

The disadvantage of index reorganization is that it is less effective, because of the skipped pages, and because it won't create new pages to arrive at a better physical organization of the table or index.

Heap Table Defragmentation

A heap table is a table without a clustered index. Because it doesn't have a clustered index, it cannot be defragmented with ALTER INDEX REBUILD or ALTER INDEX REORGANIZE.

Fragmentation in heap tables tends to be less of a problem, because records in the table are not ordered. When inserting a record, SQL Server checks whether there is space within the table, and if so, inserts the record there. If you only ever insert records, and not update or delete records, all records are written at the end of the table. If you update or delete records, you may still wind up with gaps in the heap table.

Since heap table defragmentation is not normally an issue, it is not discussed in this book. Here are a few options though:

  • Create a clustered index and then drop it.
  • Insert data from the heap table into a new table.
  • Export the data, truncate the table and import the data back into the table.

Conclusion

In this part, we saw how to reduce fragmentation, by rebuilding or reorganizing indexes.

In the next part, we'll see how to fix hardware issues - related to memory, disks and CPU.

Speeding up database access - part 6 Fixing execution plan reuse

by Matt Perdeck 30. November 2011 21:32

This is part 6 of an 8 part series of articles about speeding up access to a SQL Server database. This series is based on chapter 8 "Speeding up Database Access" of my book ASP.NET Site Performance Secrets, available at amazon.com and other book sites.

In part 2, we saw how to identify suboptimal reuse of execution plans. In this part 6, we'll look at improving this.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

You can boost execution plan reuse in your site by making it easier for SQL Server to work out which bits of a query's execution plan can be reused by a similar query.

Ad hoc queries

Take this simple ad hoc query:

SELECT b.Title, a.AuthorName 
FROM dbo.Book b JOIN dbo.Author a ON b.LeadAuthorId=a.Authorid 
WHERE BookId=5 

When SQL Server receives this query for the very first time, it will compile an execution plan, store the plan in the plan cache, and execute the plan.

If SQL Server then receives this query again, it will reuse the execution plan if it is still in the plan cache, provided that:

  • All object references in the query are qualified with at least the schema name - dbo.Book instead of Book. Adding the database would be even better.

  • There is an exact match between the text of the queries. This is case sensitive, and any white space differences also prevent an exact match.

As a result of the second rule, if you use the same query as above but with a different BookId, there will be no match:

SELECT b.Title, a.AuthorName 
FROM dbo.Book b JOIN dbo.Author a ON b.LeadAuthorId=a.Authorid 
WHERE BookId=9 -- Doesn't match query above, uses 9 instead of 5

Obviously, this is not a recipe for great execution plan reuse.

Simple Parameterization

To make it easier for ad hoc queries to reuse a cached plan, SQL Server supports simple parameterization. This automatically figures out the variable bit of a query. Because this is hard to get right and easy to get wrong, SQL Server attempts this only with very simple queries with one table. For example,

SELECT Title, Author FROM dbo.Book WHERE BookId=5

can reuse the execution plan generated for

SELECT Title, Author FROM dbo.Book WHERE BookId=9

sp_executesql

Instead of getting SQL Server to guess which bits of a query can be turned into parameters, you can use the system stored procedure sp_executesql to simple tell it yourself. Calling sp_executesql takes this form:

sp_executesql @query, @parameter_definitions, @parameter1, @parameter2, ... 

For example:

EXEC sp_executesql 
	N'SELECT b.Title, a.AuthorName
	  FROM dbo.Book b JOIN dbo.Author a ON b.LeadAuthorId=a.Authorid
	  WHERE BookId=@BookId',
	N'@BookId int',
	@BookId=5

Note that sp_executesql expects nvarchar values for its first two parameters, so you need to prefix the strings with N.

Stored Procedures

Instead of sending individual queries to the database, you can package them in a stored procedure that is permanently stored in the database. That gives you the following advantages:

  • Just as with sp_executesql, stored procedures allow you to explicitly define parameters to make it easier for SQL Server to reuse execution plans.

  • Stored procedures can contain a series of queries and T-SQL control statements such as IF THEN. This allows you to simply send the stored procedure name and parameters to the database server, instead of sending individual queries - saving networking overhead.

  • Stored procedures make it easier to isolate database details from your web site code. When a table definition changes, you may only need to update one or more stored procedures, without touching the web site.

  • You can implement better security, by only allowing access to the database via stored procedures. That way, you can allow users to access the information they need through stored procedures, while preventing them from taking unplanned actions.

To create a stored procedure in SQL Server Management Studio, expand your database, expand Programmability and then expand Stored Procedures. Right click Stored Procedures and choose New Stored Procedure. A new query window opens where you can define your new stored procedure.

A stored procedure to execute the query you saw in the previous section would look like this:

CREATE PROCEDURE GetBook
	@BookId int
AS
BEGIN
	SET NOCOUNT ON;

	SELECT Title, Author FROM dbo.Book WHERE BookId=@BookId
END
GO

This creates a stored procedure with name GetBook, and a parameter list with one parameter @BookId of type int. When SQL Server executes the stored procedure, occurrences of that parameter in the body of the stored procedure get replaced by the parameter value that you pass in.

Setting NOCOUNT to ON improves performance by preventing SQL Server from sending a message with the number of rows affected by the stored procedure.

To add the stored procedure to the database, press F5 to execute the CREATE PROCEDURE statement.

To verify that the stored procedure has been created, right click Stored Procedures and choose Refresh. Your new stored procedure should turn up in the list of stored procedures. To modify the stored procedure, right click the stored procedure and choose Modify.

To execute the stored procedure in a query window, use:

EXEC dbo.GetBook @BookId=5

or simply:

EXEC dbo.GetBook 5

Using a stored procedure from your C# code is similar to using an ad hoc query, as shown below.

string connectionString = "...";
using (SqlConnection connection =
    new SqlConnection(connectionString))
{
    string sql = "dbo.GetBook";
    using (SqlCommand cmd = new SqlCommand(sql, connection))
    {
        cmd.CommandType = CommandType.StoredProcedure;
        cmd.Parameters.Add(new SqlParameter("@BookId", bookId));
        connection.Open();

        // Execute database command ...
    }
}

Make sure that the command text has the name of the stored procedure, instead of the text of a query. Set the CommandType property of the SqlCommand object to CommandType.StoredProcedure, so SQL Server knows you're calling a stored procedure. Finally, add parameters to the command that match the parameters you used when you created the stored procedure (more about stored procedures).

Now that you've seen how to improve reuse of execution plans, let's see how to prevent plan reuse, and why you would want to do that.

Preventing Reuse

You may not always want to reuse an execution plan. When the execution plan of a stored procedure is compiled, that plan is based on the parameters used at the time. When the plan is reused with different parameters, the plan generated for the first set of parameters is now reused with the second set of parameters. However, this is not always desirable.

Take for example this query:

SELECT SupplierName FROM dbo.Supplier WHERE City=@City 

Assume that the Supplier table has an index on City. Now assume half the records in Supplier have City "New York". The optimal execution plan for "New York" will then be to use a table scan, rather incurring the overhead of going through the index. If however "San Diego" has only a few records, the optimal plan for "San Diego" would be to use the index. A good plan for one parameter value may be a bad plan for another parameter value. If the cost of using a suboptimal query plan is high compared with the cost of recompiling the query, you would be better off to tell SQL Server to generate a new plan for each execution.

When creating a stored procedure, you can tell SQL Server not to cache its execution plan with the WITH RECOMPILE option:

CREATE PROCEDURE dbo.GetSupplierByCity
	@City nvarchar(100)
	WITH RECOMPILE
AS
BEGIN
...
END

Or you can have a new plan generated for a specific execution:

EXEC dbo.GetSupplierByCity 'New York' WITH RECOMPILE

Finally you can cause a stored procedure to be recompiled the next time it is called with the system stored procedure sp_recompile:

EXEC sp_recompile 'dbo.GetSupplierByCity'

To have all stored procedures that use a particular table recompiled the next time they are called, call sp_recompile with that table:

EXEC sp_recompile 'dbo.Book'

Conclusion

In this part, we saw how to improve execution plan reuse, such as through simple parameterization and stored procedures.

In the next part, we'll see how to fix excessive fragmentation.

Speeding up database access - part 5 Fixing locking issues

by Matt Perdeck 29. November 2011 19:25

This is part 5 of an 8 part series of articles about speeding up access to a SQL Server database. This series is based on chapter 8 "Speeding up Database Access" of my book ASP.NET Site Performance Secrets, available at amazon.com and other book sites.

In part 2 we saw how to pinpoint bottlenecks that are due to locking. In this part 5, we'll look at fixing those locking issues. You'll see how to determine which queries are involved in excessive locking delays, and how to prevent those delays from happening.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

Gather Detailed Locking Information

You can find out which queries are involved in excessive locking delays by tracing the event "Blocked process report" in SQL Server Profiler.

This event fires when the lock wait time for a query exceeds the "blocked process threshold". To set this threshold to for example 30 seconds, run the following lines in a query window in SSMS:

EXEC sp_configure 'show advanced options', 1
RECONFIGURE
EXEC sp_configure 'blocked process threshold', 30
RECONFIGURE

Then start the trace in Profiler:

  1. Start SQL Profiler. Click Start | Programs | Microsoft SQL Server 2008 | Performance Tools | SQL Server Profiler.

  2. In SQL Profiler, click File | New Trace.

  3. Click the Events Selection tab.

  4. Select Show all events checkbox to see all events. Also select Show all columns to see all the data columns.

  5. In the main window, expand Errors and Warnings and select the Blocked process report event. Make sure the checkbox in the TextData column is checked - scroll horizontally if needed to find it.

  6. If you need to investigate deadlocks, also expand Locks and select the Deadlock graph event. To get additional information about deadlocks, have SQL Server write information about each deadlock event to its error log, by executing this from a SSMS query windows:

    DBCC TRACEON(1222,-1)

  7. Uncheck all the other events, unless you are interested in them.

  8. Click Run to start the trace.

  9. Save the template, so you don't have to recreate it next time. Click File | Save As | Trace Template. Fill in a descriptive name and click OK. Next time you create a new trace by clicking File | New Trace, you can retrieve the template from the Use the template dropdown.

  10. Once you have captured a representative sample, click File | Save to save the trace to a trace file for later analysis. You can load a trace file by clicking File | Open.

When you click a Blocked process report event in Profiler, you'll find in the lower pane information about the event, including the blocking query and the blocked query. You can get details about Deadlock graph events the same way.

To check the SQL Server error log for deadlock events:

  1. In SSMS expand the database server, expand Management, expand SQL Server Logs. Then double click a log.

  2. In the Log File Viewer, click Search near the top of the window and search for "deadlock-list". In the lines that chronologically come after the deadlock-list event, you'll find much more information about the queries involved in the deadlock.

Reduce Blocking

Now that you identified the queries involved in locking delays, it's time to reduce those delays. The most effective way to do this is to reduce the length of time locks are held:

  • Optimize queries. The less time your queries take, the less time they hold locks. See Part 1 "Pinpointing missing indexes and expensive queries".

  • Use stored procedures rather than ad hoc queries. This reduces time spent compiling execution plans and time spent sending individual queries over the network. Part 6 "Fixing execution plan reuse" shows how to introduce stored procedures.

  • If you really have to use cursors, commit updates frequently. Cursor processing is much slower than set based processing.

  • Do not process lengthy operations while locks are held, such as sending emails. Do not wait for user input while keeping a transaction open. Instead, use optimistic locking, as described in:

A second way to reduce lock wait times is to reduce the number of resources being locked:

  • Do not put a clustered index on frequently updated columns. This requires a lock on both the clustered index and all non-clustered indexes, because their row locator contains the value you are updating.

  • Consider including a column in a non-clustered index. This would prevent a query from having to read the table record, so it won't block another query that needs to update an unrelated column in the same record.

  • Consider row versioning. This SQL Server feature prevents queries that read a table row from blocking queries that update the same row and vice versa. Queries that need to update the same row still block each other.

    Read versioning works by storing rows in a temporary area (in tempdb) before they are updated, so reading queries can access the stored version while the update is taking place. This does create overhead in maintaining the row versions - test this solution before taking it live. Also, in case you set the isolation level of transactions, row versioning only works with the Read Committed isolation mode - which is the default isolation mode.

    To implement row versioning, set the READ_COMMITTED_SNAPSHOT option as shown in the code below. When doing this, you can have only one connection open - the one used to set the option. You can make that happen by switching the database to single user mode. Warn your users first. Be careful when applying this to a production database, because your web site won't be able to connect to the database while you are carrying out this operation.

    ALTER DATABASE mydatabase SET SINGLE_USER WITH ROLLBACK IMMEDIATE ;
    ALTER DATABASE mydatabase SET READ_COMMITTED_SNAPSHOT ON;
    ALTER DATABASE mydatabase SET MULTI_USER;

    To check whether row versioning is in use for a database, run:

    select is_read_committed_snapshot_on
    from sys.databases 
    where name=' mydatabase '

Finally, you can set a lock timeout. For example, to abort statements that have been waiting for over 5 seconds (or 5000 milliseconds), issue the command:

SET LOCK_TIMEOUT 5000

Use -1 to wait indefinitely. Use 0 to not wait at all.

Reducing Deadlocks

Deadlock is a situation where two transactions are waiting for each other to release a lock. In a typical case, transaction 1 has a lock on resource A and is trying to get a lock on resource B, while transaction 2 has a lock on resource B and is trying to get a lock A. Neither transaction can now move forward, as shown below:

One way to reduce deadlocks is to reduce lock delays in general, as shown in the last section. That reduces the time window in which deadlocks can occur.

A second way is suggested by the diagram - always lock resources in the same order. If in the diagram you get transaction 2 to lock the resources in the same order as transaction 1 (first A, then B), than transaction 2 won't lock resource B before it starts waiting for resource A, and so doesn't block transaction 1.

Finally, watch out for deadlocks caused by the use of HOLDLOCK or Repeatable Read or Serializable Read isolation levels. Take for example this code:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN 
	SELECT Title FROM dbo.Book
	UPDATE dbo.Book SET Author='Charles Dickens' 
WHERE Title='Oliver Twist'
COMMIT

Imagine two transactions running this code at the same time. Both acquire a Select lock on the rows in the Book table when they execute the SELECT. They hold onto the lock because of the Repeatable Read isolation level. Now both try to acquire an Update lock on a row in the Book table to execute the UPDATE. Each transaction is now blocked by the Select lock the other transaction is still holding.

To prevent this from happening, use the UPDLOCK hint on the SELECT statement. This causes the SELECT to acquire an Update lock, so only one transaction can execute the SELECT. The transaction that did get the lock can then execute its UPDATE and free the locks, after which the other transaction comes through.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN 
	SELECT Title FROM dbo.Book WITH(UPDLOCK)
	UPDATE dbo.Book SET Author='Charles Dickens' WHERE Title='Oliver Twist'
COMMIT

Conclusion

In this part, we saw how to reduce locking delays, by reducing the time locks are held and by reducing the number of resources being locked. We also looked at deadlocks.

In the next part, we'll see how to optimize execution plan reuse.

Speeding up database access - part 4 Fixing expensive queries

by Matt Perdeck 28. November 2011 19:42

This is part 4 of an 8 part series of articles about speeding up access to a SQL Server database. This series is based on chapter 8 “Speeding up Database Access” of my book ASP.NET Site Performance Secrets available at amazon.com and other book sites.

In part 1 we saw how to identify the most expensive queries. In this part 4, we'll look at fixing those expensive queries.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

It makes sense to try and optimize those queries that are most expensive – because they are used heavily, or because each single execution is just plain expensive. You already saw how to identify and create missing indexes in part 3. Here are more ways to optimize your queries and stored procedures.

Cache aggregation queries

Aggregation statements such as COUNT and AVG are expensive, because they need to access lots of records. If you need aggregated data for a web page, consider caching the aggregation results in a table instead of regenerating them for each web page request. Provided you read the aggregates more often than you update the underlying columns, this will reduce your response time and CPU usage. For example, this code stores a COUNT aggregate in a table Aggregates:

DECLARE @n int
SELECT @n = COUNT(*) FROM dbo.Book
UPDATE Aggregates SET BookCount = @n

You could update the aggregations whenever the underlying data changes, using a trigger or as part of the stored procedure that makes the update. Or recalculate the aggregations periodically with a SQL Server Job. See how to create such a job at:

Keep records short

Reducing the amount of space taken per table record speeds up access. Records are stored in 8KB pages on disk. The more records fit on a page, the fewer pages SQL Server needs to read to retrieve a given set of records.

Here are ways to keep your records short:

  • Use short data types. If your values fit in a 1 byte TinyInt, don’t use a 4 byte Int. If you store simple ASCII characters, use varchar(n) which uses 1 byte per character, instead of nvarchar(n) which uses 2. If you store strings of fixed length, use char(n) or nchar(n) instead of varchar(n) or nvarchar(n), saving the 2 bytes length field.

  • Consider storing large rarely used columns off row. Large object fields such as nvarchar(max), varchar(max), varbinary(max) and xml fields are normally stored in row if smaller than 8000 bytes, and replaced by a 16 bit pointer to an off row area if larger than 8000 bytes. Storing off row means that accessing the field takes at least 2 reads instead of 1, but also makes for a much shorter record – which may be desirable if the field is rarely accessed. To force large object fields in a table to be always off row, use:

    EXEC sp_tableoption 'mytable', 'large value types out of 
    row', '1'
  • Consider vertical partitioning. If some columns in a table are much more frequently accessed than others, put the rarely accessed columns in a separate table. Access to the frequently used columns will be faster, at the expense of having to JOIN to the second table when it does get used.

  • Avoid repeating columns. For example, don’t do this:

    AuthorIdAuthor Country Book Title 1 Book Title 2
    1 Charles Dickens United Kingdom Oliver Twist The Pickwick Papers
    2 Herman Melville United States Moby-Dick
    3 Leo Tolstoy Russia Anna Karenina War and Peace

    This solution not only creates long records, it also makes it hard to update book titles, and makes it impossible to have more than two titles per author. Instead, store the book titles in a separate Book table, and include an AuthorId column that refers back to the Book’s author.

  • Avoid duplicate values. For example, don’t do this:

    BookId Book Title Author Country
    1 Oliver Twist Charles Dickens United Kingdom
    2 The Pickwick Papers Charles Dickens United Kingdom
    3 Moby- Dick Herman Melville United States
    4 Anna Karenina Leo Tolstoy Russia
    5 War and Peace Leo Tolstoy Russia

    Here the author’s name and country are duplicated for each of their books. In addition to resulting in long records, updating author details now requires multiple record updates and an increased risk of inconsistencies. Store authors and books in separate tables, and have the Book records refer back to their Author records.

Considering Denormalization

Denormalization is essentially the reverse of the last two points in the previous section, “Avoid repeating columns” and “Avoid duplicate values”.

The issue is that while these recommendations improve update speed, consistency and record sizes, they do lead to data being spread across tables, meaning more JOINs.

For example, say you have 100 addresses, spread over 50 cities, with the cities stored in a separate table. This will shorten the address records and make updating a city name easier, but also means having to do a JOIN each time you retrieve an address. If a city name is unlikely to change and you always retrieve the city along with the rest of the address, than you may be better off including the city name in the address record itself. This solution implies having repeated content (the city name), but on the other hand you’ll have one less JOIN.

Be careful with triggers

Triggers can be very convenient, and great for data integrity. On the other hand, they tend to be hidden from view of developers, so they may not realize that an additional INSERT, UPDATE or DELETE carries the overhead of a trigger.

Keep your triggers short. They run inside the transaction that caused them to fire, so locks held by that transaction continue to be held while the trigger runs. Remember that even if you do not explicitly create a transaction using BEGIN TRAN, each individual INSERT, UPDATE or DELETE creates its own transaction for the duration of the operation.

When deciding what indexes to use, don’t forget to look at your triggers as well as your stored procedures and function.

Use table variables for small temporary result sets

Consider replacing temporary tables in your stored procedures with table variables.

For example, instead of writing this:

CREATE TABLE #temp (Id INT, Name nvarchar(100))
INSERT INTO #temp 
... 

You would write this:

DECLARE @temp TABLE(Id INT, Name nvarchar(100))
INSERT INTO @temp 
... 

Table variables have these advantages over temporary tables:

  • SQL Server is more likely to store them in memory rather than tempdb. That means less traffic and locking in tempdb.

  • No transaction log overhead.

  • Fewer stored procedure recompilations.

However, there are disadvantages as well:

  • You can’t add indexes or constraints to a table variable after it has been created. If you need an index, it needs to be created as part of the DECLARE statement:

    DECLARE @temp TABLE(Id INT primary key, Name 
    nvarchar(100)) 
  • They are less efficient than temporary tables when they have more than about 100 rows, because no statistics are created for a table variable. This makes it harder for the query optimizer to come up with an optimal execution plan.

Use Full Text Search instead of LIKE

You may be using LIKE to search for substrings in text columns, like so:

SELECT Title, Author FROM dbo.Book WHERE Title LIKE 
'%Quixote'

However, unless the wildcard starts with constant text, SQL Server will not be able to use any index on the column, and so will do a full table scan instead. Not good.

To improve this situation, consider using SQL Server’s Full Text Search feature. This automatically creates an index for all words in the text column, leading to much faster searches. To see how to use Full Text Search, visit:

Replacing cursors with set based code

If you use cursors, consider replacing them with set based code. Performance improvements of a 1000 times are not uncommon. Set based code uses internal algorithms that are much better optimized than you could ever hope to achieve with a cursor.

For more information about converting cursors to set based code, visit:

Minimise traffic from SQL Server to Web Server

Do not use SELECT *. This will return all columns. Instead, only list the specific columns you actually need.

If the web site needs only part of a long text value, only send that part, not the entire value. For example:

SELECT LEFT(longtext, 100) AS excerpt FROM Articles WHERE 
... 

Object Naming

Do not start stored procedure names with sp_. SQL Server assumes stored procedure names starting with sp_ belong to system stored procedures, and always looks in the master database first to find them – even when you prefix the name with your database name.

Prefix object names with the schema owner. This saves SQL Server time identifying objects, and improves execution plan reusability. For example, use:

SELECT Title, Author FROM dbo.Book

Instead of:

SELECT Title, Author FROM Book

Use SET NOCOUNT ON

Always include the command SET NOCOUNT ON at the start of stored procedures and triggers. This prevents SQL Server from sending the number of rows affected after execution of every SQL statement.

Use FILESTREAM for values over 1MB

Store BLOBs over 1MB in size in a FILESTREAM column. This stores the objects directly on the NTFS file system instead of in the database data file. To see how to make this work, visit:

Avoid functions on columns in WHERE clauses

Using a function on a column in a WHERE clause, prevents SQL Server from using an index on that column.

Take this query:

SELECT Title, Author FROM dbo.Book WHERE LEFT(Title, 
1)='D'

SQL Server doesn’t know what values the LEFT function returns, so has no choice but to scan the entire table, executing LEFT for each column value.

However, it does know how to interpret LIKE. If you rewrite the query to:

SELECT Title, Author FROM dbo.Book WHERE Title LIKE 
'D%'

SQL Server can now use an index on Title, because the LIKE string starts with constant text.

Use UNION ALL instead of UNION

The UNION clause combines the results of two SELECT statements, removing duplicates from the final result. This is expensive – it uses a work table and executes a DISTINCT select to provide this functionality.

If you don’t mind duplicates, or if you know there will be no duplicates, use UNION ALL instead. This simply concatenates the SELECT results together.

If the optimizer determines there will be no duplicates, it chooses UNION ALL even if you write UNION. For example, the select statements in the following query will never return overlapping records, and so the optimizer will replace the UNION clause with UNION ALL:

SELECT BookId, Title, Author FROM dbo.Book WHERE Author 
LIKE 'J%'
UNION
SELECT BookId, Title, Author FROM dbo.Book WHERE Author LIKE 'M%'

Use EXISTS instead of COUNT to find existence of records

If you need to establish whether there are records in a result set, don’t use COUNT:

DECLARE @n int
SELECT @n = COUNT(*) FROM dbo.Book
IF @n > 0
	print 'Records found'

This reads the entire table to find the number of records. Instead, use EXISTS:

IF EXISTS(SELECT * FROM dbo.Book)
	print 'Records found'

This allows SQL Server to stop reading the moment it finds a record.

Combine SELECT and UPDATE

Sometimes, you need to SELECT and UPDATE the same record. For example, you may need to update a ‘LastAccessed’ column whenever you retrieve a record. One can do this is with a SELECT and an UPDATE:

UPDATE dbo.Book
SET LastAccess = GETDATE()
WHERE BookId=@BookId

SELECT Title, Author
FROM dbo.Book
WHERE BookId=@BookId

However, you can combine the SELECT into the UPDATE, like this:

DECLARE @title nvarchar(50)
DECLARE @author nvarchar(50)

UPDATE dbo.Book
SET LastAccess = GETDATE(),
	@title = Title,
	@author = Author
WHERE BookId=@BookId

SELECT @title, @author

That saves you some elapsed time, and it reduces the time locks are held on the record.

Conclusion

In this part, we saw how to speed up expensive queries, such as through the proper use of normalization and denormalization, the use of full-text search and replacing cursors with set-based code.

In the next part, we'll tackle locking issues.

Generate CSS sprites and thumbnail images on the fly in ASP.NET sites - part 3

by Matt Perdeck 14. September 2011 20:59

This Series

Introduction

This is part 3 of a 3 part series about the ASP.NET CSS Sprite Generator package. It allows you to compress and resize the images in your ASP.NET web site on the fly, and to combine them into sprites. Simply add the ASP.NET CSS Sprite Generator dll and configure the package in your web.config - no need to change the site itself.

Contents

Part 1

Part 2

Part 3

Image Groups

You use image groups to determine which images will be processed, and how they will be processed. For a full introduction, refer to the quick start section.

Each group can have the following properties:

Image Filtering

These properties determine which images go into which image group.

Sprite Generation

These properties influence how sprites are generated, such as their image type.

Image Processing

These properties let you manipulate the individual images in the group.

Group Creation

These properties make it easier to manage your image groups.


Image Filtering


maxSize

Sets the maximum size in bytes of images belonging to this group. Images whose file size is larger than this will not be added to this group.

Type Default
Int64 No size limit

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add maxSize="2000" ... />
      </imageGroups>
</cssSpriteGenerator>

If you use width and/or height attributes in your img tags that are different from the physical width and/or height of the images, read on. Otherwise, you can skip the rest of this section.

If you use width and/or height attributes that are different from the physical width and/or height of the images, the package will auto resize the physical image in memory (not on disk!) before adding it to the sprite, unless the disableAutoResize property is true (more details about this feature).

Because of this, the package will estimate the size in bytes of the resized image in order to work out whether to add it to a group. Take this situation:

physical width physical height physical size
physical image 100px 200px 3000 bytes

If you set maxSize to 2000 for a group, than normally this image would not be added because its file size is too big.

Now if you use that image with this tag:

<img src="..." width="100" height="100" />

The image as shown on the page will now have half the height of the physical image. The package than makes a very rough estimate of the file size that this image would have had if it had been physically resized to the given width and height.

In this case, the area of the image (width x height) has been halved, so the package divides the physical size of the image by 2:

width in sprite height in sprite estimated size
image in sprite 100px 100px 1500 bytes

Because the estimated size is now only 1500 bytes, it will now be added to a group with maxSize set to 2000.

One last twist here is that the size of the image as it goes into the sprite can not only be set by the width and height properties on the img tag, but also by the resizeWidth and resizeHeight properties of the image group. However, these are only applied after an image has been added to a group, so they are not used to determine whether to add an image to the group in the first place.

maxWidth

Sets the maximum width in pixels of images belonging to this group. Images whose width is larger than this will not be added to this group.

Type Default
Int32 No width limit

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add maxWidth="50" ... />
      </imageGroups>
</cssSpriteGenerator>

If you used a width attribute in an img tag, than that width will be used to decide whether the image is not too wide, rather than the physical width of the image (provided you didn't set disableAutoResize to true).

For example, if maxWidth is 50 for a group, than an image that is 60px wide will normally not be included in that group. However, if you had the following image tag, the width property will be used and the image will be included:

<img src="width60px.png" width="50" />

This feature is not meant to encourage you to use width or height properties that are inconsistent with the physical image size. But if you did, than this is how the package will handle this.

maxHeight

Sets the maximum height in pixels of images belonging to this group. Images whose height is larger than this will not be added to this group.

Type Default
Int32 No height limit

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add maxHeight="50" ... />
      </imageGroups>
</cssSpriteGenerator>

Similar to maxWidth, if you used a height property in an img tag, than that height will be used to decide whether the image is not too high, rather than the physical height of the image (provided you didn't set disableAutoResize to true).

filePathMatch

This is a regular expression (tutorial). If this is set, images whose file path does not match this will not be included in the group.

Type Default
string
(regular expression)
empty
(no restriction)

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <!--only include .gif and .png images in the group-->
          <add filePathMatch="(png|gif)$" ... />
      </imageGroups>
</cssSpriteGenerator>

Note that filePathMatch matches against the file path of the image on the web server, not against the url of the image on your site. To only include images in the icons directory, set filePathMatch to \\icons\\, not to /icons/. You need to double the backslashes (\\), because the backslash is a special character in regular expressions, so needs to be escaped with another backslash.

pageUrlMatch

This is a regular expression. If this is set, than this group is only used if the url of the current page matches this. If the url of the current page does not match pageUrlMatch, the package acts as though the group doesn't exist.

Type Default
string
(regular expression)
empty
(no restriction)

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <!--do not use this group if the current page has "/excludeme/" in its url-->
          <add pageUrlMatch="^((?!/excludeme/).)*$" ... />
      </imageGroups>
</cssSpriteGenerator>

Note that whereas filePathMatch matches against the file path of an image, pageUrlMatch matches against the absolute url of the current page. To use an image group only with pages in directory special, set pageUrlMatch to /special/, not to \\special\\.

The example above shows how to make sure that an image group is used for all pages, except those in a particular directory. As you see, making this happen in a regular expression is a bit awkward (details).

The demo site DemoSite_Gallery in the solution in the download shows how pageUrlMatch can be used to resize images only on the home page, while keeping their sizes the same on all other pages.


Sprite Generation


maxSpriteSize

Sets the maximum size of a sprite in bytes.

Type Default
Int64 No limit

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add maxSpriteSize="10000" ... />
      </imageGroups>
</cssSpriteGenerator>

If you have a lot of images to put into sprites, it's better to spread them over a number of reasonably sized sprites, rather than one very big sprite. That allows the browser to load the sprites in parallel.

To achieve this, you can set maxSpriteSize. While adding images to a sprite, the package keeps track of the total file size of all images added. If that goes over maxSpriteSize, it writes the sprite and starts a new one. As a result, one group could generate multiple sprites.

Note that the package doesn't attempt to work out how big the sprite will be after it has been written to disk - that would take a lot of CPU cycles. It simply adds up the file sizes of the images going into the sprite.

You may have resized one or more images in the group with the resizeWidth and resizeHeight properties, or with the width and/or height attributes on the img tag. In that case, the package estimates the file size of the resized image and uses that to calculate the current size of the sprite.

spriteImageType

Sets the image type of the sprite.

Value Description
Png
(default)
Sprite will be written to disk as a .png file. Recommended for sprites containing simple icons, drawings, etc.
Gif Sprite will be written to disk as a .gif file. This option is included for completeness. PNG images tend to be more efficient than GIF images, so use Png if you can.
Jpg Sprite will be written to disk as a .jpg file. Recommended for sprites containing compressed photos, etc.

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add spriteImageType="Jpg" ... />
      </imageGroups>
</cssSpriteGenerator>

Because sprites tend to be used to group simple icons, the default image type, Png, is most often want you want. However, if you are combining thumbnails of photos, you may want to set spriteImageType to Jpg. Another reason to use Jpg is if you are using the package to compress big .jpg images, using jpegQuality to compress the images and giveOwnSprite to give each image their own sprite.

giveOwnSprite

Lets you give all images in the group a sprite of their own.

Value Description
false
(default)
Images in the group are combined into sprites.
trueInstead of combining images into sprites, each image in the group gets its own sprite.

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add giveOwnSprite="true" ... />
      </imageGroups>
</cssSpriteGenerator>

The reason you combine images into sprites is to reduce the request/response overhead for the browser of loading each individual image. For bigger images however, the request/response overhead is not significant, so normally you wouldn't combine those into sprites. Otherwise you could wind up with very big sprites that take a long time to load by the browser.

On the other hand, the package allows you to do all sorts of good things with sprites, such as compressing .jpg sprites, or resizing images to make thumbnails on the fly. It would be good if you could use those features with bigger images as well.

The solution is to add the bigger images to a group and to set giveOwnSprite to true. That way, the images in the group will all get a sprite of their own, so they are not combined with other images. Than you can use jpegQuality or pixelFormat to compress the resulting sprite and/or resizeWidth and resizeHeight to resize them - without winding up with massive sprites.

When you look at the html generated by the package, you will find that it generates normal img tags for sprites that contain only one image. This because such a sprite is essentially a normal image, so there is no need for additional CSS.

The demo site DemoSite_CompressedJpeg in the downloaded solution uses the giveOwnSprite property to stop big images from being combined into sprites.


Image Processing


resizeWidth

Lets you set the width of all images in the group. Also see resizeHeight.

Type Default
Int32 Don't resize

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add resizeWidth="50" ... />
      </imageGroups>
</cssSpriteGenerator>

resizeWidth can be used to create thumbnails on the fly, so you don't have to make them yourself.

Take for example a page "thumbnails.aspx" where you want to show thumbnails of bigger images. You want each thumbnail to be 50px wide. Normally, you would have to create separate thumbnail images - but with resizeWidth you can simply use image tags that refer to the full sized images:

<!--thumbnails.aspx-->
<img src="bigimage1.jpg" />
<img src="bigimage2.jpg" />
<img src="bigimage3.jpg" />
<img src="bigimage4.jpg" />

To resize the big images on the fly so they are only 50px wide, you'd make sure that the .jpg images are included in a group. For that group, set resizeWidth to 50. And make sure that the group is only used for page thumbnails.aspx:

<cssSpriteGenerator ... >
      <imageGroups>
          <add filePathMatch="\.jpg" resizeWidth="50" pageUrlMatch="thumbnails\.aspx$" ... />
      </imageGroups>
</cssSpriteGenerator>

Note that the images are physically resized before they are added to the sprite, so you will get both a smaller image and savings in bandwidth. Your original image files will not be changed though - it all happens in memory.

If that is more convenient, you could also achieve the same smaller size and the same bandwidth savings without using resizeWidth, by simply adding a width property to the image tags (details):

<!--thumbnails.aspx-->
<img src="bigimage1.jpg" width="50" />
<img src="bigimage2.jpg" width="50" />
<img src="bigimage3.jpg" width="50" />
<img src="bigimage4.jpg" width="50" />

resizeHeight

Lets you set the height of all images in the group. Also see resizeWidth.

Type Default
Int32 Don't resize

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add resizeHeight="100" ... />
      </imageGroups>
</cssSpriteGenerator>

You would use this to generate thumbnails on the fly with a given height, in exactly the same way as you would generate thumbnails with a given width using resizeWidth.

You can combine resizeHeight and resizeWidth. If you use only one, than the package will adjust the other dimension so the image keeps the same aspect ratio. So if you cut the height in half (such as from 200px to 100px), it cuts the width in half as well. If you set both, it simply uses both. For example:

Original Image Group Resulting Image
WidthHeightresizeWidthresizeHeightWidthHeight
100200not setnot set100200
10020050not set50100
100200not set201020
10020050205020

Note that if you set both resizeWidth and resizeHeight, you can easily change the aspect ratio of the image, which may not look good.

jpegQuality

Only works if the sprite generated via this group is a .jpg image. In that case, this lets you reduce the image quality, and thereby the file size, of the sprite.

Type Default
Int32 (between 0 and 100) No compression

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add jpegQuality="70" ... />
      </imageGroups>
</cssSpriteGenerator>

jpegQuality is a percentage of the quality of the sprite as it would have been if you hadn't specified jpegQuality. For example, if you set jpegQuality to 70, than the quality of the sprite will be reduced to 70% of its "natural" quality. This can dramatically reduce the file size of the sprite.

The optimal setting for jpegQuality depends on the sprite - you would determine this through experimentation. Setting quality higher than 90 may actually result in a greater file size. Values between 50 and 70 tend to give good reductions in size without being too noticable to human eyes.

The best use of jpegQuality is probably to reduce the file size of photos. Image files produced by digital cameras tend to be very big, and can be easily compressed without visible loss of quality.

To effectively compress large .jpg images, you would use these properties:

  1. You may decide you don't want to combine these large images into sprites because the resulting sprites would be very big (although you certainly could). To achieve that you'd set giveOwnSprite to true.
  2. You want to set filePathMatch to "jpg$", so the group only picks up .jpg images. Because this is a regular expression, you can use this to only select images from particular directories as well.
  3. Finally, set spriteImageType to"Jpg". Otherwise the images will be converted to .png images, which for photos is not optimal.

This would result in something like:

<cssSpriteGenerator ... >
      <imageGroups>
          <add jpegQuality="70" giveOwnSprite="true" filePathMatch="jpg$" 
          spriteImageType="Jpg" ... />
      </imageGroups>
</cssSpriteGenerator>

Suppose you want to combine small .jpg images into sprites along with small .png and .gif images, while compressing the big .jpg images? You can do this by using the fact that the package matches images with whatever group comes first:

<cssSpriteGenerator ... >
      <imageGroups>
          <!--matches all images that are 200px by 300px or smaller-->
          <add maxWidth="200" maxHeight="300"/>

          <!--matches all remaining .jpg images. 
          These images will be bigger than 200px by 300px otherwise they would have 
          matched the preceding group.-->
          <add jpegQuality="70" giveOwnSprite="true" 
          filePathMatch="jpg$" spriteImageType="Jpg" ... />
      </imageGroups>
</cssSpriteGenerator>

The demo site DemoSite_CompressedJpeg in the downloaded solution uses the jpegQuality property to reduce the quality of big jpeg files to 70%.

pixelFormat

Only applies if the sprite is generated as a PNG or GIF image. Sets the pixel format of the sprite.

ValueNbr. ColorsBit Depth
(bits per pixel)
Resulting pixel format
DontCare
(default)
Pixel format of the constituent image with the highest bits per pixel. Format48bppRgb when combining an image using Format16bppGrayScale with colored images.
Format1bppIndexed21Uses color table with 2 colors (black and white).
Format4bppIndexed164Uses color table with 16 colors.
Format8bppIndexed2568Uses color table with 256 colors.
Format16bppGrayScale65,536
shades of gray
16The color information specifies 65536 shades of gray.
Format16bppRgb55532,768165 bits each for the red, green, and blue components. The remaining bit is not used.
Format16bppRgb56565,536165 bits for the red component, 6 bits for the green component, and 5 bits for the blue component.
Format16bppArgb155532,768165 bits each for the red, green and blue components, and 1 bit is alpha.
Format24bppRgb16,777,216248 bits each for the red, green, and blue components.
Format32bppRgb16,777,216328 bits each for the red, green, and blue components. Remaining 8 bits not used.
Format32bppArgb4,294,967,296328 bits each for the alpha, red, green, and blue components.
Format32bppPArgb4,294,967,296328 bits each for the alpha, red, green, and blue components. The red, green, and blue components are premultiplied, according to the alpha component.

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add pixelFormat="Format24bppRgb" ... />
      </imageGroups>
</cssSpriteGenerator>

Images in .png and .gif format can have different pixel formats. This helps in reducing the image file size. For example, if you produce a .png image in Photoshop that has no more than 16 colors, you would give it a color depth of no more than 16 colors, giving you an image that takes 4 bits per pixel (Format4bppIndexed). If you used more than 16, you'd have to give it the next higher color depth of 256 colors, taking 8 bits per pixel (Format8bppIndexed). The higher the color depth, the bigger the file size.

Normally, the package generates sprites with the right pixel format. But sometimes you want to override the pixel format:

  • If the images that went into the sprite have very different colors, you may wind up with a sprite whose color depth is too low - you'll find out because it looks bad. To fix that, you could try pixel formats with more bits per pixel, to get a greater color depth.

  • You can try pixel formats with fewer bits per pixel to reduce the file size of the sprite. This is especially useful if you combine .jpg images into .png sprites, because the package assumes that .jpg images use many colors and so gives the sprite a greater color depth than it really needs.

The demo site DemoSite_CompressedPng in the downloaded solution uses pixelFormat to reduce the color depth of a sprite.

Pixel format is quite a topic in its own right. Here we'll look at:

Finding out bits per pixel of an image

Finding out the bit depth, dimensions, etc. of an image doesn't require an image editor, at least if you use Windows 7 and possibly Vista:

  1. Right click the image file in Windows Explorer;
  2. Choose Properties;
  3. Open the Details tab.

Combining images with different pixel formats

When the package combines multiple images into the one sprite, those images may have different pixel formats - one image may use Format4bppIndexed because it has fewer than 16 colors, while another one may use Format8bppIndexed because it uses more than 16 colors, etc.

To ensure that the constituent images all look good on your page, by default the package sets the pixel format of the sprite to that of the constituent image with the highest pixel format - which would be Format8bppIndexed in the above example. You can override this by setting pixelFormat.

Combining images with same bit depth but different palettes

When you create an image in Photoshop or some other image editing program and you give it a color depth of 16 or 256 colors (corresponding to pixel formats Format4bppIndexed and Format8bppIndexed), the program will create a palette of colors inside the image file with the colors you used in the image. The 4 or 8 bits for each pixel than form an index into the palette.

This means that an image with lots of shades of red would have a palette with lots of shades of red. Another image with the same color depth but with lots of shades of blue would have a palette with lots of shaded of blue. So both images would have completely different palettes, even though their color depths are the same.

To cope with this, the package initially creates a sprite image with pixel format Format32bppArgb (32 bits per pixel) and then adds the constituent images. That way, even if those images have widely different palettes, they will all keep their colors in the inital sprite.

However, to reduce the file size of the sprite, the package then reduces the color depth of the sprite to that of the constituent image with the most colors. If the image with the highest color depth uses 4 or 8 bits, that means the sprite itself will also use 4 or 8 bits per pixel - meaning it uses a palette. The challenge then is to come up with a palette that suits the entire sprite, even if the original images had widely different palettes.

The package has a number of clever algorithms built in to work out the colors on the sprite's palette (choose one yourself or let the package choose one for you). But if the constituent images have widely different colors, this may not work well and you could wind up with images on the page that don't look right. In that case, you can force the package to go for a pixel format with more bits per pixel (such as Format24bppRgb), to keep image quality up at the expense of a bigger sprite.

Because of all this, if you have lots of images that use a palette (4 or 8 bits per pixel), it makes sense to group images that have similar colors together - all the "red" images in one sprite, all the "blue" images in another sprite, etc.

This issue doesn't arise with higher color depths, because in that case the image no longer has a palette. If you allow 24 bits per pixel or more (pixel format Format24bppRgb or higher), any palette would contain 16,777,216 colors or more - which doesn't make sense. So in that case the bits for each pixel represent a color directly, rather than a position in a palette.

Resizing changes the pixel format

If you use resizeWidth, resizeHeight or auto resizing, than the package has to resize the image on the fly.

The problem is that the resized image often requires more colors than the original, to give it good color transitions. Because this is a complicated issue that takes many CPU cycles to optimize, the package keeps it simple and generates a resized image with at least 24 bits per pixel (pixel format Format24bppRgb), to cater for all the possibly required colors.

As a result, because at least one of the constituent images now has 24 bits per pixel or more, you'll wind up with a sprite that itself has at least 24 bits per pixel ore more as well. However, that could be more than actually necessary. You can set pixelFormat to a lower pixel format, such as Format8bppIndexed, to see if that reduces the file size of the sprite while maintaining an acceptable image quality.

Group images with similar pixel formats together

If you have 9 simple images that use no more than 16 colors (4 bits per pixel is enough) while a 10th more colorful image uses more than 16 colors (requires 8 bits per pixel or more), you may want to group only the 9 simple images together.

That way, the resulting sprite takes only 4 bits per pixel. If you added the 10th more colorful image to the group, that would force the sprite to have 8 bits per pixel - meaning it would have a greater file size.

Optimizing images

Some images on your site may be unoptimized - such as images that have a higher color depth than necessary, or that use the JPG format even though they have few colors.

If you combine these unoptimized images into a sprite, the sprite will wind up with a pixel format that is higher than necessary, causing it to have a higher file size. Rather than editing each icon yourself, you can get the package to effectively do this for you by setting pixelFormat to a lower pixel format.

Palette based pixel formats not used with some types of shared hosting

As you saw in this section, the package initially creates sprites in a pixel format that doesn't use a palette. It may then try to convert the sprite to a pixel format with only 4 or 8 bits to reduce its file size, meaning it will have a palette.

The problem is that the algorithms that calculate the palette use low level code to access all colors in the sprite so it can work out the optimal palette. This needs to be low level code to make it fast enough.

However, running this low level code can only be done if your site runs in an environment with trust level Full.

That is not an issue in your development environment or if you have your own web server - there you always have trust level Full. It even works fine with many cheap shared hosting plans that give you trust level Full even though your site shares a web server with other sites.

However, some shared hosting plans, such as GoDaddy only give your site trust level Medium, to ensure that the sites sharing the web server don't affect each other. That prevents the package from using the low level code to work out the palette.

As a result, if your site runs in an account with trust level Medium, it doesn't convert sprites to any pixel format that requires a palette (Format1bppIndexed, Format4bppIndexed or Format8bppIndexed), even if you tell it to by setting pixelFormat. Instead, it will use Format24bppRgb, which doesn't use a palette.

paletteAlgorithm

Lets you choose which algorithm is used to find the optimal palette, when the package produces a sprite with a pixel format that uses palettes.

Algorithm
Windows Uses palette with standard Windows colors. Very fast. Best choice for images that only use safe colors. When target pixel format is Format1bppIndexed (1 bit per pixel), the package always uses this algorithm.
HSBSlower, excellent results with images with many colors
MedianCutSlower, often good results
OctreeQuick, reasonable results
PopularityVery quick, but results are poor with images with many colors
UniformVery qick, but results are poor with images with many colors. Only works when targeting 8 bits per pixel (Format8bppIndexed). For 4 bits per pixel, the package will use HSB instead of Uniform

More information about these algorithms is here.

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add paletteAlgorithm="Windows" ... />
      </imageGroups>
</cssSpriteGenerator>

For details related to the pixel format of the sprites generated from this group, see pixelFormat.

Say you have an image of 1000 colors, and you want to reduce the color depth so it takes only 8 bits per pixel, to reduce its file size. 8 bits per pixel means you'll be using a palette with only 256 colors. What 256 colors will you choose so the new image still looks like the original to human eyes?

This is obviously a tricky task, especially seeing that the algorithm also needs to minimize CPU usage. People have come up with different algorithms to achieve this. The issue is that an algorithm that is best in some situations is not necessarily the best in other situations. So rather than locking you into one algorithm, the package allows you to expirement with different algorithms if the default doesn't work for you.

The demo site DemoSite_CompressedPng in the downloaded solution uses paletteAlgorithm when reducing the color depth of a sprite.

disableAutoResize

Lets you switch off the Auto Resize feature (described below).

Value Description
false
(default)
Images automatically resized according to width and height image tag properties.
trueAuto resizing switched off. Do not combine with giveOwnSprite="false".

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add disableAutoResize="true" ... />
      </imageGroups>
</cssSpriteGenerator>

As you have seen, when the package turns images into sprites, it replace img tags with div tags - where the div has a width and height that match the width and height of the original image, so it can show precisely the area of the sprite taken by the original image.

However, you may have an img tag with width and/or height attributes that do not correspond with the width and height of the physical image. For example:

<!--physical image width is 100px, but shown on page 50px wide-->
<img src="physically100wide.png" width="50" />

The issue is that you cannot resize background images with a CSS tag, like you can with img tags. To overcome this, the package physically resizes the image before adding it to the sprite - a feature called Auto Resize. This happens in memory - your original image file is not affected. It also makes sure that the width and height of the div are as specified by the width and/or height properties of the img tag.

If you set only the width property in the img tag, or only the height property, the package will preserve the aspect ratio of the image - so it still looks the same on the page.

If there are multipe img tags on the page referring to the same physical image but with different width and/or height attributes, than the package generates versions for each width/height before adding them to the sprite.

Auto Resize only works with the width and/or height attributes on an img tag. It doesn't work if you set the width or height in CSS.

Keep in mind that if you use resizeWidth and/or resizeHeight with your group, those override any width and/or height properties on the img tag - so the Auto Resize feature does not come into play then.

The DemoSite_AutoResized web site in the downloaded solution shows auto resizing in action.

If you want to, you can disable the Auto Resize feature by setting disableAutoResize to true. However, as shown above, that wouldn't work when combining images into sprites. So the package only allows you to do this if you also set giveOwnSprite to true, because in that case the sprite can be shown with an img tag with width and height attributes, rather than with a background image.

It would make sense to disable the Auto Resize feature if your page showed the same physical image a number of times, with different sizes. In that case, you would want to use one physical image rather than multiple resized images - so the browser needs to load only one physical image.


Group Creation


groupName

Sets the name of the group. You can't have more than one group with the same name (but you can have multiple groups without a name).

Type Default
string emtpy

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add groupName="largejpg" ... />
      </imageGroups>
</cssSpriteGenerator>

You would give a group a name for these reasons:

  • So another group can refer to it using subsetOf.
  • To give it a descriptive name, to make it easier to later on remember what the group is for.

subsetOf

Lets groups inherit properties from other groups.

Type Default
string empty

Example

<cssSpriteGenerator ... >
      <imageGroups>
          <add subsetOf="basegroup" ... />
          <add groupName="basegroup" ... />
      </imageGroups>
</cssSpriteGenerator>

When you set subsetOf for a group to the name of some other group, the group inherits all properties from the other group - except for the ones that it sets itself.

Take for example:

<cssSpriteGenerator ... >
      <imageGroups>
          <add subsetOf="pnggroup" filePathMatch="\.jpg" />
          <add groupName="pnggroup" maxHeight="100" filePathMatch="\.png" />
      </imageGroups>
</cssSpriteGenerator>

The lowest group has maxHeight set to 100 and filePathMatch set to \.png. So it matches .png files that are not higher than 100px.

The group above it inherits from pnggroup. It doesn't set maxHeight itself, so it inherits that from pnggroup. But it does set filePathMatch to \.jpg, thereby overriding the filePathMatch it gets from pnggroup. As a result, it matches .jpg files that are no higher than 100px.

CSS Background Images

The package has the ability to combine CSS background images into sprites, to compress them, etc. This is very useful because background images tend to be very small which means that their request/response overhead is proportionally high, so there is a lot to be gained from combining them into sprites.

The cssImages element

Whereas the package can interpret the current page to find all the images used there, it can't yet interpret CSS files. To overcome this, the package lets you specify the background images you want processed in cssImages elements.

Here is what this looks like this:

<cssSpriteGenerator ... >
      <cssImages>
          <add imageUrl="css/img/logo.png" cssSelector=".backgrounglogo" ... />
          <add imageUrl="css/img/biglogo.png" cssSelector=".backgroungbiglogo" ... />
          ...
    </cssImages>
</cssSpriteGenerator>

As you see here, the package also needs the CSS selector where the background image is used. This is because it generates additional CSS statements that partly override the original CSS, to ensure that the background image still shows up correctly on the page after it has been put into a sprite.

After the images have been read from the cssImages elements, they are processed the same way as images read from the page. This means that some background images could be combined with images used in img tags. It also means they need to be matched to an image group before they can be put into a sprite.

For example, if you wanted to place the two "logo" background images in their own group, you could use a group with the filePathMatch property, like this:

<cssSpriteGenerator ... >
      <cssImages>
          <add cssSelector=".backgrounglogo" imageUrl="css/img/logo.png" ... />
          <add cssSelector=".backgroungbiglogo" imageUrl="css/img/biglogo.png" ... />
          ...
    </cssImages>
    <imageGroups>
          <!--only include images ending in logo.png in the group-->
          <add filePathMatch="logo\.png$" ... />
    </imageGroups>
</cssSpriteGenerator>

In addition to the imageUrl and cssSelector properties, the combineRestriction property caters for background images that are repeated horizontally or vertically, and the backgroundAlignment property caters for background images used with the sliding door technique.

The description of the imageUrl property shows in detail how to create a cssImages element based on a CSS style.

runat=server in head tag

If you use cssImages elements to process CSS background images, the package will always generate a separate .css file to override existing CSS styles to make them work with the new sprites.

So the package can add a link tag for the .css file to the head section, make sure that the head tags of your pages have runat="server":

<head runat="server">

Properties Index

cssImages elements have the following properties:

imageUrl (required)

The url of the background image to be processed. This can be an absolute url or a url relative to the root of the web site - but not a url that is relative to the directory of the CSS file.

Type Default
stringnone

Example

<cssSpriteGenerator ... >
      <cssImages>
          <add imageUrl="css/img/logo.png" ... />
    </cssImages>
</cssSpriteGenerator>

As an example, suppose you have a CSS file site.css in directory css with the following CSS:

.backgrounglogo 
{
    height: 32px; width: 32px;
    background: #000000 url(img/logo.png);
}

To have the logo.png image combined into a sprite, you would take these steps:

  1. Create a new entry in cssImages:
    <cssImages>
        ...
        <add />
    </cssImages>
    
  2. Add the CSS selector .backgrounglogo:
    <cssImages>
        <add cssSelector=".backgrounglogo" />
    </cssImages>
    

    If you have multiple selectors using the same background image, you need to create an entry for each selector.

  3. Add the url of the background image.

    Here the image url - img/logo.png - is relative to the directory of the CSS file, which is css. However, the package doesn't know where the CSS file is located, so it needs the image url relative to the root of the web site - css/img/logo.png:

    <cssImages>
        <add cssSelector=".backgrounglogo" imageUrl="css/img/logo.png" />
    </cssImages>
    
  4. If the style uses a repeating background image, or if you use the sliding door technique, you may need to add combineRestriction and backgroundAlignment properties - see their descriptions for more details.

  5. Finally, make sure there is an image group that matches the background image, otherwise it won't be combined into a sprite. If there is no such group yet, add one:
    <imageGroups>
        <add ... />
    </imageGroups>
    <cssImages>
        <add cssSelector=".backgrounglogo" imageUrl="css/img/logo.png" />
    </cssImages>
    

cssSelector (required)

The selector of the style that uses the background image.

Type Default
stringnone

Example

<cssSpriteGenerator ... >
      <cssImages>
          <add cssSelector=".backgrounglogo" ... />
    </cssImages>
</cssSpriteGenerator>

See the discussion at the imageUrl property.

combineRestriction (optional)

Sets restrictions on the way the background image can be combined with other images in a sprite.

Value Description
None
(default)
No combine restrictions
HorizontalOnlyImage will only be combined horizontally, and only with images of same height. Use with styles that use repeat-y.
VerticalOnlyImage will only be combined vertically, and only with images of same width. Use with styles that use repeat-x.

Example

<cssSpriteGenerator ... >
      <cssImages>
          <add combineRestriction="VerticalOnly" ... />
    </cssImages>
</cssSpriteGenerator>

Whether to use a combine restriction depends on whether you use a repeating background image:

When usingExampleUse combineRestriction
repeat-xbackground: url(bg.png) repeat-xVerticalOnly
repeat-ybackground: url(bg.png) repeat-yHorizontalOnly

Example for vertically repeating background image

To see how this works, have a look at this screen shot:

This uses the following HTML:

<table cellpadding="10"><tr>
<td><div class="hor-gradient-lightblue">B<br />l<br />u<br />e</div></td>
<td><div class="hor-gradient-red">R<br />e<br />d</div></td>
</tr></table>

With this CSS:

.hor-gradient-lightblue
{
    width: 10px;
    background: #ffffff url(img/gradient-hor-lightblue-w10h1.png) repeat-y;
}

.hor-gradient-red
{
    width: 10px; 
    background: #ffffff url(img/gradient-hor-red-w10h1.png) repeat-y;
}

And these background images, which are both 10px wide - as wide as the div:

gradient-hor-lightblue-w10h1.png
(zoomed in 8 times)
gradient-hor-red-w10h1.png
(zoomed in 8 times)

Because each background image is tiny, it makes perfect sense to combine them into a sprite, so the browser needs to load only one image instead of two. However, you wouldn't want a sprite like this with the images stacked on top of each other:

sprite with images stacked vertically
(zoomed in 8 times)

Because that would produce this less than stellar result:

Both background images now show up in both backgrounds! We need to tell the package to only combine these background images horizontally. That can be done with the combineRestriction property:

<cssImages>
    ...
    <add ... combineRestriction="HorizontalOnly"/>
    ...
</cssImages>

Combining the background images horizontally gives us this sprite:

sprite with images combined horizontally
(zoomed in 8 times)

This allows the package to generate CSS that shifts the sprite over the visible area:

correct sprite area
shifted over visible area
(zoomed in 4 times)

Wrapping this up, you would use these entries in cssImages for your two background images:

<cssImages>
    ...
    <add imageUrl="css/img/gradient-hor-lightblue-w10h1.png" 
            cssSelector=".hor-gradient-lightblue" 
            combineRestriction="HorizontalOnly"/>
    <add imageUrl="css/img/gradient-hor-red-w10h1.png" 
            cssSelector=".hor-gradient-red" 
            combineRestriction="HorizontalOnly"/>
</cssImages>

Horizontally repeating background image

The story for background images that repeat horizontally instead of vertically is the same, except that you would set combineRestriction to VerticalOnly, so the images are guaranteed to be stacked vertically in the sprite.

Sprites and narrow background images

So far, the background images have been precisely as wide as the parent div element. What happens if we make the background images narrower, say 5px, while the div is still 10px wide?

Without sprites, the browser will show:

However, if we combine the two 5px wide background images into a sprite:

sprite with 5px wide images combined horizontally
(10px wide, zoomed in 8 times)

Than this is the result if we use that sprite as the background image:

The red background looks fine, but the blue background seems to have combined with the red background! The reason why is obvious when you consider how the CSS sprite technique works - the sprite is shifted over the div element so the correct image within that sprite becomes visible. The width and height of the div then ensure that only that correct image is visible.

However, that breaks down here for the blue background image, because the background image that we want to show is 5px wide, while the div is 10px wide. As a result, the red background image to its right shows up as well. It happens to work for the red background image, but only because here the sprite has been shifted 5px to the left and the sprite doesn't contain an image to the right of the red background image.

Moral of this story: If a background image is both lower and narrower than the div with which is is used, than it cannot be combined into a sprite.

backgroundAlignment (optional)

Determines how the CSS generated by the package aligns the background image.

Value Description
None
(default)
Image will not be aligned
TopImage will be top aligned
BottomImage will be bottom aligned
LeftImage will be left aligned
RightImage will be right aligned

Example

<cssSpriteGenerator ... >
      <cssImages>
          <add backgroundAlignment="Left" ... />
    </cssImages>
</cssSpriteGenerator>

In your CSS, you may be aligning background images, like this:

background: url(img/button-green-left.png) left;

To ensure that the package generates the correct sprite and CSS to cater for alignments, you need to add not only a combineRestriction property but also a backgroundAlignment property to your cssImages elements in the following cases:

Background ImageAlignmentExample CSScombineRestrictionbackgroundAlignment
Narrower and as high or higher than parentLeftbackground: url(bg.png) left;VerticalOnlyLeft
As high or higher than parentRightbackground: url(bg.png) right;VerticalOnlyRight
Lower and as wide or wider than parentTopbackground: url(bg.png) top;HorizontalOnlyTop
As wide or wider than parentBottombackground: url(bg.png) bottom;HorizontalOnlyBottom

The parent is for example a div tag that uses the background image. Keep in mind that if the background image is both narrower and lower than the parent element, it can't be combined with other images into a sprite.

Lets look at a practical example of all this. Have a look at this screen shot:

Both buttons are normally green. When you hover over one, it goes orange.

Here is the HTML for the buttons. Note that rather than using an image per button, both buttons use the same CSS class flexible-width-button - only the text is different. This uses the sliding door technique, which relies on background image alignment.

<table cellpadding="10"><tr><td>
<div class="flexible-width-button"><a href="delivery.aspx">Delivery</a></div>
</td><td>
<div class="flexible-width-button"><a href="buy.aspx">Buy</a></div>
</td></tr></table>

Here is the CSS class flexible-width-button (some irrelevant bits have been left out). Note that the height of a button is 25px:

div.flexible-width-button {
    background: #ffffff url(img/button-green-left.png) top left no-repeat;
    ...
}

div.flexible-width-button a {
    background: transparent url(img/button-green-right.png) top right no-repeat;
    line-height: 25px;
    ...
}

div.flexible-width-button:hover, div.flexible-width-button:focus {
    background: #ffffff url(img/button-orange-left.png) top left no-repeat;
}

div.flexible-width-button:hover a, div.flexible-width-button:focus a {
    background: transparent url(img/button-orange-right.png) top right no-repeat;
}

The CSS uses these background images:

button-green-left.png button-green-right.png button-orange-left.png button-orange-right.png

All four images are 25px high, which is the same height as their parent elements. button-green-left.png and button-orange-left.png are also wider and they are left aligned, so according to the table above, there is no need to add combineRestriction and backgroundAlignment to their cssImages elements:

<cssSpriteGenerator ... >
    <cssImages>
        ...
        <add imageUrl="css/img/button-green-left.png" 
           cssSelector="div.flexible-width-button" />
        <add imageUrl="css/img/button-orange-left.png"
           cssSelector="div.flexible-width-button:hover, div.flexible-width-button:focus" />
    </cssImages>
</cssSpriteGenerator>

It's a different story for button-green-right.png and button-orange-right.png: they are both narrower than their parent elements, and they are right aligned. According to the table above, that's two reasons to add not only a combineRestriction but also a backgroundAlignment to their cssImages elements:

<cssSpriteGenerator ... >
    <cssImages>
        ...
        <add imageUrl="css/img/button-green-right.png" 
           cssSelector="div.flexible-width-button a" 
           combineRestriction="VerticalOnly" backgroundAlignment="Right" />
        <add imageUrl="css/img/button-orange-right.png" 
           cssSelector="div.flexible-width-button:hover a, div.flexible-width-button:focus a" 
           combineRestriction="VerticalOnly" backgroundAlignment="Right" />

        <add imageUrl="css/img/button-green-left.png" 
           cssSelector="div.flexible-width-button" />
        <add imageUrl="css/img/button-orange-left.png"
           cssSelector="div.flexible-width-button:hover, div.flexible-width-button:focus" />
    </cssImages>
</cssSpriteGenerator>

Conclusion

This was the last part of this 3 part series. If you haven't given the ASP.NET CSS Sprite Generator package a try yet, now would be a good time! Download it here.

Book: ASP.NET Site Performance Secrets If you enjoyed reading this series, consider buying my book ASP.NET Site Performance Secrets. In a structured approach towards improving performance, it shows how to first identify the most important bottlenecks in your site, and then how to fix them. Very hands on, with a minimum of theory.

History

Version Released Description
1.0 3 Aug 2011 Initial release.
1.1 15 Aug 2011 If an image's file size is greater than the maxSpriteSize of a group, it won't be added to that group, regardless of the group's maxSize.

Generate CSS sprites and thumbnail images on the fly in ASP.NET sites - part 2

by Matt Perdeck 14. September 2011 20:58

This Series

Introduction

This is part 2 of a 3 part series about the ASP.NET CSS Sprite Generator package. It allows you to compress and resize the images in your ASP.NET web site on the fly, and to combine them into sprites. Simply add the ASP.NET CSS Sprite Generator dll and configure the package in your web.config - no need to change the site itself.

Contents

Part 1

Part 2

Part 3

Configuration

Configuration for the package fall in three categories:

  • Overall Configuration - settings that apply to the package as a whole, such as whether it is active or not, whether it processes page images, etc.
  • Image Groups - determine which images are processed and how. You've already come across groups in the quick start section.
  • cssImages elements - tell the package which CSS background images to process, and set restrictions on how they can be combined into sprites to cater for repeating background images.

These categories are discussed in the following sections.

Overall Configuration

The package supports these package wide configuration settings:

Overall Switches

Image Gathering

Options that determine how the package finds the images to be processed

Sprite Generation

Options that determine how the package generates sprites


Overall Switches


active

Determines when the package is active. When it is not active, it doesn't affect your site at all and none of the other attributes or child elements listed in this section have any effect.

Value Description
Never Package is never active, irrespective of debug mode
Always Package is always active, irrespective of debug mode
ReleaseModeOnly
(default)
Package is only active in Release mode
DebugModeOnly Package is only active in Debug mode

Example

<cssSpriteGenerator active="Always" >
    ...
</cssSpriteGenerator>

Whether your site is in Debug or Release mode depends on the debug attribute of the compilation element in your web.config file. If that attribute is set to false, your site is in Release mode (as it should be when live). When it is set true, it is in Debug mode (as it should be in development). It looks like this in web.config:

<configuration>
    ...
    <system.web>
        <compilation debug="true">
            ...
        </compilation>
        ...
    </system.web>
    ...
</configuration>

By default, the package is only active in Release mode - so you won't see any effect while you're developing. To ensure that the package is active in both Release mode and in Debug mode, set active to Always, as shown in the example above.

Note that the active property acts as a master switch for the whole package. If it isn't active, none of the other properties have any effect.

exceptionOnMissingFile

Determines whether the package throws an exception when an image file is missing.

Value Description
Never
(default)
The package never throws an exception when an image file is missing.
Always The package always throws an exception when an image file is missing.
ReleaseModeOnly The package only throws an exception if the site is in Release mode.
DebugModeOnly The package only throws an exception if the site is in Debug mode.

Example

<cssSpriteGenerator exceptionOnMissingFile="DebugModeOnly" ... >
    ...
</cssSpriteGenerator>

In order to process an image, the package has to actually read that image. What happens if the image file cannot be found on the web server? That is determined by the exceptionOnMissingFile attribute:

  • If exceptionOnMissingFile is active (see table above) and the package finds that an image file cannot be found, it throws an exception with the path of the image. That makes it easier to find missing images.
  • If exceptionOnMissingFile is not active, the package doesn't throw an exception and recovers by not processing the image. For example, if the image was found via a img tag, it will leave the tag alone.

If all images should be present in your development environment, than it makes sense to set exceptionOnMissingFile to DebugModeOnly. That way, you quickly find broken images while developing your site, while preventing exceptions in your live site where you probably prefer a broken image over an exception.

Keep in mind that if you want exceptions while the site is in Debug mode, you have to ensure that the package is actually active in Debug mode - set active to Always to make that happen.


Image Gathering


processPageImages

Determines whether images on the current page are processed

Value Description
Never Images on the current page are never processed
Always
(default)
Images on the current page are always processed
ReleaseModeOnly Images on the current page are processed if the site is in Release mode.
DebugModeOnly Images on the current page are processed if the site is in Debug mode.

Example

<cssSpriteGenerator processPageImages="Never" ... >
</cssSpriteGenerator>

Normally when a page on your site opens, the package finds all the img tags on the page, assigns them to image groups and processes the groups into sprites. It then replaces the img tags with div tags that show that part of the sprite matching the original image - so your page still looks the same but loads quicker (details).

When you switch off processPageImages, the package no longer goes through the page looking for img tags. It also won't replace any img tags.

You would switch processPageImages off if you only wanted to process CSS background images.

processCssImages

Determines whether the CSS background images listed in cssImages are processed

Value Description
Never CSS background images are never processed
Always
(default)
CSS background images are always processed
ReleaseModeOnly CSS background images are processed if the site is in Release mode.
DebugModeOnly CSS background images are processed if the site is in Debug mode.

Example

<cssSpriteGenerator processCssImages="Never" ... >
</cssSpriteGenerator>

If this option is active (the default), than the CSS background images you've listed in the cssImages element are processed by the package.

processFolderImages

Determines whether images stored in one or more folders on the web server are processed. See the sample web site DemoSite_FolderImages in the downloaded solution.

Value Description
Never
(default)
The package never processes images from folders on the web server
Always The package always processes images from folders on the web server
ReleaseModeOnly The package only processes images from folders on the web server if the site is in Release mode
DebugModeOnly The package only processes images from folders on the web server if the site is in Debug mode

Example

<cssSpriteGenerator processFolderImages="Always" ... >
</cssSpriteGenerator>

In addition to having the package find images on the current web page (if processPageImages is active) and from the cssImages element (if processCssImages is active), you can also get it to find images from one or more folders on the web server.

Why would you want to add images to a sprite that are not actually on the page? Take a very simple web site that uses 3 small icons on its pages, but not all icons appear on all pages:

PageIcons used
default.aspxcontactus.png, cart.png, print.png
contactus.aspxcart.png, print.png
cart.aspxcontactus.png, print.png

If you had the package only read images from the current web page, than it would create different sprites for each page, because each page has a different set of icons. However, it would be far more efficient to have all 3 icons in the one sprite. That way, when a visitor moves from one page to the other, that single sprite is still in browser cache, so doesn't have to be loaded again over the Internet. The way to do that is to get the package to read all icons when it creates a sprite, not just the ones that are on the current web page.

To get the package to find images from one or more folders on the web server, set processFolderImages active. You then also need to set imageFolderPathMatch to determine specifically which images should be read.

Just as with images taken from the current web page and from the cssImages elements, images taken from one or more folders are first assigned to image groups. Only images assigned to a group can become part of a sprite.

This means that if you want to add particular images to a sprite even if they are not on the current web page and not in the cssImages element, you have to take these steps:

  1. Set processFolderImages active;
  2. Make sure that the file paths of the desired images match imageFolderPathMatch;
  3. Make sure that the images match a group, so they can be worked into a sprite. The images taken from folders do not all have to match the same group.

For example, if you have a folder images\icons with .png and .gif images and you want all those images to be made into a sprite irrespective of whether they appeared on the current page, you could use:

<!--switch on processing of images from web server folders, but only process those 
that are in images\icons and that end in .png or .gif-->
<cssSpriteGenerator processFolderImages="Always" 
imageFolderPathMatch="\\images\\icons\\.*?(png|gif)$" ... >
      <imageGroups>
          ...
          <!--add .png images that live in images\icons to this group-->
          <add groupName="icons" filePathMatch="\\images\\icons\\.*?png$" />

          <!--add .gif images that live in images\icons to this second group-->
          <add groupName="icons" filePathMatch="\\images\\icons\\.*?gif$" />
      </imageGroups>
</cssSpriteGenerator>

imageFolderPathMatch

If the processFolderImages property is active, than this property is required. It determines which image files are read.

Type Default
string
(regular expression)
none

Example

<cssSpriteGenerator imageFolderPathMatch="\\images\\img1" ... >
</cssSpriteGenerator>

For details on the folder images feature, see processFolderImages.

The imageFolderPathMatch property is a regular expression. If processFolderImages is active, than the package looks at the file paths of all .png, .gif and .jpg files in the root directory of the site and its subdirectories. Those image files whose paths match imageFolderPathMatch will be processed.

For example, to process all image files in directory images\img1, you would use:

<!--all images in images\img1-->
<cssSpriteGenerator imageFolderPathMatch="\\images\\img1\\" ... >
</cssSpriteGenerator>

(it uses double backslashes because the backslash is a special character in regular expressions, so needs to be escaped with another backslash)

If you wanted to process only the .png files in directory images\img1, you would use:

<!--all .png images in images\img1-->
<cssSpriteGenerator imageFolderPathMatch="\\images\\img1\\.*?png$" ... >
</cssSpriteGenerator>

If you want to process all image files in the root directory of the site and its subdirectories, use:

<!--all images in the site-->
<cssSpriteGenerator imageFolderPathMatch="\\" ... >
</cssSpriteGenerator>

If your site writes images to the root directory of the site or one of its sub directories (for example if visitors can upload images), than make sure that those images are stored in a directory that isn't matched by imageFolderPathMatch, otherwise they could wind up in sprites. Those sprites would get bigger with every image added.

Additionally, if your site writes images frequently, such as more than once every 5 minutes, consider storing these images outside the root directory of the site. For faster processing, the package keeps the structure of the root directory and its sub directories in cache. Each time an image is added to the root directory or one of its sub directories, that cache entry is removed to ensure it isn't outdated, meaning that the package has to rebuild that cache entry again.


Sprite Generation


copiedImgAttributes

Determines which attributes are copied over from an img tag when it is replaced by a div tag.

Type Default
string
(regular expression)
all attributes copied
(except the ones listed below)

Example

<cssSpriteGenerator copiedImgAttributes="onclick|id" ... >
</cssSpriteGenerator>

When the package replaces an img tag with a div tag, it copies over the attributes you had included with the img (with some exceptions, see below). For example, if you had:

<img id="..." onclick="..." onmouseout="..." ... />

That will be replaced with:

<div id="..." onclick="..." onmouseout="..." ... ></div>

Normally, this is what you want. But if it isn't, you can prevent certain attributes from being copied over through the copiedImgAttributes property. That property is interpreted as a regular expression. When you specify copiedImgAttributes, only those attributes that match the regular expression will be copied over (with some exceptions).

This means that if your img tag looks like this:

<img id="..." onclick="..." onmouseout="..." ... />

And you specified a copiedImgAttributes that only matches id and onclick but not onmouseout:

<cssSpriteGenerator copiedImgAttributes="id|onclick" ... >
</cssSpriteGenerator>

Than the resulting div tag will only have the id and onclick attributes:

<div id="..." onclick="..." ... ></div>

Keep in mind that if you don't specify copiedImgAttributes at all, all attributes (with some exceptions) will be copied over.

Exceptions

The package gives special treatment to some attributes:

  • src is used by the img tag to load the image, but this is now done via CSS, so it is never copied over.
  • width and height are set by the package itself.
  • alt is used as the contents of the span or div tag that replaces the img tag (details).
  • class and style are copied over. But if the package generates an additional class name than it gets added to any class attribute you already had on the img tag (it will create a new class attribute if needed). Similar story with any style elements generated by the package.

inlineSpriteStyles

Determines whether the styling needed with the div tags that replace the img tags is inline or in a separate .css file.

Value Description
falseAdditional styling is placed in a separate .css file
true
(default)
Additional styling is inline

Example

<cssSpriteGenerator inlineSpriteStyles="false" ... >
</cssSpriteGenerator>

In the section about how sprites work, you saw how the div tags that replace your img tags use additional styling to show the correct area of the sprite - using background, width, height, etc.

If you leave inlineSpriteStyles at true, the package inlines all the additional styling. That gives you div tags such as this:

<div style="width: 32px; height: 32px; 
  background: url(/TestSite/___spritegen/2-0-0-90- ... -53.png) -200px -0px; 
  display:inline-block;"
  ></div>

However, if you set inlineSpriteStyles to false, the additional styling is placed in a separate .css file which is generated in the same directory as the sprite images themselves. The generated div tags will refer to the styling via CSS classes. This gives you div tags such as this:

<div class="c0___spritegen" >&nbsp;</div>

In that case, make sure that the head tags of your pages have runat="server", so the package can add a link tag for the .css file to the head section:

<head runat="server">

Here are the advantages/disadvantages of setting inlineSpriteStyles to false:

  • The advantage of using a separate .css file is that the next time the page is loaded, the .css file may still be in browser cache - so all that additional styling doesn't have to be loaded over the Internet. This would apply if your visitors tend to hang around on your site, hitting several pages in the one visit.
  • The drawback of using a separate .css file is that if that file is not in browser cache, the browser has one more file to load. Also, if your web server uses compression, the additional inlined styles don't add much to the number of bytes going over the Internet because they are highly compressible.

There are a few cases where it may make sense to place the additional CSS in a separate CSS file by setting inlineSpriteStyles to false:

  • You're pretty sure that on many visits the separate CSS file will be in browser cache.
  • You use the cssImages element to process CSS background images. In that case, the package always generates a separate CSS file with styles that override existing CSS styles to make them work with the new sprites. If you now set inlineSpriteStyles to false, the additional CSS for the div tags will go into the same CSS file as the CSS for the background images, rather than a separate CSS file. This means you've reduced the size of your .aspx pages by taking out the inlined styles, without incurring an additional CSS file load.
  • You use the Combine And Minify package to combine and minify CSS and JavaScript files on your site. In that case, Combine And Minify can combine the separate CSS file with the other CSS files, so you don't incur the extra CSS file load.

generatedFolder

Sets the folder where the generated .css file (if there is one) and the sprites are stored.

Type Default
string___spritegen

Example

<cssSpriteGenerator generatedFolder="generated\sprites" ... >
</cssSpriteGenerator>

This folder will be created (if it doesn't already exist) in the root folder of the site.

You can include one or more \ in this property. In that case, you'll get a folder and one or more subfolders.

classPostfix

Postfix used to ensure that generated class names are unique.

Type Default
string___spritegen

Example

<cssSpriteGenerator classPostfix="___generated" ... >
</cssSpriteGenerator>

If you set inlineSpriteStyles to false, the package generates a .css file with the additional CSS needed to make the sprites work. To make sure that the names of the generated CSS classes do not clash with the other CSS classes in your site, the contents of classPostfix is appended to the generated class names.

If you find that the names of the generated CSS classes clash with the names of other CSS classes in your site, you can change classPostfix to fix this.

Conclusion

This was part 2 of this 3 part series. In part 3 we'll find out about Image Groups, which allow you to manipulate your images on the fly, select images for inclusion into sprites, etc. We'll also go into handling of CSS background images.

Generate CSS sprites and thumbnail images on the fly in ASP.NET sites - part 1

by Matt Perdeck 14. September 2011 20:49

This Series

Introduction

On most web pages, images make up most of the files loaded by the browser to show the page. As a result, you can significantly reduce bandwidth costs and page load times by compressing large images, properly resizing thumbnails and combining small images into CSS sprites to cut down on request/response overhead. This is especially important if your site caters to iPhone users and other smart phones, because of low download speeds, high network latencies and high bandwidth costs for your visitors.

However, doing all this image manipulation manually is a mundane task that takes a lot of time. As a result, it often doesn't get done due to time pressures, or simple lack of awareness.

It was to solve this problem that this ASP.NET CSS Sprite Generator package was born. Once set up, it does all the compressing, resizing and sprite generation for you. The compressed and/or resized image versions and the sprites are generated on the fly into a separate directory. This means that if you add an image to a page on your live site, it will pick that up, with no need to go through a separate build process. However, you still have full control over which images get compressed, resized or combined into CSS sprites through configuration in your web.config.

Adding the ASP.NET CSS Sprite Generator package to an existing site is easy - just add a single dll and a few lines in the web.config - no need to change the site itself. This sets it apart from its main counterpart, Microsoft's ASP.NET Sprite and Image Optimization Library, which requires that you move images to a special directory, replace img tags with special user controls or MVC helpers, etc.

On the fly processing of images inherently puts a greater load on the web server CPU. To overcome this, the package uses caching extensively, so once a sprite has been generated or an image compressed or resized, the package won't have to do this again unless the constituent files are changed or the generated files removed. Also, it uses image processing algorithms that have been optimized for speed.

To make it easy to get started, there is a Quick Start section right after the Installation section. The "out of the box" default configuration means you'll reduce bandwidth and page load times even without configuring the package. And the solution in the download contains 10 simple but fully functional demo sites that show various features of the package.

Contents

Because CSS sprite generation is a surprisingly involved subject, this article is split into 3 parts.

Part 1

Part 2

Part 3

Requirements

  • ASP.NET 4 or higher
  • IIS 6 or higher for your live site

This package can run in shared hosting accounts that only provide Medium trust, such as those provided by GoDaddy. It can also run in accounts that provide Full trust.

Features compared with the ASP.NET Sprite and Image Optimization Library

This section compares the ASP.NET CSS Sprite Generator package against the other major package that generates sprites on the fly for ASP.NET sites, Microsoft's ASP.NET Sprite and Image Optimization Library.

The biggest difference between the two packages is that the ASP.NET CSS Sprite Generator package is much easier to install and configure:

  • With the ASP.NET Sprite and Image Optimization Library, you have to move all images that you want combined into sprites to a separate App_Sprites directory. You also have to manually replace the img tags of those images with special user controls (for ASP.NET sites) or special helpers (for ASP.NET MVC sites). Additional configuration involves adding special settings.xml files.

  • With the ASP.NET CSS Sprite Generator package, you just add a few lines to your web.config - no need to move images or change your pages.

Below is a more detailed feature comparison of the ASP.NET CSS Sprite Generator package and the ASP.NET Sprite and Image Optimization Library:

CSS Sprite
Generator package
ASP.NET Sprite and Image
Optimization Library
Combines images (except animated images) into sprites on the fly. When you add an image to a page, it will be picked up, without requiring additional build steps.
Processes dynamically generated image tags, such as from a database.
Images can optionally be combined based on file type, width, height and other properties (details).
Allows you to combine all images in a specific directory into sprites that are shared amongst your pages, to maximise browser caching (details).
Uses sophisticated algorithms to generate the smallest possible sprites.
Processes .png, .gif and .jpg images.
Processes CSS background images. Caters for repeating background images and background images used with the sliding door technique (such as used with flexible buttons and tabs) (details).
Very easy to install - just add a .dll to your site and add a few lines to your web.config (installation instructions). All configuration is done in your web.config. No need to reorganize your images or to use special controls.
Works seamlessly with the Combine And Minify package. Adding this (free) package further reduces page load times by combining and minifying CSS and JavaScript files, using cookieless domains, far future caching with file versioning, etc.
Allows you to switch features on or off per page or per group of pages (details).
Lets you compress .png and .gif files on the fly by reducing their color depth. Choose one of 6 built in algorithms or use the default (details).
Lets you compress .jpg files on the fly by reducing their image quality (details).
Lets you physically resize images on the fly (for example for thumbnails), either via configuration in web.config (details) or by setting width and/or height attributes on your img tags (details).
Lets you set the maximum file size for sprites, so sprites that grow too big are automatically split into smaller sprites (details).
Automatically replaces img tags with the HTML required to correctly show the generated sprites (details). Copies attributes such as id and onclick from your img tags to the generated html (configurable).
The additional generated CSS required to correctly show the generated sprites (details) can be inlined or automatically placed in a separate CSS file (details).
Can be used with sites hosted on low cost shared hosting plans, such as GoDaddy's (details).
Allows you to configure the package so it only kicks in in Release mode. That way, you see your individual unprocessed images while developing, and reap the performance improvement in your live site (details).
Has the option to throw an exception when an image is missing, so you quickly detect missing images (details).
Generates inlined images
To reduce CPU overhead and disk accesses caused by the package, caches intermediate results. Cache entries are removed when the underlying file is changed, so you'll never serve outdated files.

The features shown above can all be switched on and off individually in the ASP.NET CSS Sprite Generator package via the web.config file (full description). If you just install the package and not do any further configuration, it combines all .png and .gif images (except animated images) that are no more than 100px wide and high into sprites - which is most commonly what you want. By default, it only kicks in when the site is in Release mode (details).

This package is just one way of improving the performance of your web site. My recently released book ASP.NET Performance Secrets shows how to pinpoint the biggest performance bottlenecks in your web site, using various tools and performance counters built into Windows, IIS and SQL Server. It then shows how to fix those bottlenecks. It covers all environments used by a web site - the web server, the database server, and the browser. The book is extremely hands on - the aim is to improve web site performance today, without wading through a lot of theory first.

Introduction to CSS Sprites

What are CSS sprites

Web pages often contain many small images, which all have to be requested by the browser individually and then served up by the server. You can save bandwidth and page load time by combining these small images into a single slightly larger image, for these reasons:

  • Each file request going from the browser to the server incurs a fixed overhead imposed by the network protocols. That overhead can be significant for small image files. Sending fewer files means less request/response overhead.
  • Each image file carries some housekeeping overhead within the file itself. Combining small images gives you a combined image that is a lot smaller than the sum of the individual images.

Take for example this HTML:

<img src="contactus.png" width="50" height="50" />
<img src="cart.png" width="50" height="50" />
<img src="print.png" width="46" height="50" />

Which loads these images:

cart.png contactus.png print.png
Size: 2.21 KB Size: 2.02KB Size: 2.05KB Total Size: 6.28KB
50px x 50px 50px x 50px 46px x 50px

Combining these images gives you this combined image:

combined.png
Size: 3.94KB
146px x 50px

At 3.94KB, the size of this combined image is only 63% of the combined size of the three individual images.

Showing the individual images

Although the three images have now been combined into one single image, they can still be shown as individual images using a CSS technique called CSS Sprites. This involves replacing your img tags with the following div tags (done for you by the package):

<div style=
  "width: 50px; height: 50px; background: url(combined.png) -0px -0px; display:inline-block;"
  ></div>
<div style=
  "width: 50px; height: 50px; background: url(combined.png) -50px -0px; display:inline-block;"
  ></div>
<div style=
  "width: 46px; height: 50px; background: url(combined.png) -100px -0px; display:inline-block;"
  ></div>

Instead of using img tags, this HTML uses div tags with combined.png as the background image.

The CSS background property lets you specify an offset into the background image from where you want to show the background image as background. That is used here to slide the combined image over each div tag, such that the correct area is visible:

  • Here, the cart image sits right at the left edge horizontally and right at the top vertically within the combined image, so it gets offset 0px 0px.
  • The contactus image sits 50px away from the left edge, and right at the top vertically, so it gets -50px 0px.
  • And finally the print image sits 100px away from the left edge, and right at the top vertically, so it gets -100px 0px.

Note that the div tags have a width and height equal to the width and height of the original individual images. That way, you only see the portions of the combined image that correspond with the individual images.

Finally, the div tags use display:inline-block; to further mimick the behaviour of img tags. By default, when you place a number of img tags on a page, they appear horizontally after each other. div tags however by default sit below each other. The display:inline-block; causes the div tags to sit after each other, just like img tags.

Although the additional CSS looks big, it is highly compressible because it has lots of repeating bits of text. So as long as your web server has compression switched on (details) the overhead of the inlined CSS should be minimal.

Additionally, instead of having the additional CSS inlined in the div tags, you can put this in an external .css file and use CSS classes to link the div tags with their CSS. The package supports both options via the inlineSpriteStyles property.

In case you're wondering about writing all this CSS and HTML, this is all done for you by the package. This section simply shows how it's done.

Accessibility and the alt attribute

The alt attribute of the img tag is used to describe the image in words. This enables vision impaired people who use text-only browsers to find out what the image is about. For example:

<img src="cart.png" alt="Shopping Cart" />

The div tag doesn't have an alt attribute. The solution is to simply use the image description as the contents of the div tag:

<div style="...">Shopping Cart</div>

However, the text now sits visibly on top of the background image. To make it invisible, you can apply a large negative indent to shift it out of the area bounded by the div tag:

<div style="text-indent:-9999px; ...">Shopping Cart</div>

If there is no alt attribute or if it is empty, you can't leave the div tag without content, because that will cause the browser to wrongly position the tag. In that case, use &nbsp; as a filler:

<div style="text-indent:-9999px; ...">&nbsp;</div>

Linked images

A case apart is images that are part of a hyperlink, such as:

<a href="Cart.aspx"><img src="images/cart.png" alt="Shopping Cart" /></a>

The first issue is that strictly speaking, in HTML you can't have a block level element such as div inside an inline element such as a. That's easily solved by using a span instead of a div tag:

<a href="Cart.aspx"><span style="..." >Shopping Cart</span></a>

Another issue is that although the text within the span will be indented out, many browsers still show the underlining for the hyperlink. That can be fixed by setting the text-decoration of the anchor to none:

<a href="Cart.aspx" style="text-decoration: none;" ><span style="..." >Shopping Cart</span></a>

A final problem is that instead of the entire image being clickable, only the area taken by the text within the span is now clickable. The anchor doesn't regard a background image as real content. The solution is to turn the anchor itself into a block level element with the same dimension as the image itself. Also move the background image from the span to the anchor:

<a href="Cart.aspx" style=
  "text-decoration: none; width: 50px; height: 50px; 
    background: url(combined.png) -0px -0px; display:inline-block;" 
  ><span style="display:inline-block;text-indent:-9999px;" >Shopping Cart</span></a>

Keep your sprites small

Combining images into sprites tends to only make sense for small images, such as icons and background images.

With bigger images, the request/response overhead is not significant, taking away the reason to combine them into sprites. Additionally, you might wind up with very big sprites. Now that most browsers in use today load at least 6 images concurrently, you may actually be better off having a couple of medium sized images rather than one big sprite.

To manage sprite sizes, you can determine the maximum width, height and file size of the images that will be combined. You can also set a maximum sprite size - so sprites that grow too big get split in two.

Using Shared Hosting

Your site may be using a shared hosting plan, which means that it shares a web server with many other web site. Many companies that offer these plans, such as GoDaddy, restrict the operations that each web site can perform so they don't affect each other.

In technical terms, this means your live site has trust level Medium. In your development environment it has has trust level Full, meaning it can do whatever it wants.

The package was specifically build so it can run with trust level Medium, so you should have no problem using it with your site in a shared hosting plan.

The only issue is that the sprites generated by the package may have somewhat greater file sizes in trust level Medium than in trust level Full. This is because some of the algorithms required to reduce the number of colors in an image are not available in trust level Medium. More details here.

Combine And Minify

Compressing/resizing images and combining them into sprites is a good way to reduce bandwidth and page load times, but there is a lot more that can be done. This includes combining and minifying CSS and JavaScript files, the use of cookieless domains, far future caching, etc.

To take care of this, the package can be installed side by side with another package, Combine And Minify, which has these features:

  • Minifies JavaScript and CSS files. Minification involves stripping superfluous white space and comments.

  • Combines JavaScript files and CSS files to save request/response overhead.

  • Processes .axd files as used by the ASP.NET AJAX toolkit.

  • Removes white space and comments from the HTML generated by your .aspx pages.

  • Makes it easy to use cookieless domains with CSS, JavaScript and image files, to stop the browser from sending cookies with those files.

  • Lets you configure multiple cookieless domains. This causes the browser to load more JavaScript, CSS and image files in parallel.

  • Optimizes use of the browser cache by allowing the browser to store JavaScript, CSS and image and font files for up to a year. Inserts version ids in file names to ensure the browser picks up new versions of your files right away, so visitors never see outdated files.

  • Preloads images immediately when the page starts loading, instead of when the browser gets round to loading the image tags - so your images appear quicker. You can give specific images priority.

The demo site DemoSite_CombineAndMinify in the downloaded solution demonstrates the Combine And Minify package and this package working together. For details, read the notes on the home page of DemoSite_CombineAndMinify.

There are no special requirements or installation steps when using this package together with the Combine And Minify package. Simply go throught the installation steps of each package (pretty simple) and your site will be using both.

Installation

Take these steps to add the ASP.NET CSS Sprite Generator package to your web site:

  1. Compile the package:
    1. Download the zip file with the source code, and unzip in a directory.
    2. Open the CssSpriteGenerator.sln file in Visual Studio 2010 or later.
    3. You'll find that the sources are organized in a solution, with these elements:
      1. Project CssSpriteGenerator is the actual package.
      2. A number of demo sites with examples of the use of the package. You can ignore these if you want.
    4. Compile the solution. This will produce a CssSpriteGenerator.dll file in the CssSpriteGenerator\bin folder.

  2. Update your web site:
    1. Add a reference to CssSpriteGenerator.dll to your web site (in Visual Studio, right click your web site, choose Add Reference).
    2. Add the custom section cssSpriteGenerator to the configSections section of your web.config:
      <configuration>
          ...
          <configSections>
              ...
              <section 
                  name="cssSpriteGenerator" 
                  type="CssSpriteGenerator.ConfigSection" 
                  requirePermission="false"/>
              ...
          </configSections>
          ...
      </configuration>
    3. Add a folder called App_Browsers to the root folder of your web site: Right click your web site in Visual Studio, choose Add ASP.NET Folder, choose App_Browsers.
    4. Use your favorite text editor (such as Notepad) to create a text file in the App_Browsers folder. Call it PageAdapter.browser. Into that file, copy and paste this code:
      <browsers>
          <browser refID="Default">
              <controlAdapters>
                  <adapter controlType="System.Web.UI.Page"
                     adapterType="CssSpriteGenerator.CSSSpriteGenerator_PageAdapter" />
              </controlAdapters>
          </browser>
      </browsers>
      This tells ASP.NET to leave processing of the page to the class CssSpriteGenerator.CSSSpriteGenerator_PageAdapter (which is part of the package).

  3. The package is now installed in your web site. By default, it is only active when your site is in Release mode (debug="false" in your web.config). To make it active in Debug mode as well, add a cssSpriteGenerator element to your web.config, after the configSections section:

    <configuration>
        ...
        <configSections>
            ...
            <section 
                name="cssSpriteGenerator" 
                type="CssSpriteGenerator.ConfigSection" 
                requirePermission="false"/>
            ...
        </configSections>
    
        <cssSpriteGenerator active="Always">
        </cssSpriteGenerator>
        ...
    </configuration>

Quick Start

The full story about all the options and how to configure them is in the configuration section. But there are many options, so this section introduces you to the ones that may be most relevant to you, to get you started quickly.

The downloaded solution contains a series of demo sites with working examples. The demo site DemoSite_QuickStart was set up to make it easy to follow this section - apply the examples further on in this section to its web.config and than run the site to see how the changes leads to different sprites being generated.

This section looks at these scenarios:

Default configuration

If you just install the package and not do any configuration, than it will combine all .png and .gif images on the page with width and height less than 100px into sprites. But it only works when the site is in Release mode.

To give it a try:

  • Make sure that debug is set to false in your web.config.
  • Open a page on your site that has multiple .png or .gif images with width and height less than 100px.
  • View the source of the page to see the html for the sprite.
  • You will find a new directory ___spritegen in the root of your site with the generated sprite.

Using the package in Debug mode

To use the package in both Debug mode and Release mode, you need to do your first bit of configuration. You do that by adding a cssSpriteGenerator element to your web.config file, within the configuration section:

<configuration>
    ...
    <cssSpriteGenerator>
    </cssSpriteGenerator>
    ...
</configuration>

Than set the active property to Always to make the package active in both Debug and Release mode:

<configuration>
    ...
    <cssSpriteGenerator active="Always">
    </cssSpriteGenerator>
    ...
</configuration>

Processing bigger images

To increase the maximum width for images to be processed to 200px (from the default 100px) and the maximum height to 300px (from the default 100px), add an imageGroup element, like so:

<configuration>
    ...
    <cssSpriteGenerator active="Always">
        <imageGroups>
            <add maxWidth="200" maxHeight="300"/>
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

Image groups are a fundamental concept in the package. When an image is found on the current page, in a group of background images, or in a directory on the server, it is first added to a group. You can set limitations on the images that can be added to a group, such as maximum width, maximum height, file path, etc. The images in the group are then combined into one or more sprites. If an image doesn't get added to any group, it is simply ignored.

Your site can have multiple groups. The order of your groups is very important. When the package tries to match an image to a group, it starts at the first group and works its way to the last group. The first group that matches is the group that the image is added to. This means you need to list more specific groups before more general groups! In other words, don't write this:

<configuration>
    ...
    <cssSpriteGenerator active="Always">
        <imageGroups>
            <!--WRONG ORDER-->
            <!--matches all images with width 200px or less, irrespective of height-->
            <add maxWidth="200"/>

            <!--intended to match all images 200px wide or less and 300px high or less, 
            but won't match anything, because these images all match the preceding group-->
            <add maxWidth="200" maxHeight="300"/>
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

Instead, put the more specific group first:

<configuration>
    ...
    <cssSpriteGenerator active="Always">
        <imageGroups>
            <!--matches all images 200px wide or less and 300px high or less-->
            <add maxWidth="200" maxHeight="300"/>

            <!--matches all images with width 200px or less and height greater than 300px-->
            <add maxWidth="200"/>
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

So why does the package combine .png and .gif images into sprites if you haven't defined any groups? The reason is that if there are no groups at all, the package internally creates a default group. This default group only allows .png and .gif images, and then only those whose width and height are 100px or less. The moment you add your own groups however, that default group no longer gets created.

Jpg icons

If you have .jpg images on your page that are 200px by 300px or smaller, you will have noticed that they started getting combined into a sprite together with the other images after you added a new group in the previous section.

This is because groups allow all images, except if you explicitly add a limitation. The default group keeps out .jpg images, but your new group doesn't have such a limitation.

To allow only images with a certain file type, directory or name into a group, use the filePathMatch property of the group element. This is a regular expression. Images whose file paths don't match that regular expression are excluded.

To allow into your group only .png and .jpg images that are 200px by 300px or smaller, you would write this:

<configuration>
    ...
    <cssSpriteGenerator active="Always">
        <imageGroups>
            <add maxWidth="200" maxHeight="300" filePathMatch="(gif|png)$"/>
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

Shared sprites

Your site probably has a directory with small images, such as a shopping basket, arrows, etc. Instead of generating sprites per page, it makes sense to combine all those small images into a single sprite that can then be shared by all pages. That way, when your visitor moves to another page on your site, the shared sprite is probably still in their browser cache, so their browser doesn't need to retrieve it from your server.

You can get the package to read images from a folder on the web server with the processFolderImages and imageFolderPathMatch properties. However, those images then still need to match a group in order to get combined into a sprite.

Assuming your images sit in the directory images on your web server, you'd write something like this:

<configuration>
    ...
    <cssSpriteGenerator active="Always" 
    processFolderImages="true" imageFolderPathMatch="\\images\\" >
        <imageGroups>
            <add maxWidth="200" maxHeight="300" filePathMatch="(gif|png)$"/>
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

Because in this example the only group only allows .gif and .png images that are 200px by 300px or smaller, any .jpg images or images bigger than 200px by 300px in the images directory won't be processed.

Compressing big .jpg images

Many web sites have large photos that would take a lot less bandwidth if someone compressed them. Compressing images can save a lot of bytes without visible loss of image quality, but often people don't take the time to do it.

You can compress a .jpg image by reducing its quality with the jpegQuality property. Setting it to 70% often has good results. You can make sure that your photos don't get combined into sprites with other images with the giveOwnSprite property (why). Finally, you can ensure that the generated image has the JPG format (instead of the default PNG format) by setting the spriteImageType property.

Let's use this in a group:

<configuration>
    ...
    <cssSpriteGenerator active="Always" >
        <imageGroups>
             <add filePathMatch="\\photos\\" 
             jpegQuality="70" giveOwnSprite="true" spriteImageType="Jpg" />
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

The compressed photos wind up in the ___spritegen directory along with the other generated images. Check their file sizes to see how many bytes you're saving. If you're trying this out on the DemoSite_QuickStart site, you'll find that the file size of the large photo is reduced from 79.3KB to 58.6KB - a 26% saving without visible loss of quality.

Generating thumbnails

Your site may have thumbnails that link to the full sized photos. To produce a thumbnail, you could simply use the original image and use the width and height attributes of the img tag to reduce its size on the page. However, this causes the browser to load the full sized original image file, which is not an optimal use of your bandwidth. You could physically reduce the image in your favorite image editing program, but that gets boring if you have many images.

All this gets a lot simpler after you've installed the ASP.NET CSS Sprite Generator package, because it physically resizes images whose width and height attributes are different from their physical width and height.

Or you could use the resizeWidth and/or resizeHeight property:

<configuration>
    ...
    <cssSpriteGenerator active="Always" >
        <imageGroups>
             <add filePathMatch="\\photos\\" resizeHeight="200" spriteImageType="Jpg" />
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

You probably only want to resize the photos on your thumbnail pages, not on the pages with the original images. You could make this happen with the pageUrlMatch property. For example, if your thumbnail page is called thumbnail.aspx, you'd write:

<configuration>
    ...
    <cssSpriteGenerator active="Always" >
        <imageGroups>
             <add filePathMatch="\\photos\\" resizeHeight="200" 
             pageUrlMatch="thumbnail.aspx" spriteImageType="Jpg" />
        </imageGroups>
    </cssSpriteGenerator>
    ...
</configuration>

Conclusion

This was part 1 of this 3 part series. In part 2 we'll go in-depth into the configuration options of the ASP.NET CSS Sprite Generator package.

LINQ to CSV library

by Matt Perdeck 12. September 2011 22:39

Contents

Introduction

This library makes it easy to use CSV files with LINQ queries. Its features include:

  • Follows the most common rules for CSV files. Correctly handles data fields that contain commas and line breaks.
  • In addition to comma, most delimiting characters can be used, including tab for tab delimited fields.
  • Can be used with an IEnumarable of an anonymous class - which is often returned by a LINQ query.
  • Supports deferred reading.
  • Supports processing files with international date and number formats.
  • Supports different character encodings if you need them.
  • Recognizes a wide variety of date and number formats when reading files.
  • Provides fine control of date and number formats when writing files.
  • Robust error handling, allowing you to quickly find and fix problems in large input files.

Requirements

  • To compile the library, you need a C# 2008 compiler, such as Visual Studio 2008 or Visual C# 2008 Express Edition.
  • To run the library code, you need to have the .NET 3.5 framework installed.

Installation

These instructions apply to the download with sources and sample code. The NuGet package installs itself.

  1. Download the zip file with the source code, and unzip in a directory.
  2. Open the Code\LINQtoCSV.sln file in Visual Studio.
  3. You'll find that the sources are organized in a solution, with these elements:
    1. Project LINQtoCSV is the actual library.
    2. Project SampleCode has the sample code shown in this article.
    3. Project TestConsoleApplication is a working console application that exercises most of the features of the library. The code is heavily documented.
    4. The directory TestFiles within the TestConsoleApplication project contains test files - both CSV and tab delimited, and with both US and international (Dutch) dates and numbers.
  4. Compile the solution. This will produce a LINQtoCSV.dll file in the Code\LINQtoCSV\bin directory. You'll need that file to use the library in your own projects.

Quick Start

Reading from a file

    1. In your project, add a reference to the LINQtoCSV.dll you generated during Installation.
    2. The file will be read into an IEnumerable<T>, where T is a data class that you define. The data records read from the file will be stored in objects of this data class. You could define a data class along these lines:
using LINQtoCSV;
using System;

class Product
{
    [CsvColumn(Name = "ProductName", FieldIndex = 1)]
    public string Name { get; set; }

    [CsvColumn(FieldIndex = 2, OutputFormat = "dd MMM HH:mm:ss")]
    public DateTime LaunchDate { get; set; }

    [CsvColumn(FieldIndex = 3, CanBeNull = false, OutputFormat = "C")]
    public decimal Price { get; set; }

    [CsvColumn(FieldIndex = 4)]
    public string Country { get; set; }

    [CsvColumn(FieldIndex = 5)]
    public string Description { get; set; }
}

With this definition, you could read into an IEnumerable<Product>.

Although this example only uses properties, the library methods will recognize simple fields as well. Just make sure your fields/properties are public.

The optional CsvColumn attribute allows you to specify whether a field/property is required, how it should be written to an output file, etc. Full details are available here.

    1. Import the LINQtoCSV namespace at the top of the source file where you'll be reading the file:
using LINQtoCSV;
    1. Create a CsvFileDescription object, and initialize it with details about the file that you're going to read. It will look like this:
CsvFileDescription inputFileDescription = new CsvFileDescription
{
    SeparatorChar = ',', 
    FirstLineHasColumnNames = true
};

This allows you to specify what character is used to separate data fields (comma, tab, etc.), whether the first record in the file holds column names, and a lot more (full details).

    1. Create a CsvContext object:
CsvContext cc = new CsvContext();

It is this object that exposes the Read and Write methods you'll use to read and write files.

    1. Read the file into an IEnumerable<T> using the CsvContext object's Read method, like this:
IEnumerable<Product> products =
    cc.Read<Product>("products.csv", inputFileDescription);

This reads the file products.csv into the variable products, which is of type IEnumerable<Product>.

    1. You can now access products via a LINQ query, a foreach loop, etc.:
var productsByName =
    from p in products
    orderby p.Name
    select new { p.Name, p.LaunchDate, p.Price, p.Description };

// or ...
foreach (Product item in products) { .... }

To make it easier to get an overview, here is the code again that reads from a file, but now in one go:

CsvFileDescription inputFileDescription = new CsvFileDescription
{
    SeparatorChar = ',', 
    FirstLineHasColumnNames = true
};

CsvContext cc = new CsvContext();

IEnumerable<Product> products =
    cc.Read<Product>("products.csv", inputFileDescription);

// Data is now available via variable products.

var productsByName =
    from p in products
    orderby p.Name
    select new { p.Name, p.LaunchDate, p.Price, p.Description };

// or ...
foreach (Product item in products) { .... }

You'll find this same code in the SampleCode project in the sources.

Writing to a file

This is very similar to reading a file.

    1. In your project, add a reference to LINQtoCSV.dll.
    2. The Write method takes a IEnumerable<T> and writes each object of type T in the IEnumerable<T> as a data record to the file. The definition of your data class could look like this:
using LINQtoCSV;
using System;

class Product
{
    [CsvColumn(Name = "ProductName", FieldIndex = 1)]
    public string Name { get; set; }

    [CsvColumn(FieldIndex = 2, OutputFormat = "dd MMM HH:mm:ss")]
    public DateTime LaunchDate { get; set; }

    [CsvColumn(FieldIndex = 3, CanBeNull = false, OutputFormat = "C")]
    public decimal Price { get; set; }

    [CsvColumn(FieldIndex = 4)]
    public string Country { get; set; }

    [CsvColumn(FieldIndex = 5)]
    public string Description { get; set; }
}

The optional CsvColumn attribute allows you to specify such things as what date and number formats to use when writing each data field. Details for all CsvColumn properties (CanBeNull, OutputFormat, etc.) are available here.

Although this example only uses properties, you can also use simple fields.

The Write method will happily use an anonymous type for T, so you can write the output of a LINQ query right to a file. In that case, you obviously won't define T yourself. Later on, you'll see an example of this.

    1. Import the LINQtoCSV namespace at the top of the source file where you'll be writing the file:
using LINQtoCSV;
    1. Make sure the data is stored in an object that implements IEnumerable<T>, such as a List<T>, or the IEnumerable<T> returned by the Read method.
List<Product> products2 = new List<Product>();
// Fill the list with products
// ...
    1. Create a CsvFileDescription object, and initialize it with details about the file you will be writing, along these lines:
CsvFileDescription outputFileDescription = new CsvFileDescription
{
    SeparatorChar = '\t', // tab delimited
    FirstLineHasColumnNames = false, // no column names in first record
    FileCultureName = "nl-NL" // use formats used in The Netherlands
};
    1. Create a CsvContext object:
CsvContext cc = new CsvContext();
    1. Invoke the Write method exposed by the CsvContext object to write the contents of your IEnumerable<T> to a file:
cc.Write(
    products2,
    "products2.csv",
    outputFileDescription);

This writes the Product objects in the variable products2 to the file "products2.csv".

Here is the code again that writes a file, but now in one go:

List<Product> products2 = new List<Product>();
// Fill the list with products
// ...

CsvFileDescription outputFileDescription = new CsvFileDescription
{
    SeparatorChar = '\t', // tab delimited
    FirstLineHasColumnNames = false, // no column names in first record
    FileCultureName = "nl-NL" // use formats used in The Netherlands
};

CsvContext cc = new CsvContext();

cc.Write(
    products2,
    "products2.csv",
    outputFileDescription);

Writing an IEnumerable of anonymous type

If you have a LINQ query producing an IEnumerable of anonymous type, writing that IEnumerable to a file is no problem:

CsvFileDescription outputFileDescription = new CsvFileDescription
{
.....
};

CsvContext cc = new CsvContext();

// LINQ query returning IEnumerable of anonymous type
// into productsNetherlands
var productsNetherlands =
    from p in products
    where p.Country == "Netherlands"
    select new { p.Name, p.LaunchDate, p.Price, p.Description };

// Write contents of productsNetherlands to file
cc.Write(
    productsNetherlands,
    "products-Netherlands.csv", 
    outputFileDescription);

Here, a LINQ query selects all products for "Netherlands" from the variable products, and returns an IEnumerable holding objects of some anonymous type that has the fields Name, LaunchDate, Price, and Description. The Write method then writes those objects to the file products-Netherlands.csv.

CsvContext.Write Overloads

  • Write<T>(IEnumerable<T> values, string fileName)
  • Write<T>(IEnumerable<T> values, string fileName, CsvFileDescription fileDescription)
  • Write<T>(IEnumerable<T> values, TextWriter stream)
  • Write<T>(IEnumerable<T> values, TextWriter stream, CsvFileDescription fileDescription)

Some interesting facts about these overloads:

  • None of the overloads return a value.
  • Unlike the Read method, Write does not require that T has a parameterless constructor.
  • Overloads that take a stream write the data to the stream. Those that take a file name write the data to the file.
  • Overloads that do not take a CsvFileDescription object simply create one themselves, using the default values for the CsvFileDescription properties.

CsvContext.Read Overloads

  • Read<T>(string fileName)
  • Read<T>(string fileName, CsvFileDescription fileDescription)
  • Read<T>(StreamReader stream)
  • Read<T>(StreamReader stream, CsvFileDescription fileDescription)

Some interesting facts about these overloads:

  • Each overload returns an IEnumerable<T>.
  • T must have a parameterless constructor. If you do not define a constructor for T, the compiler will generate a parameterless constructor for you.
  • Overloads that take a stream read the data from the stream. Those that take a file name read the data from the file. However, see the section on deferred reading.
  • Overloads that do not take a CsvFileDescription object simply create one themselves, using the default values for the CsvFileDescription properties.

Reading Raw Data Rows

Sometimes it's easier to read the raw data fields from the CSV file, instead of having them processed into objects by the library. For example if different rows can have different formats, or if you don't know at compile time which field is going to hold what data.

You can make this happen by having your type T implement the interface IDataRow. This interface is included in the library, so you don't have to write it yourself. It essentially just describes a collection of DataRowItem objects:

public interface IDataRow
{
    // Number of data row items in the row.
    int Count { get; }

    // Clear the collection of data row items.
    void Clear();

    // Add a data row item to the collection.
    void Add(DataRowItem item);

    // Allows you to access each data row item with an array index, such as
    // row[i]
    DataRowItem this[int index] { get; set; }
}

The DataRowItem class is also defined in the library. It describes each individual field within a data row:

public class DataRowItem
{
    ...
    // Line number of the field
    public int LineNbr  { get { ... } }

    // Value of the field
    public string Value { get { ... } }
}

The line number is included in the DataRowItem class, because data rows can span multiple lines.

The easiest way to create a class that implements IDataRow is to derive it from List<DataRowItem>:

using LINQtoCSV;

internal class MyDataRow : List<DataRowItem>, IDataRow
{
}

Now you can read the CSV file into a collection of MyDataRow objects:

IEnumerable<MyDataRow> products =
    cc.Read<MyDataRow>("products.csv", inputFileDescription);

You can then access each individual field within each data row:

foreach (MyDataRow dataRow in products)
{
    string firstFieldValue = dataRow[0].Value;
    int firstFieldLineNbr = dataRow[0].LineNbr;

    string secondFieldValue = dataRow[1].Value;
    int secondFieldLineNbr = dataRow[1].LineNbr;

    ...
}

Deferred Reading

Here is how the Read overloads implement deferred reading:

  • When you invoke the Read method (which returns an IEnumerable<T>), no data is read yet. If using a file, the file is not yet opened.
  • When the Enumerator is retrieved from the IEnumerable<T> (for example, when starting a foreach loop), the file is opened for reading. If using a stream, the stream is rewound (seek to start of the stream).
  • Each time you retrieve a new object from the Enumerator (for example, while looping through a foreach), a new record is read from the file or stream.
  • When you close the Enumerator (for example, when a foreach ends or when you break out of it), the file is closed. If using a stream, the stream is left unchanged.

This means that:

  • If reading from a file, the file will be open for reading while you're accessing the IEnumerable<T> in a foreach loop.
  • The file can be updated in between accesses. You could access the IEnumerable<T> in a foreach loop, then update the file, then access the IEnumerable<T> again in a foreach loop to pick up the new data, etc. You only need to call Read once at the beginning, to get the IEnumerable<T>.

CsvFileDescription

The Read and Write methods need some details about the file they are reading or writing, such as whether the first record contains column names.

As shown in the Reading from a file and Writing to a file examples, you put those details in an object of type CsvFileDescription, which you then pass to the Read or Write method. This prevents lengthy parameter lists, and allows you to use the same details for multiple files.

A CsvFileDescription object has these properties:

SeparatorChar

Type: char
Default: ','
Applies to: Reading and Writing

Example:

CsvFileDescription fd = new CsvFileDescription();
fd.SeparatorChar = '\t'; // use tab delimited file

CsvContext cc = new CsvContext();
cc.Write(data, "file.csv", fd);

The character used to separate fields in the file. This would be a comma for CSV files, or a '\t' for a tab delimited file.

You can use any character you like, except for white space characters or the double quote (").

QuoteAllFields

Type: bool
Default: false
Applies to: Writing only

Example:

fd.QuoteAllFields = true; // forces quotes around all fields

When false, Write only puts quotes around data fields when needed, to avoid confusion - for example, when the field contains the SeparatorChar or a line break.

When true, Write surrounds all data fields with quotes.

FirstLineHasColumnNames

Type: bool
Default: true
Applies to: Reading and Writing

Example:

fd.FirstLineHasColumnNames = false; // first record does not have column headers

When reading a file, tells Read whether to interpret the data fields in the first record in the file as column headers.

When writing a file, tells Write whether to write column headers as the first record of the file.

EnforceCsvColumnAttribute

Type: bool
Default: false
Applies to: Reading and Writing

Example:

fd.EnforceCsvColumnAttribute = true; // only use fields with [CsvColumn] attribute

When true, Read only reads data fields into public fields and properties with the [CsvColumn] attribute, ignoring all other fields and properties. And, Write only writes the contents of public fields and properties with the [CsvColumn] attribute.

When false, all public fields and properties are used.

FileCultureName

Type: string
Default: current system setting
Applies to: Reading and Writing

Example:

fd.FileCultureName = "en-US"; // use US style dates and numbers

Different cultures use different ways to write dates and numbers. 23 May 2008 is 5/23/2008 in the United States (en-US) and 23/5/2008 in Germany (de-DE). Use the FileCultureName field to tell Read how to interpret the dates and numbers it reads from the file, and to tell Write how to write dates and numbers to the file.

By default, the library uses the current language/country setting on your system. So, if your system uses French-Canadian (fr-CA), the library uses that culture unless you override it with FileCultureName.

The library uses the same culture names as the .NET "CultureInfo" class (full list of names).

TextEncoding

Type: Encoding
Default: Encoding.UTF8
Applies to: Reading and Writing

Example:

fd.TextEncoding = Encoding.Unicode; // use Unicode character encoding

If the files that you read or write are in English, there is no need to set TextEncoding.

However, if you use languages other than English, the way the characters in your files are encoded may be an issue. You will want to make sure that the encoding used by the library matches the encoding used by any other programs (editors, spreadsheets) that access your files.

Specifically, if you write files with the Euro symbol, you may need to use Unicode encoding, as shown in the example.

DetectEncodingFromByteOrderMarks

Type: bool
Default: true
Applies to: Reading only

Example:

fd.DetectEncodingFromByteOrderMarks = false; // suppress encoding detection

Related to TextEncoding. The default normally works fine.

Tells Read whether to detect the encoding of the input file by looking at the first three bytes of the file. Otherwise, it uses the encoding given in the TextEncoding property.

MaximumNbrExceptions

Type: int
Default: 100
Applies to: Reading only

Example:

fd.MaximumNbrExceptions = -1; // always read entire file before throwing AggregatedException

Sets the maximum number of exceptions that will be aggregated into an AggregatedException.

To not have any limit and read the entire file no matter how many exceptions you get, set AggregatedException to -1.

For details about aggregated exceptions, see the error handling section.

CsvColumn Attribute

As shown in the Reading from a file and Writing to a file examples, you can decorate the public fields and properties of your data class with the CsvColumn attribute to specify such things as the output format for date and number fields.

Use of the CsvColumn attribute is optional. As long as the EnforceCsvColumnAttribute property of the CsvFileDescription object you pass into Read or Write is false, those methods will look at all public fields and properties in the data class. They will then simply use the defaults shown with each CsvColumn property below.

The CsvColumn attribute has these properties:

Name

Type: string
Default: Name of the field or property
Applies to: Reading and Writing

Example:

[CsvColumn(Name = "StartDate")]
public DateTime LaunchDate { get; set; }

The Read and Write methods normally assume that the data fields in the file have the same names as the corresponding fields or properties in the class. Use the Name property to specify another name for the data field.

CanBeNull

Type: bool
Default: true
Applies to: Reading only
[CsvColumn(CanBeNull = false)]
public DateTime LaunchDate { get; set; }

If false, and a record in the input file does not have a value for this field or property, then the Read method generates a MissingRequiredFieldException exception.

FieldIndex

Type: bool
Default: Int32.MaxValue
Applies to: Reading only

Example:

[CsvColumn(FieldIndex = 1)]
public DateTime LaunchDate { get; set; }

This property is used for both reading and writing, but in slightly different ways.

Reading - The Read method needs to somehow associate data fields in the input file with field and properties in the data class. If the file has column names in the first record, that's easy - Read simply matches the column names with the names of the fields and properties in the data class.

However, if the file does not have column names in the first record, Read needs to look at the order of the data fields in the data records to match them with the fields and properties in the data class. Unfortunately though, the .NET framework does not provide a way to reliably retrieve that order from the class definition. So, you have to specify which field/property comes before which field/property by giving the fields and properties a CsvColumn attribute with the FieldIndex property.

The FieldIndexs do not have to start at 1. They don't have to be consecutive. The Read and Write methods will simply assume that a field/property comes before some other field/property if its FieldIndex is lower.

Writing - The Write method uses the FieldIndex of each field or property to figure out in what order to write the data fields to the output file. Field and properties without FieldIndex get written last, in random order.

NumberStyle

Type: NumberStyles
Default: NumberStyles.Any
Applies to: Reading of numeric fields only

Example:

[CsvColumn(NumberStyle = NumberStyles.HexNumber)]
public DateTime LaunchDate { get; set; }

Allows you to determine what number styles are allowed in the input file (list of options).

By default, all styles are permitted, except for one special case. In order to accept hexadecimal numbers that do not start with 0x, use NumberStyles.HexNumber, as shown in the example.

OutputFormat

Type: string
Default: "G"
Applies to: Writing only

Example:

[CsvColumn(OutputFormat = "dd MMM yy")]
public DateTime LaunchDate { get; set; }

Lets you set the output format of numbers and dates/times. The default "G" format works well for both dates and numbers most of the time.

When writing a date/time or number field, the Write method first determines the type of the field (DateTime, decimal, double, etc.) and then calls the ToString method for that type, with the given OutputFormat. So, in the example above, if LaunchDate is 23 November 2008, the field written to the file will be "23 Nov 08".

With many formats, the final result depends on the language/country of the file, as set in the FileCultureName property of the CsvFileDescription object. So, if LaunchDate is 23 November 2008 and you specify the short date format:

[CsvColumn(OutputFormat = "d")]
public DateTime LaunchDate { get; set; }

Then, the final value written to the output file will be "11/23/08" if you use US dates (FileCultureName is set to "en-US"), but "23/11/08" if you use German dates (FileCultureName is set to "de-DE").

Error Handling

When the Read and Write methods detect an error situation, they throw an exception with all information you need to solve the problem. As you would expect, all exceptions are derived from the .NET class Exception.

Retrieving error information

In addition to such properties as StackTrace and Message, the Exception class exposes the Data property. The Read and Write methods use that property to provide exception information in a way that is easy for your code to read, while they provide error messages targeted at humans via the Message property.

The description for each exception (further below) shows what information is stored in the Data property.

Aggregating exceptions

When the Read method detects an error while reading data from a file, it does not throw an exception right away, but stores it in a list of type List<Exception>. Then, after it has processed the file, it throws a single exception of type AggregatedException, with the list of exceptions in its Data["InnerExceptionsList"] property. This allows you to fix all problems with an input file in one go, instead of one by one.

You can limit the number of exceptions that get aggregated this way by setting the MaximumNbrExceptions property of the CsvFileDescription object that you pass to the Read method. By default, MaximumNbrExceptions is set to 100. When the limit is reached, the AggregatedException is thrown right away, with the list of exceptions aggregated so far.

Not all exceptions get aggregated! Before Read starts reading data from a file, it first processes column names, CsvColumn attributes, etc. If something goes wrong during that preliminary stage, it throws an exception right away.

Deferred reading

Keep in mind that due to deferred reading, you can get exceptions not only when you invoke the Read method, but also when you access the IEnumerable<T> that is returned by the Read method.

Example

The following code reads a file and processes exceptions. To show how to use the Data property, it includes some special processing for the DuplicateFieldIndexException - thrown when the Read and Write methods detect two fields or properties with the same FieldIndex.

public static void ShowErrorMessage(string errorMessage)
{
    // show errorMessage to user
    // .....
}

public static void ReadFileWithExceptionHandling()
{
    try
    {
        CsvContext cc = new CsvContext();

        CsvFileDescription inputFileDescription = new CsvFileDescription
        {
            MaximumNbrExceptions = 50
            // limit number of aggregated exceptions to 50
        };

        IEnumerable<Product> products =
            cc.Read<Product>("products.csv", inputFileDescription);

        // Do data processing
        // ...........

    }
    catch(AggregatedException ae)
    {
        // Process all exceptions generated while processing the file

        List<Exception> innerExceptionsList =
            (List<Exception>)ae.Data["InnerExceptionsList"];

        foreach (Exception e in innerExceptionsList)
        {
            ShowErrorMessage(e.Message);
        }
    }
    catch(DuplicateFieldIndexException dfie)
    {
        // name of the class used with the Read method - in this case "Product"
        string typeName = Convert.ToString(dfie.Data["TypeName"]);

        // Names of the two fields or properties that have the same FieldIndex
        string fieldName = Convert.ToString(dfie.Data["FieldName"]);
        string fieldName2 = Convert.ToString(dfie.Data["FieldName2"]);

        // Actual FieldIndex that the two fields have in common
        int commonFieldIndex = Convert.ToInt32(dfie.Data["Index"]);

        // Do some processing with this information
        // .........


        // Inform user of error situation
        ShowErrorMessage(dfie.Message);
    }
    catch(Exception e)
    {
        ShowErrorMessage(e.Message);
    }
}

BadStreamException

This exception exposes the same properties as Exception.

Thrown when a stream is passed to Read, which is either null, or does not support Seek. The stream has to support Seek, otherwise it cannot be rewound when the IEnumarable returned by Read is accessed.

CsvColumnAttributeRequiredException

This exception exposes the same properties as Exception.

Thrown when the CsvFileDescription object that has been passed to Read has both FirstLineHasColumnNames and EnforceCsvColumnAttribute set to false.

If there are no column names in the file, then Read relies on the FieldIndex of each field or property in the data class to match them with the data fields in the file. However, if EnforceCsvColumnAttribute is false, that implies that fields or properties without the CsvColumn attribute can also be used to accept data, while they do not have a FieldIndex.

DuplicateFieldIndexException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the offending fields/properties
Data["FieldName"] string Fields or properties with a duplicate FieldIndex
Data["FieldName2"]
Data["Index"] int Common FieldIndex

Thrown when two or more fields or properties have the same FieldIndex.

RequiredButMissingFieldIndexException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the offending field/property
Data["FieldName"] string Field or property without FieldIndex

When there are no column names in the first record in the file (FirstLineHasColumnNames is false), each required field (CanBeNull attribute set to false) must have a FieldIndex attribute, otherwise it cannot be read from the file.

ToBeWrittenButMissingFieldIndexException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the offending field/property
Data["FieldName"] string Field or property without FieldIndex

When writing a file without column names in the first record, you will want to make sure that the data fields appear in each line in a well defined order. If that order were random, it would be impossible for some other program to reliably process the file.

So, when the Write method is given a CsvFileDescription with FirstLineHasColumnNames as false, and it finds a field or property that doesn't have a FieldIndex, it throws a ToBeWrittenButMissingFieldIndexException.

NameNotInTypeException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class missing the field/property
Data["FieldName"] string Field or property that isn't found
Data["FileName"] string Name of the input file

If the Read method is given a CsvFileDescription with FirstLineHasColumnNames as true, and one of the column names in the first record in the file does not match a field or property, it throws a NameNotInTypeException.

MissingCsvColumnAttributeException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the offending field/property
Data["FieldName"] string Field or property without CsvColumn attribute
Data["FileName"] string Name of the input file

The Read method may throw this exception when it is given a CsvFileDescription with both FirstLineHasColumnNames and EnforceCsvColumnAttribute as true. When Read reads the column names from the first record, one of those column names may match a field or property that doesn't have a CsvColumn attribute, even though only fields and properties with a CsvColumn attribute can be used. When that happens, Read throws a MissingCsvColumnAttributeException.

TooManyDataFieldsException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the data class
Data["LineNbr"] int Line in the input file with an excess data field
Data["FileName"] string Name of the input file

Thrown when a record in the input file has more data fields than there are public fields and properties in the data class.

TooManyNonCsvColumnDataFieldsException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the data class
Data["LineNbr"] int Line in the input file with an excess data field
Data["FileName"] string Name of the input file

When only fields or properties that have a CsvColumn attribute are used (Read is given a CsvFileDescription with EnforceCsvColumnAttribute as true), and a record in the input file has more data fields than there are fields and properties with the CsvColumn attribute, a TooManyNonCsvColumnDataFieldsException is thrown.

MissingFieldIndexException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the data class
Data["LineNbr"] int Line with offending field
Data["FileName"] string Name of the input file

If there are no column names in the first record of the input file (Read is given a CsvFileDescription with FirstLineHasColumnNames as false), then Read relies on the FieldIndex of the fields and properties in the data class to match them with the data fields in the file.

When a record in the input file has more data fields than there are fields and properties in the data class with a FieldIndex, then a MissingFieldIndexException is thrown.

MissingRequiredFieldException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the required field/property
Data["FieldName"] string Name of the required field/property
Data["LineNbr"] int Line where missing field should have been
Data["FileName"] string Name of the input file

Thrown when a record from the input file does not have a value for a required field or property (CanBeNull property of the CsvColumn attribute set to false).

Difference between null and empty string

Empty strings and strings consisting of only white space need to be surrounded by quotes, so they are recognized as something other than null.

These input lines both have the data fields "abc", null, and "def":

abc,,def
abc,   ,def

While this line has the data fields "abc", followed by the empty string, followed by "def":

abc,"",def

and this line has the data fields "abc", followed by a string with three spaces, followed by "def":

abc,"   ",def

WrongDataFormatException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the class with the field/property
Data["FieldName"] string Name of the field/property
Data["FieldValue"] string The offending data value
Data["LineNbr"] int Line with offending data value
Data["FileName"] string Name of the input file

Thrown when a field has the wrong format. For example, a numeric field with the value "abc".

AggregatedException

Additional Properties - This exception exposes the same properties as Exception, plus these additional properties:

Property Type Description
Data["TypeName"] string Name of the data class used by Read
Data["FileName"] string Name of the input file
Data["InnerExceptionsList"] List<Exception> List of Exceptions

Used to aggregate exceptions generated while reading a file (more details).

Tags:

Load testing web sites with StresStimulus, part 2 - Advanced features

by Matt Perdeck 29. August 2011 10:43

This series

Introduction

Part 1 of this 2 part series showed the features of StresStimulus 1.0 and how to get started running load tests. In part 2, we'll go into the more advanced features of StresStimulus, such as parameterization.

Contents

Increase load in steps

During a load test, it is often useful to increase the load in steps. That makes it easier to find out at what load your site starts getting into trouble. You'd apply little load for a while, then run at a greater load for a while, etc. until you reach maximum load.

To make this happen:

  1. Click Load Pattern under Test Configuration.

  2. Select Step Load.

  3. Select the number of virtual users the test should start off with, the increase in virtual users for each step, how long each step should last in seconds, and the maximum number of virtual users.

    When setting these numbers, keep in mind how long it will take to get to maximum load. If you have 9 steps of 5 seconds each, it will take 9 x 5 seconds = 45 seconds before StresStimulus finally starts applying maximum load.

  4. Click the Start Debug or Start Test button to start the test.

Think time

When real users access your site, they spend time reading your instructions, deciding what to do next, etc. To simulate this, you can get the virtual users to pause for a while, both between page requests and between iterations.

To set think times, expand Test Configuration and then click Think Time.

You get these options to set think times in between page requests (all in seconds):

  • Use recorded think times - The virtual users use the same time thinking as you did when you recorded the test case. This is a good option if you decide that your think times are typical of those of visitors to your site. Or you could simply pause it bit longer while recording to simulate visitors who are not as familiar with the site as you are. An advantage of this option is that you can vary the think time for each page.

  • Use constant think time(s) (Pro 14 day trial version and Pro paid version only) - If you have decided that on average your visitors spend the same amount of time thinking at each page in your test case, this would be the easiest option.

  • Do not use think time (Pro 14 day trial version and Pro paid version only) - Allows you to throw as much traffic at your site as possible. You would use this to find bugs caused by high load, rather than to simulate expected traffic.

There are two more think time related options:

  • Think time between iterations (s) - Allows you to add additional think time in between iterations.

  • Total iteration time (s) (Pro versions only) - This nifty feature makes it easier to compare the performance of different versions of your site.

    Assume you've tested the original version of your site, and now you are testing a new faster version. You want to know how much faster it responds to requests from the virtual users. The issue is that because the site responds more quickly, the virtual users send more requests, creating a greater load on the site. As a result, the performance improvement appears less than it actually is.

    To fix this, set Total iteration time(s) to the average iteration time of the first slower version. This way, the faster version will receive the same load as the slower version, so you can compare the average response times between the two.

Set warm up period

At the start of a load test, your site is unlikely to run at maximum speed because the web server is still filling its caches. You want to exclude this temporary slowness from your test results, otherwise they underestimate the performance of your site during normal operation.

To make this happen, you can set a warm up duration for your test. This way, when you start a test, StresStimulus first applies load for the number of seconds you set as the warm up duration, without keeping track of your site's performance. After the warm up period has finished does it start the real test, where it does keep track of your site's performance.

To set a warm up duration:

  1. Under Test Configuration, click Test Duration.
  2. Set Test completion criteria to either Run Duration or Reaching Max Users.
  3. You can now set the Warm-up Duration to the number of seconds the warm up period should last.

Old browsers, slow networks

If many of your visitors still use older browsers, such as IE6, or slow network connections, such as dial-up, you'll want to take this into account when establishing response times of your site.

Setting the simulated browser

When StresStimulus runs a load test, each virtual user effectively uses a simulated browser to access your site. Which type of browser it simulates can influence the performance of your site during load testing. For example, when IE6 loads the images on a page, it loads only 2 images concurrently per domain. More modern browsers load at least 6 images concurrently, potentially meaning faster page load times.

To set the type of browser simulated by StresStimulus:

  1. Under Test Configuration, click Browser Type.
  2. Select the browser to simulate.

Setting the simulated network

To simulate a slow dial-up connection:

  1. Under Test Configuration, click Network Type.
  2. Choose the network connection to simulate. LAN would be the fastest, Dial-Up 56k the slowest.

Parameterization

Your site may have online forms, such as sign up forms or login forms. Your server has to process those forms. When you load test pages with those forms, you'll want the virtual users to enter realistic values into those forms, so you get test results that reflect how your site will perform after it has gone live.

When you are creating a test case by visiting pages and submitting forms, Fiddler not only records which pages you visited, but also the form values you submitted (if you leave a field blank, the value is an empty string). However, if you rely on this, during a load test all virtual users will always enter the exact same values into the forms, which is not very realistic. In a sign up form where users have to enter a unique user name, it wouldn't work at all.

To solve this, StresStimulus supports parameterization. This allows you to set up a file with the values that you want the virtual users to enter into form fields for multiple form submissions. Lets see how this works step by step.

Single page, multiple values

To keep things simple, let's begin with a test case consisting of a single form. It has a few text boxes, radio buttons and a checkbox. We'll get a single virtual user to submit that form 10 times. It will get the values to put into the form from an input file.

Step 1: Create input file

StresStimulus expects the input file to be a CSV (Comma Separated Values) file. This is a text file with all values separated by commas - hence its name. To see what this looks like, look in the download that came with this article for the file parameterization1.csv and open it with your favorite text editor.

If you have Microsoft Excel or some other spreadsheet program such as the one in the free OpenOffice, you can use that to create a CSV file. Enter the following data into a spreadsheet, and use Save As to save it as a CSV file:

textbox1textbox2RadioCheckBox1
value1avalue2a
value1bvalue2bRadioButton1on
value1cvalue2cRadioButton2
value1dvalue2dRadioButton3on
value1evalue2e

The first row in this file has the headers. You can use any headers you like, but in practice it's easiest to use the names of the controls where the virtual users will enter the values.

In this first simple example, each row has all the values that go into the form. They will be used sequentially - the first time a virtual user submits the form, it will use the values in the first row. The second time a virtual user sends a form, it uses the second row, etc. When the rows are exhausted, the first row is used again, etc.

Step 2: Create test case

Take these steps to create a new test case that submits a single form:

  1. Close all web sites that regularly generate traffic, such as your favorite newspaper, gmail, etc. That way, they won't interfere with creating the test case.

  2. Switch to Fiddler and remove all sessions in the left hand pane: click Edit | Remove | All Sessions.

  3. Run the test site you found in the download with this article. You'll find a form on the home page. You can leave the text boxes blank, but do check one of the radio buttons and the checkbox. Otherwise, the browser won't send them when you submit the form (as per the HTML standard) and StresStimulus won't know they are there.

  4. Click the Send Form button to submit the form to the server.

  5. Switch back to Fiddler. You'll see two sessions in the left hand pane: the initial request when you opened the page for the first time, and the subsequent request where you submitted the form with the Send Form button. Parameterization only works with form submissions, so it is that second request that you want to make into a test case.

  6. Click the second session in the left hand pane to select it.

  7. In the StresStimulus pane, click Test Case (at the top of the tree with features).

  8. Click the dropdown in the Set Test Case button and select Set Test Case with Selected sessions.

  9. If StresStimulus asks whether to overwrite an existing test case, confirm this. You've now created a test case with a single online form.

Step 3: Load the input file

Before you can use the values in the input file, you have to load it first.

  1. Expand Test Case, expand Parameterization and click Data Sources.

  2. Click the Add button to add the file parameterization1.csv - you'll find that file in the download with this article. You'll see the content of the file appear in the bottom pane.

Step 4: Bind input file to form fields

To associate the values in the input file to the text boxes, radio buttons and checkbox on the form, take these steps:

  1. Click Requests (right under Data Sources).

  2. You should see a grid with the fields on the form. If you don't, make sure you selected the the second session in the left hand pane.

  3. In the Select Data Source column, select parameterization1 for the text boxes, radio buttons and checkbox. As you see here, you can load multiple CSV files (with the Add button in the previous step) and bind different fields to different CSV files. You might want to experiment with this later.

  4. In the Select Field column, select the headers of the columns with values for each form field.

  5. Finally, in the Databinding column, set all input fields to Sequential. This way, the first time a virtual user submits the form, the first row in the input file is used. The second time a virtual user submits a form, the second row is used, etc. If you pick Random, rows will be picked randomly. The VU_Bound option will be discussed further down.

  6. Done! You've now bound the input file to the online form and thereby created a parameterized load test. Be sure to save your work by clicking StresStimulus | Save Test.

Step 5: Run parameterized load test

To make it easier to interpret the results, lets run this first parameterized load test with just one virtual user and 10 iterations:

  1. Expand Test Configuration and click Load Pattern. Select Constant Load and set number of virtual users to 1.

  2. Also under Test Configuration, click Test Duration. Select Number of Iterations and set this to 10.

  3. Click the Start Test button to start the load test.

Step 6: Check the trace

Time to check whether StresStimulus really used the values in the input file:

  1. Go back to the test site and click the Open trace log link near the bottom of the page. The trace page will open in a new window or tab. This traces all requests received by the server.

  2. You'll see that the trace page has a line for each request sent by StresStimulus. Click the View Details links for each request.

  3. Compare the values received by the server for each form field with the values in the input file. They should be the same.

Parameterization with multiple pages

Now that we've seen how to do parameterization with a test case consisting of only a single page, lets expand this to multiple pages. To keep it simple, the following example uses only two pages, the Default page and a very simple Login page with username and password text boxes. The principle however is the same no matter how many pages you have in your test case.

Step 1: Create additional input file

The previous input file parameterization1.csv had values for the Default page. To make it easy, the download with this article includes an input file for the Login page, named parameterization2.csv. It has these values:

tbUsernametbPassword
username-apassword-a
username-bpassword-b

This input file has only 2 data rows, while the previous one had 5! That's ok. It simply means that StresStimulus goes back to the beginning of the file after every 2 iterations, while with the first input file it went back after every 5 iterations. There is no need to have the same number of rows in every input file.

Step 2: Create test case

Creating a test case with multiple pages goes along the same lines as a test case with a single page:

  1. Clear the decks - Close all web sites that regularly generate traffic. Then switch to Fiddler and remove all sessions in the left hand pane: Edit | Remove | All Sessions.

  2. Run the test site again. On the home page, check one of the radio buttons and the checkbox. Click the Send Form button to submit the form to the server.

  3. Now click the Login.aspx link to open the Login page. Click the Login button to submit the form. It's ok to leave the fields empty, because they will be filled by the virtual users with values taken from parameterization2.csv.

  4. Switch back to Fiddler. Looking at the sessions in the left hand pane, you'll see:

    1. The initial request for the Default page
    2. The second request for the Default page where you submitted the form
    3. The initial request for the Login page
    4. The second request for the Login page where you submitted the form

    Whereas in the previous example you only included the request where you submitted the form in the test case, here we'll include the initial requests as well. That makes it a bit more realistic and it's easier as well.

  5. In the StresStimulus pane, click Test Case.

  6. Click the dropdown in the Set Test Case button and select Set Test Case with All sessions. When asked, confirm that you want to overwrite an existing test case.

Step 3: Load the input files

You've already seen how this works. The only difference is that here you load two files:

  1. Expand Test Case, expand Parameterization and click Data Sources.

  2. If parameterization1.csv is no longer loaded, click the Add button to load it again. Then click Add to load parameterization2.csv.

    You can show the contents of either file in the bottom pane by selecting it in the upper pane (in the list of files).

Step 4: Bind input file to form fields

To associate the values in the input files to the form fields of both pages:

  1. Click Requests (right under Data Sources).

  2. In the left hand pane, select the second request for Default.aspx (the one where you submitted the form). This will show the grid with input elements again that you saw last time. Fill it out the same way you did last time, with values from parameterization1.csv.

    Why select the second request to Default.aspx and not the first one? During the first request no form values were sent to the server - it was just a request for the page itself. However, StresStimulus can only find out about form fields if they were submitted as part of the request, so it won't find any if you select the first request. However, the second request did have the form fields (even if most were empty), so StresStimulus can figure out the form fields.

  3. In the left hand pane, click the second request for the Login page. You'll see a grid with the form fields on the Login page. Enter parameterization2.csv in the Select Data Source column and its headers in the Select Field column. Also set the fields in the Databinding column.

  4. Done! You've now bound the two input files to the two online forms in your test case. Be sure to save your work by clicking StresStimulus | Save Test.

Step 5: Run parameterized load test

  1. Make sure that the current trace is empty, to make it easier to check the trace after the load test. On the test site, click the Open trace log link to open the trace page and clear the current trace.

  2. In StresStimulus, click the Start Test button to start the load test.

Step 6: Check the trace

Go back to the test site and refresh the trace page. Clicking the View Details links for each request, you should find that the Login page did get its values from parameterization2.csv.

Settings values per virtual user

When binding form fields to values from input files in the Requests page (under Test Case | Parameterization), you will have noticed that StresStimulus gives you three options in the Databinding column:

  1. Sequential - StresStimulus goes to the next row in the input file when it starts a new iteration.
  2. Random - StresStimulus picks rows at random.
  3. VU_Bound - Rows are tied to virtual users.

In this section we'll see why you would want to tie rows to particular virtual users and we'll see the VU_Bound option in action.

VU_Bound option

The VU_Bound option makes sense for form fields that are unique to each user, such as usernames and passwords. In the real world, if you have 21 users placing orders on your site, some may be ordering the same products but they all use their very own usernames and passwords. So it makes sense to tell StresStimulus to always use the same value for a given virtual user. The VU_Bound option allows you to do that.

How would this work in practice? Assume your site sells 10 products and you want to simulate a situation where 21 users are continuously buying random products. Than you could set up your parameterization like this:

  • Create an input file for the products, with one row per product, making 10 rows in total (for example, parameterization3.csv in the download).

  • Create a second input file for the users, with one row per user, making 21 rows in total (for example, parameterization4.csv in the download).

  • Load both files in StresStimulus: Test Case | Parameterization | Data Sources | Add button.

  • In forms that let users select a product, bind the form fields to the products related input file, using Sequential or Random data binding.

  • In forms that let users enter information about themselves (such as login or sign up forms), bind the form fields to the users related input file, using VU_Bound data binding.

Make sure that the number of users is not divisible in the number of products. For example, 20 is divisible in 10 but 21 is not. Otherwise, if you use Sequential data binding, the same virtual users may wind up buying the same products over and over again, which wouldn't be very realistic.

Assigning input file rows to virtual users

How do you assign virtual users to rows in the input file with user specific values?

The answer is that StresStimulus makes the assignments - the first virtual user to spring to life gets the first row, the second virtual user the second row, etc. So you don't have to worry about this.

Creating matching user account

If you have virtual users accessing the site with usernames and passwords taken from an input file, it makes sense to set up accounts on your test site with the same usernames and passwords.

You could do this manually, but that gets a bit laborious if you use hundreds of virtual users. An easier way is to run a load test on your user sign up page. Use the input file with usernames and passwords that you're going to use for other load tests, extended with the additional information required by your sign up form such as name, email, etc.

In this special case, you want to submit the sign up forms once for each row in the input file. If you have 21 rows, you want the sign up form submitted to the server 21 times, each with a different row.

You could make this happen this way:

  1. Create a single test case that sends sign up information once to the server (open the sign up form in your browser, fill in bogus data and hit submit). If a new user gets logged in automatically after they sign up, you want to tell the server to log you out as part of the test case. For example, clicking a log out button results in a new request to the server, which Fiddler will record.

  2. To make sure that each row in the input file is used only once, set Number of Virtual Users to 1 on the Test Configuration | Test Pattern page.

  3. Set Number of Iterations on the Test Configuration | Test Duration page to the number of rows in the input file. That way, the one virtual user will hit the sign up page once for each row in the input file.

  4. When binding the input file to the fields in the sign up form on the Test Case | Parameterization | Requests page, use Sequential data binding. That way, the one virtual user goes to the next row in the input file for each iteration.

Authentication

If your site sits on an Intranet, it may be using Integrated Windows authentication to allow users to log in with their Windows accounts. To cater for this, StresStimulus virtual users can log into your site using NTLM, a common way of performing integrated Windows authentication (Pro versions only).

To make this happen, you can enter one or more domains, usernames and passwords in StresStimulus. The virtual users use these in round robin fashion to log into your site during load testing.

If your site uses Basic authentication, you use the same feature to allow virtual users to log in (StresStimulus will pick the correct protocol). If you use forms authentication, the username and password are entered via standard HTML text boxes, so you can use Parameterization to let virtual users log into your site.

To enter a list of domains, usernames and passwords:

  1. Expand the Test Case tree and click Authentication.

  2. Enter the login domains, usernames and passwords in the grid, or click the Import button to import them from a CSV file.

  3. Don't forget to save your work: from the StresStimulus menu in the top left of the Fiddler screen, choose Save Test.

Performance Counters

IIS (the web server) and SQL Server have dozens of performance counters that show where bottlenecks may be occurring. For example, the ASP.NET/Request Execution Time counter shows the time in milliseconds it took to execute the most recent request. My book ASP.NET Site Performance Secrets shows how to use these counters in detail.

You can get StresStimulus to poll one or more counters during a load test, so you can see how the server behaves under load. It reports the values of the counters in graphical and numerical form.

To make this happen:

  1. Under Test Configuration, click Other Options.
  2. Change the sample rate, or just accept the default of once every 5 seconds.
  3. Click the Select Performance Counters button to select the counters you want to keep track of.

Conclusion

StresStimulus has a number of advanced features that can make your load tests a lot more realistic, such as warm up time, NTML authorization and parameterization.

Book: ASP.NET Site Performance Secrets Seeing that you are interested in your site's performance, consider purchasing my book ASP.NET Site Performance Secrets. In a structured approach towards improving performance, it shows how to first identify the most important bottlenecks in your site, and then how to fix them. Very hands on, with a minimum of theory.

Load testing web sites with StresStimulus, part 1 - Getting started

by Matt Perdeck 29. August 2011 10:42

This series

Introduction

It is a good idea to load test your web site before taking it live, to make sure it won't break or perform poorly under load. Fairly recently, a new load tester called StresStimulus was introduced. It is affordable and has some unique features that make it well suited to load test ASP.NET web sites, hence this article to discuss its features. You'll also find a (far from exhaustive) list with other popular load testers, to make it easier to make up your own mind which load tester you want to use.

This article first lists the features of StresStimulus, and then shows how to get started with your first load test. Part 2 goes into more advanced topics such as parameterization.

Contents

Other Popular Load Testers

A simple Google search will give you lots of load testing tools besides StresStimulus. In the table below, you'll find some of the more popular load testers available.

WCat Simple and fast load tester. Free, but lacks some basic features, such as support for ViewState.
Apache JMeter Open source Java based load tester. Steep learning curve.
WAPT Reasonably affordable load tester. Easy to use.
Load tester built into
Visual Studio 2010 Ultimate
Excellent full featured product, powerful yet easy to use. Very expensive though.

StresStimulus Requirements

To run StresStimulus, you need:

  • Fiddler 2.3.4.4 or later (free download).
  • Windows 2000 to 2008 / XP / Vista / 7 with Microsoft .NET Framework 2.0 or later.
  • 5 megabytes of available disk space / 800 MHz processor / 256 megabytes RAM.

Versions

StresStimulus comes in these versions:

  • Free version- Allows you to do real load testing. No time limit, use as long as you want. Limited in key areas, including:
    • Reduced reporting of test results. Only summary information available. No reporting on performance counters.
    • No ability to save or load a test case. Next time you run StresStimulus, you need to recreate your test case.
    • Number of virtual users limited to 100.
    • Test duration limited to 10 minutes or 999 iterations.
    • Reduced think time options.

    The rest of this article will make it clear when a particular feature is only available in a Pro version (that is, not in the free version).

  • Pro evaluation version- All features enabled, but only suitable for very light load testing:
    • No time limit - use as long as you want.
    • Number of virtual users limited to 10.
    • Reduced think time options.
  • Pro trial version- 14 day trial with all features enabled.
  • Pro paid version - All features enabled. Maximum number of virtual users depends on your license.

Features

  • An extension of Fiddler, the popular HTTP debugging proxy.
  • Very easy to create simple tests.
  • Allows you to record a test case with your browser and then have that test case repeated by multiple virtual users.
  • Shows test results in graphs and in text based reports (restricted in free version).
  • Supports parameterization, allowing you to load form field values from input files.
  • Supports Basic and NTML authentication, allowing simulated users to login to your site using Windows accounts (Pro versions only).
  • Simulates thousands of virtual users (capped at 10 in Pro evaluation version, 100 in free version).
  • Lets you increase the load in steps, to find the breaking point of your site.
  • Virtual users keep track of cookies and hidden fields, such as ViewState.
  • Set think times of virtual users.
  • Lets you set simulated browser (IE version, Firefox version, Chrome, etc.) and network (LAN, Dial-Up, etc.)
  • Lets you set warm up time, to allow web site caches to fill before testing starts.
  • Keeps track of performance counters (Pro versions only).
  • Able to save test cases to file (Pro versions only).

Limitations

Because of its tight integration with Fiddler, StresStimulus lacks some features available in more complex load testers:

  • No ability to mix different scenarios in the same test case (browsing, purchasing, searching, etc.)
  • You can't have multiple load generating machines controlled by a single controller. Obviously you can run StresStimulus on multiple computers at the same time, but you would have to coordinate them manually.
  • No easy way to compare test runs, for example of different versions of your site.

Installation

Take these steps to install StresStimulus on your machine:

  1. If you do not already have Fiddler installed, first download Fiddler and install it.
  2. Then download StresStimulus and install that too.
  3. Finally download this simple ASP.NET test site. This makes it easier to follow the examples in the article.

Load testing basics

Before creating your first load test, let's do a quick overview of Fiddler and StresStimulus.

One of the challenges in developing web sites is that they get used by many users at the same time. Unfortunately, when you're testing your site, there is only one user - you. Your site may be fine with just one or a few users, but break or perform badly with dozens or hundreds of users. You want to find out about any problems before you go live, not after you go live.

How do you get lots of users to hit your new site before it even has gone live? You could ask co-workers or friends to hit your site, but that requires a lot of effort. Much easier to use a program such as StresStimulus that unleashes dozens or hundreds (or thousands) of virtual users hitting the new version of your site. Additionally, you need to tell those virtual users which pages to visit. And if you have forms on your site, you need to tell them what values to fill in those forms.

Here is how Fiddler and StresStimulus allow you to make this happen:

  • When you run Fiddler, all requests you send from your browser to a web site and all responsesback to your browser are recorded by Fiddler. A request might be simply asking for a web page, or sending values you entered into a form to the web site.
  • Each request/response pair is called a session. When you run Fiddler, they turn up one by one in the left hand pane in Fiddler.
  • This means you can record sessions by simply visiting pages in your site yourself with your browser, and by submitting forms to the site.
  • You can then select one or more sessions within Fiddler and tell StresStimulus to make those into a test case. This tells the virtual users what pages to request and what form values to send.
  • You'll probably want each virtual user to go through the test case more than once. So in addition to setting the number of virtual users, you can also set the number of iterations- that is, the number of times you want the test case to be repeated.
  • To make the test more realistic, you can have StresStimulus use new form values for each iteration using the parameterization feature, set think times, etc. You can see how well your site performed by looking at the test results.

Creating a simple load test

To create your first load test:

  1. First close all web sites you have open, so they don't interfere when you record your first session.
  2. Run Fiddler.
  3. To toggle between the free version and the Pro evaluation version, open the StresStimulus menu in the top left of the Fiddler window and click Use free edition or Use Pro Edition.
  4. Run the test site in the download with this article. Or run your own site. You can run the site locally, from within Visual Studio if you want. Press some buttons to generate page loads.
  5. If you now switch back to Fiddler, you see'll that the requests you generated and their responses now show up in the left hand pane:
  6. Click the Inspectorstab in the right hand pane, and then click one of the sessions in the left hand pane. This shows the headers and content of the request and its response.

  7. Select the requests to play back:
    1. Click the StresStimulustab.
    2. In the left hand pane, click those requests you want to play back while holding down the Ctrl button.
    3. Click the dropdown arrow on the Set Test Case button and select Set Test Case with Selected sessions. Option Set Test Case with All sessions would simply select all sessions in the left hand pane.

  8. Now play back the selected requests once by clicking the Run Test Case Oncebutton. You'll see the requests generated by StresStimulus appear in the left hand pane.

    If you have been using the test site in the download, there is a Open trace log link on the bottom of the page. Click that to see a trace of all traffic to the site, both yours and that generated by StresStimulus.

  9. In the top left of the Fiddler window, open the StresStimulus menu and click Save Test to save your test settings (Pro versions only).

Increasing the load

Playing back a test case once doesn't really qualify as a load test. Let's increase the number of times the test case is executed. While at it, we'll increase the number of virtual users as well:

    1. Expand the Test Configuration tree and click Load Pattern.

    2. Select Constant Load and set Number of Virtual Usersto 10. This gives you 10 virtual users sending requests to your site.
    3. With the Test Configuration tree still expanded, click Test Duration.

  1. Set Test completion criteria to Number of Iterationsand set the number of iterations to 20.
  2. Click the Start Testbutton to start the load test.
  3. If you're using the test site that came in the download with this article, and you still have the trace page open, you can see the requests and responses as they happen by refreshing the page.
  4. You can also see the requests generated by StresStimulus in the Fiddler window itself, but to do this you have to run a debug test instead of a load test. After the previous test has finished, click the drop down button on the Start Test button and select Debug Test. Then click the button to start the test again:

    Debug tests are just load tests except that they show generated sessions during the test. When the rest of this article refers to load tests, it refers to debug tests as well.

  5. After a load test or debug test has completed, you can get rid of all the requests in the left hand pane except for the ones making up your load test by clicking the Reset button.

Test Results

StresStimulus provides these types of test results:

Graphs while a test is running

As soon as you start a load test, StresStimulus switches to the graphs pane, showing graphs of requests per second, average response times, etc.

A second set of graphs in the same pane shows the values of performance counters, such as % Processor Time (Pro versions only). StresStimulus allows you to add performance counters to this graph (how).

After the load test has finished, the graphs stay in place until you run another test. To navigate back to the graphs, expand Test Results and click Graphs.

Detailed reports

After the load test has finished, more detailed reports become available. These are accessible by clicking options under Test Results:

  • Test Summary- detailed summary of how well your site performed, and details of the load test itself (duration, number of virtual users, etc.)
  • Page Details(Pro versions only) - details on how well each individual .aspx page performed, such as minimum, maximum and average response times.
  • Request Details (Pro versions only) - shows details for all requested files, rather than only the .aspx files. Not as detailed as Page Details. Shows how quickly your site serves up images, etc.
  • Iteration Details (Pro versions only) - fairly sparse summary of the iterations executed during the test.

Saving test results to a file

There are two ways to save test results to file. Firstly, by clicking Test Results and then clicking the Create Report button (Pro version only). This gives you an html file you can open in your browser.

Secondly, you can simply copy and paste the results to a file. The Test Summary view can be copied into a text file - just select the text with your mouse and copy and paste. The grids in the Graphs, Page Details, Request Details and Iteration Details views can be copied to a spreadsheet. If you don't have Microsoft Excel, try the spreadsheet in the free Open Office suite.

Requests generated by StresStimulus

During a load test (as opposed to a debug test), StresStimulus doesn't show the generated requests in the left hand pane for performance reasons - there could be many thousands. To see them in the left hand pane after a load test has finished, click the drop down on the Results button:

  • Errors and Timeouts - requests where things went wrong.
  • Primary Requests - only requests for .aspx files, not for images, etc.
  • All Requests - all requests, including those for images, etc.
  • User's Iteration - shows a particular iteration by a particular virtual user.

One issue here is that when you run a normal load test (as opposed to a debug test), StresStimulus doesn't store details for the generated requests beyond the fact that they actually happened. That makes sense, given the space required by possibly tens of thousands of generated requests. But it doesn't help much when you want to know the contents or headers of the requests and their responses from the server.

You can overcome this by running a Debug Test (how). This stores the details of all generated requests and their responses. This allows you to inspect the contents, headers, etc. of all generated requests and their responses (how).

Conclusion

StresStimulus is a simple but capable and above all very easy to use load tester. Unlike other load testing products, it allows you to have your first load test going in a few minutes.

If you are responsible for a small to medium sized site and need to do load testing to ensure that new versions perform well after they've gone live, you should consider giving StresStimulus a try.

Part 2 of this series goes into more advanced features of StresStimulus, such as NTLM authorization and parameterization.

Book: ASP.NET Site Performance Secrets Seeing that you are interested in your site's performance, consider purchasing my book ASP.NET Site Performance Secrets. In a structured approach towards improving performance, it shows how to first identify the most important bottlenecks in your site, and then how to fix them. Very hands on, with a minimum of theory.

Package that speeds up loading of JavaScript, CSS and image files

by Matt Perdeck 20. August 2011 13:14

Introduction

Most web pages include one or more JavaScript files, CSS files and or images (loaded from image tags or from CSS files). If your pages use ASP.NET AJAX toolkit controls, they will be loading .axd files as well. The CombineAndMinify package presented here automatically speeds up the loading of these files into your pages, and reduces the bandwidth spent on loading those files. You'll see how it does that when reading the list of features. The result can be a dramatic improvement in web site performance.

There is no need to change the code of your web site to use CombineAndMinify. You only need to add a dll and update your web.config. Step by step installation instructions are in the installation section.

Contents

Requirements

  • ASP.NET 4 or higher.
  • To compile the included source code, you need any version of Visual Studio 2010. You don't need this if you simply use the binaries.
  • IIS 6 or higher for your live site.

Features

The features below can all be switched on and off individually via the web.config file (full description). If you just install CombineAndMinify and not do any further configuration, it only minifies and combines JavaScript and CSS files.

  • Minifies JavaScript and CSS files. Minification involves stripping superfluous white space and comments. Only JavaScript and CSS files that are loaded from the head sections of your pages are minified.

     

  • Correctly processes files that contain non-English text such as Chinese. CombineAndMinify uses the efficient YUI minifier to minify JavaScript and CSS files that are all in English (that is, contain only ASCII characters). Because the YUI minifier doesn't handle non-ASCII characters (such as Chinese), CombineAndMinify automatically uses the JSMIN algorithm for files containing such characters. JSMIN is less efficient, but handles non-ASCII characters well.

     

  • Combines JavaScript files and CSS files. Loading a single large file is often much quicker than loading a series of small files, because that saves the overhead involved in all the request and response messages.

    If a CSS file contains image urls that are relative to the folder containing the CSS file itself, those urls are fixed up by CombineAndMinify. That way, they continue to work even if CSS files from several different folders are combined.

  • Can be used with sites hosted on low cost shared hosting plans, such as GoDaddy's.

     

  • Processes .axd files as used by the ASP.NET AJAX toolkit (details).

     

  • Allows you to configure CombineAndMinify so it only kicks in in release mode. That way, you see your individual files complete with white space and comments while developing, and reap the performance improvement in your live site.

     

  • Reduces the size of the HTML generated by your .aspx pages by removing white space and comments. Note that the .aspx files themselves are not affected, only the HTML sent to the browser.

     

  • Allows you to configure cookieless domains from which to load JavaScript files, CSS files and images. This way, the browser no longer sends cookies when it requests those files, reducing wait times for the visitor.

     

  • Lets you configure multiple cookieless domains. This causes the browser to load more JavaScript, CSS and image files in parallel.

     

  • Optimizes use of the browser cache by allowing the browser to store JavaScript, CSS and image and font files for up to a year. Uses version ids in file names to ensure the browser picks up new versions of your files right away, so visitor never see outdated files (details on how this works).

     

  • Unlike similar packages, doesn't add query strings when combining files or when inserting versions. This optimizes caching by proxy servers (many proxy servers won't cache files with query strings).

     

  • Converts image file names to lower case, to make it easier for those proxy and browser caches that do case sensitive file name comparisons to find your file in their caches - so they don't request the same file again.

     

  • Preloads images immediately when the page starts loading, instead of when the browser gets round to loading the image tags - so your images appear quicker. You can give specific images priority.

     

  • Helps you detect missing files by throwing an exception when a JavaScript file, CSS file or image is missing. By default, CombineAndMinify handles missing files silently, without throwing an exception.

     

  • To reduce CPU overhead and disk accesses caused by CombineAndMinify, it caches intermediate results, such as minified files. A cache entry is removed when the underlying file is changed, so you'll never serve outdated files.

This CombineAndMinify package is just one way of improving the performance of your web site. My recently released book ASP.NET Performance Secrets shows how to pinpoint the biggest performance bottlenecks in your web site, using various tools and performance counters built into Windows, IIS and SQL Server. It then shows how to fix those bottlenecks. It covers all environments used by a web site - the web server, the database server, and the browser. The book is extremely hands on - the aim is to improve web site performance today, without wading through a lot of theory first.

Server compression

If you are interested in web site performance, you may be interested in this short digression into server compression.

IIS 6 and 7, and Apache as well, provide the option to gzip compress all text files (html, JavaScript, CSS, etc.) sent to the browser. All modern browsers know how to decompress those files. Compression can save you a lot of bandwidth and download time. It is not uncommon to reduce file sizes by way over 50%.

In IIS, compression is switched off by default for dynamic files, such as .aspx files. This is because it increases the load on the CPU. However, with the overabundence of CPU cycles on modern server hardware, switching on compression for dynamic files on your server is almost always a great idea. Also, IIS 6 and 7 allow you to set the compression level, so you can choose a level that you're comfortable with. Finally, IIS 7 can automatically switch off compression when CPU usage goes over a predetermined level (set by you), and switch it back on after CPU usage has dropped below a second level (also set by you). It even lets you cache compressed dynamic files, which makes compression extremely attractive.

Switching on basic compression on IIS 7 is easy, but getting the most out of it is a bit tricky. Switching on compression in IIS 6 is just tricky. Good places to find out more would be here (for IIS 7) and here (for IIS 6).

Or you could read chapter 10 of my book ASP.NET Performance Secrets where it is all spelt out in one place (believe me, this will save you a lot of time).

How it works

When a page is generated by the server, ASP.NET invokes CombineAndMinify while processing the HEAD element of the page. Depending on how you configured CombineAndMinify in web.config, it then minifies and combines CSS and JavaScript files, rewrites image urls to include version ids and cookieless domains, etc.

The minified and/or combined CSS and JavaScript files are written to a separate folder under your web site's root. CombineAndMinify will modify the html being sent to the server, so link and script tags referring to the original files are automatically updated to point to these generated files. Similarly, if you let CombineAndMinify insert version ids in image file names to enable far future caching, CombineAndMinify makes copies of the images with the new file names.

Note that your source code does not get changed by CombineAndMinify. It kicks in while ASP.NET processes the page and modifies the final html that is going to be sent to the browser.

The folder for the generated files will be created if it doesn't exist. By default, it is named "___generated". You can change that name with the generatedFolder attribute.

To minimise CPU overhead, CombineAndMinify uses caching to remember whether it needs to generate new files, etc. To ensure you never serve outdated content, it uses file dependencies to ensure that when any of your files are changed, any dependent generated files are updated immediately. Also, if a generated file is somehow deleted, it is automatically regenerated.

If you do not want CombineAndMinify to generate files on the file system, you can tell it to keep the files in cache using the enableGeneratedFiles attribute. You then need to configure an HTTP Handler (built into CombineAndMinify) in your web.config to serve this cached content. The description of the enableGeneratedFiles attribute shows how to do this.

Installation

  1. Compile CombineAndMinify:
    1. Download the zip file with the source code, and unzip in a directory.
    2. You will find the files CombineAndMinify.dll, EcmaScript.NET.modified.dll and Yahoo.Yui.Compressor.dll in the CombineAndMinify\bin folder.
    3. If you're interested in the source code, open the CombineAndMinify.slnfile in Visual Studio 2010 or later. You'll find that the sources are organized in a solution, with these elements:
      1. Project CombineAndMinify is the actual CombineAndMinify package.
      2. Web site Testsite contains a lot of (functional but rather disorganised) test cases. Ignore this unless you want to test CombineAndMinify.
  2. Update your web site:
    1. Add a reference to CombineAndMinify.dll to your web site (in Visual Studio, right click your web site, choose Add Reference)
    2. Add the custom section combineAndMinifyto your web.config:
      <configuration>
          ...
          <configSections>
              ...
              <section name="combineAndMinify" 
                       type="CombineAndMinify.ConfigSection" 
                       requirePermission="false"/> 
              ...
          </configSections>
          ...
      </configuration>

     

  3. Allow CombineAndMinify to process the head sections of your pages. That way, it can replace the tags loading individual JavaScript and CSS files with tags loading combined files:

    1. Make sure that the head tags of your pages have runat="server", like so:

      <head runat="server">

      When you create a new page in Visual Studio, it creates a head section with runat="server", so you probably don't have to change anything here.

    2. Add a folder called App_Browsers to the root folder of your web site.
    3. Use your favorite text editor (such as Notepad) to create a text file in the App_Browsers folder. Call it HeadAdapter.browser. Into that file, copy and paste this code:
      <browsers>
        <browser refID="Default">
          <controlAdapters>
            <adapter controlType="System.Web.UI.HtmlControls.HtmlHead"
               adapterType="CombineAndMinify.HeadAdapter" />
          </controlAdapters>
        </browser>
      </browsers>
      This tells ASP.NET to leave processing of all HtmlHead controls (which represent the head tags) to the class CombineAndMinify.HeadAdapter (which is part of CombineAndMinify).

     

Combining .axd files

You can skip this section if you do not use the ASP.NET AJAX toolkit.

How .axd files are loaded

Before discussing how you can combine and minify .axd files with CombineAndMinify, lets recap quickly where .axd files come in when a visitor loads your pages.

When you use controls provided by the ASP.NET AJAX toolkit on a page, the toolkit ensures that a number of .axd files are loaded along with the page. These contain the resources needed by the controls:

 

  1. CSS - To load CSS definitions required by the controls you used on the page, it inserts link tags in the head of the page. These all include the file WebResource.axdand a query string to identify which CSS definitions to send to the browser.

     

  2. JavaScript - To load the JavaScript required by the controls on the page, it inserts script tags in the body of the page. These may load a mixture of WebResource.axd files, ScriptResource.axdfiles and the .aspx page itself. Again, a query string is used to identify what JavaScript to send to the browser.

     

  3. Images - To load any images required by the controls, the JavaScript dynamically creates image tags that load the images. The sources of these image tags are WebResource.axdfiles, with a query string to identify the image to send to the browser.

     

 

Instead of storing the .axd files on the file system of your server, the ASP.NET AJAX toolkit keeps them in the .dll files that you added to your site when you installed the toolkit. An HTTP Handler intercepts requests for .axd files from the browser and reads the required CSS, JavaScript or image data from the .dll files, based on the query string sent with the request.

How .axd files are combined

Here is how to combine and minify the .axd files:

 

  1. CSS- CombineAndMinify will process the .axd files with CSS definitions along with the other CSS files.

    Because these files do not physically exist on the web server's file system, CombineAndMinify retrieves their content via a request to the web server instead of reading from the server's file system. Also, when the insertVersionIdInImageUrls configuration option is used to get the browser to re-request script and CSS files that have been updated on the server, a dummy version id is used for .axd files - again because they are not physical files. Apart from that, .axd files are first class citizens that benefit from minification, far future cache expiry and the other features provided by CombineAndMinify.

    This means that CombineAndMinify may combine .axd files with other CSS files. Also, because CombineAndMinify generates its own link tags which all use .js files, you will find no more link tags with .axd files in the head of your pages. The CSS required for the controls will still be loaded, but it will be requested by the browser as .js files rather than .axd files.

    To prevent CombineAndMinify from processing .axd files, set the configuration option enableAxdProcessing to false (it is true by default).

  2. JavaScript - From ASP.NET 3.5 onwards, you can combine most .axd files with JavaScript by simply replacing the ScriptManager control with the ToolkitScriptManagercontrol:

     

    <%@ Page ... %>
    <%@ Register Assembly="AjaxControlToolkit" Namespace="AjaxControlToolkit" 
                    TagPrefix="ajaxToolkit" %>
    ...
    <head runat="server">
    ...
    </head>
    <body>
        ...
        
        <%-- Use ToolkitScriptManager instead of ScriptManager --%>    
        <ajaxToolkit:ToolkitScriptManager ID="scriptManager" runat="server" 
                                             ScriptMode="Release" CombineScripts="true"  />

     

    If you use .Net 4, you'll find that this minifies the JavaScript as well. Note that the ToolkitScriptManager control only combines the .axd files with the JavaScript, not those with the CSS definitions. It is the CombineAndMinify package that will take care of the .axd files with the CSS definitions.

    More information about the ToolkitScriptManager control is at:

     

     

  3. Images - To reduce the number of requests for images, the JavaScript used to implement the controls would need to be modified to for example use CSS sprites. Hopefully, Microsoft will do this one day in the not too distant future.

Additional configuration when using a shared hosting plan

If you use a low cost shared hosting plan for your site, you may find that controls from the ASP.NET AJAX toolkit no longer work well when you use CombineAndMinify.

This is because when CombineAndMinify retrieves .axd files to combine them, it does so by sending web requests to the web server - it has to, because .axd files are not actually stored as files in the file system (they are packed in the toolkit dll files). However, many shared hosting plans run sites in trust level Medium, preventing CombineAndMinify from sending those requests.

You can solve this by adding the following line to the <system.web> section of your web.config (replace yourdomain\.com with your own domain):

<system.web>
    <trust level="Medium" originUrl="http://yourdomain\.com/.*" />
</system.web>

The value of originUrl is a regular expression. The backslash before the .com is not a typo - it is part of the regular expression. The .* at the end matches all files in your site.

This will only make your ASP.NET AJAX toolkit controls and CombineAndMinify work well together when your site is live though. In your development environment, your site may be known by something like localhost:32100, which doesn't match the originUrl.

This is normally not an issue, because by default CombineAndMinify is only active in release mode when you have debug="false" in your web.config (see configuration setting active).

If you decide to activate CombineAndMinify in your development environment - with debug="true" in your web.config and active set to Always - you need to set originUrl to a regular expression that matches your site's domain in your development environment:

<system.web>
    <trust level="Medium" originUrl="http://localhost:\d{1,5}/.*" />
</system.web>

This matches domain localhost with a port that has 1 to 5 digits.

Using shared hosting

From version 1.6, CombineAndMinify can be used with hosting companies that run your site with trust level Medium, such as GoDaddy. As a result, you should be able to use it with any hosting company that supports ASP.NET.

What is trust level Medium? All ASP.NET sites are given a trust level. If your site has trust level Full - such as in your development environment - it can basically do anything it likes on the web server. Because this may impact other sites on the same web server, many hosting companies give your site trust level Medium if you use one of their shared hosting plans (where you site shares a server with other sites).

This means that your C# or VB.Net code can't do all sorts of things that in Microsoft's opinion may impact on other sites on the same server. As a result, you may find that code that runs fine in your local development environment will no longer work when you try to take it live. Unless you develop with trust level Medium in mind from the start, you could spend a while making it work in the shared hosting environment.

To prevent these problems, if you are developing a site that will use a shared hosting plan, you should give your site trust level Medium even in your development environment. That way, you catch any issues early. You can achieve this by including a <trust> tag in the <system.web> section of your web.config file:

<system.web>
    <trust level="Medium" />
</system.web>

Configuration

By default, CombineAndMinify minifies and combines JavaScript and CSS files. To use the other features, or to switch off minification or combining of JavaScript or CSS files, add a combineAndMinify element to your web.config file, like so:

<configuration>
    ...
    <combineAndMinify ... >

    </combineAndMinify>
    ...
</configuration>

The combineAndMinify element supports these attributes and child elements:

 

active

Determines when CombineAndMinify is active. When it is not active, it doesn't affect your site at all and none of the other attributes or child elements listed in this section have any effect.

ValueDescription
Never Package is never active, irrespective of debug mode
Always
(default)
Package is always active, irrespective of debug mode
ReleaseModeOnly Package is only active in release mode
DebugModeOnly Package is only active in debug mode

Example

<configuration>
    ...
    <combineAndMinify active="ReleaseModeOnly" >

    </combineAndMinify>
    ...
</configuration>

Whether your site is in debug or release mode depends on the debug attribute of the compilation element in your web.config file. If that attribute is set to false, your site is in release mode (as it should be when live). When it is set true, it is in debug mode (as it should be in development). It looks like this in web.config:

<configuration>
    ...
    <system.web>
        <compilation debug="true">
            ...
        </compilation>
        ...
    </system.web>
    ...
</configuration>

You may find it easier to debug your site with CombineAndMinify deactivated - minified JavaScript files are not easy to read. To ensure that CombineAndMinify is active only in Release mode, set active to ReleaseModeOnly, as shown in the example above.

Note that the active attribute acts as a master switch for the whole CombineAndMinify package. It's like the ignition in a car - if you don't turn on the ignition (or at least the battery), pressing any other buttons on the dashboard won't do anything.

combineCSSFiles

Determines whether CombineAndMinify combines CSS files.

ValueDescription
None CSS files are never combined
PerGroup
(default)
CSS files are combined per group. See explanation below.
All All CSS files are combined into a single CSS file.

Example

<configuration>
    ...
    <combineAndMinify combineCSSFiles="All" >

    </combineAndMinify>
    ...
</configuration>

To see what is meant with "per group", have a look at this example:

<link rel="Stylesheet" type="text/css" href="/css/site1.css" />
<link rel="Stylesheet" type="text/css" href="/css/site2.css" />

<script type="text/javascript" src="js/script1.js"></script> 

<link rel="Stylesheet" type="text/css" href="/css/site3.css" />
<link rel="Stylesheet" type="text/css" href="/css/site4.css" />

By default, CombineAndMinify tries to ensure that the order of the JavaScript and CSS definitions doesn't change when JavaScript and CSS files are combined. It does this by grouping CSS files whose tags are after each other. In this case, there are 2 groups - site1.css and site2.css, and site3.css and site4.css. This causes the browser to load the CSS and JavaScrip definitions in the exact same order as when the files had not been combined:

 

  1. Combined file with all CSS definitions in site1.css and site2.css
  2. script1.js
  3. Combined file with all CSS definitions in site3.css and site4.css

 

You get this behaviour when you set combineCSSFiles to PerGroup (which is the default). CombineAndMinify supports grouping for JavaScript files as well, which is controlled by the combineJavaScriptFiles attribute.

Combining all CSS files into one file

If you set combineCSSFiles to All, all CSS files get combined into a single file. The link tag to load that combined CSS file gets placed where the link tag of the first CSS file used to be. In our example, that causes this load order:

  1. Combined file with all CSS definitions in site1.css, site2.css, site3.css and site4.css
  2. script1.js

As you see, the CSS definitions in site3.css and site4.css now get loaded before script1.js, instead of after script1.js. As a result, the number of CSS files that need to be loaded is reduced from two to one (as compared with grouping), but you also change the order in which CSS and JavaScript definitions are loaded. It depends on your site whether that is an issue or not.

Loading CSS files from another site

If you decide to set combineCSSFiles to All, be sure that your site doesn't load CSS files from another web site. This is uncommon, but if yours is one of the rare sites that does this, consider this example:

<link rel="Stylesheet" type="text/css" href="/css/site1.css" />
<link rel="Stylesheet" type="text/css" href="/css/site2.css" />

<link rel="Stylesheet" type="text/css" href="http://anothersite.com/css/other.css" />

<link rel="Stylesheet" type="text/css" href="/css/site3.css" />
<link rel="Stylesheet" type="text/css" href="/css/site4.css" />

CombineAndMinify never combines CSS files from other sites - not with CSS files from your site, not with CSS files from the other site. As a result, if you change combineCSSFiles to All, CombineAndMinify will combine all CSS files loaded from your site (but not those from the other site), causing the following load order:

  1. Combined file with all CSS definitions in site1.css, site2.css, site3.css and site4.css
  2. http://anothersite.com/css/other.css

The definitions in other.css now came after those in site3.css and site4.css, meaning they may take precedence - which could break your CSS.

combineJavaScriptFiles

Determines whether CombineAndMinify combines JavaScript files.

ValueDescription
None JavaScript files are never combined
PerGroup
(default)
JavaScript files are combined per group. See explanation below.
All All JavaScript files are combined into a single JavaScript file.

Example

<configuration>
    ...
    <combineAndMinify combineJavaScriptFiles="All" >

    </combineAndMinify>
    ...
</configuration>

As you saw in the description of combineCSSFiles, CombineAndMinify groups JavaScript files in the same way as CSS files. Similarly, if you set combineJavaScriptFiles to All, all JavaScript files that are loaded from your site get combined into a single JavaScript file.

However, there is one major difference with combining CSS files: When you combine all JavaScript files into a single file (combineJavaScriptFiles is All), the script tag for the combined file winds up at the location where the last original script tag used to be. This in contract to a combined CSS file, which winds up at the location of the first original CSS tag.

This makes life easier if you load JavaScript libraries from extenal sites. If you use a popular CombineAndMinify package such as jQuery, you can load it from the free Google Content Delivery Network (CDN) and also from the free Microsoft CDN - which saves you bandwidth and download time (details about those CDNs are in chapter 13 of my book).

Take this example:

<script type="text/javascript" src="/js/script1.js" ></script>
<script type="text/javascript" src="/js/script2.js" ></script>

<!-- load jQuery from free Google CDN -->
<script type="text/javascript" 
           src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>

<script type="text/javascript" src="/js/script3.js" ></script>
<script type="text/javascript" src="/js/script4.js" ></script>

script3.js and script4.js may well be dependent on jquery.min.js. If combineJavaScriptFiles is set to PerGroup, you wind up with this load order:

  1. Combined file with all JavaScript definitions in script1.js and script2.js
  2. jquery.min.js
  3. Combined file with all JavaScript definitions in script3.js and script4.js

So all definitions load in the same order as before the files were combined, as you'd expect. Now when you set combineJavaScriptFiles to All, you wind up with this load order:

  1. jquery.min.js
  2. Combined file with all CSS definitions in script1.js, script2.js, script3.js and script4.js

This would probably still work well, because the definitions in script3.js and script4.js still get loaded after those in jquery.min.js. script1.js and script2.js now get loaded after jquery.min.js, but than jquery.min.js wouldn't have been dependent on those files anyway.

minifyCSS

Determines whether CombineAndMinify minifies CSS files.

ValueDescription
true
(default)
CSS files get minified
false CSS files do not get minified

Example

<configuration>
    ...
    <combineAndMinify minifyCSS="false" >

    </combineAndMinify>
    ...
</configuration>

Minifying a file means removing redundant white space and comments. This not only reduces your bandwidth usage and download times, but also makes it harder for outsiders to reverse engineer your web site. It also encourages developers to add comments to their CSS files (and especially their JavaScript files), now that those comments do not travel over the wire to the browser.

It is normally safe to minify files, so this feature is enabled by default. However, just in case removing white space or comments breaks your file, the option exists to switch it off.

Remember that CombineAndMinify doesn't minify your source files. It reads your source files, and minifies their content before sending them to the browser. To save CPU cycles, it caches the minified versions, using file dependencies to ensure cached versions are removed the moment you change the underlying files.

minifyJavaScript

Determines whether CombineAndMinify minifies JavaScript files.

ValueDescription
true
(default)
JavaScript files get minified
false JavaScript files do not get minified

Example

<configuration>
    ...
    <combineAndMinify minifyJavaScript="false" >

    </combineAndMinify>
    ...
</configuration>

cookielessDomains

To have your JavaScript, CSS and image files loaded from one or more cookieless domains, specify those domains using a cookielessDomains child element.

Example

<configuration>
    ...
    <combineAndMinify ... >
        <cookielessDomains>
            <add domain="http://static1.mydomain.com"/>
            <add domain="http://static2.mydomain.com/"/>
        </cookielessDomains>
    </combineAndMinify>
    ...
</configuration>

Cookieless domains help you boost web site performance in two ways. They let you cut down on the overhead created by cookies. And they let you get the browser to load more JavaScript, CSS and image files in parallel. Let's look at the cookie overhead first, and then the parallel loading.

Cookie overhead

If your site sets a cookie on the browser, then each time the browser sends a request to your site, that request contains the cookie. The issue is that the cookie is not only sent when requesting an .aspx file, but also when requesting static files, such JavaScript files, CSS files and images. In most cases, this is a waste of bandwidth (the exception would be if you have a handler that processes for example JavaScript files and that uses the cookie).

However, the browser won't send the cookie to another domain or sub domain. So if your page is at http://www.mydomain.com/page.aspx (using subdomain www), and you put your images and other static files on http://static1.mydomain.com (using subdomain static1), than the browser won't send cookies when requesting static files.

As an aside, if your site uses cookies (or ASP.NET sessions, which uses cookies), you should never allow visitors to access your pages without a subdomain. That is, don't allow them to access http://mydomain.com/page.aspx. Otherwise, if a visitor first accesses http://www.mydomain.com/page.aspx (using subdomain www), sets a cookie, and then comes back via http://mydomain.com/page.aspx (no subdomain), the browser won't send the cookie! IIS 7 makes it very easy to redirect requests to http://mydomain.com to http://www.mydomain.com using an entry in web.config. See Microsoft's iis.net, or the image control adapter included in chapter 12 of my book ASP.NET Site Performance Secrets.

Parallel loading

When a browser loads a page, it loads the static files (images, JavaScript files, CSS files) in parallel to reduce the visitor's wait time. However, the browser limits the number of files that are loaded in parallel. Modern browsers (IE7 and better, Firefox, Chrome, Safari) have a limit of about 6, while older browsers (such as IE6) have a limit of 2.

However, this limit is per (sub)domain. It isn't per IP address. This means that if you spread your static files over for example 2 cookieless domains, you allow the browser to load up to two times more files in parallel.

Using cookieless domains on your site

If you add a cookielessDomains element with one or more domains to the combineAndMinify element, CombineAndMinify adds those domains to the urls of all static files in your site. This includes images referenced from CSS files.

This means that if you use:

<configuration>
    ...
    <combineAndMinify ... >
        <cookielessDomains>
            <add domain="http://static1.mydomain.com"/>
        </cookielessDomains>
    </combineAndMinify>
    ...
</configuration>

Then for example

<img src="images/ball3.png" height="10" width="10" />

is replaced by

<img src="http://static1.mydomain.com/images/ball3.png" height="10" width="10" />

If you define multiple domains, such as:

<configuration>
    ...
    <combineAndMinify ... >
        <cookielessDomains>
            <add domain="http://static1.mydomain.com"/>
            <add domain="http://static2.mydomain.com"/>
        </cookielessDomains>
    </combineAndMinify>
    ...
</configuration>

Then CombineAndMinify attempts to spread the files over the available domains (note that you can add more than 2 domains if you want). This to get the browser to load more files in parallel. So if you have these image tags:

<img src="images/ball3.png" />
<img src="images/woodentoy4.png" />

You would wind up with:

<img src="http://static2.mydomain.com/images/ball3.png" />
<img src="http://static1.mydomain.com/images/woodentoy4.png" />

Log into your DNS name server to create the subdomains. If your site is hosted by an external hosting company, their control panel will probably let you create subdomains - if in doubt, ask them. Create the static1, static2, etc. subdomains (you can use any subdomain names you like). Make sure they point to the same IP address as your www subdomain. This way, you don't have to physically move your static files. Note that every subdomain that is not your www subdomain acts as a "cookieless" subdomain - it isn't like there are special "cookieless" subdomains as such.

Contrary to what you may think, if you have say 2 cookieless domains, CombineAndMinify won't use one domain for half the static files and the other domain for the other half. This is because it needs to make sure that on every page, a given static file is always given the same domain. If images/ball3.png were turned into http://static1.mydomain.com/images/ball3.png on one page, but to http://static2.mydomain.com/images/ball3.png on a second page, than the browser won't find ball3.png in its cache when it hits the second page, even if it stored ball3.png when it accessed the first page. Because of the different domains, it would regard the two urls as different, even though they actually point at the same resource.

Because of this requirement, CombineAndMinify uses the hash code of the file name to work out which domain to use. So if there are two domains, than if the hash is even it uses the first domain, and if it is odd it uses the second domain. Because it is unlikely that 50% of file names have an even hash code, you are unlikely to get a perfect distribution of the static files over the available domains.

enableCookielessDomains

Determines whether cookieless domains are used.

ValueDescription
Never Cookieless domains are never used.
Always
(default)
Cookieless domains are always used, provided that 1) CombineAndMinify is active, and 2) you have defined a cookielessDomains element with cookieless domains.
ReleaseModeOnly Cookieless domains are only used in release mode.
DebugModeOnly Cookieless domains are only used in debug mode.

Example

<configuration>
    ...
    <combineAndMinify active="Always" enableCookielessDomains="ReleaseModeOnly" >
        <cookielessDomains>
            <add domain="http://static1.mydomain.com"/>
            <add domain="http://static2.mydomain.com"/>
        </cookielessDomains>
    </combineAndMinify>
    ...
</configuration>

This option is really only useful if you decide to activate CombineAndMinify in your development environment. In that case, you may decide to only use cookieless domains in release mode, while using all the other features of CombineAndMinify in both release and debug mode.

The reason for this is that if you have new images in your development environment that are not yet on your live site, than they won't show up in your development environment if you use the cookieless domains - which point to your live site.

If you want to take that route, set enableCookielessDomains to ReleaseModeOnly.

The default value for enableCookielessDomains is Always. However, keep in mind that for CombineAndMinify to use cookieless domains, it has to be active. And by default, it is only active in release mode.

preloadAllImages

Determines whether CombineAndMinify inserts code that preloads all images when the page starts loading.

ValueDescription
true All images are preloaded
false
(default)
No images are preloaded (except for those specified using child element prioritizedImages, described further below)

Example

<configuration>
    ...
    <combineAndMinify preloadAllImages="true" >

    </combineAndMinify>
    ...
</configuration>

Normally, the browser only starts loading an image after it has encountered its image tag in the HTML or in a CSS file. If the image tag is towards the end of a big page, or if it takes a while to load the CSS file, it can take a while before image loading starts.

To get the browser to start loading all images immediately when the page itself starts loading, set the preloadAllImages attribute to true. CombineAndMinify than generates JavaScript at the start of the page head to load each image into the browser cache, along these lines:

<script type="text/javascript">
var img0=new Image();img0.src='/images/ball3.png';
var img1=new Image();img1.src='/css/chemistry.png';
var img2=new Image();img2.src='/images/toy4.png';
...
</script>

Now when the browser encounters an image tag, the image is already in browser cache, so the browser can show the image right away.

prioritizedImages

Allows you to prioritize certain images for preloading.

Example

<combineAndMinify ... >
    <prioritizedImages>
        <add url="/images/salesbutton.png"/>
        <add url="/images/logo.png"/>
    </prioritizedImages>
</combineAndMinify>

If you have many images on your pages, you may want to prioritize certain images. For example, if your "order now" button is an image, you want that image in front of your visitors as soon as possible.

You can use prioritizedImages without setting preloadAllImages to true. Here is how these two attributes interact:

prioritizedImagespreloadAllImagesImages preloaded
Empty or not present true All the images referred to from CSS files and all the images on the page are preloaded. They are loaded in the order in which their tags appear in the CSS or in the HTML. Images referred to from CSS are preloaded before images on the page itself.
Empty or not present false None
One or more urls true All urls listed in prioritizedImages are preloaded first, than all the other images.
One or more urls false Only the urls listed in prioritizedImages are preloaded

You can list any image urls in prioritizedImages, not just urls of images that are used on the page. If you know which page visitors are likely to go to next, you could use this to preload the images used on that next page.

makeImageUrlsLowercase

Determines whether CombineAndMinify makes all image urls lowercase.

ValueDescription
true All images urls are converted to lowercase
false
(default)
Images url casing is left as it is

Example

<configuration>
    ...
    <combineAndMinify makeImageUrlsLowercase="true" >

    </combineAndMinify>
    ...
</configuration>

You may be using inconsistent casing in your web pages to refer to the same image. For example:

<img src="/images/woodentoy4.png" height="10" width="10" />
...
<img src="/images/WoodenToy4.png" height="10" width="10" />

Assume a browser or proxy loads woodentoy4.png and stores it in its cache. When it then needs to load WoodenToy4.png, it may not recognize it is the same as the woodentoy4.png that it already has in cache, and send a superfluous request for WoodenToy4.png.

To prevent this, set makeImageUrlsLowercase to true. This way, all images urls in the HTML and CSS sent to the browser will be lowercase, so there is no inconsistent casing. Note that CombineAndMinify doesn't change your source files. Instead, it changes the HTML that gets generated from your source files and sent to the browser.

insertVersionIdInImageUrls

Determines whether CombineAndMinify inserts version ids in image file names.

ValueDescription
true Version ids are inserted in image file names.
false
(default)
No version ids are inserted in image file names

Example

<configuration>
    ...
    <combineAndMinify insertVersionIdInImageUrls="true" >

    </combineAndMinify>
    ...
</configuration>

When a browser receives an image, it stores it in its browser cache. That way, if it needs to load the image again, it may still be in cache, so there is no need to send a request to the server. The result is less waiting for the visitor and less bandwidth used by your server.

One issue here is how long the browser should keep the image in its cache. Too long, and you may have visitors looking at outdated images. Too short, and the browser sends more requests for the image than necessary.

By setting insertVersionIdInImageUrls to true, you get the best of both worlds:

  • It causes CombineAndMinify to insert a version id into the image file name as used in image tags (both in the page HTML and in the CSS). CombineAndMinify calculates that version id from the last update time of the image file - so if you update an image, the version id changes. That way, when you update an image, the browser immediately picks up the new image. It won't pick up the image it has in cache, because that has a file name with the old version id.
  • Because of this, you can now tell the browser to cache images for up to a year - the maximum you can ask for according to the HTTP specification - without ever presenting outdated images to the visitor.

If you use IIS 7, you can tell the web server to cache all static files - images, CSS files, JavaScript files, etc. - for a year by adding a staticContent element to the system.webServer in your web.config:

<configuration>
    ...
    <system.webServer>
        ...
        <staticContent>
            <clientCache cacheControlCustom="Cache-Control: public"
                cacheControlMode="UseMaxAge"
                cacheControlMaxAge="365.00:00:00"/>
        </staticContent>
        ...
    </system.webServer>
    ...
</configuration>

Doing this for IIS 6 is a bit more complicated. Because most people will be using IIS 7 by now, I'll just refer to chapter 12 of my book ASP.NET Performance Secrets for instructions on how to set caching for static files in IIS 6.

A few more details about how this feature works:

  • There is no need to change your image file names manually. CombineAndMinify will create copies of the image files, with names that include the version ids (unless you changed this with the enableGeneratedFiles attribute). It also updates the page html on the fly before it is sent to the browser, so the browser will request those copies. Note that your actual source code or image file names are not changed.
  • To find out the version id, CombineAndMinify needs to access the file system to find out the last update time of the image file. You don't want this to happen for each request, so CombineAndMinify stores the version ids in server cache. These cache entries are invalidated the moment the underlying file changes, so the cache is never outdated.
  • Why insert the version id in the file name? Why not just add it as a query string? That would be a bit easier to handle. However, proxies (intermediate servers that pass on messages on the Internet) are less likely to cache files with query strings. So by inserting the version id in the file name instead of using a query string, you gain maximum proxy caching.
  • There is no counterpart of insertVersionIdInImageUrls for JavaScript and CSS files, because CombineAndMinify always uses version ids for those files.

For a lot more on browser caching and proxy caching, see www.iis.net or chapter 12 of my my recently released book ASP.NET Performance Secrets.

insertVersionIdInFontUrls

Determines whether CombineAndMinify inserts version ids in font file names loaded with the @font-face rule. If you don't use the @font-face rule in your CSS files, you can safely skip this attribute.

ValueDescription
true Version ids are inserted in font file names.
false
(default)
No version ids are inserted in font file names

Example

<configuration>
    ...
    <combineAndMinify insertVersionIdInFontUrls="true" >

    </combineAndMinify>
    ...
</configuration>

CSS allows you to load font files using the @font-face rule. insertVersionIdInFontUrls lets you insert version ids in the names of those files. It is to font files what insertVersionIdInImageUrls is to image files. See the description of insertVersionIdInImageUrls for the principle behind these two attributes.

exceptionOnMissingFile

Determines whether CombineAndMinify throws an exception when an image file is missing.

ValueDescription
Never
(default)
CombineAndMinify never throws an exception when an image file is missing.
Always CombineAndMinify always throws an exception when an image file is missing.
ReleaseModeOnly CombineAndMinify only throws an exception if the site is in release mode.
DebugModeOnly CombineAndMinify only throws an exception if the site is in debug mode.

Example

<configuration>
    ...
    <combineAndMinify exceptionOnMissingFile="DebugModeOnly" 
                         insertVersionIdInImageUrls="true" >

    </combineAndMinify>
    ...
</configuration>

Assume insertVersionIdInImageUrls is set to true, so CombineAndMinify inserts a version id in all image urls. This means it has to access each image file to find its last updated time. What happens if the image file cannot be found? That is determined by the exceptionOnMissingFile attribute:

  • If exceptionOnMissingFile is active (see table above) and CombineAndMinify finds that an image file cannot be found, it throws an exception with the path of the image. That makes it easier to find missing images.
  • If exceptionOnMissingFile is not active, CombineAndMinify doesn't throw an exception but recovers by not inserting a version id in the image url.

If all images should be present in your development environment, than it makes sense to set exceptionOnMissingFile to DebugModeOnly. That way, you quickly find broken images while developing your site, while preventing exceptions in your live site where you probably prefer a broken image over an exception.

What about JavaScript and CSS files? CombineAndMinify accesses these files when combining and / or minifying them:

  • If exceptionOnMissingFile is active and a JavaScript or CSS files can't be found, you'll get an exception, just as with images.
  • If exceptionOnMissingFile is not active and a JavaScript or CSS files can't be found, it just writes a comment in the combined and / or minified file, specifying the full name of the file that couldn't be found.

 

Keep in mind that if you want exceptions while the site is in debug mode, you have to ensure that CombineAndMinify is actually active in debug mode - set active to Always to make that happen.

removeWhitespace

Determines whether CombineAndMinify removes superfluous white space and comments from the HTML of the page.

ValueDescription
true Superfluous white space and comments are removed.
false
(default)
No superfluous white space and comments are removed.

Example

<configuration>
    ...
    <combineAndMinify removeWhitespace="true" >

    </combineAndMinify>
    ...
</configuration>

When you set removeWhitespace to true, CombineAndMinify removes all HTML comments from the page and collapses all runs of white space into a space. However, if a run of white space contains one or more line breaks, it is collapsed into a line break. That way, inline JavaScript will not be broken.

enableAxdProcessing

Determines whether CombineAndMinify processes .axd files. For more information about this feature, see the section Combining .axd files.

ValueDescription
true
(default)
.axd files are processed.
false .axd files are not processed.

Example

 

<configuration>
    ...
    <combineAndMinify enableAxdProcessing="true" >

    </combineAndMinify>
    ...
</configuration>

 

headCaching

Determines how tags for combined JavaScript files and CSS files are cached.

ValueDescription
None
(default)
Caching of replacement tags is switched off.
PerSite There is a single cache entry for the entire site.
PerFolder There is a cache entry per folder.
PerPage There is a cache entry per page (ignoring any query string).
PerUrl There is a cache entry per url (including any query string).

Example

<configuration>
    ...
    <combineAndMinify headCaching="PerSite" >

    </combineAndMinify>
    ...
</configuration>

Even though CombineAndMinify caches all minified and combined files, there is still some work involved in replacing tags of individual JavaScript files and CSS files with tags of combined files. Without further caching, this needs to be done for each page request.

To reduce the CPU usage involved in this, CombineAndMinify provides the option to cache the replacement tags. The recommended way to do this depends on the way you load your JavaScript and and CSS files:

SituationRecommended
Value
The additional CPU usage of CombineAndMinify is not an issue. Or the tags to load JavaScript and CSS files are totally ad hoc per page. None
(default)
All pages load the same JavaScript and CSS files in the same order. For example, all pages uses a single master page, and the master page has all the script and link tags to load the JavaScript and CSS files. PerSite
Your pages are split over folders, and the JavaScript and CSS files you load depend on the folder. For example, pages in the admin folder all load the same JavaScript and CSS files in the same order, but those files are different from the ones loaded by pages in the products folder. PerFolder
Each page loads different JavaScript and / or CSS files. However, the query string doesn't influence which files are loaded. So toys.aspx?id=1 and toys.aspx?id=2 load the same files in the same order, but categories.aspx loads different files. PerPage
The JavaScript and CSS files used by a page depend on the entire url, including its query string. So toys.aspx?id=1 and toys.aspx?id=2 load different files. PerUrl

The headCaching attribute really comes into its own if you load JavaScript and CSS files from a master page or a shared user control. This is because CombineAndMinify caches entire groups of tags, and the tags to replace those groups with. This process is sensitive to for example white space in between tags. That means that

<script type="text/javascript" src="/js/script1.js" ></script>
<script type="text/javascript" src="/js/script2.js" ></script>

and

<script type="text/javascript" src="/js/script1.js" ></script>

<script type="text/javascript" src="/js/script2.js" ></script>

are not the same, due to the extra white line in the second block.

generatedFolder

Determines the name of the folder into which generated files are stored.

TypeDefault
String ___generated

Example

<configuration>
    ...
    <combineAndMinify generatedFolder="anotherGeneratedFolder" >

    </combineAndMinify>
    ...
</configuration>

As you saw earlier on, unless you set generatedFolder to false, CombineAndMinify writes the processed versions of files (such as minified JavaScript files) to a separate folder. It also modifies the html that is being sent to the browser, so script tags, etc. pick up these generated files.

If this folder doesn't exist, it is created automatically by CombineAndMinify.

By default, the name of this folder is ___generated, and it lives in the root folder of your site. Normally, this name doesn't clash with anything. But if it does, the generatedFolder attribute allows you to specify another name.

enableGeneratedFiles

Determines whether processed versions of files are stored on disk or in memory.

ValueDescription
true
(default)
Processed versions of files are stored on disk.
false Processed versions of files are stored in memory.

Example

<configuration>
    ...
    <combineAndMinify enableGeneratedFiles="false" >

    </combineAndMinify>
    ...
</configuration>

CombineAndMinify processes files into new files - it minifies CSS and Javascript files and combines them, it inserts version ids in images file and font files, etc. Because doing all this processing for each request is inefficient, the processed versions of these files need to be stored somehow so they can be reused for many requests.

CombineAndMinify provides two ways to store processed files:

  • On disk, in a separate folder. This is the default. Not only is this a simple method, it also means that the processed files can be treated as simple static files. As a result, IIS will read these files into fast kernel cache - which is faster than the normal cache used by the second method below. Essentially, the advantage of this option is better performance.

     

    Additionally, cookieless domains can only be used when processed files are stored on disk.
  • In memory, in the ASP.NET cache. This is slower than the first method, and is more complicated because it means using an HTTP Handler to process all requests for CSS, JavaScript, image and font files, to read them from memory rather than disk. The advantage however is that no disk space is used for processed files.

     

    Note that Cassini, the web server built into Visual Studio, does not support HTTP Handers. To use CombineAndMinify with enableGeneratedFiles set to false while debugging, install IIS 7 on your development machine and get Vistual Studio to use IIS 7 instead of Cassini.

If you decided to store processed files in memory instead of on disk, set enableGeneratedFiles to false and follow the Additional Installation instructions below to configure the required HTTP Handler.

Additional Installation for IIS 7

If you use IIS 7, use the instructions in this section. Skip to the next section if you use IIS 6 or IIS 7 in classic mode.

To configure the HTTP Handler, add the following to your web.config:

</configuration>
    ...
    <system.webServer>
        <validation validateIntegratedModeConfiguration="false"/>
        ...
        <handlers>
            ...
            <add name="JavaScriptHandler" verb="*" path="*.js"
                 type="CombineAndMinify.HttpHandler, CombineAndMinify" 
                 resourceType="Unspecified"/>
            <add name="CssHandler" verb="*" path="*.css" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>

            <!-- required if you use the insertVersionIdInImageUrls attribute -->
            <add name="GifHandler" verb="*" path="*.gif" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>
            <add name="PngHandler" verb="*" path="*.png" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>
            <add name="JpegHandler" verb="*" path="*.jpg" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>

            <!-- required if you use the insertVersionIdInFontUrls attribute -->
            <add name="WoffHandler" verb="*" path="*.woff" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>
            <add name="TtfHandler" verb="*" path="*.ttf" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>
            <add name="SvgHandler" verb="*" path="*.svg" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>
            <add name="EotHandler" verb="*" path="*.jpg" 
                 type="CombineAndMinify.HttpHandler, CombineAndMinify"  
                 resourceType="Unspecified"/>

        </handlers>
        ...
    </system.webServer>
    ...
</configuration>

Additional Installation for IIS 6

  1. Configure the HTTP Handler in your web.config:

    <configuration>
        ...
        <system.web>
            ...
            <httpHandlers>
                ...
                <add verb="*" path="*.js" 
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.css" 
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
    
                <!-- required if you use the insertVersionIdInImageUrls attribute -->
                <add verb="*" path="*.gif"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.png"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.jpg"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
    
                <!-- required if you use the insertVersionIdInFontUrls attribute -->
                <add verb="*" path="*.woff"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.ttf"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.svg"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
                <add verb="*" path="*.eot"  
                        type="CombineAndMinify.HttpHandler, CombineAndMinify" />
    
            </httpHandlers>
            ...
        </system.web>
        ...
    </configuration>
        
  2. Send all requests for JavaScript and CSS files (and optionally image and font files) to the ASP.NET handler. The ASP.NET handler will pass these on to the HTTP Handler (because of the lines you added to your web.config), allowing it to serve them from memory.

    1. Open the IIS Manager - click Start | Administrative Tools | Internet Information Services (IIS) Manager.
    2. Expand your server. Expand Web Sites. Right click your web site and choose Properties.
    3. Click the Home Directory tab, and then click the Configuration button.
    4. Get the path to the ASP.NET handler:
      1. Click the line with the .aspx extension.
      2. Click the Edit button.
      3. Copy the contents of the Executable field. For .Net 2 and .Net 3.5, it will be C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll. For .Net 4, it will be C:\Windows\Microsoft.NET\Framework\v4.0.30319\aspnet_isapi.dll.
      4. Click Cancel to dismiss the dialog.
    5. Tell IIS 6 to let ASP.NET handle all requests for files with extension .js:

      1. Click the Add button.
      2. Paste the path to the ASP.NET handler you just found into the Executable field.
      3. Enter .js into the Extension field.
      4. Uncheck the Verify that file exists box.
      5. Click OK.
    6. Repeat the last step, for extentions .css.
    7. If you set insertVersionIdInImageUrls to true, repeat the last step for these extensions:
      • .gif
      • .png
      • .jpg
    8. If you set insertVersionIdInFontUrls to true, repeat the last step for these extensions:
      • .woff
      • .ttf
      • .svg
      • .eot
    9. Click OK to dismiss the Configuration dialog.
    10. Click the Apply button, and then click OK to dismiss the Properties dialog.

Conclusion

The CombineAndMinify package will improve the performance of any web site that loads JavaScript, CSS, image or font files. Please give it a try with your site. If you find any bugs or have any trouble with the documentation, let me know and I'll try to fix the issue. Feature requests would be welcome too.

History

 

VersionReleasedDescription
1.0 6 Nov 2010 Initial release.
1.1 13 Nov 2010 Bug fixes. CombineAndMinify now correctly handles inlined images and image urls in CSS files surrounded by quotes.

CombineAndMinify can now be used in conjunction with Microsoft's Sprite and Image Optimization Framework. That framework combines several small images into one, reducing overall load times.

1.2 23 Jan 2011 Several bug fixes and enhancements, including:

If a JavaScript or CSS file contains non-ASCII characters (such as Chinese or Greek), it no longer uses the YUI minifier, because this mangles non-ASCII characters. Instead, it uses a port of JSMIN (class JsminCs in the sources). Note that for JavaScript or CSS files that only have ASCII characters (which would almost always be the case for sites in English), CombineAndMinify continues to use the more efficient YUI minifier.

CSS files with different media (such as 'screen', 'print', etc.) are now processed correctly (CombineAndMinify used to incorrectly combine CSS files with different media).

CombineAndMinify now processes script tags and link tags (for CSS files) with absolute URLs, as long as the URL points to a file on the site itself.

CombineAndMinify no longer attempts to process images generated by an HTTP Handler.

1.2.1 30 Jan 2011 Fixed a bug related to link tags that have a media property and also load an external stylesheet. Also fixed a bug that affected CSS files that both contain non-ASCII characters and use external background images.
1.3 12 Feb 2011 Added support for the @font-face rule. Added the insertVersionIdInFontUrls web.config attribute.
1.3.1 19 Mar 2011 Minor bug fix to prevent the page crashing if it contains a LiteralControl with null Text.
1.3.2 10 May 2011 When minifying a CSS file containing non-English characters, too many spaces were removed thereby breaking descendant selectors. This has now been fixed.
1.4 15 May 2011 CombineAndMinify now processes .axd files. This will improve page load times when you use ASP.NET AJAX toolkit controls.
1.4.1 22 May 2011 Two related bug fixes:

1) Script and link tags commented out with <!-- --> are no longer processed.

2) Conditional incudes of the form

<!--[if IE 7]>
<link href="css/ie7Fixes.css" rel="stylesheet" type="text/css" />
<![endif]-->
will no longer be combined with other script or css files. Note that conditional includes do not get processed at all by CombineAndMinify (they get ignored as comments), so there is no minification, etc.
1.4.2 16 June 2011 CombineAndMinify no longer crashes when confronted with a static file with a query string, such as "script1.js?fp789". It now ignores any query string on static files, because query strings normally do not change the contents of these files. Note that this does not include .axd files, where the query string does determine the content of the file.
1.5 1 July 2011 CombineAndMinify now sends ETags with generated content.
1.6 6 July 2011 CombineAndMinify can now run with trust level medium. This means it can now be used on most (if not all) shared hosting plans that support ASP.NET.

If you are upgrading from an earlier version:

1) You need to modify your web.config, to include requirePermission="false" in the <section> definition:

<section name="combineAndMinify" type="CombineAndMinify.ConfigSection"
         requirePermission="false" /> 

2) This version uses versions of the Yahoo.Yui.Compressor.dll and EcmaScript.NET.modified.dll binaries that can run with medium trust, so you need to replace those as well along with the CombineAndMinify.dll (you'll find them in the bin directory of the CombineAndMinify project).

3) If you use a shared hosting plan and controls from the ASP.NET AJAX toolkit, also read section Additional configuration when using a shared hosting plan.

1.7 22 Sep 2011 Protocol relative urls are now handled correctly. Code generated for preloadAllImages now wrapped in an anonymous function to reduce pollution of the name space.
1.8 11 Dec 2011 img tags within Repeaters with data bound image urls are now processed.
2.0 12 Mar 2012 By default, CombineAndMinify now writes combined files to disk, instead of keeping them in cache. Cookieless domains now work properly. Installation of CombineAndMinify has been greatly simplified.
2.1 29 Mar 2012 Fixed a bug kicking in when headCaching="None".

Speeding up database access - Part 3: Fixing missing indexes

by Matt Perdeck 20. August 2011 12:54

This is part 3 of an 8 part series of articles about speeding up access to a SQL Server database.

In part 1 we saw how to pinpoint any missing indexes. In this part 3, we'll look at fixing those missing indexes. This will include an in depth look at how indexes work under the hood and when and when not to use them.

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

Missing Indexes

Just as using an index in a book to find a particular bit of information is often much faster than reading all pages, so SQL Server indexes can make finding a particular row in a table dramatically faster by cutting down the number of read operations.

This section first discusses the two types of indexes supported by SQL Server, clustered and non-clustered. It also goes into included columns, a feature of non-clustered indexes. After that, we'll look at when to use each type of index.

Clustered Index

Take the following table:

CREATE TABLE [dbo].[Book](
 [BookId] [int] IDENTITY(1,1) NOT NULL,
 [Title] [nvarchar](50) NULL,
 [Author] [nvarchar](50) NULL,
 [Price] [decimal](4, 2) NULL)

Because this table has no clustered index, it is called a heap table. Its records are unordered, and to get all books with a given title, you have to read all the records, which is just not efficient. It has a very simple structure:

 

 

Let's see how long it takes to locate a record in this table. That way, we can compare against the performance of a table with an index. To do that in a meaningful way, first insert a million records into the table. Tell SQL Server to show I/O and timing details of each query we run:

SET STATISTICS IO ON
SET STATISTICS TIME ON

Also, before each query, flush the SQL Server memory cache:

CHECKPOINT
DBCC DROPCLEANBUFFERS

Now run the query below with a million records in the Book table:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results on my machine - reads: 9564, CPU time: 109 ms, elapsed time: 808 ms SQL Server stores all data in 8KB pages. This shows it read 9564 pages - the entire table. Now add a clustered index:

ALTER TABLE Book 
ADD CONSTRAINT [PK_Book] PRIMARY KEY CLUSTERED ([BookId] ASC)

This puts the index on column BookId, making WHERE and JOIN statements on BookId faster. It sorts the table by BookId, and adds a structure called a B-tree to speed up access:

BookId is now used the same way as a page number in a book. Because the pages in a book are sorted by page number, finding a page by page number is very fast.

Now run the same query again to see the difference:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results - reads: 2, CPU time: 0 ms, elapsed time: 32 ms. The number of reads of 8KB pages has gone from 9564 to 2, CPU time from 109ms to less than 1 ms, and elapsed time from 808 ms to 32 ms. That's a dramatic improvement.

Non-clustered Index

Now let's select by Title instead of BookId:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 9146, CPU time: 156 ms, elapsed time: 1653 ms These results are pretty similar to what we got with the heap table. Which is no wonder, seeing that there is no index on Title.

The solution obviously is to put an index on Title. However, because a clustered index involves sorting the table records on the index field, there can be only one clustered index. We've already sorted on BookId, and the table can't be sorted on Title at the same time.

The solution is to create a non-clustered index. This is essentially a duplicate of the table records, this time sorted by Title. To save space, SQL Server leaves out the other columns, such as Author and Price. You can have up to 249 non-clustered indexes on a table.

Because we still want to access those other columns in queries though, we need a way to get from the non-clustered index records to the actual table records. The solution is to add the BookId to the non-clustered records. Because BookId has the clustered index, once we have found a BookId via the non-clustered index, we can use the clustered index to get to the actual table record. This second step is called a key lookup.

Why go through the clustered index? Why not put the physical address of the table record in the non-clustered index record? The answer is that when you update a table record, it may get bigger, causing SQL Server to move subsequent records to make space. If non-clustered indexes contained physical addresses, they would all have to be updated when this happens. It’s a trade off between slightly slower reads and much slower updates.

If there is no clustered index then non-clustered index records do have the physical address. If there is a clustered index but it is not unique, than SQL Server does use the clustered index key in the non-clustered index records, but it adds a uniquifier to each record to distinguish records with the same clustered key.

To see what a non-clustered index will do for us, first create it:

CREATE NONCLUSTERED INDEX [IX_Title] ON [dbo].[Book]([Title] ASC)

Now run the same query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 4, CPU time: 0 ms, elapsed time: 46 ms. The number of reads has gone from 9146 to 4, CPU time from 156 ms to less than 1 ms, and elapsed time from 1653 ms to 46 ms. This means that having a non-clustered indexes is not quite as good as having a clustered index, but still dramatically better than having no index at all.

Included Columns

You can squeeze a bit more performance out of a non-clustered index by cutting out the key lookup - the second step where SQL Server uses the clustered index to find the actual record.

Have another look at the test query - it simply returns Title and Author. Title is already in the non-clustered index record. If you were to add Author to the non-clustered index record as well, there would be no longer any need for SQL Server to access the table record, enabling it to skip the key lookup. It would look like this

This can be done by including Author in the non-clustered index:

CREATE NONCLUSTERED INDEX [IX_Title] ON [dbo].[Book]([Title] ASC) 
INCLUDE(Author) 
WITH drop_existing

Now run the query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 2, CPU time: 0 ms, elapsed time: 26 ms. The number of reads has gone from 4 to 2, and elapsed time from 46 ms to 26 ms. That's almost a 50% improvement. In absolute terms, the gain isn't all that great, but for a query that is executed very frequently this may be worthwhile. Don't overdo this - the bigger you make the non-clustered index records, the fewer fit on an 8KB page, forcing SQL Server to read more pages.

Selecting columns to give an index

Because indexes do create overhead, you want to carefully select the columns to give indexes. Before starting the selection process, keep in mind that:

  • Putting a Primary Key on a column by default results in it having a clustered index (you can give it a unique non-clustered index instead). So you may already have many columns in your database with an index. As you'll see later in section "When to use a clustered index" , putting the clustered index on the ID column of a record is almost always a good idea.
  • If you made one or more columns unique (with the UNIQUE constraint), SQL Server will already have created a UNIQUE index to enforce the uniqueness requirement of the UNIQUE constraint.
  • Putting an index on a table column can slow down queries that modify that table (UPDATE, INSERT, DELETE). Don't focus on just one query.
  • Before introducing an index on your live database, test the index in development to make sure it really does improve performance.

Let's look at when and when not to use an index, and when to use a clustered index.

When to use an index

You can follow this decision process when selecting columns to give an index:

  • Start by looking at the most expensive queries. You identified those in Part 1 "Pinpointing missing indexes and expensive queries". There you also saw indexing suggestions generated by Database Engine Tuning Advisor.
  • Look at putting an index on foreign keys, especially if they are used in JOINs. This may help SQL Server to identify matching records quicker.
  • Consider columns used in ORDER BY and GROUP By clauses. If there is an index on such a column, than SQL Server doesn't have to sort the column again - because the index already keeps the column values in sorted order.
  • Consider columns used in WHERE clauses, especially if the WHERE will select a small number of records.
  • The MIN and MAX functions benefit from working on a column with an index. Because the values are sorted, there is no need to go through the entire table to find the minimum or maximum.
  • Think twice before putting an index on a column that takes a lot of space. If you use a non-clustered index, the column values will be duplicated in the index. If you use a clustered index, the column values will be used in all non-clustered indexes. The increased sizes of the index records means fewer fit in each 8KB page, forcing SQL Server to read more pages. The same applies to including columns in non-clustered indexes.
  • A WHERE clause that applies a function to the column value can't use an index on that column to find records, because the output of the function is not in the index. Take for example:
    SELECT Title, Author FROM dbo.Book WHERE LEFT(Title, 3) = 'Don'
    	   

    Putting an index on just the Title column won't make this query any faster. However, if you use a non-clustered index that includes both the Title and Author columns, SQL Query is able to scan that index instead of the table itself - using the index to access the data rather than locating records in the table itself. In Part 4 "Fixing expensive queries" you'll see how this may be quicker when the index records are smaller than the table records.

  • Likewise, SQL Server can't use an index to locate the records if you use LIKE in a WHERE clause with a wild card at the start of the search string, such as this:
    SELECT Title, Author FROM dbo.Book WHERE Title LIKE '%Quixote'
    	   

    However, if the search string starts with constant text instead of a wild card, an index can be used to locate records:

    SELECT Title, Author FROM dbo.Book WHERE Title LIKE 'Don%'
    	   

When not to use an index

Having too many indexes can actually hurt performance. Here are the main reasons not to use an index on a column:

  • The column gets updated often.
  • The column has low specificity - meaning it has lots of duplicate values.

Let's look at each reason in turn.

Column Updated Often

When you update a column without an index, SQL Server needs to write one 8KB page to disk - provided there are no page splits. However, if the column has a non-clustered index, or if it is included in a non-clustered index, SQL Server needs to update the index as well - so it has to write at least one additional page to disk. It also has to update the B tree structure used in the index, potentially leading to more page writes.

If you update a column with a clustered index, the non-clustered index records that use the old value need to be updated too, because the clustered index key is used in the non-clustered indexes to navigate to the actual table records. Secondly, remember that the table records themselves are sorted based on the clustered index - if the update causes the sort order of a record to change, that may mean more writes. Finally, the clustered index needs to keep its B-tree up to date.

This doesn't mean you cannot have indexes on columns that get updated - just be aware that indexes slow down updates. Test the effect of any indexes you add.

If an index is critical but rarely used, for example only for overnight report generation, consider dropping the index and recreating it when it is needed.

Low Specificity

Even if there is an index on a column, the query optimizer won't always use it. Think of the index in a book - great if you are trying to find a word that is used on only a few pages, but not so great if you're trying to find all occurrences of a commonly used word such as "the". You'd be better off going through each page, rather than going back and forth to the index. In this context, it is said that "the" has low specificity.

You can use a simple query to determine the average selectivity of the values in a column. For example, to find the average selectivity of the Price column in the Book table, use:

SELECT 
 COUNT(DISTINCT Price) AS 'Unique prices',
 COUNT(*) AS 'Number of rows',
 CAST((100 * COUNT(DISTINCT Price) / CAST(COUNT(*) AS REAL)) 
 AS nvarchar(10)) + '%' AS 'Selectivity'
FROM Book

If every book has a unique price, selectivity will be 100%. However, if half the books are $20 and the other half $30, then average selectivity will be only 50%. If the selectivity is 85% or less, an index is likely to incur more overhead than it would save.

Some prices may occur a lot more often than other prices. To see the specificity of each individual price, you would run:

DECLARE @c real
SELECT @c = CAST(COUNT(*) AS real) FROM Book
SELECT 
 Price, 
 COUNT(BookId) AS 'Number of rows',
 CAST((1 - (100 * COUNT(BookId) / @c)) 
 AS nvarchar(20)) + '%' AS 'Selectivity'
FROM Book
GROUP BY Price
ORDER BY COUNT(BookId)

The query optimizer is unlikely to use a non-clustered index for a price whose specificity is below 99%. It figures out the specificity of each price by keeping statistics on the values in the table.

In the section on included columns, we saw how SQL Server not only uses indexes to find records, but also to get table data right out of the index. SQL Server only looks at specificity when deciding whether to use an index for finding records. It could profitably get the data out of an index even if that index has very bad specificity.

When to use a clustered index

You saw that there are two types of indexes, clustered and non-clustered. And that you can have only one clustered index. How do you determine the lucky column that will have the clustered index?

To work this out, let's first look at the characteristics of a clustered index against a non-clustered index.

CharacteristicClustered index compared to a non-clustered index
Reading Faster - Because there is no need for key lookups. No difference if all required columns are included in the non-clustered index.
Updating Slower - Not only the table record, but also all non-clustered index records need potentially be updated.
Inserting / Deleting Faster - With a non-clustered index, inserting a new record in the table means inserting a new record in the non-clustered index as well. With a clustered index, the table is effectively part of the index, so there is no need for the second insert. The same goes for deleting a record.

On the other hand, when the record is inserted at any place in the table but the very end, the insert may cause a page split where half the content of the 8KB page is moved to another page. Having a page split in a non-clustered index is less likely, because its records are smaller (they normally don't have all columns that a table record has), so more records fit on a page.

When the record is inserted at the end of the table, there won't be a page split.

Column Size Needs to be kept short and fast - Every non-clustered index contains a clustered index value, to do the key lookup. Every access via a non-clustered index has to use that value, so you want it to be fast for the server to process. That makes a column of type int a lot better to put a clustered index on than a column of type nvarchar(50).

If only one column requires an index, this comparison shows that you'll want to always give it the clustered index. If multiple columns need indexes, you'll probably want to put the clustered index on the primary key column:

  • Reading - The primary key tends to be involved in a lot of WHERE and JOIN clauses, making read performance important.
  • Updating - The primary key should never or rarely get updated, because that would mean changing referring foreign keys as well.
  • Inserting / Deleting - Most often you'll make the primary key an IDENTITY column, so each new record is assigned a unique, ever increasing number. This means that if you put the clustered index on the primary key, new records are always added at the end of the table without page splits.
  • Size - Most often the primary key is of type int - which is short and fast.

Indeed, when you set the Primary Key on a column in the SSMS table designer, SSMS by default gives that column the clustered index unless another column already has the clustered index.

Maintaining Indexes

Do the following to keep your indexes working efficiently:

  • Defragment indexes. Repeated updates cause indexes and tables to become fragmented, decreasing performance. To measure the level of fragmentation and to see how to defragment indexes, refer to parts 2 and 7.
  • Keep statistics updated. SQL Server maintains statistics to figure out whether to use an index for a given query. These statistics are normally kept up to date automatically, but this can be switched off. If you did, make sure statistics are kept up to date.
  • Remove unused indexes. As you saw, indexes speed up read access, but slow down updates. In Part 1 "Pinpointing missing indexes and expensive queries", you saw how to identify unused indexes.

Conclusion

In this part, we saw how indexes work, the difference between clustered and non-clustered indexes, and when and when not to use indexes.

In the next part, we'll see how to fix expensive queries.

Speeding up database access - Part 2: Pinpointing other bottlenecks

by Matt Perdeck 20. August 2011 12:49

This is part 2 of an 8 part series of articles about speeding up access to a SQL Server database.

In part 1, we saw how to pinpoint missing indexes and expensive queries. In this part 2, we'll pinpoint a number of other bottlenecks, including locking issues, lack of execution plan reuse, fragmentation and hardware issues. Parts 3 through 8 will show how to fix the bottlenecks we found in parts 1 and 2:

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

Locking

In a database with lots of queries executing, some queries may try to access the same resource, such as a table or index. You wouldn't want one query to read a resource while another is updating it, otherwise you could get inconsistent results.

To stop a query from accessing a resource, SQL Server locks the resource. This will inevitably lead to some delays as queries wait for a lock to be released. To find out whether these delays are excessive, check the following performance counters on the database server with perfmon:

Category: SQLServer:Latches

  • Total Latch Wait Time (ms) - Total wait time in milliseconds for latches in the last second.

Category: SQLServer:Locks

  • Lock Timeouts/sec - Number of lock requests per second that timed out. This includes requests for NOWAIT locks.
  • Lock Wait Time (ms) - Total wait time in milliseconds for locks in the last second.
  • Number of Deadlocks/sec - Number of lock requests per second that resulted in a deadlock.

Perfmon comes with Windows, so you already have it on your computer. Issue the command "perfmon" from the command prompt. After it has loaded, click the Plus button in the toolbar, then select the category - such as SQLServer:Latches - in the Performance object drop down, and then add the counter - such as Wait Time (ms).

A high number for Total Latch Wait Time (ms) indicates that SQL Server is waiting too long for its own synchronization mechanism. Lock Timeouts/sec should be 0 during normal operation and Lock Wait Time (ms) very low. If they are not, queries are waiting too long for locks to be released.

Finally, Number of Deadlocks/sec should be 0. If not, you have queries waiting on each other to release a lock, preventing either to move forward. SQL Server eventually detects this condition and resolves it by rolling back one of the queries, which means wasted time and wasted work.

If you found locking issues, in part 5 you'll see how to determine which queries cause the excessive lock wait times, and how to fix the problem.

Execution Plan Reuse

Before a query is executed, the SQL Server query optimizer compiles a cost effective execution plan. This takes many CPU cycles. Because of this, SQL Server caches the execution plan in memory, in the plan cache. It then tries to match incoming queries with those that have already been cached.

In this section you'll see how to measure how well the plan cache is being used. If there is room for improvement, you'll see in part 6 how to fix this.

Performance Counters

Start by checking the following performance counters on the database server with perfmon:

Category: Processor (_Total)

  • % Processor Time - The percentage of elapsed time that the processor is busy.

Category: SQL Server:SQL Statistics

  • SQL Compilations/sec - Number of batch compiles and statement compiles per second. Expected to be very high after server start up.
  • SQL Re-Compilations/sec - Number of recompiles per second.

These counters will show high values at server start up as every incoming query needs to be compiled. The plan cache sits in memory, so doesn't survive a restart. During normal operation in a system where the data doesn't change much, you would expect compilations per second to be less than 100, and re-compilation per second to be close to zero.

However, in a system with very volatile data, it would be normal for these numbers to be much higher. As you'll see in Part 3 Fixing missing indexes, the most optimal execution plan for a query depends on the actual data in the tables the query accesses. So when that data changes often, it makes sense for the execution plans to be recompiled often as well so they remain optimal.

Additionally, when you change the schema, you would expect the execution plans affected by that change to be recompiled as well.

dm_exec_query_optimizer_info

Alternatively, look at the time spent by the server optimizing queries. Because query optimizations are heavily CPU bound, this is almost all time spent by a CPU.

The dynamic management view (DMV) sys.dm_exec_query_optimizer_info gives you the number of query optimizations since the last server restart, and the elapsed time in seconds it took on average to complete them:

SELECT 
  occurrence AS [Query optimizations since server restart], 
  value AS [Avg time per optimization in seconds],
  occurrence * value AS [Time spend optimizing since server restart in seconds]
 FROM sys.dm_exec_query_optimizer_info
 WHERE counter='elapsed time'

Run this query, wait a while and then run it again, to find the time spent optimizing in that period. Be sure to measure the time between runs, so you can work out what proportion of time the server spends optimizing queries.

sys.dm_exec_cached_plans

The DMV sys.dm_exec_cached_plans provides information on all execution plans in the plan cache. You can combine this with the DMV sys.dm_exec_sql_text to find out how often the plan for a given query has been reused. If you get little reuse for an otherwise busy query or stored procedure, you are getting too little benefit out of the plan cache:

SELECT ecp.objtype, ecp.usecounts, ecp.size_in_bytes, 
  REPLACE(REPLACE(est.text, char(13), ''), char(10), ' ') AS querytext
 FROM sys.dm_exec_cached_plans ecp
   cross apply sys.dm_exec_sql_text(ecp.plan_handle) est
 WHERE cacheobjtype='Compiled Plan'

The column objtype is 'Proc' for stored procedures and 'Adhoc' for ad hoc queries, while column usecounts shows how often a plan has been used.

In part 1 you saw how to identify busy queries and stored procedures.

Fragmentation

The data and indexes in a database are organized on disk in 8KB pages. A page is the smallest unit that SQL Server uses to transfer data to or from disk.
When you insert or update data, a page may run out of room. SQL Server then creates another page, and moves half of the contents of the existing page to the new page. That leaves free space in not only the new page but the original page as well. That way, if you keep inserting or updating data in the original page, it doesn't split again and again.

This means that after many updates and inserts, and deletes as well, you'll wind up with lots of pages with empty space. This takes more disk space than needed, but more importantly also slows down reading, because SQL Server now has to read more pages to access data. The pages may also wind up in a different physical order on disk than the logical order in which SQL Server needs to read them. As a result, instead of simply reading each page sequentially right after each other, it needs to wait for the disk head to reach the next page - meaning more delays.

To establish the level of fragmentation for each table and index in the current database, use the following query which uses the DMV dm_db_index_physical_stats:

SELECT o.name AS TableName, i.name AS IndexName, ips.index_type_desc,
 ips.avg_fragmentation_in_percent, ips.page_count,  
 ips.avg_page_space_used_in_percent
FROM sys.dm_db_index_physical_stats(
 DB_ID(), NULL, NULL, NULL, 'Sampled') ips
JOIN sys.objects o ON ips.object_id = o.object_id
JOIN sys.indexes i ON (ips.object_id = i.object_id) AND (ips.index_id = i.index_id)
WHERE (ips.page_count >= 1000) AND (ips.avg_fragmentation_in_percent > 5) AND
      (ips.alloc_unit_type_desc <> 'LOB_DATA') AND
      (ips.alloc_unit_type_desc <> 'ROW_OVERFLOW_DATA')
ORDER BY o.name, i.name

This gives you all tables and indexes that take over 1000 pages and that are more than 5% fragmented. Looking at fragmentation of tables and indexes taking less than a few thousand pages tends to be a waste of time. Fragmentation of 5% may sound small, but the fragmentation might be in a heavily used area of the table or index.

When you see index type CLUSTERED INDEX in an entry, it really refers to the actual table, because the table is part of the clustered index. Index type HEAP refers to a table without a clustered index. This is explained further in Part 3 Fixing missing indexes.

If you found any tables or indexes that are over 5% fragmented and take over a few thousand pages, this may be an issue. Part 7 will show how to defragment tables and indexes. Note that defragmenting is not a silver bullet - it will not necessarily solve any or all performance issues.

Memory

To see whether lack of memory is slowing down the database server, check these counters in perfmon:

Category: Memory

  • Pages/sec - When the server runs out of memory, it stores information temporarily on disk, and then later reads it back when needed - which is very expensive. This counter indicates how often this happens.

Category: SQL Server:Buffer Manager

  • Page Life Expectancy - Number of seconds a page will stay in the buffer pool without being used. The greater the life expectancy, the greater the chance that SQL Server will be able to get a page from memory instead of having to read it from disk.
  • Buffer Cache Hit Ratio - Percentage of pages that were found in the buffer pool, without having to read from disk.

These counters can tell you whether SQL Server has too little memory. This will cause it to store more data on disk, leading to excessive disk i/o, causing greater stress on the CPU and disk.

If Pages/sec is consistently high, or Buffer Cache Hit Ratio is consistently below say 90%, adding memory may lighten the load on the disk system - see part 8.

If Page Life Expectancy suddenly becomes much lower than what it normally is on your system and stays low, that would be a cause for concern. You should be able to ignore temporary dips in Page Life Expectancy.

A low Page Life Expectancy is often caused by for example queries doing table scans. During a table scan, SQL Server reads the pages making up a table in memory to scan every record, and expels the pages from memory when they are no longer needed - this page churning leads to shorter page life expectancies. Note that the solution here would not be to add memory, but to avoid the table scans by for example adding indexes - see Part 3 Fixing missing indexes.

However, if your system is getting busier over time with Page Life Expectancy decreasing over time, it may be time to add memory - see part 8.

In addition to the performance counters shown above, you could use the DMV dm_os_sys_memory (SQL Server 2008 and higher):

SELECT 
  total_physical_memory_kb, available_physical_memory_kb, 
  total_page_file_kb, available_page_file_kb, 
  system_memory_state_desc
FROM sys.dm_os_sys_memory

The column system_memory_state_desc shows in human readable form the state of physical memory.

Disk usage

SQL Server is heavily disk bound, so solving disk bottlenecks can make a big difference. If you found memory shortages in the previous section, fix those first, because a memory shortage can lead to excessive disk usage in itself. Otherwise check these counters for each of your disks to see if there is a disk bottleneck for some other reason.

Categories: PhysicalDisk and LogicalDisk

  • Avg. Disk sec/Read - Average time, in seconds, of a read of data from the disk.
  • % Disk Time - Percentage of elapsed time that the selected disk was busy reading or writing.
  • Avg. Disk Queue Length - Average number of read and write requests queued during the sample interval.
  • Current Disk Queue Length - Current number of requests queued.

If Avg. Disk sec/Read is consistently higher than 0.02 seconds, your disk system would need attention, especially if it is consistently higher than 0.05 seconds. If it goes higher during short periods, that could simply be a case of a lot of work coming in at the same time, so wouldn't be an immediate cause of concern.

If your system has only 1 disk, than if % Disk Time is consistently over 85%, the disk system is severely stressed. However, if you use a RAID array or SAN, % Disk Time could be consistently over 100% without it being a problem.

Avg. Disk Queue Length and Current Disk Queue Length refer to the number of tasks that are queued at the disk controller or are being processed. These counters are less useful for SANs, where queue lengths are difficult to interpret as okay or too high. Otherwise, counter value that are consistently higher than 2 would be an issue. If you use a RAID array where the controller is attached to several disks, multiply this by the number of individual disks in the array - so if there are 2 disks in the array, look for counter values that are consistently higher than 4. Keep in mind that some operations dump a lot of IO transactions in the queue in one go and than do something else while waiting for the disk system.

Part 8 will show how to fix disk issues.

CPU

If you found memory or disk issues in the previous sections, fix those first because they will stress the CPU as well. Otherwise check the counters below to see whether the CPU is stressed for another reason.

Category: Processor

  • % Processor Time - Proportion of time that the processor is busy.

Category: System

  • Processor Queue Length - Number of threads waiting to be processed.

If % Processor Time is consistently over 75%, or Processor Queue Length consistently greater than 2, the CPU is probably stressed. Part 8 will show how to resolve this.

Conclusion

In this part, we saw how to pinpoint a number of bottlenecks, including locking issues, lack of execution plan reuse, fragmentation issues and memory, disk and CPU issues. In part 3, we'll start to fix bottlenecks by fixing any missing indexes. We'll have an in depth look at how indexes work under the hood and when and when not to use them.

Speeding up database access - Part 1: Missing indexes

by Matt Perdeck 20. August 2011 12:38

This is part 1 of an 8 part series of articles about speeding up access to a SQL Server database. In parts 1 and 2 we'll see how to pinpoint any bottlenecks, such as missing indexes, expensive queries, locking issues, etc. This allows you to prioritize the biggest bottlenecks. Parts 3 through 8 than show how to fix those bottlenecks:

  • Part 1 Pinpointing missing indexes and expensive queries
  • Part 2 Pinpointing other bottlenecks
  • Part 3 Fixing missing indexes
  • Part 4 Fixing expensive queries
  • Part 5 Fixing locking issues
  • Part 6 Fixing execution plan reuse
  • Part 7 Fixing fragmentation
  • Part 8 Fixing memory, disk and CPU issues

Missing Indexes and Expensive Queries

You can greatly improve the performance of your queries by reducing the number of reads executed by those queries. The more reads you execute, the more you potentially stress the disk, CPU and memory. Secondly, a query reading a resource normally blocks another query from updating that resource. If the updating query has to wait while holding locks itself, it may then delay a chain of other queries. Finally, unless the entire database fits in memory, each time data is read from disk, other data is evicted from memory. If that data is needed later, it then needs to be read back from the disk. again

The most effective way to reduce the number of reads is to create enough indexes on your tables. Just as an index in a book, a SQL Server index allows a query to go straight to the table row(s) it needs, rather than having to scan the entire table. Indexes are not a cure all though - they do incur overhead and slow down updates - so they need to be used wisely.

In this part, you'll see:

  • How to identify missing indexes that would reduce the number of reads in the database.
  • How to identify those queries that create the greatest strain - either because they are used very often or because they are just plain expensive.
  • How to identify superfluous indexes that take resources but provide little benefit.

Missing Indexes

SQL Server allows you to put indexes on table columns, to speed up WHERE and JOIN statements on those columns. When the query optimizer optimizes a query, it stores information about those indexes it would have liked to have used, but weren't available. You can access this information with the dynamic management view (DMV) dm_db_missing_index_details:

select d.name AS DatabaseName, mid.*
 from sys.dm_db_missing_index_details mid
  join sys.databases d ON mid.database_id=d.database_id

The most important columns returned by this query are:

 
Column Description
DatabaseName Name of the database this row relates to.
equality_columns Comma separated list of columns used with the equals operator, such as:
column = value
inequality_columns Comma separated list of columns used with a comparison operator other than the equals operator, such as:
column > value
included_columns Comma separated list of columns that could profitably be included in an index. Included columns will be discussed in part 3 "Missing Indexes".
statement Name of the table where the index is missing.

This information is not persistent - you will lose it after a server restart.

Don't take the results of dm_db_missing_index_details as gospel. Not all indexes that it suggests would actually be useful, some are duplicates, etc.

An alternative is to use Database Engine Tuning Advisor, which comes with SQL Server 2008 (except for the Express version). This tool analyzes a trace of database operations and attempts to identify an optimal set of indexes that takes the requirements of all queries into account. It even gives you the SQL statements needed to create the missing indexes it identified.

This all sounds great, but keep in mind that Database Engine Tuning Advisor will sometimes come up with the wrong indexes and miss good indexes. Before you make any changes, read part 3 "Fixing missing indexes" first.

The first step is to get a trace of database operations during a representative period. If your database is busiest during business hours, that is probably when you want to run the trace:

1. Start SQL Profiler. Click Start | Programs | Microsoft SQL Server 2008 | Performance Tools | SQL Server Profiler.

2. In SQL Profiler, click File | New Trace.

3. Click the Events Selection tab.

4. You want to minimize the number of events captured, to reduce the load on the server. Deselect every event, except SQL:BatchCompleted and RPC:Completed. It is those events that contain resource information for each batch, and so are used by Database Engine Tuning Advisor to analyze the workload. Make sure that the TextData column is selected for both events.

5. To only capture events related to your database, click the Column Filters button. Click DatabaseName in the left column, expand Like in the right hand pane and enter your database name. Click OK.

 

6. To further cut down the trace and only trace calls from your web site, put a filter on ApplicationName, so only events where this equals the Application Name you set in the connection string will be recorded. If you didn't set the Application Name in the connection string, use ".Net SqlClient Data Provider".

7. Click the Run button to start the trace. You will see batch completions scrolling through the window. At any stage, you can click File | Save or press ctrl-S to save the trace to a file.

8. Save the template, so you don't have to recreate it next time. Click File | Save As | Trace Template. Fill in a descriptive name and click OK. Next time you create a new trace by clicking File | New Trace, you can retrieve the template from the Use the template dropdown.

Sending all these events to your screen takes a lot of server resources. You probably won't be looking at it all day anyway. The solution is to save your trace as a script and then use that to run a background trace. You'll also be able to reuse the script later on.

9. Click File | Export | Script Trace Definition | For SQL Server 2005 - 2008. Save the file with a .sql extension. You can now close SQL Server Profiler, which will also stop the trace.

10. In SQL Server Management Studio, open the .sql file you just created. Find the string InsertFileNameHere and replace it with the full path of the file where you want the log stored. Leave off the extension, the script will set it to .trc. Press ctrl-S to save the .sql file.

11. To start the trace, press F5 to run the .sql file. It will tell you the traceid of this trace.

12. To see the status of this trace and any other traces in the system, in a query window execute the command:

select * from ::fn_trace_getinfo(default)

Find the row with property 5 for your traceid. If the value column in that row is 1, your trace is running. The trace with traceid 1 is a system trace.

13. To stop the trace after it has captured a representative period, assuming your traceid is 2, run the command:

exec sp_trace_setstatus 2,0

To restart it, run:

exec sp_trace_setstatus 2,1

14. To stop and close it, so you can access the trace file, run:

exec sp_trace_setstatus 2,0
exec sp_trace_setstatus 2,2

Now run Database Engine Tuning Advisor:

1. Start SQL Profiler. Click Start | Programs | Microsoft SQL Server 2008 | Performance Tools | Database Engine Tuning Advisor.

2. In the Workload area, select your trace file. In the Database for workload analysis dropdown, select the first database you want analyzed.

3. Under Select databases and tables to tune, select the databases for which you want index recommendations.

4. Especially with a big trace, Database Engine Tuning Advisor may take a long time to do its analysis. On the Tuning Options tab, you can tell it when to stop analyzing. This is just a limit, if it is done sooner, it will produce results as soon as it is done.

5. To start the analysis, click the Start Analysis button in the toolbar.

Keep in mind that Database Engine Tuning Advisor is just a computer program. Consider its recommendations, but make up your own mind. Be sure to give it a trace with a representative workload, otherwise its recommendations may make things worse rather than better. For example, if you provide a trace that was captured at night when you process few transactions but execute lots of reporting jobs, its advice is going to be skewed towards optimizing reporting, not transactions.

Expensive Queries

If you use SQL Server 2008 or higher, you can use the activity monitor to find recently executed expensive queries. In SSMS, right click your database server (normally in the top left corner of the window) and choose Activity Monitor.

You can get a lot more information by using the DMV dm_exec_query_stats. When the query optimizer creates the execution plan for a query, it caches the plan for reuse. Each time a plan is used to execute a query, performance statistics are kept. You can access those statistics with dm_exec_query_stats:

SELECT
 est.text AS batchtext,
 SUBSTRING(est.text, (eqs.statement_start_offset/2)+1, 
 (CASE eqs.statement_end_offset WHEN -1 
 THEN DATALENGTH(est.text) 
 ELSE eqs.statement_end_offset END - 
 ((eqs.statement_start_offset/2) + 1))) AS querytext,
 eqs.creation_time, eqs.last_execution_time, eqs.execution_count, 
 eqs.total_worker_time, eqs.last_worker_time, 
 eqs.min_worker_time, eqs.max_worker_time, 
 eqs.total_physical_reads, eqs.last_physical_reads, 
 eqs.min_physical_reads, eqs.max_physical_reads, 
 eqs.total_elapsed_time, eqs.last_elapsed_time, 
 eqs.min_elapsed_time, eqs.max_elapsed_time, 
 eqs.total_logical_writes, eqs.last_logical_writes, 
 eqs.min_logical_writes, eqs.max_logical_writes,
 eqs.query_plan_hash 
FROM
 sys.dm_exec_query_stats AS eqs
 CROSS APPLY sys.dm_exec_sql_text(eqs.sql_handle) AS est
ORDER BY eqs.total_physical_reads DESC

A limitation of this DMV is that when you run it, not all queries that have run since the last server restart will have a plan in cache. Some plans may have expired due to disuse. Plans that were very cheap to produce, but not necessarily cheap to run, may not have been stored at all.

Another limitation is that as it stands, this query is only suitable for stored procedures. If you use ad hoc queries, the parameters are embedded in the query, so the query optimizer produces a plan for each set of parameters, unless the query has been parameterized. This is further discussed in part 6 "Fixing execution plan reuse".

To get around this, dm_exec_query_stats returns a column query_plan_hash which is the same for each query that has the same execution plan. By aggregating on this column using GROUP BY, you can get aggregate performance data for queries that share the same logic.

It makes sense to not only look at the cost of running a query once, but also at how often it runs. A very expensive query that runs once a week is less important than a moderately expensive query that runs 100 times per second.

The query returns this information:

Column Description
batchtext Text of the entire batch or stored procedure containing the query.
querytext Text of the actual query.
creation_time Time that the execution plan was created.
last_execution_time Last time the plan was executed.
execution_count Number of times the plan was executed after it was created. This is not the number of times the query itself was executed - its plan may have been recompiled at some stage.
total_worker_time Total amount of CPU time, in microseconds, that was consumed by executions of this plan since it was created.
last_worker_time CPU time, in microseconds, that was consumed the last time the plan was executed.
min_worker_time Minimum CPU time, in microseconds, that this plan has ever consumed during a single execution.
max_worker_time Maximum CPU time, in microseconds, that this plan has ever consumed during a single execution.
total_physical_reads Total number of physical reads performed by executions of this plan since it was compiled.
last_physical_reads Number of physical reads performed the last time the plan was executed.
min_physical_reads Minimum number of physical reads that this plan has ever performed during a single execution.
max_physical_reads Maximum number of physical reads that this plan has ever performed during a single execution.
total_logical_writes Total number of logical writes performed by executions of this plan since it was compiled.
last_logical_writes Number of logical writes performed the last time the plan was executed.
min_logical_writes Minimum number of logical writes that this plan has ever performed during a single execution.
max_logical_writes Maximum number of logical writes that this plan has ever performed during a single execution.
total_elapsed_time Total elapsed time, in microseconds, for completed executions of this plan.
last_elapsed_time Elapsed time, in microseconds, for the most recently completed execution of this plan.
min_elapsed_time Minimum elapsed time, in microseconds, for any completed execution of this plan.
max_elapsed_time Maximum elapsed time, in microseconds, for any completed execution of this plan.

An alternative to using dm_exec_query_stats is to analyze the trace you made with SQL Server Profiler. After all, this contains performance data for every completed batch. A batch corresponds to a stored procedure, or a query if you use ad hoc queries.

To investigate this a bit further, load the trace file into a table. You can use Profiler to do this:

1. Start SQL Profiler. Click Start | Programs | Microsoft SQL Server 2008 | Performance Tools | SQL Server profiler.

2. To open the trace file, click File | Open | Trace File, or press ctrl-O. If you want, you can now analyze the trace in Profiler.

3. To save the trace to a table, click File | Save As | Trace Table. If the table you specify does not yet exist, Profiler will create it.

Alternatively, use fn_trace_gettable, like this:

SELECT * INTO newtracetable

FROM ::fn_trace_gettable('c:\trace.trc', default)

The most obvious way to find the most expensive queries or stored procedures is to aggregate the performance data in the table by query or stored procedure, using GROUP BY. However, when you have a look at the TextData column in the table with trace results, you'll find all queries or stored procedure calls are listed with actual parameter values. To aggregate them, you'll have to filter out those values.

If you send stored procedure calls to the database, good on you. In that case, it isn't too hard to remove the parameters, because they always come after the stored procedure name. Here is a sql script that does exactly that and then aggregates the performance data per stored procedure (stored procedures are discussed further in part 6, which is about execution plan reuse):

-- This code assumes that table newtracetable holds trace information
-- produced by SQL Server Profiler.
-- It also assumes that all lines have a stored procedure call in the 
- TextData field, of the form
- exec sprocname param1 param2 ...
-- 
-- This code first produces an intermediate table sprocdata, with the
-- parameters removed from the stored procedure calls.
-- It then produces a table sprocaggregateddata with aggregated CPU,
-- reads, writes and duration info per stored procedure.
IF EXISTS(select * FROM sys.objects WHERE name='fnSprocName' AND type='FN')
 DROP FUNCTION fnSprocName
GO
CREATE FUNCTION fnSprocName
(
 @textdata nvarchar(4000)
)
RETURNS nvarchar(100)
AS
BEGIN
 DECLARE @spaceidx int
 SET @spaceidx = CHARINDEX(' ', @textdata, 6)

 IF @spaceidx > 0
 RETURN SUBSTRING(@textdata, 6, @spaceidx - 5)
 RETURN RIGHT(@textdata, LEN(@textdata) - 5)
END
GO
IF EXISTS(select * FROM sys.objects WHERE name='sprocdata' AND type='U')
 DROP TABLE sprocdata
GO
SELECT dbo.fnSprocName(TextData) AS SprocName, CPU, Reads, Writes, Duration
INTO sprocdata
FROM newtracetable
CREATE CLUSTERED INDEX [IX_sprocdata] ON [dbo].[sprocdata]([SprocName] ASC)
IF EXISTS(select * FROM sys.objects WHERE name='sprocaggregateddata' AND type='U')
 DROP TABLE sprocaggregateddata
GO
SELECT
 SprocName, 
 MIN(CPU) AS MinCpu, MAX(CPU) AS MaxCpu, AVG(CPU) AS AvgCpu, SUM(CPU) AS SumCpu, 
 MIN(Reads) AS MinReads, MAX(Reads) AS MaxReads, AVG(Reads) AS AvgReads, 
 SUM(Reads) AS SumReads, 
 MIN(Writes) AS MinWrites, MAX(Writes) AS MaxWrites, AVG(Writes) AS AvgWrites,
 SUM(Writes) AS SumWrites, 
 MIN(Duration) AS MinDuration, MAX(Duration) AS MaxDuration, AVG(Duration) AS AvgDuration,
 SUM(Duration) AS SumDuration 
INTO sprocaggregateddata
FROM sprocdata
GROUP BY SprocName
SELECT * FROM sprocaggregateddata

If you send ad hoc queries, removing the variable bits of the queries will be a lot harder, because their locations are different for each query. The following resources may make your job a bit easier:

Once you've identified the most expensive queries, you can find out whether adding indexes would speed up their execution.

1. Open a query window in SSMS.

2. From the Query menu, choose Include Actual Execution Plan. Or press ctrl+M.

3. Copy an expensive query in the query window and execute it. Above the results pane, you will see a tab Execution plan. Click that tab.

4. If the query optimizer found that an index was missing, you will see a message in green.

5. For more information, right click in the lower pane and choose Show Execution Plan XML. In the XML, look for the MissingIndexes element.

If you identified missing indexes, in part 3 you'll see how indexes work and how to create them. If you found any particularly expensive queries, in part 4 you'll find how to fix those.

Unused Indexes

Indexes not only speed up reads, they also slow down updates and take storage space. If an index slows down updates but is little or not used for reading, you're better off dropping it.

Use the DMV dm_db_index_usage_stats to get usage information on each index:

SELECT d.name AS 'database name', t.name AS 'table name', i.name AS 'index name', ius.*
 FROM sys.dm_db_index_usage_stats ius
 JOIN sys.databases d ON d.database_id = ius.database_id AND ius.database_id=db_id()
 JOIN sys.tables t ON t.object_id = ius.object_id
 JOIN sys.indexes i ON i.object_id = ius.object_id AND i.index_id = ius.index_id
 ORDER BY user_updates DESC

This gives you the name and table of each index in the current database that has seen activity since the last server restart, and the number of updates and reads since the last server restart. Specifically, the DMV sys.dm_db_index_usage_stats shows how many times the query optimizer used an index in an execution plan. You'll find the number of updates caused by INSERT, UPDATE or DELETE operations in column user_updates, while the number of reads is in columns user_seeks, user_scans and user_lookups. If the number of updates is high in relation to the number of reads, consider dropping the index, like this:

DROP INDEX IX_Title ON dbo.Book

You may see clustered indexes being updated. In part 3, we'll see how the table itself is part of the clustered index - which means that any table update is also an update of the clustered index.

Conclusion

In this part, we saw how to pinpoint missing indexes and expensive queries through a number of DMVs and SQL Server Profiler. In part 2, we'll see how to pinpoint additional bottlenecks, such as locking and fragmentation issues.

Making the most out of IIS compression - Part 2: Configuring IIS 6 compression

by Matt Perdeck 20. August 2011 09:46

Introduction

In part 1 of this series, we saw how to use compression in IIS 7.

As of July 2011, IIS 7's predecessor, IIS 6, is still used by 72.4% of all the websites that use Microsoft-IIS (source). Additionally, using compression in IIS 6 is a lot harder than in IIS 7. Hence this article on IIS 6 compression.

Contents

Getting started

Unfortunately, configuring compression on IIS 6 is far from straight forward. It involves four steps:

  1. Switch on compression in the IIS Manager;
  2. Set permissions on the folder where compressed static files are cached;
  3. Update the metabase;
  4. Reset the IIS server.

Let's go through each step in turn.

Switch on compression in the IIS Manager

This consists of the folowing steps:

  1. Start IIS manager: Click on Start | Administrative Tools | Internet Information Services (IIS) Manager.
  2. Backup the metabase: Right-click on your server and then click on All Tasks | Backup/Restore Configuration. Click on the Create Backup button, enter a name for your backup, such as today's date, and click on OK. Finally, click on Close to get back to the IIS manager.
  3. Expand your server. Right-click on the Web Sites node and click on Properties | Service.

  4. If your server has enough spare CPU capacity, select Compress application files. Because there is no caching for dynamic files of compressed content, this will cause IIS to compress dynamic files on the fly for every request. As a result, dynamic file compression takes more CPU than compressing static files.
  5. Select Compress static files.
  6. The temporary directory is where compressed static files are cached. Leave the default for Temporary directory. Or enter a different directory, for example if your system drive is low on space. Make sure it sits on an uncompressed and unshared local NTFS volume.
  7. Set a maximum size for the temporary directory.
  8. Click on OK.

Set permissions on the folder where compressed static files are cached

For static compression to work, the IIS_WPG group or the identity of the application pool must have Full Control access to the folder where the compressed files are stored.

Unless you changed the folder in the previous step (in the Temporary directory field), it will be at C:\WINDOWS\IIS Temporary Compressed Files.

  1. Right-click on the folder and click on Properties | Security.
  2. Click on the Add button and add the IIS_WPG group, or the identity of the application pool.

  3. Allow Full Control to the identity you just added and click on OK.

IIS_WPG or IUSR_{machinename}?

There is conflicting advice on various websites as to whether you should give the IIS_WPG group or the IUSR_{machinename} account access to the folder where the compressed files are stored. However, my testing with a clean install of Windows Server 2003 and IIS 6 has shown that IIS 6 will only compress static files if the IIS_WPG group has Full Control access to that folder, irrespective of the level of access by IUSR_{machinename}.

Update the metabase

Next modify the metabase:

  1. Get IIS to allow you to edit the metabase. In IIS manager, right-click on your IIS server near the top of the tree on the left-hand side. Click on Properties, check Enable Direct Metabase Edit, and click on OK.
  2. You'll normally find the metabase in directory C:\Windows\system32\inetsrv, in file metabase.xml. Open that file with a text editor.
  3. Find the IIsCompressionScheme elements. There should be two of these: one for the deflate compression algorithm and one for gzip.
  4. In both the elements, extend the HcFileExtensions property with the extensions you need for static files used in your pages, such as .css and .js. You will wind up with something like the following:

    HcFileExtensions="htm
      html
      css
      js
      xml
      txt"

    Keep in mind that there is no point in including image files here such as .gif, .jpg, and .png. These files are already compressed because of their native format.

  5. Also in both elements, extend the HcScriptFileExtensions property with the extensions you need for dynamic files, such as .aspx. You will wind up with something like the following:
  6. HcScriptFileExtensions="asp
      dll
      exe
      aspx
      asmx
      ashx"

  7. The compression level for dynamic files is set by the HcDynamicCompressionLevel property in both IIsCompressionScheme elements. By default, this is set to zero, which is too low. The higher you set this, the better the compression but the greater the CPU usage. You might want to test different compression levels to see which one gives you the best tradeoff between CPU usage and file size. Start testing at a low compression level and then increase this until CPU usage gets too high. The compression level can be between 0 and 10:

    HcDynamicCompressionLevel="1"

  8. The compression level for static files is set by the HcOnDemandCompLevel property in both IIsCompressionScheme elements. By default, this is set to 10, meaning maximum compression. Because compressed static files are cached (so that static files are not compressed for each request), this causes little CPU usage. As a result, you will want to stick with the default.
  9. Save the file.
  10. Disallow editing of the metabase. Right-click on your IIS server, click on Properties, uncheck Enable Direct Metabase Edit, and click on OK.

You'll find a full list of the available metabase properties here.

Instead of editing the metabase directly, you can run the adsutil.vbs utility from the command line to change the metabase. This allows you to write a script so you can quickly update a number of servers. For example, setting HcDoStaticCompression to true will enable static compression. This is done as follows:

  1. Open command prompt and change the directory to C:\Inetpub\AdminScripts
  2. Run the following command:

    cscript adsutil.vbs set w3svc/filters/compression/parameters/HcDoStaticCompression true

More information about adsutil.vbs is here

Reset the IIS server

Finally, reset the server, so that it picks up your changes. Right-click on your IIS server and then click on All Tasks | Restart IIS.

Alternatively, open a command prompt and run:

iisreset

Static files are not always served compressed

If a static file is requested and there is no compressed version of the file in the temporary directory where compressed static files are cached, IIS 6 will send the uncompressed version of the file. Only once that's done does it compress the file and store the compressed version in the temporary directory, ready to be served in response to a subsequent request for the file.

Summary

In this article we saw how enabling compression in IIS 6 involves four steps - switching on compression, setting permissions so compressed static files can be cached, updating the metabase and finally resetting the IIS 6 server.

If you enjoyed this series and want to know the full story on how to improve ASP.NET site performance, from database server to web server to browser, consider my book ASP.NET Site Performance Secrets.

Tags:

Web Server

Making the most out of IIS compression - Part 1: configuring IIS 7 compression

by Matt Perdeck 20. August 2011 09:44

Introduction

Using compression is the single most effective way to reduce page load times. The .aspx files sent by the server to the browser consist of HTML. HTML is highly compressible by algorithms such as gzip. Because of this, modern web servers including IIS 5 and later have the ability to compress outgoing files, and modern browsers have the ability to decompress incoming files.

Both IIS 6 and IIS 7 offer advanced compression related options that help you get better performance improvements for your web site and make better use of your servers and bandwidth. Unfortunately, these options are not always easy to access. This article series shows step by step how to unlock these options.

In the first article in this 2 part series, we'll focus on configuring IIS 7 compression. If you are used to IIS 6, you'll find that IIS 7 offers many new features, including the ability to cache not only compressed static files, but also compressed dynamic files. If you still use IIS 6, the next article in the series will show how to configure IIS 6 compression.

Contents

Request and response headers involved in compression

How does the server know that the browser can accept compressed content? And how does the browser know that the content it received is compressed?

When a browser that supports compression sends a request to the server, it includes the request header Accept-Encoding telling the server which compression algorithms it supports. For example:

Accept-Encoding: gzip,deflate

If the server then uses compression for its response, it includes the response header Content-Encoding in the (uncompressed) file header to say how the file has been compressed, as shown:

Content-Encoding: gzip

This keeps the browser and server compression-wise in sync. However, it isn't only browsers and servers that send and receive requests and responses, but proxies as well. And proxies can cache responses and serve subsequent requests from their cache. When a proxy caches a compressed file, how do we make sure that the proxy doesn't send that compressed file to a browser that can't process compressed files?

The solution adopted by IIS 6 and IIS 7 is to tell the proxy that if it receives a request without the Accept-Encoding request header, it must not serve a file that was sent in response to a request that did have the Accept-Encoding request header, or vice versa, that is, the Accept-Encoding request headers must match. IIS 6 and 7 make this happen by sending a Vary header in the response from the server when compression is enabled as shown:

Vary: Accept-Encoding

IIS 6 also lets you override the Cache-Control and Expires headers for compressed files via properties in its metabase. This allows you to suppress proxy caching for compressed files. The IIS 6 metabase will be described in part 2, about configuring IIS 6 compression. The metabase properties that override the Cache-Control and Expires headers can be found here

Starting configuration of IIS 7 compression

Before you start configuration of IIS 7 compression, find out at the following sites whether your pages already use compression, and if so, how much bandwidth you're saving:

Enabling compression in IIS 7 essentially consists of these steps:

  1. Installing the dynamic content compression module;
  2. Enabling Compression;
  3. Configuring advanced features.

Lets go through these steps one by one.

Installing the dynamic content compression module

If you want to use compression for dynamic files, first install the dynamic content compression module. The steps to do this are different depending on whether you use Vista/Windows 7 or Windows Server 2008.

On Windows Server 2008:

  1. Click Start | Administrative Tools | Server Manager.
  2. On the left-hand side, expand Roles and then click on Web Server (IIS).

  3. Scroll down to the Role Services section and then click on Add Role Services. The Add Role Services wizard opens:

  4. On the Select Role Services page, scroll down to the Performance section and select Dynamic Content Compression. Click on Next.

  5. Read the message and click Install.
  6. Once the installation is done, close the wizard.

On Vista or Windows 7:

  1. Click on Start | Control Panel | Programs | Turn Windows features on or off. The Windows Features dialog opens.
  2. Expand Internet Information Services, expand World Wide Web Services, and expand Performance Features. Select Http Compression Dynamic.
  3. Click on OK. Wait for the feature to be configured.

Enabling compression

Now enable compression in the IIS manager:

  • Open IIS manager. Click on Start | Control Panel. Type admin in the search box. Click on Administrative Tools. Double-click on Internet Information Services (IIS) Manager.
  • Click on your machine. Then double-click on the Compression icon on the right-hand side.
  • The compression window opens. Here you can enable compression for dynamic content and static content. The window shows the following items:
    • Enable dynamic content compression: Unless your server already uses a lot of CPU, you will want to enable dynamic content compression.
    • Enable static content compression: You can safely enable static content compression because compressed static content gets cached. So, only the initial compression takes CPU cycles.
    • Only compress files larger than (in bytes): It makes sense to not compress small files. Because compression produces some overhead in the file, compressing a small file may actually make it bigger rather than smaller.
    • Cache directory: This is where compressed static files are stored. If you are short on disk space on the system drive, consider putting this on another drive. Make sure that the drive is a local drive or NTFS partition, and that it isn't compressed or shared.
    • Per application pool disk space limit (in MB): If you have lots of application pools and limited disk space, you may want to adjust this. If you have 100 application pools and you leave this at 100MB, 100 x 100MB = 10GB may be used to cache static compressed files.
  • On the right-hand side of the window, click on Apply. Compression is now enabled.

Setting compression by site, folder, or file

In addition to enabling or disabling compression for all sites on the server, you can enable or disable compression at a site level, or even a folder or file level.

To make this work:

  1. Open the IIS Manager and in the left-hand side click on the site, folder, or file whose compression status you want to change.
  2. Make sure that the middle pane is switched to Features View, and double-click on the Compression icon.
  3. This will open a window where you can enable or disable compression for dynamic or static files:

Compression level

You can tweak the tradeoff between compression and CPU usage by setting the compression level. The higher the compression level, the greater the compression and CPU usage.

The compression level can be set separately for static and dynamic files. For static files, use 9, the highest level. For dynamic files, compression level 4 seems to be the sweet spot, as shown in this study

However, the optimal compression level for your website may be different, depending on how much spare CPU capacity you have, the compressibility of your pages, and your bandwidth costs. Experiment with different levels to see which one works best for you.

To set the compression level:

  1. Execute this from the command prompt:

    C:\Windows\System32\Inetsrv\Appcmd.exe
      set config -section:httpCompression 
      -[name='gzip'].staticCompressionLevel:9 
      -[name='gzip'].dynamicCompressionLevel:4

    (This sets compression level 9 for static files and compression level 4 for dynamic files).

  2. Reset the IIS server to make the new compression level take effect. In IIS Manager, click on the server at the top of the tree and then click on Restart on the right-hand side.

Disabling compression based on CPU usage

To make sure that compression doesn't overload the CPU, IIS 7 calculates average CPU usage every 30 seconds. It automatically switches off compression when CPU usage exceeds a given limit. Then when CPU usage drops below a second limit, it switches on compression again.

The default values for these limits are:

Switch compression off at
(CPU usage)
Switch back on at
(CPU usage)
Dynamic files 90 percent 50 percent
Static files 100 percent 50 percent

Note that this means that if CPU usage on your server is consistently over 50 percent, and when it spikes over 90 percent, compression for dynamic files will be switched off, but will never be switched back on again.

You can change these limits by modifying the applicationHost.config file, which is normally in folder C:\Windows\System32\inetsrv\config:

  1. Make a backup copy of applicationHost.config.
  2. Open applicationHost.config with a text editor.
  3. Find the <httpCompression> section.
  4. To change the CPU usage at which compression for dynamic files is switched back on to 70 percent, add the dynamicCompressionEnableCpuUsage attribute to the httpCompression element, as shown:

    <httpCompression dynamicCompressionEnableCpuUsage="70" .... >

    Note that you provide a number to the attribute, not a percentage, so don't write a percentage sign when setting the attribute. The value 70 shown here is simply an example, not a recommendation. You need to determine the optimal value for your own site.

  5. Save the applicationHost.config file.
  6. Reset the IIS server to make the new compression level take effect. Start IIS Manager, click on the server at the top of the tree, and then click on Restart on the right-hand side.

In case you want to change any of the other limits, here are the matching attributes:

Switch compression off at
(CPU usage)
Switch back on at
(CPU usage)
Dynamic files dynamicCompressionDisableCpuUsage dynamicCompressionEnableCpuUsage
Static files staticCompressionDisableCpuUsage staticCompressionEnableCpuUsage

If you want to stop IIS from ever switching off compression based on CPU usage, set all these attributes to 100.

You will find all the elements and attributes that can be used with httpCompression here

Setting the request frequency threshold for static compression

As you saw earlier, IIS 7 caches the compressed versions of static files. So, if a request arrives for a static file whose compressed version is already in the cache, it doesn’t need to be compressed again.

But what if there is no compressed version in the cache? Will IIS 7 then compress the file right away and put it in the cache? The answer is yes, but only if the file is being requested frequently. By not compressing files that are only requested infrequently, IIS 7 saves CPU usage and cache space.

By default, a file is considered to be requested frequently if it is requested two or more times per 10 seconds. This is determined by two attributes in the serverRuntime Element in web.config:

serverRuntime attribute Description
frequentHitThreshold Number of times a URL must be requested within the time span specified in the frequentHitTimePeriod attribute to be considered frequently hit. Must be between 1 and 2147483647. Default is 2 .
frequentHitTimePeriod Time interval in which a URL must be requested the number of times specified in the frequentHitThreshold attribute before it is considered to be frequently hit. Default is 10 seconds.

This means that when a static file is requested for the very first time, it won’t be compressed.

For example, to specify that static files need to be hit seven times per 15 seconds before they will be compressed, use:

<configuration>
...
  <system.webServer>
    <serverRuntime frequentHitThreshold="7" frequentHitTimePeriod="00:00:15" />
  </system.webServer>
...
</configuration>

Caching compressed dynamic files

You've seen that IIS 7 caches only the compressed version of static files, and that dynamic files are compressed for each request (provided that dynamic file compression is enabled). This means that compressing dynamic files takes much more CPU than static files.

That makes sense if the dynamic files are different for each visitor, for example if each page contains personal information. However, if the dynamic pages are fairly static and the same for all visitors, it makes sense to cache their compressed versions too.

You may already use the ASP.NET OutputCache directive to cache your .aspx pages. The issue is that by default, IIS stores the uncompressed version of the file in the output cache, rather than the compressed version. For each request, IIS then has to compress the contents of the cache before sending it to the browser. This is not very efficient.

Storing compressed files in the output cache

Here is how to get IIS to cache the compressed version of the file, rather than the uncompressed version. That way, it doesn't have to compress the file for each request, reducing CPU usage.

Because this uses ASP.NET output caching, you need to use the OutputCache directive in your pages, as shown:

<%@ OutputCache Duration="300" VaryByParam="none" %>

This caches the page for 300 seconds.

Now to get IIS to cache the compressed version rather than the uncompressed version, modify the applicationHost.config file. You'll normally find this file in folder C:\Windows\System32\inetsrv\config:

  1. Make a backup copy of applicationHost.config.
  2. Open applicationHost.config with a text editor.
  3. Find the <urlCompression> section.
  4. Add the dynamicCompressionBeforeCache="true" attribute to the urlCompression element, as shown:
  5. <urlCompression dynamicCompressionBeforeCache="true" ...  />
  6. Save the applicationHost.config file.
  7. Reset the IIS server to make the new attribute take effect.
  8. Start IIS Manager, click the server at the top of the tree, and then click Restart on the right-hand side.

What if a client doesn't accept compressed content?

Now that we're caching compressed content, what happens if someone visits your site with a browser that doesn't accept compressed content?

To simulate this eventuality, let's send a request to the server that doesn't have the Accept-Encoding request header. This should force the server to send uncompressed content.

To do this, we'll use Fiddler, a free proxy which allows you to "fiddle" with requests and responses while they are travelling between browser and server. It's easiest to do this with Firefox:

  1. Open Firefox and download Fiddler at http://www.fiddler2.com/fiddler2/
  2. Install Fiddler and start it.
  3. On the Firefox status bar, switch on forcing traffic to Fiddler.

  4. At this stage, when you visit a page with Firefox, the server should still use compression (assuming the request has an Accept-Encoding request header allowing gzip compression).

    You can measure the actual size of your compressed pages as they will travel over the Internet to the browser, using the Web Developer add on for Firefox:

    1. Using Firefox, visit http://chrispederick.com/work/web-developer to download and install the Web Developer add on.
    2. After you have installed Web Developer, load the page again.
    3. Right click anywhere in the page. A popup menu will appear. Click Web Developer | Information | View Document Size. A new window appears showing the groups of files making up the page.
    4. Expand the Documents group to see the size of the page. If it was compressed while travelling over the Internet, you will also see its compressed size.
  5. Now get Fiddler to strip off the Accept-Encoding request header from the request going from Firefox to the web server. In the Fiddler window on the right-hand side, click on the Filters tab, select Use Filters, select Delete request header, and type in Accept-Encoding.

  6. Refresh the page in Firefox. Check the file size again with Web Developer. You should find that no compression was used with this request. That will make browsers that do not support compression happy. So far, so good.
  7. In Fiddler, uncheck the Delete request header checkbox. As a result, the Accept-Encoding request header now makes it to the web server again.
  8. Refresh the page. The server should now compress the file again. But if you check with Web Developer, you'll find that it is still sending files uncompressed!

    This is because when IIS received the request for uncompressed content, it threw away the compressed contents in the cache, regenerated the content and stored it uncompressed in the cache. It then keeps serving this uncompressed content until the cache expires, even to clients that accept compressed content.

  9. You can prevent this from happening by caching both compressed and uncompressed content. You do that by including VaryByContentEncoding in the OutputCache directive, as shown in the following code:

    <%@ OutputCache Duration="300" VaryByParam="none" VaryByContentEncoding="gzip;deflate" %>

  10. If you now delete the Accept-Encoding header and then let it go through again, you'll see that the server always sends compressed content to clients that accept it, even if another client didn't accept it.
  11. Before you end this experiment and close Fiddler, go back to the Firefox status bar and stop sending traffic to Fiddler. Otherwise Firefox will complain about the missing proxy when you close Fiddler.

A drawback of using VaryByContentEncoding in the OutputCache directive is that it disables kernel caching for this file. Kernel caching is a highly efficient form of server side caching, and is discussed in Chapter 5 of my book ASP.NET Site Performance Secrets.

So should you use VaryByContentEncoding in the OutputCache directive? Seeing that you are reading this chapter, the gain in compression by using VaryByContentEncoding may well outweigh the loss of kernel caching, especially seeing that you already use output caching. Your best bet would be to try both scenarios in production for a while, and compare CPU usage, response times, and bandwidth used per request for each scenario.

Improving the compressibility of your pages

If your server uses compression, then it makes sense to optimize compressibility of your .aspx, JavaScript and CSS files. Compression algorithms like repeating content, which puts a premium on consistency:

  • Always specify HTML attributes in the same order. One way to achieve this is to have all your HTML generated by high-level web controls and custom server controls, instead of low level html server controls. This will slightly increase CPU usage, but will give you assured consistency. For example, write the following:

    <asp:Hyperlink runat="server"......>

    Instead of:

    <a runat="server"......>

  • Likewise, within CSS selectors, write your properties in alphabetical order.
  • Use consistent casing. Use all lowercase for HTML tags and attributes and you'll be XHTML compliant as well.
  • Use consistent quoting: Don't mix "...." and '....'.

Summary

In this article, we saw how to configure compression on IIS 7, including its more advanced, lesser known features. In the next article, I'll show how to get the most out of the compression features built into IIS 6.

If you enjoy this series and want to know the full story on how to improve ASP.NET site performance, from database server to web server to browser, consider my book ASP.NET Site Performance Secrets. Or visit my web site ASP.NET Performance.

Tags:

Web Server

Books

Book: ASP.NET Site Performance Secrets

ASP.NET Site Performance Secrets

By Matt Perdeck

Details and Purchase

About Matt Perdeck

Matt Perdeck Presenting

Matt has written extensively on ways to improve web site performance.

more >>