Optimizing size of public websites based on Sharepoint

By Míša Hájková

Sharepoint is generally very good platform for intranet applications. But you can meet more and more public web applications based on Sharepoint. Despite my humble opinion, that Sharepoint really isn’t the best choice for this purpose, some of our customers web presentations are based on it.

The biggest problem, I was faced, is many many…. really many parts I don’t need. Default master page in Publishing site contains lot of controls and referenced files, which are useful in intranet, and some of them are also useful for editing content, but visitors of your site really don´t need them – they don´t edit pages, don´t approve them, they only browse pages.

Here is image taken from Firebug showing the overall amount of data visitor’s browser download from our site before it was optimized:

Firebug data - before

As you see, user download almost 700 KB of data. And most of them are Sharepoint data which are useless on read-only sites.

  • core.js – good for editing and administrating site, but visitors doesn’t need it;
  • core.css – our website use it’s own stylesheet, we don´t need another 80 KB which is not used anywhere;
  • init.js – our read only site doesn’t need it, why visitors must download it?

I check some of sites presented on http://www.topsharepoint.com and few of them are optimized and relatively small, but most of them has really terrible overall size of landing page and all referenced files. Average is about 1 MB, but there are some 3 MB or 4 MB! Yes, most of these files are stored in cache after first load, but problem is mentioned first load. Many users has only limited bandwidth (mobile connection is more and more common). They probably don’t wait five minutes for loading landing page, they close browser’s tab and find another (probably faster) page.

If your website is based on Sharpeoint, don’t be sad, here is solution of this pain.

Important information: This solution works and is tested in MOSS 2007, but similar technics can be used in other versions.

First of all, I recommended  you to use minimal master page (http://msdn.microsoft.com/en-us/library/aa660698(v=office.12).aspx), similar minimal master page exists also for MOSS 2010. It’s first step to wipe out Sharepoint things, you don’t need in Publishing web.

In next step, you must modify Render method of your master page. It can be done in <script runat=”server”> block or in code behind file referenced in <%@ Master %> directive.

protected override void Render(HtmlTextWriter writer)
{
	string html = string.Empty;
	using (StringWriter sw = new StringWriter())
	{
		using (HtmlTextWriter tw = new HtmlTextWriter(sw))
		{
			base.Render(tw);
			html = sw.ToString();
		}
	}

	// Do something with html

	writer.Write(html);
}

As you see, you can take html right before it’s send to client. And what’s better, you can modify it. And that’s right what we need. We use regular expressions to find and remove references to scripts and style sheets, we don’t need, and also remove Sharepoint hidden fields and script blocks:

protected override void Render(HtmlTextWriter writer)
{
	string html = string.Empty;
	using (StringWriter sw = new StringWriter())
	{
		using (HtmlTextWriter tw = new HtmlTextWriter(sw))
		{
			base.Render(tw);
			html = sw.ToString();
		}
	}
	if (_isPublicWeb)
	{
		// removing of all hidden MSO... and __SPCEditMenu fields for page editing
		html = Regex.Replace(html, @"<input type=""hidden"" name=""(MSO|__SPSCEditMenu).*?/>\s*", String.Empty);
		// removing of MOSS client scripts for page editing
		html = Regex.Replace(html, @"<script> var MSOWebPartPageFormName.*?</script>\s*", "");
		html = Regex.Replace(html, @"<script[^>]*src=""/_layouts/1029/.*?></script>\s*", "");
		html = Regex.Replace(html, @"<script type=""text/javascript"">\s*//<\!\[CDATA\[\s*var __wpmExportWarning=.*\s*</script>\s*", "");
		// removing of MOSS body and form scripts
		html = html.Replace(@"<body onload=""javascript:_spBodyOnLoadWrapper();"">", "<body>");
		html = Regex.Replace(html, @"(<form.*?)onsubmit="".*?""(.*?>)", "$1$2");
		// removing references to MOSS styles
		html = Regex.Replace(html, @"<link rel=""stylesheet"".*?href=""/_layouts/1029.*?/>", "");
		// removing of (not html valid) ms-action table
		html = Regex.Replace(html, @"<!-- Begin Action Menu Markup -->(.|\s)*?<!-- End Action Menu Markup -->", "");

		// if we don't use WebResource.axd, we can remove it
		html = Regex.Replace(html, @"<script[^>]?src=""/WebResource.axd.*?></script>\s*", "");
	}
	writer.Write(html);
}

Variable _isPublicWeb contains information, whether this site is public (read only) or not. We assign it with value stored in web.config, but you can choose different approach (for example check rights).

After these steps our site look much better:

net-after

Now, overall amount of downloaded data is only 113 KB and after optimization of our stylesheet it can be plus minus 100 KB, which is acceptable for site visitors and also for site creators :) Difference between site before optimization and after it is half of megabyte.

At the end, you can improve your Render to lower size of your aspx files:

html = Regex.Replace(html, @"[ \t]{2,}", " "); // all tab sequences are replaced with space
html = Regex.Replace(html, @"(\r\n(\s*)){2,}", "\r\n"); // removing all of empty rows
if (IsPublicWeb) ...

This method is also useful for validating HTML (if you need it), or other activities, if you need modify Sharepoint pages before sending them to client . If you are not sure if you can remove some fields or references, don’t afraid. You can try it row by row, or create own regular expression.

And that’s all. I hope this article can help someone :)

Tags: Sharepoint, MOSS 2007, Publishing site

9 Comments

  • Alexander Stendevad Skott said

    You are the man - serious - beautiful without too much advanced configs needed!

  • Alex Skott said

    Btw - can you also remove the __VIEWSTATE part as well you guess?

  • Conspirator said

    There are few gramatical mistakes in this blog. Why not to use good Czech instead of bad English?

  • sulca said

    I think that despite the few grammatical mistakes, this article still can be understood.
    If it was in Czech, only minority of the targeted audience would be able to read it.

  • Míša Hájková said

    Conspirator: Please read this article (in good czech) http://zdrojak.root.cz/clanky/anglicky-radsi-spatne-nez-vubec/. I absolutely agree with it and I hope my article can be understood by most of developers, if they want to understand ;)

  • Alex Skott said

    Misa, can't you please include the code for _isPublicWeb?

  • Míša Hájková said

    Alex: We have stored this value in appsettings block in web.config and we read it by ConfigurationSettings.AppSettings["IsPubWeb"]; We can simply change it on our development server and test behavior for editors and visitors. In production, we have permanent "1" in web.config of our public (read only) server and "0" in web.config of editing web.
    You also asked if you can remove __VIEWSTATE. Answer is: Don't do it. __VIEWSTATE is essential part of webforms applications. You can reduce size of this hidden field by disabling viewstate on particular elements.

  • Alex Skott said

    When is it not suitable to remove / disable VIEWSTATE?

  • Míša Hájková said

    Alex: http://msdn.microsoft.com/en-us/library/ms972976.aspx

Add a Comment