For the past few years I have had the oppurtunity to work hand in hand with many organizations who choose Cascade Server to manage their web content.  In all my interactions with those clients, one thing I have tried to impress, time and time again, is employing efficient indexing strategies for their implementations. Practical and efficient indexing decisions made at the outset of any development ensure that what you build will be managable and user-friendly not only for the immediate future, but also for years to come as your content begins to grow exponentially.

A wrong decision early in the design process can mean the difference between publishing a site in 30 minutes and publishing a site in 3 days–and since all clients want instantaneous results 3 days is never an option.  A strong site infrastructure will also allow greater flexibility in the future for adding the ever-evolving array of ‘bells and whistles’ that the web revolution presents us.

Here are a few tips we like to hand out to our clients early in our talks and others we enforce in-house with our development team:

Do not be afraid to create more than one index block
What’s the most efficient number? One, right? If I just create one index block that indexes my whole site I can use it over and over again, wherever I want without having to worry about creating multiple index blocks, right? Wrong.

An index block of your entire site has one purpose, a site map.  If you’re using a block of that type anywhere else, chances are your Cascade Server experience is unnecessarily slow and cumbersome.  Create multiple index blocks tailored to each use.  Create one for a dynamic left navigation menu, name it ‘left-nav.’ Create one for your breadcrumbs, name it ‘breadcrumbs.’  Create one for your news index page, name it ‘news-index.’ I could go on and on.  Using multiple index blocks in your system ensures that you’re not indexing your entire site two times over every time you want to view or publish a page that contains a left navigation menu and breadcrumbs. A 200 page site that uses a whole site index to render just those two regions will have to compile the XML for the entire site 400 times for every full-site publish–not to mention that you will lose any cache almost instantaneously as the index block will need to be rebuilt with every system edit.

Limit index blocks to metadata only, wherever possible
When you begin modeling your data, it’s best to put fields that you will likely want to reuse elsewhere on your site within a metadata set, adding dynamic fields to a custom Metadata Set if needed.  It’s a good idea to place your metadata ‘Title’ and ‘Display Name’ fields inline to force your content contributors to input relevant data in those fields for easy indexing and reuse elsewhere in your site.

Adopting this standard means that the majority of your index blocks can be set to return only the XML from the page’s metadata, eliminating the vast majority of the page content (most of which will be unneeded for your purposes and would cause the index block to grow rapidly and exponentially over time).

In-house we use the ‘Display Name’ field to create any navigation menu or for breadcrumbs and use the ‘Title’ field when constructing things like press release index pages.  If you’re creating a staff directory you may want to add custom metadata fields for a staff member’s first and last name and job title so that you can pull that to a centralized directory more easily.

It’s also a good idea to keep in mind certain fields that can’t be added to metadata, like file choosers for images and custom date fields, so when you’re modeling what data to reuse elsewhere in your site you can weigh the indexing implications for including that type of data.  Typically, if I can’t model the data as metadata, I try to avoid situations where I will need to index it.

Use Content Type indexing
If you just want to create a listing of all your press releases, why would you want to include anything that’s not a press release in your index block to begin with?   If you created a press release content type for all those press releases then you have an option available that will make it easy to do just that. Content Type indexing allows you to specify an index block that returns data from only a certain Content Type within Cascade Server.

Instead of using folder-based indexing, content type index blocks allow a single database query to return all your press releases (or any other specified Content Type) much more efficiently.  This will exclude ancillary data, like images and other supporting pages, from being returned in the XML as well.

Keep an eye on your folder structure
Within Cascade Server, a well-designed and logical folder structure is key to operating an optimized website.  Not only do folder paths affect site URLs (important in SEO) but logical organization of content can also keep your indexing load times down and your system’s performance up.  While Content Type indexing is a great new tool introduced by Hannon Hill, it is not applicable everywhere.  For dynamically generated navigation regions, it will still be necessary to use standard folder-based index blocks.

Making sure similar content is grouped together will mean that you can create an index of one folder, instead of twelve–an obvious savings.  Staff profiles should be grouped in one folder, i.e., /profiles. News articles can be grouped into a directory named /news.  Feel free to subdivide those folders into smaller divisions, but this method works better than having doctor profiles in /doctor/profiles and nurse profiles in /nurse/profiles which would require you to set your folder index further up the hierarchical tree.

Make the folder tree your friend
Cascade Server index blocks aren’t always designed to cascade down your folder tree from top to bottom.  In certain situations you’ll want to start indexing down the tree and work your way back up, especially for breadcrumb-type functionality.

Working your way back up the tree, by selecting a rendering option that starts at the current page, allows you to eliminate indexing of any content below your current location in the folder structure and also will reduce the XML of many unnecessary sibling assets from bogging down the index.

Don’t forget about block choosers
It’s good to have a very flexible application like Cascade Server as it allows you the freedom to accomplish almost anything you could want to do in the web environment, but it may also lead you into situations where you realize you’ll need to perform some voodoo to keep your users whizzing along in the system.

Maybe you’ve worked yourself into a position where you want to aggregate content from a number of places within your site’s folder structure.  You’re thinking, ‘well I don’t want to do it, but it looks like a full site index block is the only way to get the information I need.’ Wrong, again.  Luckily.

If you’re faced with a page that needs to pull content from disparate corners of your website, create a data definition with a few block choosers.  Create individual index blocks that index each of the folders you need, then link them to your data definition and use a ‘current page’ index block to return all that XML that you need, and none of it that you don’t.  You can even restrict those block chooser fields to certain groups in Cascade Server so that an overzealous content contributor doesn’t undo all your genius design.

Remember as you start out on your site construction that it is never too early to think about future scalability.  Always think about how a decision you make now will affect your site and your users two to three years in the future.  It’s the same principle as compounding interest in your 401(k), though not as sexy.  Decisions you make today wil have lasting ripple effects years from now.  Making sure you’ve thought through your indexing needs and picked the correct strategies will mean less headaches, and more kudos, as long as your website is out there informing the world.  Who knows, your server administrator may even buy you lunch in appreciation.