It is becoming more and more apparent that I might actually have to *do* something about this new website I have been talking about for about 18 months. The project to procure a CMS has some real momentum and for the first time probably I think it is going to happen this year. [No idea what that CMS will be though – but whatever it is will be a massive step forward!]
The technical side of the new site will clearly be driven by the CMS and the visual design is likely to be driven by outside forces to some extent as well so I am not worrying *too* much about that yet.
The two things that are starting to preoccupy me though are the information architecture (which continues to vex me as it has for months but I think we are closing in on a schema which will be tested this summer) and also the content.
In particular something I am wondering about the existing content. We don’t have a huge site – Google only registers about 9k URLs including all the ‘vanity URLs’, redirects and documents – I reckon we only have 1500 HTML pages and a similar amount of PDFs etc if truth be told. The nature of a large amount of our content also is that it is very time sensitive (funding calls have a time window as do news stories) and traffic drops off a cliff pretty sharpish for certain pages.
I was re-reading this piece by Gerry McGovern recently – Web content migration: disastrous strategy – and it got me to thinking about whether our planned ‘lift and shift’ strategy was really the best course of action.
What I would like to do is start anew on the new site – select the most sought after content, freshen it up (alot in some cases), rethink the format of some content from the ground up, fill in some of the gaps we know we have and basically launch the new site with content that really has a use focus. This all sounds a little like a ‘content strategy’ I guess – just as well I am going to Bathcamp on Wednesday🙂
For now I am more interested though in what to do with all the pages that wouldn’t be migrated. I am writing a proposal based on the following idea but am hoping someone who reads this will point out if it is stupid in advance of me sending it to anyone important🙂
The idea is to use software like Heritrix (if I can ever get my head around it) or perhaps a company like Hanzo (if funds permit) to create a web archive snapshot of the website prior to the launch of the new site.
This ‘archive’ would be hosted at a domain like http://www.archive.mrc.ac.uk (for instance) and all existing pages that are not reproduced on the new site would be 301 redirected to there. A pop-up (or modal dialogue!) in the style of Gov.UK warnings would warm users that it was an archived page and no longer updated but the Google juice would not be lost and we would maintain persistence around our URLs.
Does this sound sensible? If content needed to be un-archived as it were then I can imagine it could get a bit tangled and that needs more thought.