This is the second post in my series talking you through how to carry out a content audit on your website. The next step in our content audit was to determine what to do with our older content. In the first post of this series we talked about creating 3 “piles”; Keep, Update or Bin. With a site the size of Web Analytics World we needed to take smaller bits of the content so we decided to sort by year then start looking at the Analytics from our Content Audit year by year starting with /2006/.
Filtering our Data
Using our gorgeous Main Content workbook from part 1 we can now do a text filter by year in our URL column. Simply click on the filter button and select “Text Filters” and then “contains”:
Opening the window below opens we simply add “/2006/” because that’s the first part of the URLs we’re looking for into the text field:
Even though we selected “contains” from our list you can always change your mind once you’re in this window. If you click on the “contains” dropdown list it will show you all your options and you can do a more advanced filter by using the and/or advanced filtering options as well.
Now that we’ve got our URLs for 2006 filtered we can take a look at the pageviews, backlinks and content to decide which posts we want to:
- Keep – posts that have some decent traffic, backlinks or the content is still up to date and relevant
- Update – posts we feel we can update and bring something new to the table so we’ll leave them live for now and then rel canonical them once the updated post is published
- Bin – posts we don’t think add anything to WAW. They’ve had little to no traffic over the 2.5 year period we selected, the content isn’t current and/or still relevant so we’ve decided to delete them
The Deletion Question Answered
Great! We’ve decided to delete a bunch of posts, 126 in fact from 2006 alone. Now what? Well we could just delete the posts and remove them from our sitemap and leave it at that because the posts we were deleting had no backlinks or any traffic for over 2 years. We wanted to redirect these “to be deleted” posts to more appropriate content and WordPress does have a few easy to setup post/page redirection, rel canonical and noindex/nofollow options:
However we like to keep things tidy so we needed to figure out a good way to remove the posts from the world without causing 404 errors and that wouldn’t take ages to actually do … No-one wanted to take one for the team and manually redirect all 126 posts for 2006 (not to mention all the other posts we wanted to delete). So we were forced to get creative.
The .htaccess of it All
In the end we settled on using .htaccess redirections for 2 reasons:
- It was a great way to redirect all the posts except for the few posts we wanted to keep/update
- It was a great opportunity for me to increase and hone my RegEx skills
The question then became – how DO we redirect 126 posts from 2006 except for the 6 posts we want to keep live? While our URL structure is setup as /year/month/post-name.html we couldn’t redirect by month either because the 5 posts we wanted to keep were (of course) all from different months.
So we setup exclusions.
This nifty piece of .htaccess code says if the URL is /2006/05/postname1.html or /2006/09/postname2.html then ignore it. Everything else that starts with /2006 redirect to /category/analytics with a permanent 301 redirection.
Once we implemented the redirections we checked the 2006 post URLs that we wanted to keep to make sure that they didn’t redirect. Then we checked a few random posts from the 73 we wanted to delete to make sure they did redirect. Lastly we randomly checked posts from the rest of the site to make sure that we didn’t inadvertently redirect something else.
Over the next few weeks we kept an eye on Search Console (formerly Webmaster Tools) to ensure that none of our redirections from 2006 caused any new 404’s or came up as 404’s and then we happily moved onto the next few years to repeat the process. For each new year we copied the posts into a new sheet in our main Content List workbook so we could keep track of each specific year.
The next and final post of this series will be looking at how we’re streamlining and tidying up the front of WAW with the help of a great platform called Ptengine. The team at Ptengine have given us a generous free trial of their analytics and heatmap tool so we can share our experience with you guys and so that I can use their heatmaps to find out what people are interacting with on our site!