At the beginning of 2014 we started our Web Analytics World improvement plan and looked to cultivate new and interesting bloggers, focused more on quality versus quantity and in the background began building our content cleanup plan.
Our inspiration springboard for the cleanup was Koozai with their bold move of deleting 900 blog posts and so earlier this summer we started removing old content from our site.
We agreed with Koozai’s approach that while older posts were valuable back in their day, they’re no longer relevant in 2015. This is especially true for our posts from 2006 to 2010 as Google’s (and other search engines’) best practices have changed quite drastically.
To start with we wanted to collect data about the posts on our site and how they’re performing. We already knew that there would be a lot of low performing blog posts (mainly those from that time period) but we wanted to create a list of all the posts/pages on the site, their total pageviews over the last few years, backlinks, categories etc. to create a Master Content list we could then work from.
In the Beginning there were Data Exports
What we needed was a way to export all of the data from WAW in a usable fashion (sorry WordPress but your xml export wasn’t easy to work with) and some of the data we needed was; post id, post title, post url, author name and category to name a few.
1. Getting our Post Content Data
To get everything we needed for the Content Master list we needed two different plugins the first of which was a CSV export plugin “WP CSV” by CPK solutions. This plugin exports more information than we actually needed, including our post author, categories, tags, post id and post title/meta. It does let you filter for specific content, but I’d rather have too much content to edit down rather than not enough and when first using it I had no idea what to exclude so I took everything!
We took this and began by creating user friendly headings for our data and deleting columns that we don’t need like “wp_post_mime_type”. Then, after tidying up columns and headers the next step was to remove all the drafts/pages etc. so I sorted my content by post type and cut/pasted the drafts and pages onto another tab … just in case. Now you should have something that looks like this:
As you can see at the bottom I’ve got my tabs setup and this is important particularly when I start bringing in my Analytics and backlinks data. I’ve also copied the category information into a new tab called “Categories” because in a later point on this cleanup campaign we’re going to be reviewing our categories and deleting empty ones. Your workbook might look very different from the above screenshot depending on which columns you kept from the export and how you like to arrange your data.
2. Now for our Permalinks
The first plugin is brilliant but the one thing it doesn’t export (or at least I never figured out how to get it to export) was the pretty permalinks for our posts. Luckily I found a handy WordPress plugin called “WP Permalinks Export” by Jordy Meow (This plugin is no longer available). This is a very simple, but effective, plugin and will export the post type and title and more importantly our pretty permalinks:
It’s a little messy as a straight import and from here you can save this as a .txt file and then open it in Excel and use your “text to columns” function to tidy it up. I’m lazy though, so I simply cut and paste all of this into my handy Notepad++ (I love this program) and then cut and paste from Notepad++ to Excel and ta da! All the hard work is done for me and our data now looks like this:
Now comes a slightly more tedious section of our work where you take your CSV Export and our Permalinks export and combine them by matching Post Titles.
The fastest way I found to do it (and let me know if you have a better way) is to sort both sets of exports by Post Title, then cut and paste the Post Permalink/URL into your larder CSV Export. Last but not least remember to save your final work as an Excel Workbook which makes it easier to follow the nex steps!
Adding more Meaning to our Data Export
At this point we have the start of our Content Master List and the next step is to bring in Analytics and backlink information. Add two more tabs to your Content List; “Analytics” and “Backlinks” and we’ll export this information from Google Analytics and Search Console (GWT).
Google Analytics Export
Select your date range in GA carefully if you’re working with a long time frame as we are (9 years) and have a lot of content; after some testing/playing around we managed to get 2.5 years of data, January 1st, 2013 to July 31st 2015, without sampled data. We felt that 2.5 years would give us a good idea of what people were reading of the older content, for example something with 30 pageviews over that time period is probably safe to get rid of.
- Once you’ve got your date range go to Behaviour => Site Content => All Pages
- Scroll to the bottom of your table where you choose to view 25 rows
- Make a note of the total number of rows, in our instance it’s 7056
- Then go back up to our URL and at the very end of it we look for this: D25/
- Then we change the 25 to 7056 (or however many rows you have) so the end of the URL looks like this: D7056/ and press enter.
- To export simply go back to the top of the page and click on Export to xls.
We do this in order to export all our rows of data and if we left it as showing only 10 rows it would only export those 10 rows, be patient and try not to click on anything as it loads!
- Open up the file once it downloads and click on the centre tab called “Dataset 2”
- Copy all of this data and paste it into your Analytics tab in your Content Master List.
This data will include pageviews for pages but don’t worry we’ve got a formula to help us pull out only the information we want.
Search Console (was Google Webmaster Tools)
Now we’re off in search of some backlinks – I like to get these from Google’s Search Console but you can also get them from platforms like Ahrefs, Majestic SEO, SEOMoz and Raven Tools.
Log in to the Search Console and select your site:
- Go to Search Traffic => Links to your Site in your left-hand navigation menu and you’ll see the following sections:
- Who links the most
- Your most linked content
- How your data is linked
- At the bottom of the “Your most linked content” section click on the blue “more” link. This will take you into an overview of all your linked pages:
From here all you do is click on the “Download this table” and it will download all of the results for your site. Once you have that exported you’re going to copy and paste all the data from this report and paste it into your “Backlinks” tab in your Content Master list.
Filtering your Analytics and Backlink Data
Now we need to pull our analytics and backlink data onto our main Content page in our workbook. As we’ve copied these two exports into their own tabs within our workbook we just need to add two (2) new columns called, “Total Pageviews” and “Total Backlinks” respectively.
In the new column (where you want your total pageviews to appear) enter the following VLOOKUP formula:
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
- lookup_value => This is what you want the formula to look for in this case your post URL. You only need to select the URL in the same row as the analytics cell where you’re adding this formula
- table_array => This is where you want the formula to look, make sure you select all the data on your tab otherwise the formula won’t work (either your analytics or backlinks tab depending on which one you’re working on)
- col_index_num => This is where the formula will find the information you want i.e. Total Pageviews and it’s just the column number of where that data is found (In our analytics export or table array pageviews were in column 2
- [range_lookup] => This will be either False (to make an exact match) or True (an approximate match) for this formula (both for analytics and backlinks) use “False”
How this formula works:
- First it will look at the post URL that you’ve selected
- Then search for that exact string in your table of data (either your backlinks or analytics tab)
- When it finds the exact string it returns the information in the column you entered (i.e. 2 = Pageviews)
If you see “#N/A” in one cell don’t panic, that just means there was no value to return and if you see “#REF” in any of the cells it means there’s a problem with your VLOOKUP formula. Whenever I use this formula it usually shows up because we’d forgotten to select all of the data in the range_lookup field.
Repeat the same steps for your Backlinks data and once you’ve got them done your workbook will look something like this. If you have a lot of #N/A results then you can do a “find and replace” and change them to “0” which means you can still see them.
Data Export to Data Report – Ready for Review
Now you’ve collected and collated all your data exports, you can filter for Pageviews and/or Backlinks to as you review and decide which posts you want to keep and which ones you want to get rid of. The next post in this series will discuss our “Keep, Update, Bin” list and what we did with the posts we wanted to update and the ones we wanted to bin.
Have you ever done a spring cleaning on your site? If so what was the most challenging part, did you encounter any unforeseen hiccoughs or did it all run smoothly?
We’d love to hear from you so feel free to leave your thoughts and comments below!
We’ve created a video walkthrough of how we created our vlookup formulas for Analytics and Backlinks in this report.
Video length: 1:29, no audio