Locating Unnatural Links

As I mentioned in my previous blog post, 2012 was a turbulent year for those in the SEO industry. The year saw Google becoming more intelligent in the ways that they detected unnatural link portfolios and the punishments being stricter than ever before.

May began with Twitter alight amidst a flurry of SEO’ers and webmasters noticing changes to their rankings, Mozcast was displayed high temperatures indicative that an update may be under way and the whole SEO community feared the worst! Had Google rolled out the penguin update teased by Matt Cutts at SXSW?

After a few days of speculation, constant tweets and people reaching out to Google for answers, Matt Cutts came forward answering the question on everyone’s mind: No, a new penguin update had not been released, further adding that one was on the horizon!

Matt Cutts Confirms Penguin 4

Fast forward to 2 weeks later and again Penguin was the talk of Twitter and after a continued period of speculation Matt Cutts announced that on Thursday night the new version of Penguin had been rolled out.

Though I’ve not yet got enough data to give information on what I believe the new version of penguin is specifically targeting, below I have detailed areas that I normally look at and try and tidy up when beginning work on a new site:

Collecting Information & Pruning

Personally, before I begin work on any campaign I first collate the client’s full link portfolio, pulling in information from Webmaster Tools, Ahrefs and Open Site Explorer including as much information as I can get.

Previously, I would notify the client of my findings and then begin removing low-quality links, now with their permission I just add these to Google’s disavow tool.

Types Of Links

Post Penguin, webmasters & SEO’ers who’d been hit by the update collated their thoughts in an attempt to ascertain which factors Google was looking when rolling the update out.

Sitewide Links

In ay opinion, sitewide links such as blog roll, or footer links are rarely natural, they are also a tactic that I’ve always avoided:

A./ Because frankly they look suspicious

B./ You have little control over what types of links you are going to end up amongst. It is entirely plausible that the link on a blog-roll below yours could be to a spam site, not only does this have serious SEO implications, but it also looks unprofessional.

Needless to say, many utilised, and still do utilise building links in this way, however it is widely reported to be the tactic most associated with sites damaged by penguin.

To locate sitewide links I use Ahrefs, or link detective, both are simple to use, though using ahrefs is easier and just a case of typing in your URL and scrolling down to see which of your links are sitewide

Ahrefs Showing Sitewide & Non Sitewide links

Ahrefs Showing Sitewide & Non Sitewide links

One thing to note with Ahrefs is that in some cases it reports links as being sitewide even when they aren’t. As with everything, it is always a good idea to look through before taking action. I have found that even when a blog has its archives/tags set to nofollow, Ahrefs still reports these as sitewide if your link appears on a few pages.

Link Detective is more thorough and is one of my favoured tools for link analysis in general, it can analyse exactly where links are coming from, but does require you to upload your links first via a .csv file

Link Detective

Unindexed/ Deindexed Links

If a site has disappeared from Google’s index, it may have been removed for not conforming to their quality guidelines and has been deindexed, subsequently, any link from this site is likely to be viewed as being low quality. When building links I always do a quick check to make sure the site is in Google’s index before contacting the site owner, this only takes a few seconds so is not a time-consuming task. In the case of checking links that have already been built this can be a lengthy process depending on how big the site’s link portfolio is, for those with a few links you can manually check, but for larger link portfolios a tool is going to be necessary. Ethan Lyon’s spam checker which can be found at http://t.co/6oz7tgGeu8 is perfect for a reasonably small link portfolio and as well as identifying if a site is indexed it can also give you information about if the site has been flagged for malware and if they are using Adsense on the site. All in all, not bad for smallish link portfolios, especially considering it is free!

@Ethan Lyon‘s Spam Checker

@Ethan Lyon‘s Spam Checker


For larger link portfolios you’re going to need a tool to scale this process. My favourite is one that isn’t conventionally associated with quality SEO and that is Scrapebox. Though the tool is favoured by the blackhat community and is one more commonly used for comment spam it does have some good uses which includes looking to see if the home page of a site is indexed. To do this, simply load up Scrapebox, import your links and click to check if they are indexed. Those that aren’t are ones that should instantly be added to your list of disavowed pages/domains.

Scrapebox Checking For Indexed Links

Scrapebox Checking For Indexed Links


Links From Blog Networks

Google’s battle against blog networks has been well documented, in 2012 they shut down two of the biggest networks resulting in thousands of websites being removed from their index. Recently reports of other blog networks feeling the wrath of Cutts & co have made the news: http://searchengineland.com/google-zaps-another-link-network-several-thousand-link-sellers-hit-159547

Although most would never intentionally build on a link network these days, there are instances where work will be taken on where links have been previously built on networks. I can think of 2 instances over the past year where I have taken on work and found these sorts of links.

Locating blog networks is not as simple as it used to be, the owners have got smarter and technology has improved. I always look for the following

  • Similar I.P addresses – Tools such as Ahrefs can identify if your links are coming from similar I.P addresses as shown below. If you are seeing a large difference between the amount of linking domains and the amount of referring subnets it may be due to some links appearing in a network that share similar I.P addresses
Ahrefs Referring I.Ps

Ahrefs Referring I.Ps

  • Similar URLs/ Sitenames – This one can easily be done in Excel, just sort your links alphabetically and you may see a sites with very similar names sat next to each other, 9 times out of ten these links will be part of a network.
  • Multiple links where the contact information is for the same person – This one does not always necessarily indicate a link network, it is perfectly normal for savvy webmasters to have more than one site, however in a lot of cases this is not the case and the sites are linked as part of a network. In the past I have used a number of different tools to identify these types of links but my personal favourite is now Buzzstream which though a paid tool is actually the perfect tool for doing the majority of the link network detective work, as shown below it pulls in email addresses (where visible on the site, or from whois information), I.P addresses and URLS.

Paid Links

This is a really obvious one so I am going to avoid going in to a lot of detail, but both Penguin 1 and 2.0 have targeted paid links. If you know that you’ve been purchasing sponsored links, or advertorials it may be time to add these links to your disavow list! A previous client of mine invested heavily in advertorials on large newspaper sites, but when Penguin 2.0 rolled out they saw that these had been been seriously devalued and they saw large drops in the terms they had been targeting.

Over Optimisation

Like with paid links, this section does not need much explanation, in a nutshell, over use of commercial anchor text looks suspicious and if you’re building links like this then it’s only a matter of time before your site ends up being the victim of Penguin or slapped with manual action.

Over optimisation can be seen by going to Ahrefs, OpenSite Explorer or a similar alternative, entering your site details and looking at anchor text. Even when dealing with competitive niches I always try to build as many branded terms as possible.

So there you have it! These are the some of the things I look at in a client’s link portfolio. If you’ve enjoyed this post, please share it on your social networks via the links to the side, and as always feel free to leave a comment below if you have any questions or would like to add any further points.


About My Work

Phasellus non ante ac dui sagittis volutpat. Curabitur a quam nisl. Nam est elit, congue et quam id, laoreet consequat erat. Aenean porta placerat efficitur. Vestibulum et dictum massa, ac finibus turpis.

Recent Works

Recent Posts