Ahrefs Site Explorer

There’s no question that a website’s backlink profile is still a major determinant in how well it ranks for its target keywords. From its inception, Google has always regarded links as digital votes confidence that vouch for a website’s authority, trustworthiness and originality.

Of course, we all know by now that not all links are built equal. Many link types can help you gain greater search visibility while an increasing number of links are simply ignored by Google. In some cases, links can be counter-productive and get your website in trouble with search engines. Unnatural and spammy links have been known to trigger both algorithmic and manual penalties which can abruptly tank your website’s organic search traffic.

If you suspect that your site might have been hit with an unnatural link penalty or that it might be at risk of one in the future, it may be time to perform a backlink audit. This guide provides step-by-step instructions on how to perform one using popular tools and a simplified process. Let’s get started.

What is a Backlink Audit?

A backlink audit is a comprehensive review of a website’s backlink profile with the goal of identifying potentially harmful referring links. It can be done manually or with the help of some link audit software platforms. Once the bad links are determined, a webmaster may choose to have them manually removed by referring domains or he may opt to disavow them using Google Search Console.

Why Should You Do a Backlink Audit?

The most common reason people do backlink audits is to try and get a penalty revoked whether it was levied through a manual action or via algorithmic filters. Among the algorithmic sanctions, a Penguin hit may be the most feared.

Some SEOs encourage periodic backlink audits and proactively disavowing low-quality links. For these folks, this is a preventive measure to guard against penalties. Some even claim to see ranking benefits after getting rid of suspect links pointing to their sites.

Do You Need to Audit Your Links?

Short answer: Depends on          who you ask. Some people make a living out of cleaning up other folks’ backlink profiles and they’ll probably say you should. For my part, the answer is simply “maybe, but I doubt it.” Links are an incredibly vague and contentious topic among SEO circles and Google doesn’t help the situation by being perpetually cryptic about the matter. However, there are a few things that they’ve said in the past which I feel are pretty close to the truth:

  • First, I believe that Google indeed has the ability to ignore most of the spammy links out there on the web:

  • Second, I still believe that Google’s algorithms can’t catch every type of unnatural link scheme and they need to manually go after offending webmasters.

Read: Latest Google Manual Action Penalty Wave is Buggy

  • Third, I view link disavows and removals the same way that a doctor views chemotherapy. It can do a lot more harm than good if you don’t know what you’re doing. Unless you’re absolutely sure that your website has “cancerous” links, I suggest you hold off on this.

That said, if you’ve experienced any or a combination of the following, you should probably do a link audit:

  • You received a notification from Google that your site has been docked by a manual action for unnatural links (duh)
  • You experienced a sudden and drastic drop in traffic and rankings which can’t be attributed to on-page SEO and technical SEO reasons.
  • You’re seeing a rapid and sustained increase in inbound links to your webpages from shady-looking websites.

Nine times out of ten, you probably don’t need a link audit followed by a link removal campaign. However, if you feel that you’re in a situation that calls for drastic action, keep on reading to learn what exactly needs to be done.

Backlink Data Sources

You can get started with your backlink audit by collecting data on where your website is getting links from. There are four reliable sources, though it’s unnecessary to get data from each one in most cases. Personally, I find that data from Search Console and one other paid tool is usually enough. However, some very large sites with tens of thousands on inbound links may require an additional source or two to make sure that your list of links is as comprehensive as can be.

  • Google Search Console. Search Console is a free service from Google that provides you lots of useful data that’s relevant to SEO. These include technical, on-page and backlink reports which you can access and download at no cost. Its link-related reports include Top Linked Pages, Top Linking Sites, Top Linking Text, and more.

Search Console is the most complete in terms of reporting your linking domains. However, it doesn’t provide a whole lot of data beyond that. It doesn’t give you any hints on a linking site’s authoritativeness, popularity, spamminess, etc.

GSC-Top Linked Pages

  • Ahrefs. Among the paid tools, Ahrefs separates itself as the one which tends to find the most inbound links to your site. Their crawler’s tenacity is second to none, and their platform is probably the best suited for competitive link acquisition

The only drawback to using Ahrefs is probably the fact that their primary link metric, Domain Rank, isn’t as popular and well-understood by SEO professionals and business owners alike. Far too often, people still use Moz’s Domain Authority and Majestic’s Trust Flow as references.

Ahrefs Site Explorer

  • Majestic. Similar to Ahrefs, Majestic is a link data source with a nice breadth of websites and links in its index. Its ability to find and report links isn’t quite as robust as Ahrefs’ but it’s still better than what Moz gathers for the most part.

The main selling point of Majestic as far as most SEOs are concerned is its Trust Flow metric. Rather than simply assigning domains scores based on the quality and quantity of their backlink profiles, Majestic identified thousands of trustworthy seed sites on the web. The more closely linked a website is to those seed websites, the higher its Trust Flow score will be. Conversely, the more links you get from questionable websites, the lower your Trust Flow score drops.

MajestiC Link Data

  • Moz. Moz is undoubtedly the most popular brand among the paid link data sources and is a pioneer of the service. Its Domain Authority metric has been an industry standard for years and is widely recognized by SEO pros and clients alike.

The criticism of Moz in recent years has been the comparatively smaller index of websites and link data that it has. It’s also a fact that Domain Authority is the easiest link metric to game out of the three. Fortunately, the introduction of Link Explorer seems to have addressed its link index limitations somewhat. Recently, the company also announced that its Domain Authority metric has been revamped to better reflect the true trustworthiness of the websites in its index.

Moz Link Data

For the purpose of this post, I’ll generally be referring to data from Search Console and Majestic, though you can apply virtually the same steps if you prefer Moz or Ahrefs.

The Link Audit Process

Now that you have what you need, let’s get started with the audot process:

1. Extract The Link Data

Go to Majestic Site Explorer and type the preferred root domain of your website. Hit Enter and you’ll be shown a variety of reports. Navigate to the Backlinks tab and you’ll see a report like this:

Majestic Backlinks

The good thing about Majestic is its simple yet effective filters. Make sure to set the report to display all the backlinks while hiding deleted ones. This will make the number of rows in your spreadsheet more manageable.

Moz and Ahrefs have similar filters which you can also apply to weed out links that once existed but are no longer online.

In Google Search Console, navigate to the Top Linking Sites report where you can see the domains that link to you the most.  Scroll to the bottom of the list and select the maximum number of rows displayed.

GSC Top Linking Sites Rows

Export the data to an Excel file.

GSC Top Linking Sites Export

You’ll notice that the data from Search Console is nowhere near as rich as the kind you’ll get from paid tools. That’s okay since Google has the most complete set of linking domains anyway. Whatever domains your paid link data source misses, Search Console will likely have them.

2. Clean Up Raw Data from Paid Tools

After opening the files you’ve exported, you’re probably overwhelmed with the sheer volume of links that you didn’t know you had plus the supporting data from the tools you’re using. For instance, my blog only has a domain authority of 30 and a Trust Flow of 20, yet it has about 15,000 backlinks which will take weeks to review if I had to review them one by one.

Fortunately, I don’t have to.

We’ll expedite the backlink audit process by cleaning up the data sheet. This can be done by following these steps:

a. Removing Dead Pages. Even if you followed the instruction on filtering out deleted pages, this step is still necessary. Keep in mind that Majestic, Moz and Ahrefs only periodically crawl websites and the HTTP status of the sites in your sheet may be out of date.

For best results, crawl the URLs in your list using the Screaming Frog SEO Spider tool and check the status codes in real time. If you see any of the URLs show up with 4xx or 3xx responses, highlight them in your sheet and delete their rows. No need to audit links that are no longer accessible.

b. Removing Multiple Links from Single Domains. You may have noticed that a lot of websites in your sheet link back to your site on more than one page. For the most part, you don’t need to manually inspect each link from a single site. Often, you can judge whether a site sends you spammy links based on just one of its links. It’s very rare for a website to send you a quality link one time and a spammy link another time.

If a site is giving you a spammy link, you’ll likely want to disavow all of its links to your site anyway. Therefore, it’s safe to just review one link and base your decision on that. You can clear your spreadsheet of multiple links from the same domain by going to the Domain column in the Majestic sheet (or its equivalent in Moz or Ahrefs) and selecting the entire data range.

Majestic Domains in Excel

Go to the Data tab and click on the Remove Duplicates function. Choose only the Domain column and click OK. You should be left with just one sample link for every domain that links to your site which Majestic has found.

Link Data Remove uplicates

Of course, this list isn’t quite complete as every tool misses some domains and links here and there. To get a more complete list of websites linking to yours, you’ll also have to extract domain data from Google Search Console.

c. Removing Nofollow Links. As you may know, the nofollow attribute nullifies the flow of link equity from one page to another. If you get a nofollow link from a high-authority site, you stand to gain no direct ranking gains from it. Conversely, a spammy link from a bad website which has a nofollow attribute poses virtually no harm to your search visibility either.

All of the paid link data sources cited here will tell you if a an inbound link to your site has a nofollow attribute or not. You’ll want to remove every link from your list with a nofollow attribute since they neither help nor hurt your site anyway.

In Majestic, you can see this under the FlagNofollow column. The zero value means it doesn’t have a nofollow attribute while a value of 1 means that it does have it. Sort the sheet using the data from this column (largest to smallest) and delete all the rows with 1’s in them.

Link nofollow sorting

After removing the nofollow links, I was down to just 70 domains to examine on my sheet.

d. Remove disavowed domains. If the website has previously had a link audit and some domains were disavowed, check the disavow file and see if the domains are on your current spreadsheet. Remove those from the sheet because the disavow has rendered them useless at this point.

3. Add and Enrich Search Console Data

As previously mentioned, Search Console has the most complete list of linking domains to your website, helping you see linking domains that your paid tools miss. However, Search Console data has two major shortcomings of its own from a link audit perspective. Specifically:

  • It doesn’t give you any metrics such as authority and trust scores, the IPs of linking domains, if the links are follow or nofollow, the countries where the linking domains are hosted etc.
  • It doesn’t show you the specific pages in those domains where you’re getting links from.

Good thing we can make the Search Console data a little more suitable for our backlink audit purposes by following these steps:

  1. Paste the list of domains from the Search Console Top Linking Sites report to the Domain column in your spreadsheet. Don’t worry if there’s no other data to go with those right now.
  2. Select the entire range of data in the Domain column.
  3. Use the Remove Duplicates function.
  4. The domains remaining from the Search Console list are the ones that your paid tool isn’t currently seeing. There will probably be quite a lot of them.
  5. Cut the list of remaining domains from Search Console onto your clipboard.
  6. Open Majestic Site Explorer or its Moz and Ahrefs counterparts.
  7. Open the Bulk Backlink Checker tool under the Tools dropdown menu.

Bulk backlink checker in Majestic

  1. Paste the domains that you want to get link metrics from in the Bulk Backlink Checker box and let the tool go to work.
  2. You should receive a report that looks like this.

Bulk Backlink Check report in Majestic

  1. Export the data and download the Excel sheet. You’ll be given tons of data for every domain in the list.
  2. In this use case, we only really want the Trust Flow and Citation Flow columns.
  3. Save the spreadsheet as a separate file from the sheet that you created using the paid link data sources.

4. Weeding Out the Bad Sites

Now that you have your two datasheets, it’s time to review each one manually. Here are the things that you should be looking at:

a. Trust and Authority Metrics. These include Trust Flow, Citation Flow, Domain Rank and Domain Authority. Just keep in mind that these are not 100% reliable as these are stats that were developed by smart folks who don’t work with Google. Therefore, their link quality metrics are correlated to a website’s ranking power but they’re not always accurate and they can be manipulated.

That said, you’ll want to start your review with websites that have very low scores. This doesn’t necessarily mean that low-scoring sites’ links are harmful. Many low-scoring sites just happen to be either new, small or unpopular. It doesn’t necessarily mean that they’re spammy and their links can hurt your site. Of course, many spammy sites have low trust and authority scores, too. For that reason, you’ll want to prioritize the review of sites with link metrics in this range:

  • Domain Authority – 15 or lower.
  • Trust Flow – 10 or lower.
  • Domain Rank – 15 or lower.

If a website’s content looks original, non-malicious and contextually related to your own site’s topics, that usually means you should leave it alone. In time, its score might get better and the links you have from it will pass along better link equity. However, if the site seems like a content scraper, has malicious content (porn, gambling, weapons) or has content themes totally irrelevant to your own sites’, you may want to disavow it.

b. Content

The content library of a website is probably the most telling factor in assessing of it’s a good link source for yours. When reviewing a site, skim over several of its pages. Ask yourself whether the content is:

  • Well-written
  • Well-researched
  • Not fluff
  • Original
  • Not offensive

If the answer is yes to all of these, the site is probably legit. If the answer is no to one or more of these, consider adding the site to your disavow list.

c. Anchor Text

Anchor text is basically the text in a webpage’s body where a URL is latched on to create a hyperlink. If it’s highlighted, clickable and leads to another page, it’s anchor text.

SEOs have figured out years ago that using keywords as anchor text significantly boosts the ranking power that links pass to their target pages. If you sold “men’s underpants” and you have lots of links with “men’s underpants” pointing to your landing page, you’re likely to get a great ranking boost from all that relevant link action going your way.

When reviewing your backlink profile, check out the anchor text column of your datasheet from Majestic, Moz and Ahrefs. Sort them alphabetically to see if you’re getting tons of links with anchor text that exactly match your site’s target keywords.

Google knows this tactic has been abused by SEOs in the past and they’ve cracked down on its excessive use ever since. It’s probably the easiest way for Google’s algorithms and manual reviewers whether you’re engaged in unnatural link activities.

Having a few links with exact match anchor text is highly unlikely to get you in Google’s doghouse. However, if the number of links with over-optimized anchor text rises to the tune of dozens or hundreds, that can be a concern. I personally think that if no more than 5% of your links have keyword-heavy anchor text, you’ll be okay. The rest of your link anchors should be your brand name, neutral terms, partial match phrases and naked links.

If you’re doing a link audit because you’ve been recently penalized, you’ll likely have to disavow all the exact match anchor text links in your profile. If you’re doing a link audit to prevent a penalty, get rid of links with exact match anchor text from low-authority websites first. You can let a few remain – especially the ones from high-authority sites if you have them.

d. Signs of Penalties. Naturally, you don’t want penalized websites to be associated with yours. When doing a review of sites linking to yours, watch out for the following signs:

  • The site does not appear in Google’s index when its root URL is searched
  • The site does not appear as the number 1 result for its own brand or domain name
  • The site has good domain authority, Trust Flow or Domain Rank scores but low organic traffic in SEMRush or Alexa traffic rank
  • The site has a low number or organic ranking keywords (viewable in the paid link data sources and SEMRush)

If a website that’s linking to you has one or more of these signs, consider disavowing its links to your pages.

5. Prepare a Disavow List.

Gather all your disavow candidates and do a second review to make sure that your choices are based on logic rather than fear or paranoia. Keep in mind that Google has stated in the past that the greater number of spammy links in their index are being ignored anyway. That means you may be disavowing sites that Google isn’t even counting to begin with. Of course, if you really feel strongly that a website’s links are a threat to your site’s search visibility, go ahead and disavow them.

In a future post, we’ll discuss how to create a proper, working disavow file along with concepts that you need to understand before going through with it.