5 ways bots hurt your web business

Recently I had to deal with significant invalid traffic caused by automated scripts and programs (web bots). So I want to mention just some of the ways automated bots will hurt your site.

Make sure you also read the part at the very end that refers on how bots can be blocked and how NOT to block valid users.

1) Bots clicking on ads

Many automated bots click on your website ads, that results in lower quality traffic sent to advertisers and you will earn less and less each day, or some valid clicks not being counted at all because you have many invalid clicks you are not even aware of.

web bot

These invalid clicks can be made by your website competitors, or by competitors of the advertisers that have ads on your site, or even by people that are upset with ad companies, like people that were banned from Google AdSense program.

2) Fake bounce rates to decrease your search engine rankings

I will tell you from the start, may say this is a myth, but I disagree.

On some of my sites I noticed many hits from U.S, that seem valid on first look, after closer inspection and trying to identify a behavior pattern, I noticed that these were bots;

What they did was this: they searched for a term on Google (one I was ranking for), clicked my site, then as soon as possible they return to that same Google search page. Why ? Because that makes Google think that user didn't find what he was looking for on my site, and that is why user returned to Google, so in this case Google will rank that site lower for that search term.

3) Bots spam your site but also to decrease your rankings

When bots post crappy words and links on your comments section, that is not just because they are trying to get hits to these links, this also happens because they want to decrease your site content quality. I seen bots that filled description boxes with bad words without trying to post any links, that is just to fill bad content in your site, and it will make you rank lower or even not at all if your site starts containing non family-safe words.

4) Bots steal content and become your competitors

Many bots are there just to scrap / steal content from your site, this content then gets posted on other sites and might end up ranking higher than you. Because even if your content was first, originality is not the only ranking factor in Google's eyes.

5) Bots slow down servers, also affecting rankings

Many times bots are not optimized and hit the site too many times a second, this slows down your site, database server, etc. It also means that other users will perceive your site as being slow during these times, for Google the site speed is a ranking signal, so this will cause you to rank slower in search results

Additional considerations

Besides all that, having many hits from bots might make you think that your site has traffic when it actually doesn't, you might not know why your traffic doesn't convert into customers when actually most of your traffic is from bots.

Bots can also fill registration forms on your site, then if you try to send an activation email to that mailbox, you might be flagged as a spammer because you reached inexistent mailserver users.

Blocking bots

Identifying and blocking bots is NOT easy.

For example on one website I noticed that same few IP addresses make a lot of traffic and download files multiple times, after I blocked these IPs I noticed that these are actually proxy / optimizers IP addresses from mobile Google Chrome and Opera, valid users use these same IP addresses without knowing, this is when mobile browsers are set to optimize the content delivered to users, Google, Opera and most mobile browsers fetch the websites in order to deliver them faster to users, so most of your users will appear to be using same IP.

In the above situation, using https on your site helps, since less bots will alter website content before delivering it to user.

Here is a PHP script that can detect real user IP if the user is behind a proxy / cache / optimizer:

<?php 
// get real user IP if user is using a proxy/cacher/optimizer
$user_ip = $_SERVER['REMOTE_ADDR'];
//
if(isset($_SERVER['HTTP_X_FORWARDED_FOR']) and $_SERVER['HTTP_X_FORWARDED_FOR']!=''){
	$user_ip = $_SERVER['HTTP_X_FORWARDED_FOR'];
	// the "for=" should only be present in HTTP_FORWARDED but test in HTTP_X_FORWARDED_FOR too
	$user_ip = str_ireplace("for=", "", $user_ip);
	$download_comment = "HTTP_X_FORWARDED_FOR from google proxy: ".$_SERVER['REMOTE_ADDR'];
}
// some admins say they don't receive HTTP_X_FORWARDED_FOR but only HTTP_FORWARDED
if(!isset($_SERVER['HTTP_X_FORWARDED_FOR']) or $_SERVER['HTTP_X_FORWARDED_FOR']==''){
	if(isset($_SERVER['HTTP_FORWARDED']) and $_SERVER['HTTP_FORWARDED']!=''){
		$user_ip = $_SERVER['HTTP_FORWARDED'];
		$user_ip = str_ireplace("for=", "", $user_ip);
	}
}
?>

Bots can also be blocked by using a service like CloudFlare, these services will stand between you and all your users and will filter unwanted automated traffic. They can better identify bad behavior patterns since they handle many websites at once and have more data to work with.