Recently I have seen a huge increase in referral traffic in GA that comes from spammy domains like bidvertiser . com, easyhits4u . com or trafficswirl . com. These are messing a lot the data in GA triggering a sudden decrease in conversion rate rendering the data unusable.
You can easily see which referrals are bad because they have a few charateristics:
- high bounce rate
- low time spent on pages (even fewer pageviews per user)
- 0 conversions (if you measure such a thing)
Looking in the logs I found lines like this
52.33.56.250 - - [10/May/2017:08:39:05 +0000] "GET / HTTP/1.0" 200 18631 "http://ptp4all.com/ptp/promote.aspx?id=628" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729; MALCJS)"
74.73.253.77 - - [10/May/2017:08:39:05 +0000] "GET / HTTP/1.0" 200 18631 "http://secure.bidvertiser.com/performance/bdv_rd.dbm?enparms2=7523,1871496,2463272,7474,7474,8973,7684,0,0,7478,0,1870757,475406,91376,112463629579,78645910,nlx.lwlgwre&ioa=0&ncm=1&bd_ref_v=www.bidvertiser.com&TREF=1&WIN_NAME=&Category=1000&ownid=627368&u_agnt=&skter=vgzouvw%2B462c%2B40v10h%2Bghru%2Bmlir%2Bhoveizn%2Bsxgzd&skwdb=ooz_wvvu" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
How to handle this?
There are 2 main things that you need to do:
1. Server level - You must block spammy request from the beggining.
I thought it would be best to prepare dynamic filters that will block requests from specific IPs that are doing the spammy traffic.
I am using for this purpose fail2ban, but there is no rule that will help ypu do this out fo the box. First you need to create a new Jail Filter (I am using Plesk so here is how to do that in Plesk https://docs.plesk.com/en-US/onyx/administrator-guide/server-administration/protection-against-brute-force-attacks-fail2ban/fail2ban-jails-management.73382/). For those that do not use Plesk and use ssh you can have a look here https://www.fail2ban.org/wiki/index.php/MANUAL_0_8
The definition of a jail is like this:
Be sure to include ignoreregex as well otherwise it will not save.
After that look for the domains you find in Google Analytics, in your access log. You will find a lot of requests like the one above.
Once you identify the domain you need to add rules like this:
My rule looks like this:
You can see the
bdv_rd\.dbm. That is not a domain but a script they used to produce the spam. So it could be easy for them to change the domain and use the same script. This adds an extra layer of filtering. I added that there becasue fail2ban will search for any string that matches the pattern.Note 1: be sure you do not interfere with your own website URL's because this will block legitimate users and you do not want that.
Note 2: you can test your regex in ssh like this
This should produce the following output:
Now this means your filter found 925 requests matching that domain (a lot if you ask me) that will translate into 925 hits from the referral bidvertiser.com in your Google Analytics.
You can verify this downloading the log and doing the search with a tool like Notepad++.
Now that your definition is ready you should add a jail and a rule.
I use the definition above with the action to block all ports for that IP for 24 hours.
After I installed this in just a few hours I had close to 850 blocked IPs. Some are in Amazon AWS network so I filed an abuse complaint here https://aws.amazon.com/forms/report-abuse .
You can use this service https://ipinfo.io/ to find the owner of the ip.
2. Google Analytics level
Here you have a few options that I will not describe here because it is not the place and there are well written resources on this theme:
https://moz.com/blog/how-to-stop-spam-bots-from-ruining-your-analytics-referral-data https://www.optimizesmart.com/geek-guide-removing-referrer-spam-google-analytics/
A few notes:
These guys use in some places .htaccess blocking. That is an option as well that I did not use here because in my filters I use also script names and not only domains.
Fail2Ban will use iptables to block any other request from these IPs and not only the http/https port.
The first request will always pass and create 1 hit in Analytics and then depending on whether the script still accesses your website another hit when the ban expires
You can use the recidive filter to permanently ban those IPs https://wiki.meurisse.org/wiki/Fail2Ban#Recidive
The Analytics filters will not filter out historic data.