Archive for the 'Spam' Category

Changing spamming tactics

I had submitted this blog to a couple of free web directories. The end result is that I don’t have much of traffic, but I do get plenty of spam comments. Now recently it has reached a new milestone of 100 spam comments per day.

I think spam study will be a fascinating field. Most of the spam is auto generated, but I do get occassional manual spam. If I have a post on “ruby”, the comment will be something like “your ruby knowledge is amazing!”, flattering indeed!

Then there is a kind of spam message no one is going to approve. These are long comments and most of them contains over 10 links! These guys seem to survive on blogs which are not moderated.

I had even some comments pleading not to delete them.

Now you would be wondering why I am writing a post on blog spam? Well, today I got a reason - I got a gem among the comments that needed moderation. Check out the following screenshot.

Best spam comment I have received

This is guy is telling me not to delete the message and he says “the money from spam will go to help hungry children in uganda”! Funny that he calls the message as spam :-)

I don’t know whether this tactic will work. Ok. This might work. After all, even in India nigerian spammers are able to cheat some innocent (read: greedy) guys.

Google to address paid link menace

I think text-link-ads will have to find a new business model soon. It appears that Google is taking the issue of paid links very seriously and according to this Matt Cutts article, you can now report paid links appearing on a Webpage!

In another post, Matt Cutts talks about the disclosures required if you are going to place a paid link. Basically provide a machine readable disclosure (rel=nofollow) and a human readable disclosure.

As you can guess from my previous articles, I am a strong advocate of rel=nofollow. As a leader of search engines, Google has the responsibility to clean up the paid link menace in its early stage itself. This is the only way to ensure integrity of search index.

The explosion in online advertising and the huge interest generated by blogs have begun to pollute the search results of google and other search engines. In fact “how to make money online” seems to be the most searched term these days. I am not against “making money online”, infact I am trying to atleast cover my hosting expenses via Adsense! But I think search engines should strive for achieving this simple statement - “Content is the King”.

It is a shame that other search engines doesn’t support rel=nofollow. 

We need rel=negative tag!

There are many who oppose the “rel=nofollow” tag used by major search engines such as Google. But I am in favor of this tag, infact I wish we had something called “rel=negative”. So if I link a site with “rel=negative” then Google should reduce the Pagerank of the site being pointed to.

This will ensure that all the spam based sites such as the MFA (Made for Adsense) will quickly get buried (borrowing the Digg terminology!). This opens up a lot of possibilites. For example, Wikipedia could use this tag and create a spam directory!

There are couple of issues with this approach. A determined campaign against any site could potentially bring it down! Also the domain which gets a lot of “negative rank” will eventually disappear from the internet, so bringing it back will not be easy!

Note that Google already does some kind of negative ranking for those sites which link to spam Websites and for those sites which has duplicate content.

But I think there cannot be any substitute for manual filtering. What Google needs is a  dedicated team of 10 guys who will do a daily scan for top 10 spam sites which gets most traffic. Once a site is identified as spam, these guys will simply remove it from the search index.  I am not sure whether such a team exists in Google!

PS: After writing this post, I was going through my RSS feeds and came across two interesting news items. It appears that Google is actively hunting spam sites on blogger and my friend binny got tagged as spam!.  They are also going after spammers on Gmail (which seems to have backfired since they had also deleted some non spam accounts!)

iREEADD THIS spam hits Orkut

Orkut SpamThese days, spammers are everywhere. The latest spam to hit Orkut is the “iREEADD THIS” spam. I got this message from 5 of my Orkut friends. This clearly shows how easy it is to mislead people! Here is what the mail contains,

HEY ITS DIANNA, FROM THE DIRECTOR OF ORKUT,EVERYBODY SORRY FOR THE INTERRUPTION BUT ORKUT IS CLOSING THE SYSTEM DOWN BECAUSE TOO MANY BOTTERS ARE TAKING UP ALL THE NAMES, WE ONLY HAVE 57 NAMES LEFT, IF YOU WOULD LIKE TO CLOSE YOUR ACCOUNT, DONT SEND THIS MESSAGE, IF YOU WANT TO KEEP YOUR ACCOUNT ,SEND THIS MESSAGE TO EVERYONE ON YOUR LIST. THIS IS NOT A JOKE, YOU’LL BE SORRY IF YOU DONT SEND IT. THANKS DIRECTOR OF ORKUT, TIM BUISKI. WHOEVER DOESNT SEND THIS MESSAGE, YOUR ACCOUNT WILL BE DEACTIVATED AND IT WILL COST YOU $ 10.00 A MONTH TO USE IT.

As you can see it looks incredibly obvious that it is a spam(Upper case, full of grammer mistakes and they have only 57 names left!). But what surprised me was that 5 of my friends forwarded it to me! Surely they don’t want their accounts to be deactivated!

Another puzzling thing is the motive of this spammer. There is no link. Hence he/she must be doing this just for the kick of it :)

Fighting comment spam in a Wordpress blog

We all are used to email spam. I get around 100 spam mails daily in my Gmail account. Thankfully most of these are identified as spam by Gmail and gets moved to spam folder automatically.

When I started this blog, I never thought that spam would be a major issue. Initially I haven’t added any comment moderation. Within a week, I started seeing spam comments mostly related to pharmacy and drugs. I started manually deleting spam and soon I realized it is not going to work.

In Wordpress, under options->discussion, there are a couple of spam fighting measures available. I enabled comment moderation which automatically puts a comment in moderation queue if it contains 2 or more links. I have also enabled common spam word protection. This means that any comment which contains words in this list will be automatically put into moderation queue.

The solved the problem for a few more days. Then I noticed that I have over 100 comments to moderate. Now sorting through 100 comments to find a genuine comment is not something you would cherish!

Wordpress provides something called comment blacklist. If any of the words in this list is part of the comment, the comment will be nuked. It will not appear in moderation queue. So I analyzed few spam comments and added the common words into the comment black list.

I had hoped that these measures would solve the spam problem. Soon I realized that I was too optimistic. I started getting a lot of comments and it contained black listed words with spelling mistakes! For example, the word viagra will appear as viegra or something similar.

Looking at the spam comments I noticed that all of them are coming from a set of specific IP addresses. So what I needed was a way to blacklist IP addresses.

In Wordpress, under manage->files you can see the .htaccess file. This can be used to block a specific set of IP addresses. So I added the following entries in this file (Substitute the actual IP address instead of 127.0.0.1)

order allow,deny
deny from 127.0.0.1
allow from all

So today, I have no comments to moderate. Thank god! :)

Notes

1. There are sophisticated spam fighting tools such as the Akismet which is distributed. I am yet to use it.
2. It is better to disable trackbacks. Tools such as trackback submitter is widely used by spammers.