It’s a good thing spammers aren’t smarter
By Simon Bisson & Mary Branscombe in Editorial
Posted in Identity, Security, Google, Internet on
I find it easy to spot most of the phishing messages that hit my inbox, because there’s nearly always an egregious grammatical mistake in there somewhere. Real messages from banks may be full of logical errors (like a regular savings account with a headline rate of 7% that never tells you that actually it averages out nearer 4% because not all of the money gets to earn the high rate for the whole year), but the spelling is spot on.
And spammers are in such a hurry to put up the Web pages they want to earn ad money on, or use for drive-by downloads to increase the size of the botnet they use to spend most of the spam from zombie machines, that they often make stupid mistakes. If you’re checking 100 messages a day in your junk mail filter for anything real that got in there by mistake, I’m not sure if it’s any comfort to remember that spammers are only human. But Google finds it useful.
According to Matt Cutts of Google at Web 2.0, Web spammers often use templates and tools to build their pages. And fairly often they follow the commented-out instruction to ‘type your hidden text in here’ - but never delete that instruction. The tools they use to fill in forms are simplistic too; the captcha you have to complete to leave a comment here is enough to defeat most of them - but so is a box labelled email address with the instruction not to fill it in. When the bot adds whatever email address it’s abusing, you know you can just delete the comment. Simple maths or the instruction to type in a specific word are beyond bots - at least until Jeff Hawkins perfects Hierarchical Temporal Memory.
If you have a site, you need to think of things that raise the blood pressure of the spammers without doing the same to your users. It’s like being chased by any dumb but dangerous pack animal, says Cutts; you only have to run faster than the slowest person you’re willing to sacrifice. If your system is a little different from the default installation of whatever you use, the default attacks are less likely to work and the spammers may move on to slower prey.
Apart from the obvious advice to patch, patch and patch again, Cutts didn’t say much more - because every time you tell spammers how you’re spotting them, they get a chance to stop doing that. A lot of what Google knows about spam comes from the analysis it does of real Web pages, which lets it work out what things go together. If you know that timepiece and chronometer are synonyms for watch, those strangely-worded Rolex spams are easier to stop. You can see this classification in Google Sets and it’s used in Google Spreadsheets. The equivalent of Excel AutoFill does more than days of the week and months of the year, without you having to add the lists by hand; start with red, yellow and blue and Google Sets will add other colours. Start with lion, tiger, bear and you get other animals.
But you might also get wood, tin and cotton. That’s because Google Sets can’t always tell the difference between the list of animal names and the list of animal toys on the Web sites it looks at. It will learn; like spammers it will learn more quickly if someone tells it what it’s got wrong. But at this point, we get into a race between whether the anti-spam tools can learn faster than the spammers…
Make a comment
Tag cloud
Archives
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
Most commented posts
- Java’s SSVAGENT.EXE: training the monkey
5 comments
- Wubi Tuesday
- Not very open, not very social
- The best mobile game ever
- A Big Day In The Enterprise IT World
- Employees are our most valuable asset (snigger)
- Biometrics - it's not the technology that's broken
- More battery life, fewer explosions
- Spam Fighting in Exchange
- IDF: Will SSD mean the end of 5GB free?
Highest Rated Blog Posts
- Nobody knows what Web 2.0 really is (100%)
- Songs of distant satellites (100%)
- Log in and lock in (100%)
- Mommy, why is there a home server in the office? (100%)
- Employees are our most valuable asset (snigger) (100%)
- Locking down IT or blocking creativity (100%)
- Video opera? What would you do with huge bandwidth and millions of pixels? (100%)
- Consumer BlackBerrys are good for business (100%)
- HD Trek (100%)
- Top tips for speeding up Vista (100%)


