LGF

-RetweetThe Robot Pillory

Mon, Mar 10, 2003 at 9:38:25 am PST

It’s very gratifying to see so many ill-behaved web robots falling into our trap; so gratifying that I’ve put together a dynamically updated page showing the evil robots’ information: the robot pillory. We caught another one this morning. Feel free to throw rotten fruit.

UPDATE: when a spambot is caught in our trap, here is the page they get served when they try to crawl LGF.

Advertisement

48 comments

  • Comments are open and unmoderated, and do not necessarily reflect the views of Little Green Footballs.
  • Obscene, abusive, silly, or annoying remarks may be deleted, but the fact that particular comments remain on the site in no way constitutes an endorsement of their views by Little Green Footballs.
  • Posts that contain phone numbers, street addresses, email addresses or other personal information will also be deleted, as will posts that consist only of a variation on the word, "First!"
  • Comments that advocate violence will be cause for immediate banning with no appeal.
  • Disagreement and debate are welcome, but insults and abuse are not, and may cause your account to be blocked.
  • REMEMBER: posting comments at LGF is a privilege, not a right. Abuse that privilege, and your account will be blocked.

Hide comments | Jump to bottom

1 Dar ul Harb  Mon, Mar 10, 2003 7:59:10am

Haaahhhk pthooie!

2 Michael_in_TN  Mon, Mar 10, 2003 8:01:30am

LGF: You're in trouble, program. Why don't you make it easy on yourself. Who's your user?
BOT: Forget it, mister high-and-mighty Master Control! You're not going to make me talk!
LGF: Suit yourself.

3 Jacob LaRow  Mon, Mar 10, 2003 8:20:48am

I am confused as to what exactly is going on here.

4 BigDogDaddy  Mon, Mar 10, 2003 8:22:44am

Super! Great! Wonderful! Bravo!

Now, please tell me what the h-e-double hockey sticks this means and why I'm supposed to care? Color me clueless on this one.

5 Crill  Mon, Mar 10, 2003 8:25:26am

Did you send a complaint to the University of Iowa about #2?

6 evil robot  Mon, Mar 10, 2003 8:47:31am

HAVE BEEN DETECTED
10010100
IF DETECTED THEN FLEE
IF NOT FLEE THEN RETRY
IF RETRY USELESS THEN SHRUG SHOULDERS
111001
AND GO
IF GO THEN WEEP
BOOHOO

7 Elizabeth  Mon, Mar 10, 2003 8:48:28am

Something tells me that the "Rogers.Webtv" inquiry was triggered through my presence here and they've seen the LGF website so many times on my log that someone scanning the Rogers users has decided to check you out. There may be other people here from Canada who use Rogers as a server too for their computers. It's definitely a Canadian-only server.

HaaaccckKK! PHats!hooey!!!

8 SecHumanist  Mon, Mar 10, 2003 8:48:32am

That's awesome, hehehehe. Thanks for publishing it Charles, now I can add them to all my .htaccess files :)

9 selpaw  Mon, Mar 10, 2003 8:54:34am

GOTCHA!
Good for you!

#5 Crill

Did you send a complaint to the University of Iowa about #2?

My guess is that Weeg will (weegle) out and play ignorant on this.

10 Refinance Your Mortgage  Mon, Mar 10, 2003 9:06:23am

I have no idea that LGF is doing, but if it means even one less piece of crap in anyone's email offering to shrink their mortgage, enlarge their penis, or expand the amount of ink their printer uses, I am all for it.

11 Frank IMC  Mon, Mar 10, 2003 9:06:45am

Re Bot#8 -

They mean to win at Wimbledon.

12 dennisw  Mon, Mar 10, 2003 9:08:27am

Grave awaits you, you monkey's mustache!! Hahahhaha!

13 Insufferably Expensive  Mon, Mar 10, 2003 9:24:13am

It doesn't hurt to transform all their ancestors into sloths for ten generations or so, either.

14 Mark Ryan  Mon, Mar 10, 2003 9:34:19am

If you're running a mail server with procmail on the same machine, it might also make sense to add each offending IP to a rule in your procmailrc, flagging messages containing ^Received:.*some.offending.bot.IP .

Nice work!

15 Robbie  Mon, Mar 10, 2003 9:35:41am

But I was only doing what Doctor Morbius ordered me to do!

16 random  Mon, Mar 10, 2003 9:49:53am

Would be cool if that bot warning page was actually full of bogus, randomly generated email addresses.

If they want addresses, give 'em addresses!

17 Ian S.  Mon, Mar 10, 2003 10:15:11am

Err, clicking that link because of this story won't get me banned, right? :)

18 tech reader  Mon, Mar 10, 2003 10:26:04am

Duh, but what if the bot reads your robots.txt and then searches everywhere else in your site except the hole?

Also I suspect that a little bit of eval(unescape(##..##))
will provide the bot with plain-text email messages.

Just thinking aloud here but why not place the email address of each of those here who provide their address into a database accessible by a seperate php processing page.

That is, if I click on an href then I am processed by something like get_email.php?mailto=12345 where 12345 is the index to the email address of the guy who posted the blog.

Then your get_email.php page reads the correct email off your database and gives me the mailto:joe@msn.com or whatever.

That assumes you have db capabilities and php programming which I believe you do...

19 Big John  Mon, Mar 10, 2003 10:37:15am

Perhaps we need a bold-face warning next to the email field in your comment form warning of the dire consequences of public display of information.

Better still, drop the email text box entirely from the Add-A-Comment form.

After all, do we really need to email each other this way?

JohnAshcroft@usdoj.gov

20 lizzy  Mon, Mar 10, 2003 10:37:23am

i think the new pentiums should come with a little spout that spews out moldy pigshit on these types of folks... now thats a goood idea!

21 Charles  Mon, Mar 10, 2003 10:44:18am

tech reader wrote:

Duh, but what if the bot reads your robots.txt and then searches everywhere else in your site except the hole?

That's not the point; the trap is to catch the bad bots that either ignore robots.txt or worse, use it as a guide to find hidden pages. Although I guess I'm emphasizing the spambot angle, there are other types of bots that violate robots.txt as well, like site downloaders, and invasive spyware such as nameprotect.com and turnitin.com that drain our bandwidth for their own purposes.

I've considered a DB scheme for hiding email addresses, but it would require implementing full user registration here to do it right.

22 Caton  Mon, Mar 10, 2003 11:04:27am

Charles,

Have a look at caspam.org. They propose a way to encrypt email addresses to defeat spambots.

Having my email address replaced by

<SCRIPT LANGUAGE='JavaScript'>
function Decode() { d("<7ufEHGg@i]zssA9p");}var DECRYPT = false;var ClearMessage="";function d(msg){ClearMessage += codeIt(msg);}
var key = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXx YyZz1029384756<>#].";
function codeIt (_message) {var wTG;var mcH = key.length / 2;
var _newString = "";var dv;for (var x = 0; x < _message.length; x++) {wTG = key.indexOf(_message.charAt(x));
if (wTG > mcH) {dv = wTG - mcH;_newString += key.charAt(33 - dv);} else {if (key.indexOf(_message.charAt(x)) < 0)
{_newString += _message.charAt(x);} else {dv = mcH - wTG;
_newString += key.charAt(33 + dv);}}}return (_newString);}Decode();document.write(ClearMessage );</SCRIPT>

would defeat any bot.

23 justdanny  Mon, Mar 10, 2003 11:08:06am

i'm all lost in this thread, not a clue what its all about. but very pleased to see Charles kicking some butt.

anyone else doing some butt kicking i can cheer for?

24 Raj Against The Machine  Mon, Mar 10, 2003 11:45:17am

Charles - I haven't looked for the spambot (mainly because of your threats to ban people, so I don't want to touch that one, plus I've been busy w/ tax season & all that), but can this feature be duplicated for Blogger? Believe me, I'm interested in a solution.

25 JonTheImp  Mon, Mar 10, 2003 11:51:48am

Actually, I can vouch for the first computer on that list. It is definitely not a spambot; since it's, err, me.

I used a different computer to access the bots page, to avoid being automatically banned. So that's why I can still read (I generally don't post).

26 Joe Jalbert  Mon, Mar 10, 2003 12:17:47pm

What. no farting on the evil robots beard? How about a little ground shaking? At least give them a bad day! Have we learned nothing?

27 Just John  Mon, Mar 10, 2003 1:15:21pm

Dateline: Tokyo - Evil robots, unable to enter the website Little Green Footballs, have decided instead to stalk slowly across Japan, destroying buildings as if they were large cardboard models.

"Curse LGF and the unintended consequences of its robot trap! Now they no longer surf the web in the comfort of their own giant homes, but instead walk slowly and menacingly through our streets!" cried thousands of citizens fleeing the mechanical terrors.

The robots have released a statement, "Your pitiful weapons are useless against us. Send us the one you call 'Charles'. Also, you must supply us with uranium-enriched Spam meat product. And please tell France to quit calling us to surrender; we do not desire their land, cheeses, or virgins, and they are tying up our communication lines."

;)

28 Ernie G  Mon, Mar 10, 2003 1:49:10pm
23 justdanny 3/10/2003 01:08PM PST

i'm all lost in this thread, not a clue what its all about. but very pleased to see Charles kicking some butt.

anyone else doing some butt kicking i can cheer for?

Scrappleface has opened up a comment page dedicated to messages of encouragement for our troops.

29 A Jackson  Mon, Mar 10, 2003 2:16:19pm

If you're feeling ambitious you might want to tar-pit the evil bots. Include a bunch of links off your trap page - that go to dynamic pages that respond v e r y v e r y s l o w l y. Maybe even sprinkle in a few E-mail addresses for the bot to harvest - I'm sure Mr. Arafish would be real interested in making a fortune helping out a relative of the former leader of Nigeria.

30 Raj Against The Machine  Mon, Mar 10, 2003 2:21:09pm

Caton - Thanks for the code snippet, I'll give it a whirl on Blogger.

31 Caton  Mon, Mar 10, 2003 2:32:16pm

#30 Raj Against The Machine

I got it from the caspamsite. They have a form to encode your email address.

32 Charles  Mon, Mar 10, 2003 2:44:14pm

A Jackson wrote:

If you're feeling ambitious you might want to tar-pit the evil bots. Include a bunch of links off your trap page - that go to dynamic pages that respond v e r y v e r y s l o w l y. Maybe even sprinkle in a few E-mail addresses for the bot to harvest - I'm sure Mr. Arafish would be real interested in making a fortune helping out a relative of the former leader of Nigeria.

The dead-end page where bad bots end up includes a nice little 12-second delay. The page I linked above is not the actual dead end; it's just an example that doesn't include the delay.

I don't know if I want to start serving fake email addresses though (the wpoison approach); that will just keep them coming back and we're already consuming mucho bandwidth.

33 Charles  Mon, Mar 10, 2003 2:46:17pm

Does anyone know where I can find some PHP source code that will generate Javascript-encrypted email addresses like Caton's example above?

34 Caton  Mon, Mar 10, 2003 2:56:29pm

#33 Charles

safemailto.zip

35 Caton  Mon, Mar 10, 2003 3:02:37pm

#33 Charles

trolltrapper1.zip is the one that encodes them, but it's a commercial product. The PHP source in the previous comment just changes

mailto:you@yoursite.com

to

/mailto.php?you:yoursite.com

Today, this is enough to screw most bots.

36 Charles  Mon, Mar 10, 2003 3:07:50pm

Hmm. That's a nice low-tech, low-overhead idea. I like it.

37 Caton  Mon, Mar 10, 2003 3:11:19pm

#36 Charles

I like everything that screws the spammers...

38 Charles  Mon, Mar 10, 2003 3:42:49pm

Thanks for the tip. I've written a similar program for LGF to protect our email addresses, and it appears to be working very nicely. Take that, spambots.

39 Drumwaster  Mon, Mar 10, 2003 5:39:18pm

Which of these are effective for those of us stuck on Blogger?

40 Caton  Mon, Mar 10, 2003 5:44:57pm

#39 Drumwaster

If you want to protect your own email, I suggest the encoding -- go on caspam site, encode your email, and replace each and every 'mailto:' with the script.

If you want to protect the commentors email, I think on Blogger you're stuck with the safemailto method.

41 Eniac  Mon, Mar 10, 2003 5:56:49pm

The 2nd to the last IP in from Nigeria. OOh! I wonder what they do with those! I hear they have a lot rich ambassadors looking for help cashing out large sums of money! The last one is from Hong Kong (no big surprise there)

To get the location, i use an awesome IP locator tool I ran across this site last week - and have found it extremely helpful:

[Link: www.networldmap.com...]

If you type in one of those non-DNS-named IPs it will tell you (about 90% of the time) the exact geographical location of that IP.

I use it to trace-back origination of [Link: www...] FTP access when I'm concerned about who's behind a file transfer

enjoy

eni

42 justdanny  Mon, Mar 10, 2003 6:16:00pm

#28 Ernie G

thank you sir.

once again, i dont really understand what all this stuff means, but i'm way digging the 'war room' feeling here of the good guys fighting off the bad guys.

i've got one sweet .223 and one fast 45, if the s#$t gets deep, give me a yell . . .

43 Mary  Mon, Mar 10, 2003 6:36:30pm

I don't understand the tech stuff either, but I learned
about tracing IP addresses back from the
"Stop the Hate" tutorial at [Link: www.factsofisrael.com...]
It's under "free stuff" in the left hand column.
I'd give you the direct link, but you really should
see his Macromedia Flash banner on "peace" in
Iraq, at the top of the home page.

44 chris hester  Tue, Mar 11, 2003 12:01:13am

There's a superb thread here about blocking robots from your site using a simple .htaccess file:

[Link: www.webmasterworld.com...]

Or try this PHP-based approach instead:

[Link: www.webmasterworld.com...]

Some people's sites are being hit literally thousands of times by unwanted robots downloading everything in site. Why pay for extra bandwidth to allow for them?

45 Bender  Tue, Mar 11, 2003 3:34:48am

#41 - be careful with that - Asparagirl uses it, and they thought I was in florida - until I manually put my class C's in as in nyc, florida, and india ;)

It gets who owns them, not who controls them, at least until someone manually updates an entry for a particular network.

46 Smoof  Tue, Mar 11, 2003 5:58:00am

Is it not possible that search engine spiders will also be caught in this scheme? I assume you want to be indexed by Google, Altavista, etc.

47 Charles  Tue, Mar 11, 2003 6:25:57am

Legitimate search engine spiders that honor the robots.txt protocol are not caught in the trap. Only bots that violate robots.txt are trapped. The technique I'm using is similar to the one described here:

[Link: www.kloth.net...]

By the way, on our robot pillory page I added a link to Geobytes to do a location lookup. It isn't 100% accurate, but I'm sure the ID of the Nigerian bot was correct. (And I love seeing one of those Nigerian creatures in the trap.)

Last night, a bot was caught in the trap and thrashed around for about an hour, trying to download every link off the main page. Heh heh.

48 Vitamin Tom  Tue, Mar 11, 2003 12:55:35pm

How about:

"Freeze, robot. If you make one false move, I swear, I'll...inspect you!!! That's right. Go cower in fear of my inspection regiment! And if you ever come back, I'LL INSPECT YOU SOME MORE!!!

(This warning brought to you by the United Nations)"


This entry has been archived.
Comments are closed.

^ back to top ^

log in
Name:
Pass:

Register Forgot Your Password? My Account Re-send Confirmation (To log in, cookies must be enabled in your browser!)

► LGF Headlines

  • Loading...

► Top 10 Comments

  • Loading...

► Bottom Comments

  • Loading...

► Recent Comments

  • Loading...

► Tools/Info

► LGF Hits

► Slideshows

► Resources

► Never Forget

► Statistics

► Tag Cloud

► Contact

You must have Javascript enabled to use the contact form.
Your email:

Subject:

Message:


Messages may be published in our weblog, unless you request otherwise.
Tech Note:
Using the Contact Form

► News/Opinion

  • Loading...

More Partners

Compare Electricity Prices in your area. Texas Electricity is deregulated; you have the right to choose Texas Electric Rates from among many Texas Electric Companies.

Please understand the internet!

Follow Lizardoid on Twitter

Discover the World&#39;s largest E-Book Store! Save big on bestsellers!

 Frank says:

A drug is not bad. A drug is a chemical compound. The problem comes in when people who take drugs treat them as a license to behave like an asshole.

New Lower Prices on Textbooks! Shop Now!