Blocking spam referrers

I’ve noticed through my website statistics program (I use Mint, thankyouverymuch), and I’ve noticed a couple of referrers that are suddenly referring a ton of hit to my website. I as curious what I did to attract these guys, so I peeked in my Apache logs a bit, and headed over to their website to find out what they’re all about.  Turns out, they’re basically opening a little frame in their site, and redirecting browsers to hit Movable Type trackback posts.  Mind you, I haven’t run Movable Type in … 6 months?

Now, my Movable Type install doesn’t exist anymore, but I do have a rewrite rule in my Apache configuration to capture 404 (resource not found) errors and have WordPress deal with them.  This is so you get pretty website addresses for me, rather than something ugly with variable names in the website address.

I decided that I wanted to block these guys.  Not because they’re bothering me in any way — the trackback script they’re trying to hit on my server doesn’t exist anymore — but they are skewing my website statistics.

Warning: technology speak ahead.

I knew that I could solve this with mod_rewrite in Apache, but I never remember the conditionals and syntax of anything more than a very basic rule rewrite.  A bit of searching online led me to ilovejackdaniels.com, who have already fixed this problem (and in a bigger way than I need to).  Basically, they use Apache’s mod_rewrite to forbid any request with a certain referrer field using RewriteCond and HTTP_REFERER variables.  I dropped the following two lines into my .htaccess, where I do the rest of my mod_rewrite work:

RewriteCond %{HTTP_REFERER} (tbsp2.php) [NC] 

RewriteRule .* - [F]

What does this do?  The first line checks the HTTP_REFERER variable, and if it contains tbsp2.php, it executes the next RewriteRule directive.  In my case, two different websites (who I will not link to), have a page called /klx/tbsp2.php with the aforementioned frame sketchiness going on.  The RewriteRule directive tells Apache to not redirect the client or do any rule rewriting.  The magic is in the [F] – this tells Apache to return a 403 Forbidden error.  These don’t get tracked into my Mint statistics, so they’ll no longer skew my stats.  And now, anytime I notice additional spamming referrers, I’ll just add another RewriteCond to my configuration.

And no, I haven’t been misspelling "referrer" through this.  For some reason, RFC2616 documents this as "referer", which is wrong.

I hate web spam…

I really hate web spam. Since I used to run [Movable Type](http://www.sixapart.com/movabletype) here, lots of spammers used to hit up /mt/mt-tb.cgi and /mt/mt-comments.cgi to try to auto-spam my blog. I’ve been running [Wordpress](http://www.wordpress.org/) for (checks archives) nearly 3 months now, and I’m still seeing it.

Today, I was checking out my [Mint](http://www.haveamint.com/) statistics, and I noticed I had some really spammy-looking [outclicks](http://code.jalenack.com/archives/outclicks-pepper/) from my [photo gallery](http://www.marius.org/gallery/). Turns out spambots on in “t3h internets” are spamming my comments in there too, so I’ve disabled Gallery comments now.

Sigh. I wish someone would come up with a [TypeKey](http://www.sixapart.com/typekey/)/[LiveID](http://get.live.com/getlive/overview)/[OpenID](http://www.livejournal.com/openid/about.bml) system that actually worked and was well used, so comments could be better managed.

(ed. note: now that I’ve written all that, there’s a lot of links up there. And it looks spammy. How ironic and lovely.)

Tagging for WordPress

So, as I mentioned, I’ve been working on upgrading the blog to WordPress this past week. Everything went off without a hitch, or so I thought.

See, I installed [Jerome’s Keywords 1.9](http://vapourtrails.ca/wp-keywords) to manage my tagging. Well, all was working well until I started receiving comments on the blog. As comments came in, tags from that entry disappeared. That’s, uhh, sub-optimal. I did some reading on Jerome’s site, and it seems he’s working on version 2.0, but it’s been in the works for months. People posted in comments that they are seeing the same thing when running WordPress 2.1, but no response from Jerome that I saw.

I migrated tonight to [Simple Tagging Plugin](http://sw-guide.de/wordpress/wordpress-plugins/simple-tagging-plugin/), which is based on Jerome’s Keywords. It has a few other handy features (type ahead, better tag management), and the best feature of all: it doesn’t exhibit this bug. It looks like it’s in a lot more active development too, and can import Jerome’s Keywords tags. Imported my entries, and away I went.

It also supports listing off entries that aren’t tagged at all: this is handy for me as I re-tag my entire site after moving from Movable Type.

From MT to WP

Well, I’ve been fiddling with WordPress for a while, and tonight I’ve bit the bullet and made the switch from Movable Type to WordPress. Since I had a heck of a time finding one site that had all the right steps, I’ve decided to document it here for posterity. My biggest challenge was that I had originally started using Brad Choate’s Textile plugin, but had recently switched to Markdown. I decided to go the extra step and convert my MT export entries that were Textile coded into raw HTML. It wasn’t simple. Read on for the detailed instructions.

Continue reading “From MT to WP”

Movable Type Tags – Simplifying Search URLs

So now that Movable Type supports Tags, I'm going to have to re-architect the way I've done my categories.

I wanted to make this work a little more like Technorati with their tag search, where you can just go type in www.technorati.com/tags/[tag of interest].

With a little searching on the web, I found this page in Movable Type's beta documentation. Paraphrasing here for continuity:

This assumes you're using Apache, and have mod_rewrite installed and working. You need to simply put the following into your document root's .htaccess file:

RewriteEngine on
RewriteRule tag/(.+) /mt/mt-search.cgi?tag=$1&blog_id=1

Jay goes on to note that if you're using Movable Type templates to create the .htaccess file, you can use Movable Type's <$MTBlogID$> variable to automate your blog_id variable.

New Stuff on the Site

Added a Book Queue over there on the right ( powered by MTBookQueueToo and MTAmazon ) and finally have a dynamically created Blogroll (powered by a beta of MT Blogroll, which doesn't have a homepage yet.)

Oh, and Jeff pointed this out today online. Pretty damned sweet — a genetic sequence browser for various species' DNA. Wonder if I can get some of this plotted on glossy-poster paper and hung on a wall.

MT Upgrade Woes

Well, after another MT upgrade this last week, I broke the install again with the Storable module's breaking with regard to 56-bit vs. 64-bit storage (I think — I plan to research this more later today..). Every time I upgrade, I forget to patch lib/MT/PluginData.pm, so I figured I'd document the required steps here, as it seems I'm not the only one having this problem.

Edit lib/MT/PluginData.pm, after line 9, add the following:

BEGIN {
 $Storable::interwork_56_64bit = 1;
}

MTOtherBlog Failure

Well, since upgrading to MT 3.0D, MTOtherBlog has failed, as some of you may have noticed by my Static Categories list vanishing off the right. Guess I need to drop a note to the MTOtherBlog author; hopefully s/he's not one of the people bailing on MT since the 3.0 licensing brouhaha.

UPDATE: Yep, it's official, David Raynes says that OtherBlog is going the way of the dodo and he's going to release MultiBlog to take over for the 3.0 days..