$Id: INSTALL 45 2003-02-01 23:02:24Z aqua $ Installing Sugarplum -------------------- The basic steps are these: 1) Program installation 2) Webserver configuration 3) CGI configuration 4) Teergrube baiting (optional) 5) HTML trapping 6) Testing -- 1. Program Installation The automated way: Edit install.sh, and adjust the various settings to suit your system. Run ./install.sh. This should be done as a user with write permissions to the ETCDIR (/usr/local/etc/ by default) and CGIDIR (/home/httpd/cgi-bin/ by default). It need not be done by root, and indeed should not. install.sh will create /usr/local/etc/sugarplum/ and /home/httpd/cgi-bin/sugarplum/; in the former it will create blank agentlog.gdbm and poison.log files with permissions suitable to allow the CGI user to edit them (write-only for the logfile). The manual way: a. Copy the 'poison' CGI into a directory where your webserver can execute it. b. Copy sample.conf to /etc/sugarplum/config. Go through and change the settings to suit your system and desires. c. Ensure that your local language dictionary is at /usr/share/dict/words, or configure /etc/sugarplum/config to point to the real location. d. If you enable logging (it's off by default) in the config file, create a file/var/log/sugarplum.log (or wherever you configured the log to be), writable by the webserver. 2. Webserver Configuration These instructions apply to Apache 1.3, the only one I have readily available that supports URL rewriting. If anyone has configuration instructions for other webservers, I'd be delighted to see them. Jasper Jonhmans has contributed webserver configuration instructions for the WN webserver -- see INSTALL.WN. Add the following mod_rewrite rules to your httpd.conf file (or wherever else is appropriate to your configuration). Remember that rewrite rules are not inherited by virtualhosts by default, so if you wish them to apply to all virtualhosts, you'll need to turn on rule inheritance; see the Apache documentation at http://www.apache.org/docs/ for the correct syntax; indeed, a preparatory reading of the mod_rewrite documentation would be a good idea anyway. You'll also need a 'RewriteEngine on' directive in the same scope as the rewrite rules. RewriteCond %{HTTP_USER_AGENT} email.?(magnet|reaper|siphon|harvest|collect|wolf) [NC,OR] RewriteCond %{HTTP_USER_AGENT} floodgate [NC,OR] RewriteCond %{HTTP_USER_AGENT} web(bandit|snake|collector|mole|miner|weasel) [NC,OR] RewriteCond %{HTTP_USER_AGENT} cherry.?picker [NC,OR] RewriteCond %{HTTP_USER_AGENT} extractor.?pro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^crescent [NC,OR] RewriteCond %{HTTP_USER_AGENT} nicerspro [NC] RewriteCond %{REQUEST_URI} !^/sugarplum/ RewriteRule .* /sugarplum/visions [PT] ScriptAlias /sugarplum/ /home/httpd/cgi-bin/sugarplum/poison/ The ScriptAlias at the end, and the RewriteRule that points to it should have the /sugarplum/ portion adjusted to some randomly selected innocuous path. I suggest picking a random dictionary word that has no relation to the content on the rest of your site. You might also consider changing the word(s) on a fairly regular basis. Remember to assume that the spammers are as smart as you are; as of the first release of sugarplum, the ScriptAlias /sugarplum/ wouldn't be one that a spambot would be instructed to avoid, but that will almost certainly change as soon as the first spambot author/operator reads this. For webservers other than Apache 1.3, at least put in the ScriptAlias line, changing the /sugarplum/ alias as per above. This will work on at least NCSA HTTPd and Apache 1.2 that I know of. It's important also to prevent legitimate spiders from wandering into the poison -- it wastes search engine space. A legitimate spider will generally respect the Robot Exclusion Standard (RES); for this, create or add to the file served as http://www.yoursite.tld/robots.txt: User-agent: * Disallow: /sugarplum/ 3. CGI Configuration As of v0.9.8, configuration should be done by editing the configuration file, stored as /etc/sugarplum/config. All runtime configuration options are documented there. The default settings should be generally usable and safe for normal installations. 4. Teergrube Baiting (optional) As of v0.9.8, teergrube baiting is the only address generation method used by default. The default teergrube address is unresolvable. If you'd like to use sugarplum to lay bait for a teergrube, first enable teergrube address generation in Step 3. By default, the usernames in the "bait" addresses will have the IP address of the requesting host encoded in them -- use decode_teergrube.pl (included) to extract the IP address back out. Note that at present, address encoding can only handle IPv4 addresses. To point sugarplum at your teergrube, edit the configuration setting 'teergrube_address_fqdn' to the teergrube's FQDN. 5. Testing With any suitable web browser, open http://www.yoursite.tld/sugarplum/; you should receive the poison code's output, suitably random but structurally resembling normal HTML by/for humans. Also try http://www.yoursite.tld/sugarplum/1/2/3/4/5/6/7/8/9/0/, adding enough extra stuff to the end of the URL to exceed the max_depth configuration in the CGI -- here, you should see no recursive links, only mailto: links. Restart the webserver if you haven't already (apachectl graceful, et al). Telnet to your machine on port 80, and enter: GET /foo HTTP/1.0 User-Agent: EmailSiphon No matter what URL you selected for /foo, you should receive the output from the poison code. 6. HTML Trapping Spambots are spiders, and like an spider, to find a page they must be given a link to it. Somewhere on your site, ideally on several pages near or on the root page, include a link to the /sugarplum/ scriptalias (adjust as necessary). Here, again, you need to be smarter than the spambots. Assume they can identify any trap that you can -- hence, making your links the same color as the background in a zero-sized table cell, while it's discrete, may not be effective. Use a variety of tactics -- my suggestion is to use a small (not ridiculously or invisibly small) font, and to employ the specific formatting characteristics of the pages to hide it somewhere, rather than applying formatting to the link itself. Remember that the poisoner won't hurt a normal browser -- it may confuse people, but no worse than that. Even if they wander through it endlessly, the worst thing that will happen is that they'll fill up their local cache with it. Remember, the only way to trigger the DoS attack is to enter it with a known-spammer User-Agent or to make too many requests from the same IP with multiple agents in the configured time period, so it should be nearly impossible for normal uses to trigger an attack accidentally.