This release has classification of URLs as "interesting" or "boring" by simple string matching. Interesting URLs are downloaded in preference to boring ones. Spider has been separated from the UI, and is now in ui/TextSpider. Checkpointing and resume functionality have been added so that the spider can be killed and restarted without doing lots of processing. URL retrieval has been fixed so that fragments (URLs with a # in them) are not treated as new URLs.
Basic multi-threading code is now in place, so the spider can retrieve many
URLs at once. There's no limit to the number of threads that can be started.
Mailto links on pages are now handled without the spider crashing, and are
saved into a text file for later processing.