Larbin is an HTTP Web crawler with an easy interface that runs under Linux. It can fetch more than 5 million pages a day on a standard PC (with a good network).
Information regarding Project Releases and Project Resources. Note that the information here is a quote from Freecode.com page, and the downloads themselves may not be hosted on OSDN.
This release corrects some compilation tweaks with recent gcc
versions, improves the configuration file parser, and adds new options
for following links selectively.
This release compiles on Solaris, cookie management has been added, images can be fetched with pages, and many rewrites have been done for efficiency and portability.
With this release, it is possible again to crawl through a proxy, all configurations should compile (Linux and BSD), images can now be downloaded with pages, and the robots.txt parser has been enhanced.