Author Topic: SiteCrawler is a website downloading application for Mac OS X 10.4 and Later  (Read 1157 times)

0 Members and 1 Guest are viewing this topic.

Software Santa

  • Administrator
  • *****
  • Join Date: Dec 2006
  • Posts: 5238
  • Operating System:
  • Mac OS X 10.9 Mac OS X 10.9
  • Browser:
  • Firefox 60.0 Firefox 60.0
SiteCrawler is a website downloading application for Mac OS X 10.4 and Later - now FREEWARE!

It WAS $20 Shareware ... now FREE - and Totally Unsupported!

Quote
The Web, on your hard disk

SiteCrawler is a website downloading application that lets you capture entire sites or selected portions, like image galleries. It features powerful settings that no other application offers.


Simplicity
You don't have to be an expert to use SiteCrawler. While the advanced features are easily accessible, they don't bog down the basic settings, so you can stay focused on the task at hand. To start crawling a site, enter a web address and choose a destination folder on your disk. Further options help you fine-tune the behavior, such as which parts of the site to access.


Take a break
While SiteCrawler crawls a site, you can pause it to change the settings. When you resume, the new settings are picked up immediately. So if you see files being downloaded that you don't really want, there's no need to stop the session.
You can save sessions for later, even in the midst of downloading. You can then re-open the session later, and pick up right where you left off!


Multiple Languages
SiteCrawler is currently available in English and Swedish.


URL Patterns
Normally, you choose a single web address to start from. But using URL patterns, you can choose several. To start from a numeric range, put the range inside a pair of square brackets. If you instead have a list of a few options, put those inside a pair of curly brackets. You can combine these pattern types to create even more advanced URLs. SiteCrawler 1.1 has even more powerful URL patterns that lets you use square brackets inside curly ones! An address using URL patterns might look like this:

Code: [Select]
http://example.com/{stuff,images}/img[100-150].jpg

Rewrite that
By default, SiteCrawler changes references in downloaded HTML pages to point locally. This means you can browse downloaded sites without problems with broken links and links that point to the originating web site. Also, the file extensions of downloaded files are changed to match their actual content.


Performance
The SiteCrawler engine is optimized for high-speed Internet connections and takes advantage of modern web server technologies to speed up downloads. In addition to regular HTTP, SiteCrawler also supports secure connections (HTTPS).


AppleScriptability
AppleScriptVersion 1.1 adds full AppleScript support, so you can more easily integrate SiteCrawler with many other applications. You can control all aspects of a session, including rules.


Universally Cool
SiteCrawler is built as a universal binary, which means it runs natively on both PowerPC and Intel Macs.


Authentication
SiteCrawler handles authentication with ease, no matter whether the web site requires you to log in using HTTP authentication or with a login form. If it's the latter, you can use SiteCrawler's Safari integration - simply log in using Safari, and SiteCrawler will automatically inherit the session.

When a web site requires you to log in using HTTP authentication (when a log in sheet drops down in Safari), SiteCrawler asks you for the log in information. And if the password is already in your keychain, you'll get the option to use that.



Rules rule!
You can set up rules that SiteCrawler obeys when crawling a website. For example, you may want to exclude all files whose file extension is 'jpg'. Or, you may want to explicitly allow SiteCrawler to follow links to certain domains that are not normally included. You can arrange rules like these in the Rules list. The rules are evaluated in order, so later rules can override previous ones.

SiteCrawler 1.1 improves rules even further by adding a nice new user interface and the ability to assign several conditions to a rule. It's now possible to test rules out beforehand.


Safari Integration
Unlike any other site downloader, SiteCrawler can automatically use session cookies from Safari. Often, web sites protect resources with passwords, and it's normally very difficult to crawl protected areas of sites which use cookies for authentication. All you need to do to crawl the password-protected areas with SiteCrawler is to log in to them using Safari - SiteCrawler will figure out the rest.

SiteCrawler offers the option to install shortcuts in Safari. An item is inserted in the File menu to pre-fill a SiteCrawler session with the address of the current page. There is also a new entry in context menus, so you can start crawling a link.



As I no longer have the time to maintain these apps, and because my payment processing company recently shut down, all Lighthead apps are now free of charge. Use these licenses for full functionality. Thanks!

SiteCrawler

Name: SiteCrawler User
Key: LSC11-28FF-E0D6-5D5F

http://lightheadsw.com/sitecrawler

http://lightheadsw.com/store.php

 

Software Santa first opened on January 1st, 2007
Now celebrating 16 Years of being a Digital Santa Claus!
Software Santa's Speedy Site is Proudly Hosted by A2 Hosting.

Welcome Visitor:





@MEMBER OF PROJECT HONEY POT
Spam Harvester Protection Network
provided by Unspam



Software Santa Welcome Page

The Software Santa Privacy Policy