What are some web based approaches for text mining? How can web pages be crawled?

What are some web based approaches for text mining? How can web pages be crawled?

I used to be a huge fan of open.dapper.net, but Yahoo! seems to have run it into the ground. I’ve tried kimonolabs.com (BETA), but found it’s not stable enough yet to rely on. import.io seems to be more mature and reliable; however, I find SaaS products that require a download component really annoying.