Updated 2011-05-12 10:29:13 by RLE

Purpose: to discuss aspects of developing scripts that interact with web sites.

Background: I have a short term need, but wondered if perhaps it is masking a larger need.

The short term need: I have an email box at one of the various free email web sites. To get into the mailbox, I need to go to a form based web page, type in a password, then select one of the various email folders holding new mail, then for 1 to n pages, select unread mail msgs and read the msg, deciding whether to download, forward, delete, etc. the msg.

I want to be able to run scripts that automatically deal with all of the msgs in a folder - either forwards them, downloads or something. Fetching the pages (which require cookie handling), parsing the the html , invoking the appropriate function (which of course moves around the page depending on what else is going on around one) seems non-trivial.

Longer term need: developing toolsets (to make obtaining info and interacting with web sites such as search engines, news sites like slash, freshmeat, news.com, etc., downloading the latest versions of software, etc.) seems like a task that would benefit others.

Does anyone have thoughts, ideas, suggestions, examples?

Sure do! A wrapper for the curl lib. There seems to be some unfounded inertia in the Tcl community related to the wrapping of third-party libs. It would take about an hour to wrap the curl lib using swig.

http://curl.haxx.se

You get everything you want and then some.

It would be a shame to do this any other way, since by using the curl interface you get a big jumpstart on features.

I did a 15 minute hack to get the curl_easy interface and it was quite useful.

-PSE