Updated 2012-06-09 01:35:39 by RFox

Feel free to add whatever occurs to you by finishing the sentence 'TclHttpd needs ...'

(note: actual bugs can be recorded here [1])

A Certificate Authority edit

- CMcC: ... to match its TLS support.

AK: Support for the mgmt of certificates and authorities, etc. is something I would place into TLS, so that everything using TLS can make use of that support. Not only Tclhttpd.

CMcC: there seems to be two halves to the user-facing part of CA, an interface to fill in a certificate request and something to upload a certificate into the user's browser. It would be good to provide a general interface, but it seems to be wedded to the httpd server, so hard to generalise. Most open CAs also seem to be more-or-less shallow wrappers around the awful OpenSSL interface. Perhaps if TLS provided key generation support, or perhaps if there was a non-OpenSSL CA facility that would be worthwhile (because OpenSSL key management is a disgusting thing.)

CMcC update 20040709: Checked into HEAD under sampleapp/ca a simplistic CA, sufficient to create CA and server certificates and deliver them to the client.

DKF: Certificate Authorities are hard. The awkward bit is that they really need to be run very carefully, and the people running them need to be very careful about how they decide whether a person really sent them a certificate. By comparison, OpenSSL's key mangling is just annoying...

CMcC: yes, hard and finicky - X.509v3 and its interaction with browsers is a can of wriggly worms. All this one attempts to do is generate a CA certificate and a server certificate. To be a completely functional CA it needs to be able to handle SPKACs from Mozilla and it needs to send js to IE to generate requests ... the OpenCA project [2] handles all that - if you need it. In the meantime, this simplistic thing has just enough to generate a server certificate without much admin intervention, which will encourage people to use SSL/TLS, which increases the nett security in the world, so I'm happy.

Web Admin Interface edit

CMCc: help newbies configure the system with a web interface. This would benefit from the recent Digest changes, and the addition of /htaccess domain, and (more generally) the recent security enhancements to TclHttpd.

Generalised Templating edit

CMCc: allow any mime type to be templated as .tml->.html currently is. qv TclHttpd Templates for proposal.

AK: Note the textutil::expander in tcllib. This is the core of William Duquettes Expand macro processor.

CMcC: Tclhttpd already has a well developed template facility, based around subst. Other types of template could be added by using the Doc_$mimetype facility, as/if desired.

Generalised Error Handling edit

CMCc: allow per-directory error and not-found page handling.

Some of this is handled in Apache by means of the .htaccess file, which seems to me to be a really bad idea, because it necessitates parsing, because it's a strange place to put that kind of thing, and because it's redundant to create a command language when we've already got an excellent one.

I've put the notfound handler I use here Tclhttpd Error Handling for reference. It wraps the standard handler, and can be deployed in /usr/lib/tclhttpd/custom/ as a per-site customisation.

.htaccess -> .tclaccess compilation edit

CMCc: Recently, David Gravereaux reported (in the tclhttpd mailing list) that "[Url_Dispatch] appears to have a bottleneck processing the access hooks."

I wonder if this is because of the heavy parsing and processing needed for the .htaccess format, and suggest (if it is, and I think it probably is) we could convert .htaccess to an equivalent .tclaccess on-the-fly, and save the performance hit.

It might be, though, that .htaccess bottleneck is due to the necessity of communicating back to the client, in which case compilation wouldn't help.

AK: Question from someone not having any idea about .htaccess. Conversion of .htaccess to something tcl can read easier is a good thing. Would it also make sense to keep the information in memory after reading any of the files once ? Or is it necessary to read the files whenever an access happens ? Because if we (can) keep the info in memory even a more costly parse (of .htaccess) should not hit that hard, being amortized over the whole lifetime of the server process.

CMcC: good point. I think the .htaccess is in fact parsed into a data structure which persists, which suggests that David Gravereaux's observation of performance problem is about network latency, and so unfixable.

DG: Off hand, each time a page is requested, all .htaccess file up the tree are sourced in a slave interp, each and everytime. Persists.. not really.. Cached, no. This isn't network latency at all; this is page generation speed and the bottleneck [Url_Dispatch] has.

CMcC: On the contrary, having RTFS, I see that the file hierarchy is searched bottom-up for an applicable .htaccess or .tclaccess file. A matching .htaccess file is then submitted to [AuthParseHtaccess] which consults an array called auth$file, comparing its modification time with the modification time of the file. Only if, the file is more recent than the contents of auth$file is the file re-parsed. Thus, .htaccess parsing is definitely designed to be cached (modulo bugs, of course.) .tclaccess files must, of course, be re-evaluated on every attempted access.

One could speed this up by caching the directory search process, mapping file to applicable .htaccess file, but that would preclude adding a .htaccess file on the fly, and having it recognised, which I presume is a user requirement (although it might be something people would want to option out.) I suspect, too, that the directories are already cached by any decent o/s, and so the win from this isn't as great as one might imagine.

Even when the .htaccess file has been parsed or fetched from cache, yielding a realm, the browser must be challenged by sending the appropriate http protocol challenge, and its result must be calculated ... this is a necessity imposed by the requirements of the http Basic authentication protocol. Unavoidable network latency, as the challenge is sent to and a response formulated by and returned from the browser. This may or may not result in a popup challenge, depending on the browser's ability to cache realm->user,password.

Given this analysis, and subject to the requirement that .htaccess files are mutable, I think it is likely that network latency is the cause of DG's observation.

DG -- Observations were done by commenting-out 'foreach hook $Url(accessHooks) {...}'. If latency was the cause, why did commenting-out the access hook processing nearly double hits/sec speed? Network latency is non-existant when the tclsh process is wedged @ 100% cpu usage. Accept it Colin, the access hook processing is slow. Maybe the ugly eval right in the beginning there is suspect, hmm? I see why the eval is there, too bad it needs to be.

CMcC This is all good stuff, David. One of the default accessHooks is [DocAccessHook] which calls [Auth_Verify] ... which (assuming there are .htaccess files in the path to Doc_Root) will cause network access. Did your tests have .htaccess files anywhere in the path to root?

DG There was one in the docroot, but I don't recall the contents of it. It called into routines from the root .tml file, but security wasn't asserted for those URLs. I'm almost positive it was a .tclaccess, not an .htaccess. All the performance sensitive URLs were domain scripts. No access files were in the literal directories of those domain paths. No literal directories existed for them.

Status: Gathering Data

Improved [Url_Dispatch] speed edit

Someone (DG?) Reports [Url_Dispatch] bottleneck ... could they expand on this concern here?

DG --
   1) Use your favorite http performance abuser against a URL for a static/cachable
     page or maybe a domain script that returns a static string.  This tester must
     be able to place the tclsh process into 100% cpu usage. Apache's ab doesn't
     seem to have enough guts for me, I wrote my own.
   2) Get a baseline max hits/per in stock form. (was 70 for me in my test setup)
   3) Comment-out the block 'foreach hook $Url(accessHooks) {...}' in Url_Dispatch
   4) observe the large speed-up (143).
   5) scratch head as to why..

The production setup serves @ about 500 hits/sec which is over 1.5M per hour and was the design target. I wouldn't have been able to get the design target if I had not modified the access hook processing.

CMcC Can I have your performance abuser, to repeat your tests?

DG It was iocpsock using the http package setting a ?protocol handler?. I forget exactly what I did, but used the -command option to enable async mode and within time counted up the connects, then down for the completes to zero for done. As a client, iocpsock works best on WinXP. If you want, I'll rewrite it? I don't think I saved it... At least not here. I'll need to look.

Status: gathering data.

Better xml support edit

When using direct url to return xml there is a problem with returning type text/xml.

getting xml from Microsoft.XMLHTTP the server does not know how to handle content-type text/xml. resolution: change ncgi package proc ::ncgi::nvlist adding to the switch command text/xml

Status: AFAIK, this was fixed in tcllib, see [3]

Clean the tree edit

[AL] : I use the current released tclhttpd (3.4.2). And the basic tree is messy. Why htdocs and htdoc2 ? Where is the real documentation doc htdocs htdocs_2 ? Please clean it all. There is also a far cry for documentation. I had to look in the code to really know how to use upload in my programs. A clear simple set of man pages would be really useful.

WJR - You should download the latest version (3.5.1), a very nice set of man pages is now available. In addition, the latest version has several sample applications that may help your development.

[AL] : It would be also very nice to have some kind of "trivial" tclhttpd with very few functions and a lightweight code and size, without snmp, without ttml, without all but the basic functions (post and get, cookies, file upload).

WJR - I believe there's an existing discussion about this (see minimal TclHttpd).

CMcC - The current released version of Tclhttpd is 3.5.1 - there's a lot more documentaton.

Direct Wikit support edit

CMcC - Wikit vfs will be adapted to allow a Wikit to be directly mounted under /htdocs, with all expected Wikit functionality.

Status: several working versions under construction (20040613)

CGI support in the Starkit Version edit

phk 2004-06-25 - Is there any chance to use cgi within the Starkit? (you may ask why... well, a Starkit/-pack could be used for a nice small demo without the need of an installation)

AMG - I don't think this is feasible or worthwhile. CGI scripts are separate executables, and (so far) only Tcl programs can look inside Metakits using ordinary file calls (a la VFS). In Linux at least, it's possible to add VFS-like functionality to existing programs by (1) overriding the libc file calls for dynamically-linked executables, (2) creating a kernel driver for mounting Metakits, (3) mounting a userland filesystem such as [LUFS], or (4) (insert juju here). But this is a terrible amount of trouble to go through when "sdx unwrap" works just as well.

phk 2004-08-30 - I didn't know about this limitation, but I still think it would be cool to combine a webserver based application and the webserver within one Starkit. Not really for productive usage, but for demo versions etc... Thanks for pointing out the current possibilities.

Expanded fallback in Doc domain edit


CMcC At the moment the prefix of a URL is used to select a domain handler, and (within the Doc domain) the mime type of the result can be used to select a post-processing filter. It would be good if, by analogy to the use of .tml files to generate any required type, provision could be made for converting any format into any other format, thus:

A request for a file of the form x.y, if not found, and unsatisfied by a template x.y.tml could be satisfied by applying a proc .z.y to an existing file x.z, where .z.y is expected to transform .z format to .y format.

This would be a very simple addition to doc.tcl, and would allow a similar kind of flexibility in handling URL suffixes as is available to handle URL prefixes.

AJAX capabilities? edit


RLH Since everything is driven by Tcl does this really apply? I am asking because I do not know.

CMcC Yes, AJ stands for asynchronous javascript. AJT[XH] would make sense.

RLH 2006-07-18: Well the javascript package in Tcllib needs updating. So it would be nice to add Ajax support to that using either Dojo or prototype.

Look and Feel edit

RLH 2006-04-14: I think TclHttpd is such a cool piece of software that's its look and feel should be updated. I am talking about the initial server page and example pages. I am looking at some designs (html+css with minimal images) to see if I can update the look.

RLH I would like to see something like this as I think it presents a better face to TclHttpd: http://www.frbc-va.org/mockup/

Jeff Smith 2006-04-15: Nice work! It does look more up to date and appealing. If you suggest it on the TclHttpd mailing list it may inspire the author to release an official 3.5.2 version with this new look.

RLH Thanks Jeff, I sent an email to the mailing list about it.

MHo 2006/05/23: Has someone adapted the cool new homepage? I'm working on my Tclhttpd Winservice and want to integrate the new design...

RLH 2006-05-23: Not to my knowledge...I posted to the newsgroup as suggested but not a lot came back. I have no idea if it is being integrated or not.

Screencasts edit

RLH 2006-08-14: They seem to be all the rage. It would be nice for a screencast to show how to create different types of "applications" using TclHttpd.

Quickstart Script(s) edit

RLH 2006-08-14: There is an emphasis "today" on getting started quickly. It would be nice if you could do something like "site_create" and it would create a basic scaffolding and a generic start page. It would create a stub of a custom.tcl page and update the TclHttpd instance to automatically include that page. This could be extended to other things as well. Yes, I am looking at the Rails playbook and no I wouldn't do everything they do. Just something that would make starting with TclHttpd easier.

RFox Documentation on the options/env so that writing a custom startup script is simpler.. specifically for embedding web servers into other applications that may not have initially been web apps (e.g. to provide a web interface to existing applications so that the UI can be decoupled physically from the app.

Done edit

Done in the current version

Security - all known holes plugged in 3.5.1 (new release)

Documentation and Tutorials - vastly improved in 3.5.1 [Scott Nichols and CL and others remain unsatisfied.]

Stacked domain handlers - NEM: moved to TclHttpd Stacked Domains. Status: Done in HEAD (20040613 CMcC)

Features done elsewhere

(e.g., in tcllib, which can be used with Tclhttpd of course)

Better logging infrastructure - schlenk: allow finer control over logging. Provide a hierarchical logging infrastructure (probably with web interface) to control logging on a per subsystem base. Provide logging backends for different targets like syslog/eventlog.

AK: See the log and logger packages in Tcllib. The hierarchical stuff seems to be something logger is supporting.

Log Rotate and Reaping - Jeff Smith Purge logs after x number of days and rotate logXXXX_error daily as well.

AK: This is also something not specific to tcllhttpd, but many demons and long-running applications. Candidate for Tcllib.