Updated 2012-09-10 23:13:46 by RLE

http://www.saxproject.org/

Review of Book of SAX [1]

RS: In another aspect, SAX (Simple API for XML) is a model of processing XML on the fly. A SAX parser like expat (available in tdom) goes through the XML input without keeping much state, and issues configurable callbacks for instance at the start or end of an XML element, or when it encounters a character data chunk.

Following is a little example of instrumenting a SAX parser. On every start tag, el is called; on every end tag, ee is called, and for all character data in an element, ch is called. These are the callbacks provided by the user.

To keep track of where in the tag hierarchy we are, a global stack ::S is maintained: el pushes the current tag name, ee pops it. ch collects the content of "grill" and "baz" elements in a global array g. When a "bar" element ends, the collected contents are formatted and output, and g reset (so that earlier content cannot be mis-used).
 package require tdom
 proc parse xml {
    set ::S {}
    set p [expat -elementstartcommand el \
               -characterdatacommand  ch \
               -elementendcommand  ee ]
    if [catch {$p parse $xml} res] {
        puts "Error: $res"
    }
 }
#---- Callbacks for start, end, character data
 proc el {name atts} {
    lappend ::S $name ;# push
    if {$name eq "bar"} {array unset ::g}
 }
 proc ee name {
    global g
    set ::S [lrange $::S 0 end-1] ;# pop
    if {$name eq "bar"} {
        puts $g(grill)=$g(baz) 
    }
 }
 proc ch str {
    global g
    set type [lindex $::S end]
    switch -- $type {
        grill - baz {set g($type) $str}
    }
 }
#-- Now to test the whole thing:
 parse "<foo><bar><grill>hello</grill><baz>42</baz></bar>
           <bar><grill>world</grill><baz>24</baz></bar></foo>"

Running this script displays
 hello=42
 world=24