Keep your struct away from me

Marco Maggi In this page I present some "creative" usage of variable traces that allow us to visit elements of complex data structures, hiding the details of the structure itself.

To make the concept clear we start with a simple list iterator. We use a global array to store the state of the iteration and hand the array name to the procedures that need to access the elements.

It's easier to understand the code if we first look at a usage example:

 proc print_elements { itervar } {
    upvar       $itervar iterator

    set elms {}
    while { $iterator(MORE) } {
        lappend elms $iterator(VALUE)
    }
    puts [format "elements: %s" $elms]
 }

 set lst { 0 1 2 3 4 5 6 7 8 9 }
 literator $lst ::li
 print_elements ::li
 unset ::li

We see that the [print_elements] procedure accesses only two elements of the iterator array: MORE and VALUE. The first is expected to be a boolean value: true if there are more elements, false if the iteration is over. The second is the current value in the iteration.

When is the next element from the list stored in the VALUE element of the array? The answer is that a variable trace is associated to the "read" operation on the "::li(MORE)" global variable, and via the [upvar] command to the "iterator(MORE)" variable, local to the [print_elements] procedure.

We have to note that the [print_elements] procedure doesn't know where the elements are coming from: it only sees the two elements of the global array. In [print_elements] there's no reference to the code that controls the list and to the [literator] procedure.

The code that controls the list doesn't know where the list elements are going, it only knows that a reference to the list is used as argument to the to the [literator] procedure.

Being a global variable "::li" can be used as argument to procedure calls and be buried deep in some module. Obviously the right thing to do is write a command that builds a unique global variable name in a private namespace, so that we can do:

 set iterator [unique_name]
 literator $lst $iterator
 print_elements $iterator
 unset $iterator

Easy, you can find a [unique_name] procedure at Generating a unique name.

Now we can look at the list iterator code.

 proc literator { lst itervar } {
    upvar       $itervar iterator

    set iterator(LEN) [llength $lst]
    if { $iterator(LEN) } {
        array set iterator [list MORE 1 CURRENT -1 LST $lst VALUE {}]
        trace add variable [set itervar](MORE) read \
                [namespace code [list literator_tracer $itervar]]
    } else {
        set iterator(MORE) 0
    }
 }

 proc literator_tracer { itervar args } {
    upvar       $itervar iterator

    if { [incr iterator(CURRENT)] < $iterator(LEN) } {
        set iterator(VALUE)        [lindex $iterator(LST) $iterator(CURRENT)]
    } else {
        trace remove variable [set itername](MORE) read \
                [namespace code [list iterator_list_tracer $itername]]
        set iterator(MORE)        0
    }
 }

The array iterator is a little bit more complicated: first, we can only iterate through a global array; second, we can't just unset the iterator: we have to finalise the array search.

The global array constraint is there because arrays local to procedures can't be used as procedure arguments. It's not possible to use the [upvar] command to reach the original array in the iterator tracer procedure: it's too difficult to keep track of the stack level; the [upvar] command will not link a local variable to a global one:

 proc myproc { } {
     upvar ::global local ;# correct
     upvar local ::global ;# error

 }

so we cannot mirror a local array with a global one.

The second problem comes from the mode of operation of the iteration performed with the commands: [array startsearch], [array donesearch], etc.

So, here is the code for an array iterator; the first argument is the type of elements we want to access: key, will iterate over keys; val, will iterate over values; pair, will iterate over key/value pairs.

 proc arrayiterator { type arrayvar itervar } {
    upvar       $arrayvar array $itervar iterator

    set iterator(ARRAY) $arrayvar
    set iterator(SEARCHID) [array startsearch $iterator(ARRAY)]

    trace add variable [set itervar](MORE) { read unset } \
            [namespace code [list arrayiterator_tracer $type $itervar]]
 }

 proc arrayiterator_tracer { type itervar name1 name2 op } {
    upvar       $itervar iterator


    if { [string equal $op read] } {
        if { [array anymore $iterator(ARRAY) $iterator(SEARCHID)] } {
            set iterator(MORE)        1
            set key [array nextelement $iterator(ARRAY) $iterator(SEARCHID)]
            set val [set [set iterator(ARRAY)]($key)]

            switch -exact -- $type {
                key        {
                    set iterator(VALUE) $key
                }
                val        {
                    set iterator(VALUE) $val
                }
                pair        {
                    set iterator(VALUE) [list $key $val]
                }
            }
        } else {
            iterator_array_tracer $type $itername {} {} unset
            set iterator(MORE) 0
        }
    } else {
        trace remove variable [set itername](MORE) { read unset } \
                [namespace code [list iterator_array_tracer $type $itername]]
        array donesearch $iterator(ARRAY) $iterator(SEARCHID)
    }
 }

We can test this iterator with the following code.

 proc print_pairs { itervar } {
    upvar       $itervar iterator

    set elms {}
    while { $iterator(MORE) } {
        lappend elms $iterator(VALUE)
    }
    puts [format "pairs: %s" $elms]
 }

 array set ::arry { a 0 b 1 c 2 d 3 e 4 f 5 g 6 h 7 i 8 l 9 }
 arrayiterator pair ::arry ::ai
 print_pairs ::ai
 unset ::ai
 unset ::arry

Now let's look at something more interesting: set operations. Let's say we are using lists to represent sets of elements.

We keep the lists sorted to make it easy to find an element. We cannot use the full comparison power of [lsort] because it's not matched by [string compare], and we need to compare two elements. Obviously we could [lsort] a list of two elements and then recognise that the first element is the lesser (or greater), but we don't care about this here.

The intersection operation can be realised with an iterator that returns the next element in the intersection set; this iterator accepts as arguments two iterators visiting the two input lists.

Here we reuse the [literator] and [literator_tracer] procedures.

 proc iterator_intersection { iter1name iter2name itername } {
    upvar        $iter1name iter1 $iter2name iter2 $itername iterator

    array set iterator \
            [list MORE 1 ITER1 $iter1name ITER2 $iter2name VALUE {}]
    trace add variable [set itername](MORE) { read unset } \
            [namespace code [list iterator_intersection_tracer $itername]]
 }
 proc iterator_intersection_tracer { itername name1 name2 op } {
    upvar        $itername iterator


    if { [string equal $op read] } {
        upvar        $iterator(ITER1) iter1 $iterator(ITER2) iter2

        set one $iter1(MORE)
        set two $iter2(MORE)

        while { $one && $two } {
            set e [string compare $iter1(VALUE) $iter2(VALUE)]
            if { $e == 0 } {
                set iterator(VALUE) $iter1(VALUE)
                return
            } elseif { $e < 0 } {
                set one $iter1(MORE)
            } else {
                set two $iter2(MORE)
            }
        }

        set iterator(MORE) 0
    } else {
        trace remove variable [set itername](MORE) { read unset } \
                [namespace code [list iterator_intersection_tracer $itername]]
    }
 }

We can test the intersection iterator with the following code.

 literator { 0 1 2 3 4 5 6 7 8 9 } ::iter1
 literator { 5 6 7 8 9 10 11 } ::iter2
 iterator_intersection ::iter1 ::iter2 ::li
 print_elements ::li
 unset ::iter1 ::iter2 ::li

 literator { 1 3 5 7 9 } ::iter1
 literator { 0 2 4 6 8 } ::iter2
 iterator_intersection ::iter1 ::iter2 ::li
 print_elements ::li
 unset ::iter1 ::iter2 ::li

With the same mechanism we can implement the union operation. We see that the requirement to only read the MORE element once can make the code ugly.

 proc iterator_union { iter1name iter2name itername } {
    upvar        $iter1name iter1 $iter2name iter2 $itername iterator

    array set iterator \
            [list MORE 1 ITER1 $iter1name ITER2 $iter2name VALUE {} STATE {}]
    trace add variable [set itername](MORE) { read unset } \
            [namespace code [list iterator_union_tracer $itername]]
 }
 proc iterator_union_tracer { itername name1 name2 op } {
    upvar        $itername iterator


    if { [string equal $op read] } {
        upvar        $iterator(ITER1) iter1 $iterator(ITER2) iter2

        switch $iterator(STATE) {
            1                {
                set one $iter1(MORE)
                set two 1
            }
            2                {
                set one 1
                set two $iter2(MORE)
            }   
            3                {
                set one $iter1(MORE)
                set two 0
            }
            4                {
                set one 0
                set two $iter2(MORE)
            }   
            default        {
                   set one $iter1(MORE)
                set two $iter2(MORE)
            }
        }

        if { $one && $two } {
            set e [string compare $iter1(VALUE) $iter2(VALUE)]
            if { $e < 0 } {
                set iterator(STATE)  1
                set iterator(VALUE) $iter1(VALUE)
            } elseif { $e == 0 } {
                set iterator(STATE)  1
                set iterator(VALUE) $iter1(VALUE)
            } else {
                set iterator(STATE)  2
                set iterator(VALUE) $iter2(VALUE)
            }
        } elseif { $one } {
            set iterator(STATE) 3
            set iterator(VALUE) $iter1(VALUE)
        } elseif { $two } {
            set iterator(STATE) 4
            set iterator(VALUE) $iter2(VALUE)
        } else {
            set iterator(MORE) 0
        }
    } else {
        trace remove variable [set itername](MORE) { read unset } \
                [namespace code [list lunion_tracer $itername]]
    }
 }

Once you get the idea in your head, it's just a matter of testing the operations.

What about trees? The common iterations (inorder, preorder, etc.) can be seen as sequences of elements coming from somewhere, so it's easy to implement a sequence iterator. There can be the need keep track of the "path" from the root node to the current node, but this is really not a problem: since we have a global array, we can store the list of node identifiers in an element.

The structure of these iterators is similar to the structure of the array iterator, so we don't deal with them here (there's no standard tree in TCL, if someone wants to add code to be used with TCLLIB: just place it here).

Another usage of interfaces implemented with variables, is the adapter concept: to connect two modules with an interface that hides the details.

We inspect this idea with the following problem: browse a tree structure with a "listbox" widget. We limit ourselves to display a single tree, not a forest or a graph: that means that our data structure will have a single root node, and each node in the tree is connected to it by a single "path".

The "listbox" widget is a list viewer, and knows nothing about trees. So we have to define an interface: we need the list of elements to be displayed in the listbox and a way to request the visualisation of an item. We use a global array (again) and declare two elements: CURRENT, the identifier of the selected node; CHILDREN, the list of children of the selected node.

Here we have some degrees of freedom in selecting a way to keep track of "were we are": we choose to store in the adapter the path from the root node to the current one. We define the PATH array element. The path is just the list of node identifiers from the root to the current one; an empty path means that the selected node is the root one.

We assume that the initial node is selected "somewhere" in the code and when we have to initialise the adapter, we have the node identifier stored in a variable. The tree structure must provide a way to build the path from the root to the initial node; then the adapter will take care of updating the PATH element.

The usage of the widget is obvious: at the beginning the list of children of the initial node is visible; when we click on an item, the view will change displaying the children of the item we clicked on; if we select the special item "..", the view will change displaying the node in the upper level.

We use the tree structure in An experimental tree structure. The tree adapter is specialised to operate with this tree implementation.

 proc make_tree_adapter { treevar adaptervar first } {
    upvar       $adaptervar adapter $treevar tree

    set s {
        CURRENT         {$first}
        CHILDREN        {[tree_get_children $treevar $first]}
        PATH            {[tree_path $treevar $first]}
        TREEVAR         {$treevar}
    }
    array set adapter [subst $s]

    if { ! [tree_isroot tree $first] } {
        set adapter(CHILDREN) [concat { .. } $adapter(CHILDREN)]
    }

    trace add variable [set adaptervar](CURRENT) write \
            [namespace code [list tree_adapter_tracer $adaptervar]]
 }

 proc tree_adapter_tracer { adaptervar args } {
    upvar       $adaptervar adapter
    upvar       $adapter(TREEVAR) tree

    if { [string equal $adapter(CURRENT) ..] } {
        switch [llength $adapter(PATH)] {
            0           { return }
            1           {
                set adapter(PATH) {}
                set adapter(CURRENT) {}
                set adapter(CHILDREN) [tree_get_root tree]
            }
            default     {
                set adapter(PATH) [lreplace $adapter(PATH) end end]
                set adapter(CURRENT) [lindex $adapter(PATH) end]
                set lst [tree_get_children tree $adapter(CURRENT)]
                set adapter(CHILDREN) [concat { .. } $lst]
            }
        }
    } else {
        set lst [tree_get_children tree $adapter(CURRENT)]
        if { [llength $lst] } {
            set adapter(CHILDREN) [concat { .. } $lst]
            lappend adapter(PATH) $adapter(CURRENT)
        }
    }
 }

Now the "listbox" widget code.

 proc link_listbox_to_adapter { listbox adaptervar } {
    upvar       $adaptervar adapter

    $listbox configure -listvariable [set adaptervar](CHILDREN)
    set s {
        set [set adaptervar](CURRENT) \[%W get \[%W curselection\]\]
    }
    bind .l <ButtonRelease-1> [subst $s]
 }

Get all the code and save it in a script, add the following chunk and run it.

 array set ::tree {}
 fill_tree ::tree

 grid [button .quit -text Quit -command { destroy . }]
 grid [listbox .l -height 10 -background \#ffffff]

 make_tree_adapter ::tree ::adapter 2
 link_listbox_to_adapter .l ::adapter

 tkwait window . 
 exit 0

As an example of different tree implementation, we can use the directory structure of our hard disk. The following code implements an adapter for the directory tree structure.

 proc make_dir_adapter { dir adaptervar } {
    upvar       $adaptervar adapter

    set dir [file normalize $dir]
    set adapter(PATH) [file split $dir]
    set adapter(CURRENT) [file tail $dir]
    set adapter(CHILDREN) \
            [concat { .. } [dir_adapter_elements $adapter(PATH)]]

    trace add variable [set adaptervar](CURRENT) write \
            [namespace code [list dir_adapter_tracer $adaptervar]]
 }

 proc dir_adapter_tracer { adaptervar args } {
    upvar       $adaptervar adapter

    if { [string equal $adapter(CURRENT) ..] } {
        switch [llength $adapter(PATH)] {
            0           { return }
            1           {
                set adapter(PATH) {}
                set adapter(CURRENT) {}
                set adapter(CHILDREN) [file separator]
            }
            default     {
                set adapter(PATH) [lreplace $adapter(PATH) end end]
                set adapter(CURRENT) [lindex $adapter(PATH) end]
                set lst [dir_adapter_elements $adapter(PATH)]
                set adapter(CHILDREN) [concat { .. } $lst]
            }
        }
    } else {
        set lst [concat $adapter(PATH) [list $adapter(CURRENT)]]
        set lst [dir_adapter_elements $lst]

        if { [llength $lst] } {
            set adapter(CHILDREN) [concat { .. } $lst]
            lappend adapter(PATH) $adapter(CURRENT)
        }
    }
 }

 proc dir_adapter_elements { path } {
    set directory [eval {file join} $path]
    set dirs {}
    set files {}
    foreach item [glob -nocomplain -directory $directory -type d -- *] {
        lappend dirs [format "%s%s" [file tail $item] [file separator]]
    }
    foreach item [glob -nocomplain -directory $directory -type f -- *] {
        lappend files [file tail $item]
    }
    return [concat $dirs $files]
 }

We can use it with the same code for the "listbox" widget used before. Make a new script: put in it the "listbox" stuff, add the directory adapter and the following chunk for initialisation.

 grid [button .quit -text Quit -command { destroy . }]
 grid [listbox .l -height 10 -background \#ffffff]

 make_dir_adapter [pwd] ::adapter
 link_listbox_to_adapter .l ::adapter

 tkwait window . 
 exit 0