dictutils

NEM 2007-02-12: Here's a collection of some basic utility procedures that I find useful when working with dictionaries. Feel free to add some more.

NEM 2009-03-18: Added a witharray command. Requires Tcl 8.6 (try).

aricb 2009-03-18: Added an nlappend command similar to [dict lappend] but capable of modifying values in nested dicts. Feel free to rename/modify/etc.

CMcC 2010-07-04: Wub is a heavy user of dict, and carries around an evolving distillation of the ideas in this page. You can fetch the latest version

at any time. I will be incorporating anything that looks useful into it, and suggestions and contributions are welcome. I hope, eventually, this can become a standard part of Tcl, because it's just so fantastically useful. In the meantime it's a good way to try out possible extensions to dict.

The current line-up is:

: dictutils witharray dictVar arrayVar script

A version of dict with that uses a temporary array, rather than creating individual variables. Implements transaction semantics: updates to the array are only copied back to the dictionary if the script completes without error.

: dictutils equal equalp d1 d2

Compares two dictionaries for equality. Two dictionaries are considered equal if they have exactly the same keys and if for each key the corresponding elements of each dictionary are equal according to the passed in equality predicate, equalp. The equality predicate is expected to be a command prefix and will be called with the key and the corresponding values from each dictionary. A possible enhancement would be to allow more than two dictionaries to be compared, or to restrict the keys that matter to some set (i.e., are these two dictionaries equal for the following set of keys?).

KBK notes, perhaps redundantly, that {string equal} and {tcl::mathops::==} are both reasonable choices for equalp.

NEM Actually, not quite, as equalp takes the key as first argument, under the assumption that dictionaries are often used (by me at least) as records/structs and so you might want a different notion of equality for different keys. A simple wrapper proc can make them suitable:

 proc ignore1st {cmd arg args} { uplevel 1 $cmd $args }
 dictutils equal {ignore1st {string equal}} $d1 $d2

: dictutils apply dictVar lambdaExpr ?arg1 arg2 ...?

This function is like a hybrid of dict with and apply. It creates a new procedure scope and populates it with the variable/value mappings present in the dictionary variable. It then invokes the lambda expression (anonymous procedure) in this new scope with the provided arguments. If the procedure completes normally (i.e., without throwing an exception) then any updates to the dictionary are reflected back into the dictVar, otherwise they are ignored and the exception is propagated. This allows for atomic updates to a dictionary in a simple transaction style. A future enhancement might be to allow a series of keys to be specified to apply an update to a nested dictionary.

: dictutils capture ?level? ?exclude? ?include?

This function captures a snapshot of the variable bindings visible at level into a dictionary value and returns it. The level can be any of the forms acceptable to uplevel, and defaults to 1. The exclude argument contains a list of variable names to ignore when performing the capture (defaults to empty list), and the include argument contains a list of glob patterns of variables that should be captured (defaults to a list containing * -- i.e., match everything). Together with dictutils apply this can be used to model simple mutable closures, where a scope can be (partially) captured and later restored and updated. For instance, imagine we have a custom control construct for looping over the lines in a file:

 proc foreachLine {varName file body} {
     upvar 1 $varName line
     set chan [open $file]
     while {[gets $chan line] >= 0} { uplevel 1 $body }
     close $chan
 }
 set count 0
 foreachLine l myfile.tcl { puts [format "%-4d | %s" [incr count] $l] }

This displays a nicely formatted listing with line numbers. Now, let's say that for some reason this processing takes a long time and we want to do it in the background using the event loop. It would be nice to be able to write essentially the same bit of code and let the foreachLine procedure take care of the details. With our simple closures we can do exactly this:

 proc foreachLine {varName file body} {
     set chan [open $file]
     set env  [dictutils capture 1 $varName]
     set func [list $varName $body ::] ;# create a lambda expression
     chan event $chan readable [list foreachLineCb $chan $env $func]
 }
 proc foreachLineCb {chan env func} {
     if {[gets $chan line] < 0} { close $chan; return }
     dictutils apply env $func $line
     # rewrite callback with updated environment
     chan event $chan readable [list foreachLineCb $chan $env $func]
 }

We can now write exactly the same code that we had before, but it will operate in the background using the event loop:

 set count 0
 foreachLine l myfile.tcl { puts [format "%-4d | %s" [incr count] $l] }

(Use vwait to enter the event loop if needed).

: dictutils nlappend dictVar keyList ?value ...?

This function is similar to [dict lappend] but allows modification of list values in a nested dictionary. $keyList contains the path of keys to the list to be modified. If the path specified in $keyList does not exist in the given dictionary, it will be created and treated as if it contained an empty list.

 # dictutils.tcl --
 #
 #       Various dictionary utilities.
 #
 # Copyright (c) 2007 Neil Madden ([email protected]).
 #
 # License: http://www.cs.nott.ac.uk/~nem/license.terms (Tcl-style).
 #
 
 package require Tcl         8.6
 package provide dictutils   0.2
 
 namespace eval dictutils {
     namespace export equal apply capture witharray nlappend
     namespace ensemble create

     # dictutils witharray dictVar arrayVar script --
     #
     #       Unpacks the elements of the dictionary in dictVar into the array
     #       variable arrayVar and then evaluates the script. If the script
     #       completes with an ok, return or continue status, then the result is copied
     #       back into the dictionary variable, otherwise it is discarded. A
     #       [break] can be used to explicitly abort the transaction.
     #
     proc witharray {dictVar arrayVar script} {
         upvar 1 $dictVar dict $arrayVar array
         array set array $dict
         try { uplevel 1 $script
         } on break    {}     { # Discard the result
         } on continue result - on ok result {
             set dict [array get array] ;# commit changes
             return $result
         } on return   {result opts} {
             set dict [array get array] ;# commit changes
             dict incr opts -level ;# remove this proc from level
             return -options $opts $result
         }
         # All other cases will discard the changes and propagage
     }
 
     # dictutils equal equalp d1 d2 --
     #
     #       Compare two dictionaries for equality. Two dictionaries are equal
     #       if they (a) have the same keys, (b) the corresponding values for
     #       each key in the two dictionaries are equal when compared using the
     #       equality predicate, equalp (passed as an argument). The equality
     #       predicate is invoked with the key and the two values from each
     #       dictionary as arguments.
     #
     proc equal {equalp d1 d2} {
         if {[dict size $d1] != [dict size $d2]} { return 0 }
         dict for {k v} $d1 {
             if {![dict exists $d2 $k]} { return 0 }
             if {![invoke $equalp $k $v [dict get $d2 $k]]} { return 0 }
         }
         return 1
     }
 
     # apply dictVar lambdaExpr ?arg1 arg2 ...? --
     #
     #       A combination of *dict with* and *apply*, this procedure creates a
     #       new procedure scope populated with the values in the dictionary
     #       variable. It then applies the lambdaTerm (anonymous procedure) in
     #       this new scope. If the procedure completes normally, then any
     #       changes made to variables in the dictionary are reflected back to
     #       the dictionary variable, otherwise they are ignored. This provides
     #       a transaction-style semantics whereby atomic updates to a
     #       dictionary can be performed. This procedure can also be useful for
     #       implementing a variety of control constructs, such as mutable
     #       closures.
     #
     proc apply {dictVar lambdaExpr args} {
         upvar 1 $dictVar dict
         set env $dict ;# copy
         lassign $lambdaExpr params body ns
         if {$ns eq ""} { set ns "::" }
         set body [format {
             upvar 1 env __env__
             dict with __env__ %s
         } [list $body]]
         set lambdaExpr [list $params $body $ns]
         set rc [catch { ::apply $lambdaExpr {*}$args } ret opts]
         if {$rc == 0} {
             # Copy back any updates
             set dict $env
         }
         return -options $opts $ret
     }
 
     # capture ?level? ?exclude? ?include? --
     #
     #       Captures a snapshot of the current (scalar) variable bindings at
     #       $level on the stack into a dictionary environment. This dictionary
     #       can later be used with *dictutils apply* to partially restore the
     #       scope, creating a first approximation of closures. The *level*
     #       argument should be of the forms accepted by *uplevel* and
     #       designates which level to capture. It defaults to 1 as in uplevel.
     #       The *exclude* argument specifies an optional list of literal
     #       variable names to avoid when performing the capture. No variables
     #       matching any item in this list will be captured. The *include*
     #       argument can be used to specify a list of glob patterns of
     #       variables to capture. Only variables matching one of these
     #       patterns are captured. The default is a single pattern "*", for
     #       capturing all visible variables (as determined by *info vars*).
     #
     proc capture {{level 1} {exclude {}} {include {*}}} {
         if {[string is integer $level]} { incr level }
         set env [dict create]
         foreach pattern $include {
             foreach name [uplevel $level [list info vars $pattern]] {
                 if {[lsearch -exact -index 0 $exclude $name] >= 0} { continue }
                 upvar $level $name value
                 catch { dict set env $name $value } ;# no arrays
             }
         }
         return $env
     }
 
     # nlappend dictVar keyList ?value ...?
     #
     #       Append zero or more elements to the list value stored in the given
     #       dictionary at the path of keys specified in $keyList.  If $keyList
     #       specifies a non-existent path of keys, nlappend will behave as if
     #       the path mapped to an empty list.
     #
     proc nlappend {dictvar keylist args} {
         upvar 1 $dictvar dict
         if {[info exists dict] && [dict exists $dict {*}$keylist]} {
             set list [dict get $dict {*}$keylist]
         }
         lappend list {*}$args
         dict set dict {*}$keylist $list
     }

     # invoke cmd args... --
     #
     #       Helper procedure to invoke a callback command with arguments at
     #       the global scope. The helper ensures that proper quotation is
     #       used. The command is expected to be a list, e.g. {string equal}.
     #
     proc invoke {cmd args} { uplevel #0 $cmd $args }

 }

Courtesy [patthoyts] (with some mods by CMcC): here's a conditional dict get, called dict get?

Here's the command. See where it installs itself?

    proc ::tcl::dict::get? {dict args} {
        if {[dict exists $dict {*}$args]} {
            return [dict get $dict {*}$args]
        } else {
            return {}
        }
    }

And here's where we extend the dict ensemble to make get? look like a first class dict subcommand.

namespace ensemble configure dict -map \
    [linsert [namespace ensemble configure dict -map] end get? ::tcl::dict::get?]

dict switch edit

CMcC - 2010-06-24 03:43:40

    # dict switch dict args... --
    #
    # Apply matching functions from the second dict (or $args)
    # replacing existing values with the function application's return
    #
    # dict switch $record {
    #        name {string tolower $name}
    #        dob {...}
    # }

    proc switch {d args} {
        upvar 1 $d dict
        if {[llength $args] == 1} {
            set args [lindex $args 0]
        }
        dict for {n v} $dict {
            if {[dict exists $args $n]} {
                dict set dict $n [uplevel 1 [list ::apply [list $n [dict get $args $n]] $v]]
            }
        }
        return $dict
    }

    # side effect free variant
    proc transmute {dict args} {
        if {[llength $args] == 1} {
            set args [lindex $args 0]
        }
        dict for {n v} $dict {
            if {[dict exists $args $n]} {
                dict set dict $n [uplevel 1 [list ::apply [list $n [dict get $args $n]] $v]]
            }
        }
        return $dict
    }

Dotpath edit

CMcC - 2010-07-07 00:11:06

Another dict extension... this one makes [dict a.b.c] a synonym for [dict get $a b c] and [dict a.b.c x] a synonym for [dict set a b c x]

namespace ensemble configure dict -unknown {::apply {{dict cmd args} {
    if {[string first . $cmd] > -1} {
        ::set cmd [::split $cmd .]
        if {[llength $args]} {
            return [::list dict set {*}$cmd]
        } else {
            ::set var [::lindex $cmd 0]
            ::upvar 1 $var v
            return [::list dict get $v {*}[lrange $cmd 1 end]]
        }
    }
} ::tcl::dict}}

dicthash edit

slebetman: See dicthash: Yet another lightweight object system for yet another layer of sugaring for dicts.It makes [%a.b.c] a synonym for [dict get $a b c],[%a.b.c = x] a synonym for [dict set a b c x],[%a.b.c x $y] a synonym for [apply [dict get $a b c x] $y] and many more.

LV Is this package something that would be worthwhile to incorporate at least into tcllib, if not the core itself?

dictTreeUnset: remove empty parents on unset in nested dict edit

HaO 2018-07-02: Given a tree-like storage within a dict:

% set d {}
% dict set d a b c 1
a {b {c 1}}
% dict set d a2 b c 1
a {b {c 1}} a2 {b {c 1}}
% dict set d a2 b2 c 1
a {b {c 1}} a2 {b {c 1} b2 {c 1}}

Now unsetting an item may leave an empty parent list as artefact:

% dict unset d a b c
a {b {}} a2 {b {c 1} b2 {c 1}}
% dict unset d a2 b c
a {b {}} a2 {b {} b2 {c 1}}

IMHO this is often unwanted, as it shows levels without contents. The following proc deletes an item and its parent if they get empty by the deletion:

## dictTreeUnset dict ?key ?subkey ...??
## Unset a key in a nested dict. Also unset the parents, if they got empty by the item unset.
## @param dictName Name of the dict to do the operation
## @param key Toplevel key to unset
## @param subkey .. a set of subkeys within the dictionary
proc dictTreeUnset {dictName args} {
    upvar 1 $dictName myDict
    dict unset myDict {*}$args
    while {1 < [llength $args]} {
        set args [lrange $args 0 end-1]
        if {0 < [dict size [dict get $myDict {*}$args]]} {
            return $myDict
        }
        dict unset myDict {*}$args
    }
    return $myDict
}

with the following result using the upper examples:

% set d {a {b {c 1}} a2 {b {c 1} b2 {c 1}}}
% dictTreeUnset d a b c
a2 {b {c 1} b2 {c 1}}
% dictTreeUnset d a2 b c
a2 {b2 {c 1}}
% dictTreeUnset d a3 b c
key "a3" not known in dictionary
% dict unset d a3 b c
key "a3" not known in dictionary

IMHO, this command is a condidate for tcllib or even the core. What do you think?

Limit recursive deletion to given nesting level

Here is an extended version with on optional -maxlevel parameter to limit deletion to a certail nesting level:

proc dictTreeUnset {args} {
    set maxLevel 0
    while 1 {
        switch -exact -- [lindex $args 0] {
            -maxlevel {
                set maxLevel [lindex $args 1]
                if {![string is entier $maxLevel] || $maxLevel < 0} {
                    return -code "-maxlevel not numeric or below 0"
                }
            }
            -- { set args [lrange $args 1 end];break }
            default {break}
        }
        set args [lrange $args 2 end]
    }
    set args [lassign $args dictName]
    upvar 1 $dictName myDict
    dict unset myDict {*}$args
    while {$maxLevel + 1 < [llength $args]} {
        set args [lrange $args 0 end-1]
        if {0 < [dict size [dict get $myDict {*}$args]]} {
            return $myDict
        }
        dict unset myDict {*}$args
    }
    return $myDict
}

Example:

% set d {a {b {c 1}} a2 {b {c 1} b2 {c 1}}}
% dictTreeUnset -maxlevel 1 -- d a b c
a {} a2 {b {c 1} b2 {c 1}}

Category Package

dict