My temporary conclusions and where they come from.A. OBJECTIVE CONCLUSIONSon tcl 8.3, 32 bit machines only
each tcl object =
any new string, number, date, value --> 24 bytes + content size
any pre-existing string, data, value --> 4 bytes (pointer only)
content size =
depends on encoding and data type :
one or two bytes per char for string values
may be 0 for a number (integer/double), if it is never used as a
string (therefore included in the core tcl object)Jeffrey Hobbs comments: UTF-8 can go up to 3 bytes per char for the 2-byte unicode that Tcl uses internally. Also, content size can be greater for UnicodeString objects, List objects, ... that all malloc some extra space for their internal reps.
each variable =
48 bytes + "content size" of the name +
"tcl object size" of the content
each hash key entry =
48 bytes + "content size" of the key +
"tcl object size" of the value
each list =
32 bytes + size of each list entry
each list entry =
4 bytes + "tcl object size" of the contentB. SUBJECTIVE CONCLUSIONS- When using TCL, don't emulate pointer mechanisms. Copy the complete data when needed. TCL will replace redundant data by pointers.
- Each different "thing" in a tcl program will cost 24 bytes
- Variables and hash-tables are costly:
52 bytes overhead for each variable,
52 bytes overhead for each hash table key- Lists are not costly: 4 bytes overhead for each element. (Yes, far more if each element is itself a list...)
> On a 32 bit machine where alignment is 4 byte boundary > and the types have the > following sizes, > long 4 bytes > int 4 bytes > char * 4 bytes > double 8 bytes > void * 4 bytes > sizeof (Tcl_Obj) = 4 + 4 + 4 + 4 + MAX (4, 8, 4, 4 + 4) > = 24 bytes2. excerpt from[2]
>> [experiment shows that...] approximately 54 bytes for each key. [...]Well, it takes a certain amount of space to store the hash entry (four words plus the size of the key; median about 20 bytes in your case on a 32-bit machine) and more to store the variable (each entry in an array is an independent variable that can support its own traces, etc.) which adds another 8 words or 32 bytes. This gives about 52 bytes per array member; pretty close to what you report...D. MY EXPERIMENTStclsh 8.3.2 with TCL_MEMORY_DEBUG on windows Millenium - 32 bit machine
1. hashtable with empty values
memory info
current bytes allocated 152681
...
% for {set i 0} {$i < 10000 } { incr i } {
set t($i) ""
}
% memory info
current bytes allocated 698453
...
698453 - 152681 = 545772
approx 54 bytes per key.
2. hashtable with constant value
memory info
current bytes allocated 152550
...
% for {set i 0} {$i < 10000 } { incr i } {
set t($i) "abcd"
}
% memory info
current bytes allocated 698363
...
698363 - 152550 = 545813
approx 54 bytes per key.
3. hashtable with variable value
memory info
current bytes allocated 152550
...
% for {set i 0} {$i < 10000 } { incr i } {
set t($i) "abcd_$i"
}
% memory info
current bytes allocated 1037220
...
1037220 - 152550 = 884670
approx 89 bytes per key.
4. empty global variables
% memory info
current bytes allocated 152550
...
% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] ""
}
% memory info
current bytes allocated 729761
...
729761 - 152550 = 577211
approx 57 bytes per variable
5. global variables with the same value
% memory info
...
current bytes allocated 152550
% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] "abcd"
}
% memory info
...
current bytes allocated 708202
708202 - 152550 = 555652
approx. 55 bytes per variable.
6. global variables with different values
% memory info
...
current bytes allocated 152550
% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] "abcd_$i"
}
% memory info
...
current bytes allocated 1047070
1047070 -152550 = 894520
approx 89 bytes per variable.
7. empty list entries
% memory info
...
maximum bytes allocated 152550
% for {set i 1 } { $i <= 10000 } { incr i } {
lappend l ""
}
% memory info
...
current bytes allocated 202179
202179 - 152550 = 49629
approx 5 bytes per list entry.
8. identic list entries
% memory info
...
current bytes allocated 152550
% for {set i 1 } { $i <= 10000 } { incr i } {
lappend ::l "abcd"
}
% memory info
...
current bytes allocated 202215
202215 - 152550 = 49665
approx 5 bytes per list entry.
9. different list entries
% memory info
...
current bytes allocated 152550
% for {set i 1 } { $i <= 10000 } { incr i } {
lappend ::l "abcd_$i"
}
% memory info
...
current bytes allocated 541083
541083 - 152550 = 428533
approx 43 bytes per list entry.interp costs? interp alias costs?DKF - note that dict (as proposed in TIP #111 [3]) will give hash access for memory costs much closer to that of a list and that of an array.
Arts and crafts of Tcl-Tk programming

