Updated 2011-12-19 03:45:13 by RLE

From Wikipedia[1], the free encyclopedia.

In computer science, the term integer is used to refer to any data type that can represent some subset of the mathematical integers. These are also known as integral data types.

In the context of Tcl, where everything is a string, dealing with integers can be an odd business indeed.

I have been recently working on a system that must interface between Tcl scripts and other programs implemented in other languages across a network.

Some of the values that need to be communicated need to meet certain data type restrictions of other languages at the other end.

This means that values that are integers inside of Tcl need to be checked against those restrictions to ensure that they are acceptable. (This is part of checking user inputs against validation criteria.)

Some of these restrictions were interesting to enforce, and I'm wondering if there are any tricks to this that I missed.

The external data types of interest:

  1. unsigned byte
  2. unsigned short (16 bits)
  3. signed word (32 bits)
  4. unsigned word (32 bits)
  5. unsigned long (64 bits)

So, how then to check for particular data types?

One of the interesting mechanisms is [string is integer]. This turned out to be a useful part of the checking, but not sufficient.
unsigned byte
string is integer and a check for a range between 0 and 255.
unsigned short
string is integer and a check for a range between 0 and 65535.
signed word
string is integer and a check for a value under 2147483648.
unsigned word
used a regexp to ensure an unsigned string of digits and then ::math::bignum to enforce a 32-bit limit.
unsigned long
used a regexp to ensure an unsigned string of digits and then ::math::bignum to enforce a 64-bit limit.

It certain doesn't make sense to clutter up string to be able to say string is uint8 or string is uint32 but it was kind of surprising that the range of acceptable values for string is integer was about one and a half times larger than the range for a signed, 32-bit, integer. That's because string is integer accepts signed values and also the half range of higher unsigned values.

escargo 16 Aug 2005

I also note that ::math::bignum (in tcllib) makes it harder than I would like to compare bignum values with 0 and 1. A couple of built-in constants would make comparing arbitrary values to 0 and 1 much easier.

Readers of this page will want to follow TIPs 237 [2] and 249 [3].

NEM: Probably the binary command will form part or all of a solution to checking/formatting integers to particular binary representations.

escargo: Ironically, I don't care about the binary representations on the system that is reading the input. I just have to verify that the values conform to the specified types before sending them to a remote server. I never need the binary representations. I almost need some kind of introspection that says how many bits would it take to represent a particular value (if the value has been stored as a long, wide, or bignum interally).