Updated 2014-04-13 23:54:04 by RLE

if 0 {Richard Suchenwirth 2004-03-08 - In the "software museum", today I play with the Semantic Information Retriever SIR (Raphael, 1964), a software that takes natural-language sentences (a very restricted subset of English) to build a relational knowledge base, or answer questions about it. The following Tcl implementation is even weaker than the original from 40 years ago, but then again it is only a little weekend fun project... and re-plays the sample dialog with SIR as seen in the literature:

  • populate the knowledge base with some facts
  • ask questions
  • SIR will ask back if facts are missing in its chain of reasoning}
 proc Sir, args {
    set sentence [join $args]
    if [regexp {^what is( an?)? (\w+)} $sentence -> . item] {
       return [isa $item ?]
    }
    if [regexp {^is( an?)? (\w+) a (\w+)} $sentence -> . item cat] {
       return [isa? $item $cat]
    }
    if [regexp {^(\w+) is a (\w+)} $sentence -> item cat] {
       return [isa $item $cat]
    }
    if [regexp {^every (\w+) is a (\w+)} $sentence -> item cat] {
       return [isa $item $cat]
    }
    if [regexp {^(any|every) (\w+) has (\d+) (\w+)s} $sentence -> . cat n item] {
       return [has $item $n $cat]
    }
    if [regexp {^an? (\w+) is part of an? (\w+)} $sentence -> part cat] {
       return [has $part * $cat]
    }
    if [regexp {^how many (\w+)s are on (\w+)} $sentence -> part cat] {
       return [has $part ? $cat]
    }
    error "don't understand '$sentence'"
 }
#-- Routines for adding to, or querying, the knowledge base (::K)
 proc isa {item cat} {
    if {$cat eq "?"} {
       whats $item
    } else {
       ladd ::K($item,isa) $cat
       ladd ::K($cat,eg) $item
    }
 }
 proc isa? {item cat} {
    if [info exists ::K($item,isa)] {
        foreach subcat $::K($item,isa) {
            if {$subcat eq $cat} {return yes}
            if {[isa? $subcat $cat] eq "yes"} {return yes}
        }
    }
    return no
 }
 proc has {item n cat} {
    if {$n eq "?"} {
       howmany $item $cat
    } else {
       set ::K($cat,has,$item) $n
       ladd ::K($item,ispartof) $cat
    }
 }
 proc howmany {item cat} {
    if [info exists ::K($cat,has,$item)] {
       set n $::K($cat,has,$item)
       if {$n eq "*"} {
          ask-n $item $cat
          return [howmany $item $cat]
       } else {return $n}
    } else {
       if [info  exists ::K($cat,isa)] {
          foreach subcat $::K($cat,isa) {
             set n [howmany $item $subcat]
             if [numeric $n] {return $n}
          }
       }
       foreach fact [array names ::K $cat,has,*] {
          regexp $cat,has,(.+) $fact -> part
          set n $::K($fact)
          set n2 [howmany $item $part]
          if [numeric $n2] {return [expr $n*$n2]}
       }
    }
    return "can't tell"
 }
 proc ask-n {item cat} {
    puts "How many ${item}s per $cat?"
    eval Sir, [gets stdin]
 }
 proc whats what {
    if [info exists ::K($what,isa)] {
       set cats $::K($what,isa)
       foreach i $cats {
          if [info exists ::K($i,isa)] {append cats " " $::K($i,isa)}
       }
       return "$what is a [join $cats {, a }]"
    } elseif [info exists ::K($what,eg)] {
       return "A $what is a [join $::K($what,eg) {, or a }]"
    } else {return "don't know"}
 }
#---- General utilities:
 proc ladd {listvar element} {
    upvar 1 $listvar list
    if ![info exists list] {set list {}}
    if {[lsearch $list $element]<0} {lappend list $element}
 }
 proc numeric x {string is integer -strict $x}
#---- Testing (note the respectful way in which we talk to this silly software)
 Sir, John is a boy
 Sir, every boy is a person
 Sir, any person has 2 hands
 Sir, a finger is part of a hand
 puts [Sir, what is John?]
 puts [Sir, what is a boy?]

if 0 {If we source this file in a tclsh (so that gets works), we see
 % source sir.tcl
 John is a boy, a person
 boy is a person
 % Sir, how many fingers are on John
 How many fingers per hand?
 every hand has 5 fingers
 10

which comes close to the 40-years old original as reported in http://staff.science.uva.nl/~mdr/Teaching/LTP/literature/monz_chap2.ps - the chain of reasoning went John - boy - person, then through person's "parts", filling the unspecified number of fingers per hand, and finally computing 2 (hands) * 5 (fingers) for John. More tests:
 % Sir, Mary is a girl
 % Sir, every girl is a person
 % Sir, what is Mary?
 Mary is a girl, a person
 % Sir, what is a person?
 A person is a boy, or a girl
 % Sir, is John a person?
 yes
 % Sir, is John a girl?
 no
 % Sir, is a boy a person?
 yes

Playing with this "Sir", it's often helpful to inspect the knowledge base with
 parray K

References

RAPHAEL, Bertram: SIR: A computer program for semantic information retrieval. MAC-TR2 Project MAC MIT June 1964

}