Updated 2013-12-11 05:28:08 by pooryorick

fileutil, a Tcllib module, provides utilties for working with files and directories.

See Also  edit

NR-grep: A Fast and Flexible Pattern Matching Tool
Unixy minitools

Documentation  edit

fileutil official reference
fileutil::magic::cfront official reference
fileutil::magic::cgen official reference
fileutil::magic::filetype official reference
fileutil::magic::mimetype official reference
fileutil::magic::rt official reference
fileutil::multi official reference
fileutil::multi::op official reference

Modules  edit

fileutil::traverse

See also the modules in the Documentation section, which don't yet have a wiki page

Commands  edit

  • ::fileutil::fullnormalize path
  • ::fileutil::test path codes ? msgvar ? ? label ?
  • ::fileutil::cat ( ? options ? file)...
  • ::fileutil::writeFile ? options ? file data
  • ::fileutil::appendToFile ? options ? file data
  • ::fileutil::insertIntoFile ? options ? file at data
  • ::fileutil::removeFromFile ? options ? file at n
  • ::fileutil::replaceInFile ? options ? file at n data
  • ::fileutil::updateInPlace ? options ? file cmd
  • ::fileutil::fileType filename
  • ::fileutil::find ? basedir ? filtercmd ? ?
  • ::fileutil::findByPattern basedir ? -regexp|-glob ? ? -- ? patterns
  • ::fileutil::foreachLine var filename cmd
  • ::fileutil::grep pattern ? files ?
  • ::fileutil::install ? -m mode ? source destination
  • ::fileutil::stripN path n
  • ::fileutil::stripPwd path
  • ::fileutil::stripPath prefix path
  • ::fileutil::jail jail path
  • ::fileutil::touch ? -a ? ? -c ? ? -m ? ? -r ref_file ? ? -t time ? filename ? ... ?
  • ::fileutil::tempdir
  • ::fileutil::tempdir path
  • ::fileutil::tempdirReset
  • ::fileutil::tempfile ? prefix ?
  • ::fileutil::relative base dst
  • ::fileutil::relativeUrl base dst

What other file-related procs would be useful?  edit

Other procs that would be useful to add would include wc, tee, head, tail, and perhaps some awk'ish type functions ala Tclx.

LV Anyone have a Tcl version of the dircmp command [1]? I don't see it in the cygwin package list, and when I did a casual search on google.

VI 2003-11-28: Nice of you to ask. There's a list above, other than that: tail -f, split, join. I use tkcon as my main shell on a wimpy laptop. Fewer dlls loaded is good..

LV I think some procs emulating functionality (not necessary flags, etc.) of Unix commands such as:

  • cut - extract one or more columns of text from the input file
  • join - create the union of one or more files containing columns of data, using a common column as an index
  • sort - sort a file based on the contents of one or more columns
  • comm - extract rows of data common, or uncommon, between 2 or more files
  • uniq - extract unique rows (or count the occurances of unique rows) in a file

would be useful. Several of these commands have, at their core, the idea of files being a series of columns, separated by some character or position, and allow a person to select one or more specific columns upon which to perform functions. They represent, in a sense, shortcuts for various awk scripts.

Perhaps even some code like Glenn Jackman's:
proc touch {filename {time ""}} {
    if {[string length $time] == 0} {set time [clock seconds]}
    file mtime $filename $time
    file atime $filename $time
}

glennj: This proc has been accepted into tcllib 1.2: http://tcllib.sourceforge.net/doc/fileutil.html

US: Unix-like touch:
proc touch {filename {time ""}} {
    if {![file exists $filename]} {
        close [open $filename a]
    }
    if {[string length $time] == 0} {set time [clock seconds]}
    file mtime $filename $time
    file atime $filename $time
}

SS: 2003-12-16: Trying to improve over the Tcl implementation of wc in the Great Language Shootout I wrote this, that seems half in execution time against big files:
set text [read stdin]
set c [string length $text]
set l [expr {[llength [split $text "\n\r"]]-1}]
set T [split $text "\n\r\t "]
set w [expr {[llength $T]-[llength [lsearch -all -exact $T {}]]-1}]
puts "\t$l\t$w\t$c"

Output seems to be identical to GNU's wc command.

SEH 2006-07-23 -- The proc fileutil::find is useful, but it has several deficiencies:

  • On Windows, hidden files are mishandled.
  • On Windows, checks to avoid infinite loops due to nested symbolic links are not done.
  • On Unix, nested loop checking requires a "file stat" of each file/dir encountered, a significant performance hit.
  • The basedir from which the search starts is not included in the results, as it is with GNU find.
  • If the basedir is a file, it is returned in the result not as a list element (like glob) but as a string.
  • The proc calls itself recursively, and thus risks running into interp recursion limits for very large systems.
  • fileutil.tcl contains three separate instantiations of proc find for varying os's/versions. Maintenance nightmare.

The following code eliminates all the above deficiencies. It checks for nested symbolic links in a platform-independent way, and scans directory hierarchies without recursion.

For speed and simplicity, it takes advantage of glob's ability to use multiple patterns to scan deeply into a directory structure in a single command, hence the name. Its calling syntax is the same as fileutil::find, so with a name change it could be used as a drop-in replacement:

SEH 2008-01-20: globfind has been rewritten to achieve greater speed, simplicity and function, and moved to its own page.

gavino posted a question on comp.lang.tcl:

"I can not figure out the [globfind] syntax to limit it to finding say .pdf files. ... please someone post and [sic] example."

and Gerald Lester replied:
 proc PdfOnly {fileName} {
     return [string equal [string tolower [file extension $fileName] .pdf]
 }

 set fileList [globfind $dir PdfOnly]

SEH 20070317 -- A simpler alternative:
 set fileList [globfind $dir {string match -nocase *.pdf}]

gavino 2011-03-21:

I could not get globfind to work with 8.6

I wrote this because on solaris 10 at work find sucks and is sometimes broken outright.
#! /home/g/tcl/bin/tclsh8.6.exe

#needs tcllib, I used 1.13 and cygwin at home, but use unix tcl+tcllib at work
package require fileutil
foreach file [fileutil::find /home/g {string match -nocase *.log}] {
    set filesize [file size $file]
    if {$filesize >= 1073741824} {
        set gigs [expr {$filesize / 1073741824}]
        puts "$gigs G $file"
    } elseif {$filesize >= 1048576} {
        set megs [expr {$filesize / 1048576}]
        puts "$megs M $file"
    } elseif {$filesize >= 1024} {
        set kilos [expr {$filesize / 1024}]
        puts "$kilos K $file"
    } else {
        puts "$filesize B $file"
    }
}

AMG: How is it misbehaving?

gavino I was in a directory and ran find and it didn't find the httpd.conf file I was looking at, let alone others, perms no doubt, but you think root find would find files anyhow? perhaps perms..

Laif: It should be noted by those who are not familiar with unix - that even in windows xp, if fileutil::find encounters a folder or file named with a single tilde (~), it will append the contents of the person's home directory to the search results. Furthermore, there is a risk of infinite recursion, if somewhere within your home folder, there is also a folder named with a single tilde.

gavino 2011-03-24:

faster, shorter, cooler version, if you pipe to sort -n especially fun: ./gavinfind.tcl|sort -n
#!/usr/local/bin/tclsh
#needs tcllib, I used 1.13
package require Tcl 8.5.9
package require fileutil
foreach file [fileutil::find /export/home/g] {
    set filesize [file size $file]
    if {$filesize > 1073741824} {
        puts "[expr {$filesize / 1073741824}] G $file"
    } elseif {$filesize > 1048576} {
        puts "[expr {$filesize / 1048576}] M $file"
    }
}

I guess shell works too, but maybe tcl finds files that shell misses? hmm
find /export/home/gschuette -size +1000000c -type f -exec ls -lh {} \;|awk '{print $5 " " $9}'|sort -n|grep -v [0-9]K