xyzzy

HPFS and FAT filename characters


0. Contents of filechar.zip:

   FILECHAR.ABS    this text
   FILECHAR.CMD    OS/2 REXX script to create FILECHAR.nnn
   FILECHAR.437    file name characters for codepage 437
   FILECOLD.850    dito codepage  850 (old 850 without euro)
   FILECHAR.850    dito codepage  858 (new 850 with    euro)
   FILECHAR.004    dito codepage 1004

1. Introduction

   You don't need this file (FILECHAR.ABS) to use FILECHAR.CMD.

   FILECHAR.CMD is a trivial OS/2 REXX script used to determine
   all legal filename characters and their AKAs on HPFS and FAT.

   If you're only interested in my results see the four files
        FILECHAR.437            (new result after CHCP 437)
        FILECHAR.850            (new result after CHCP 850)
        FILECOLD.850            (old result after CHCP 850)
        FILECHAR.004            (new result after CHCP 1004)

   These files have been created on my system by commands like
        CHCP 437 & FILECHAR > FILECHAR.437
        CHCP 850 & FILECHAR > FILECHAR.850

   The old FILECOLD.850 reflects results before installing the
   new "Euro-codepage" (codepage 850 with Euro-symbol hex. D5).
   On my WARP 3 system "old" is fixpack 17, and "new" is e.g.
   fixpack 40.  I never intended to publish FILECHAR.CMD, but
   the different results for with vs. without Euro-symbol are
   IMHO quite alarming.

2. Configuration

   If all legal filename characters depending on file system
   (FAT vs. HPFS etc.), codepage (437 vs. 850 etc.), and even
   installed fixpack are documented somewhere, then please
   tell me where...  Until then FILECHAR.CMD works by trial
   and error.  You have to "configure" FILECHAR.CMD for your
   system by editing two lines, replace...

        HPFS.. = 'D:\TMP\'   /* HPFS directory */
        OFAT.. = 'F:\TMP\'   /*  FAT directory */

   ... by existing HPFS- resp. FAT-directories on your system.
   You may use root-directories, e.g. OFAT.. = 'C:', or even
   other file systems, as long as you have write access resp.
   know how to interpret the results.

   Hint:  FILECHAR.CMD deletes all created temporary files
   ---?---$, and this works faster on drives without "DELDIR".

3. Operation

   FILECHAR.CMD simply tries to create 255 files ---?---$ in
   both directories, where ? is hex. 01 .. hex. FF (255), by
   appending the letter ? to ---?---$.  For some characters
   like # the file ---#---$ finally contains only # on both
   FAT and HPFS in codepage 437 or 850.

   The file ---Z---$ probably contains Z and z for HPFS and
   FAT:  OS/2 would treat files zzz, ZZZ, zZz, etc. as the same
   file, although HPFS supports mixed case filenames.  If you
   have write access on a *NIX-filesystem then zzz, ZZZ, and
   zZz would be three different files.  The most (in)famous
   examples are makefile, Makefile, MAKEFILE, etc. ;-)

   The file ---E---$ may contain 6 characters in codepage 437:
   E, e, é, ê, ë, and è are treated as identical in file names.
   Of course FILECHAR.CMD does not only create ---?---$ files,
   it also evaluates and eventually deletes these files.

4. Usage

        FILECHAR -h        usage info (dito -u, -?, etc.)
        FILECHAR --         long result lines (upto 255 columns)
        FILECHAR           short result lines (upto  79 columns)

   The short format skips 0 .. 9 (known unique legal characters)
   to get less result lines.  The long format contains all valid
   filename characters, about 164 columns in "new" codepage 850.

   The output format should be obvious.  Characters in a line
   marked by HPFS (resp. FAT) are valid in HPFS (FAT) filenames.

   Characters in a line marked by "aka" are treated as identical
   with the character(s) in the same column, notably the next
   HPFS- resp. FAT-line character above it.  Short format only:
   If there is no aka-line below a HPFS- resp. FAT-line, then
   these characters are legal and unique.

   In long format you get exactly one long HPFS-line and one
   long FAT-line with as many aka-lines as needed in the worst
   case.  In the "new" codepage 850 there is only one aka-line,
   i.e. at most lower and upper case are treated as identical.

   Characters in a line marked by "not" are not supported on
   HPFS (resp. FAT).  HPFS does not support "/:<>\| in addition
   to anything below hex. 20 (32, space).  FAT does not support
   "+,./:;<=>[\]|.  Often programs have difficulties with the
   characters +,.;=[] working on HPFS but not in a FAT.

5. Caveats

   FILECHAR.CMD only tests ---?---$.  So if characters depend
   on the position within a filename, then FILECHAR.CMD cannot
   detect it.  Examples:  leading or trailing spaces generally
   don't work, but spaces within a name are okay (even in a FAT,
   compare "WP ROOT. SF" etc.).  Trailing dots don't work on
   HPFS, a leading dot may have a special meaning (*NIX), many
   programs treat the last dot as THE DOT, and in a FAT dots are
   not supported (except from the implicit 8+3 dot).

   For "FAT" read "good old DOS FAT", all I know about FAT32
   is that it exists.  The same holds for HPFS vs. HPFS386.

XHTML validator Last update: 25 Nov 2002 9:30 by F.Ellermann