Back to index

tetex-bin  3.0
vaxvms.c
Go to the documentation of this file.
00001 /***********************************************************************
00002 This file provides alternative functions for several VMS VMS  C  library
00003 routines which either unacceptable, or incorrect, implementations.  They
00004 have  been developed and  tested under VMS Version  4.4, but indications
00005 are  that they apply  to  earlier versions, back to 3.2  at least.  They
00006 should be retested with each new release of VMS C.
00007 
00008 Contents:
00009        EXIT
00010        FSEEK
00011        FTELL
00012        GETCHAR
00013        GETENV
00014        READ
00015        UNGETC
00016        getlogin
00017        qsort
00018        system
00019        tell
00020        unlink
00021 
00022 The VAX VMS  file system record  structure has  unfortunate consequences
00023 for random access files.
00024 
00025 By default, text  files written by most system  utilities, and languages
00026 other than C, have a variable  length record format,  in  which a 16-bit
00027 character count is  aligned on an  even-byte boundary in the  disk block
00028 b(always 512 bytes   in VMS, independent  of  record and  file  formats),
00029 followed by  <count> bytes of data.   Binary files, such  as .EXE, .OBJ,
00030 and  TeX .DVI  and font  files, all use a  512-byte  fixed record format
00031 which  has no explicit  length  field.  No  file  byte count  is stored;
00032 instead, the block count,  and the  offset of the  last data byte in the
00033 last block are recorded in the file header  (do ``DUMP/HEADER filespec''
00034 to see it).  For binary files with fixed-length  records, the last block
00035 is normally  assumed to be  full,  and  consequently, file   transfer of
00036 binary data from other machines  via Kermit, FTP, or DCL  COPY from ANSI
00037 tapes, generally fails because  the input file length is  not a multiple
00038 of 512.
00039 
00040 This record organization may  be contrasted with  the STREAM, STREAM_LF,
00041 and STREAM_CR organizations supported from Version 4.0; in  these,  disk
00042 blocks contain a continuous byte stream in which nothing, or  LF, or CR,
00043 is recognized as a record terminator.  These formats are similar to  the
00044 Unix  and TOPS-20 file system  formats  which also use continuous   byte
00045 streams.
00046 
00047 For C, this  means that a  program operating on a file  in record format
00048 cannot count input characters and expect that count to be the same value
00049 as the  offset parameter passed  to fseek(),  which  numerous C programs
00050 assume to  be the case.  The draft  ANSI C  standard,  and  Harbison and
00051 Steele's ``C Reference Manual'', emphasize that only  values returned by
00052 ftell() should be used as arguments to fseek(),  allowing the program to
00053 return to  a position previously read or  written.  UNFORTUNATELY, VMS C
00054 ftell()  DOES NOT  RETURN   A CORRECT  OFFSET VALUE FOR   RECORD  FILES.
00055 Instead, for record files, it returns the byte  offset  of the start  of
00056 the current record, no matter where in that  record the current position
00057 may  be.   This  misbehavior  is  completely unnecessary,   since    the
00058 replacements below perform correctly, and are written entirely in C.
00059 
00060 Another problem is that ungetc(char c,  FILE*  fp) is unreliable.  VMS C
00061 implements  characters  as  signed 8-bit integers  (so  do many other  C
00062 implementations).  fgetc(FILE*  fp) returns an int,  not  a  char, whose
00063 value is EOF (-1) in the event of end-of-file;  however, this value will
00064 also  be returned for  a   character  0xFF, so  it  is essential  to use
00065 feof(FILE  *fp) to test  for a  true end-of-file condition  when  EOF is
00066 returned.   ungetc() checks the sign of  its argument c,  and  if it  is
00067 negative (which it will be for 128 of the 256 signed  bytes), REFUSES TO
00068 PUT IT BACK IN THE INPUT STREAM, on the assumption that c is really EOF.
00069 This  too can  be fixed;   ungetc()  should only  do   nothing if feof()
00070 indicates  a  true  end-of-file  condition.   The   overhead of  this is
00071 trivial, since feof() is   actually implemented  as a macro   which does
00072 nothing more than a logical AND and compare-with-zero.
00073 
00074 getchar()  waits for a <CR> to  be typed when stdin is  a terminal;  the
00075 replacement vms_getchar() remedies this.
00076 
00077 Undoubtedly  other  deficiencies  in   VMS  C will   reveal  themselves.
00078 
00079 VMS read() returns   only  a  single  disk   block on  each call.    Its
00080 replacment, vms_read(), will  return  the  requested number of bytes, if
00081 possible.
00082 
00083 There are also a  few Unix standard  functions which are  unimplemented.
00084 qsort() is not provided.  getlogin()  and unlink() have VMS  equivalents
00085 provided below.  tell() is considered obsolete, since its  functionality
00086 is available from lseek(), but it is still seen in a few programs, so is
00087 provided below.   getenv()  fails if  the  name contains  a  colon;  its
00088 replacement allows the colon.
00089 
00090 In the interest  of  minimal source perturbation,  replacements  for VMS
00091 functions   are  given   the same  names,    but prefixed  "vms_".   For
00092 readability,   the original names  are  preserved,  but are converted to
00093 upper-case:
00094 
00095        #define FTELL vms_ftell
00096        #define FSEEK vms_fseek
00097        #define GETCHAR vms_getchar
00098        #define GETENV vms_getenv
00099        #define UNGETC vms_ungetc
00100 
00101 These  are  only defined to work   correctly for fixed  length  512-byte
00102 records, and no check is made that the file has that organization (it is
00103 possible, but   not without  expensive calls to    fstat(), or access to
00104 internal library structures).
00105 
00106 [02-Apr-87]  --      Nelson   H.F.  Beebe,  University  of Utah  Center  for
00107               Scientific Computing
00108 ***********************************************************************/
00109 
00110 #define EXIT  vms_exit
00111 #define FTELL vms_ftell
00112 #define FSEEK vms_fseek
00113 #define GETENV       vms_getenv
00114 #define GETCHAR vms_getchar
00115 #define READ  vms_read
00116 #define UNGETC       vms_ungetc
00117 
00118 #include <stdio.h>
00119 #include <types.h>
00120 #include <ctype.h>
00121 #include <stat.h>
00122 #include <descrip.h>
00123 #include <iodef.h>          /* need for vms_getchar() */
00124 #include <ssdef.h>
00125 
00126 #ifdef __GNUC__
00127 #include <stdlib.h>
00128 #endif
00129 
00130 void  EXIT(int code);
00131 long  FTELL(FILE *fp);
00132 long  FSEEK(FILE *fp, long n, long dir);
00133 long  UNGETC(char c, FILE *fp);
00134 int   GETCHAR(void);
00135 int   READ(int file_desc, char *buffer, int nbytes);
00136 char *GETENV(char *name);
00137 char *getlogin(void);
00138 long  tell(int handle);
00139 int   unlink(char *filename);
00140 
00141 /**********************************************************************/
00142 /*-->EXIT*/
00143 
00144 void
00145 vms_exit(int code)
00146 {
00147     switch (code)
00148     {
00149     case 0:
00150        exit(1);                    /* success */
00151        break;
00152 
00153     default:
00154        exit(2);                    /* error */
00155        break;
00156     }
00157 }
00158 
00159 
00160 /**********************************************************************/
00161 /*-->FSEEK*/
00162 
00163 /* VMS fseek() and ftell() on fixed-length record files work correctly
00164 only at block boundaries.  This replacement code patches in the offset
00165 within the  block.  Directions      from  current         position   and  from
00166 end-of-file are converted to absolute positions, and then the code for
00167 that case is invoked. */
00168 
00169 long
00170 FSEEK(FILE *fp, long n, long dir)
00171 {
00172     long k,m,pos,val,oldpos;
00173     struct stat buffer;
00174 
00175     for (;;)                /* loops only once or twice */
00176     {
00177       switch (dir)
00178       {
00179       case 0:               /* from BOF */
00180          oldpos = FTELL(fp);       /* get current byte offset in file */
00181          k = n & 511;              /* offset in 512-byte block */
00182          m = n >> 9;        /* relative block number in file */
00183          if (((*fp)->_cnt) && ((oldpos >> 9) == m)) /* still in same block */
00184          {
00185            val = 0;         /* success */
00186            (*fp)->_ptr = ((*fp)->_base) + k; /* reset pointers to requested byte */
00187            (*fp)->_cnt = 512 - k;
00188          }
00189          else
00190          {
00191            val = fseek(fp,m << 9,0); /* move to start of requested 512-byte block */
00192            if (val == 0)    /* success */
00193            {
00194              (*fp)->_cnt = 0;      /* indicate empty buffer */
00195              (void)fgetc(fp);      /* force refill of buffer */
00196              (*fp)->_ptr = ((*fp)->_base) + k;   /* reset pointers to requested byte */
00197              (*fp)->_cnt = 512 - k;
00198            }
00199          }
00200          return(val);
00201 
00202       case 1:               /* from current pos */
00203          pos = FTELL(fp);
00204          if (pos == EOF)    /* then error */
00205            return (EOF);
00206          n += pos;
00207          dir = 0;
00208          break;             /* go do case 0 */
00209 
00210       case 2:               /* from EOF */
00211          val = fstat(fileno(fp),&buffer);
00212          if (val == EOF)    /* then error */
00213            return (EOF);
00214          n += buffer.st_size - 1; /* convert filesize to offset and */
00215                                /* add to requested offset */
00216          dir = 0;
00217          break;             /* go do case 0 */
00218 
00219       default:                     /* illegal direction parameter */
00220          return (EOF);
00221       }
00222     }
00223 }
00224 
00225 /**********************************************************************/
00226 /*-->FTELL*/
00227 
00228 /* With fixed-length record files, ftell() returns the offset of the
00229 start of block.       To get the true position, this must be biased by
00230 the offset within the block. */
00231 
00232 long
00233 FTELL(FILE *fp)
00234 {
00235     char c;
00236     long pos;
00237     long val;
00238     if ((*fp)->_cnt == 0)   /* buffer empty--force refill */
00239     {
00240        c = fgetc(fp);
00241        val = UNGETC(c,fp);
00242        if (val != c)
00243            return (EOF);    /* should never happen */
00244     }
00245     pos = ftell(fp);        /* this returns multiple of 512 (start of block) */
00246     if (pos >= 0)           /* then success--patch in offset in block */
00247       pos += ((*fp)->_ptr) - ((*fp)->_base);
00248     return (pos);
00249 }
00250   
00251 /**********************************************************************/
00252 /*-->GETCHAR*/
00253 
00254 static int tt_channel = -1; /* terminal channel for image QIO's */
00255 
00256 #define FAILED(status) (~(status) & 1) /* failure if LSB is 0 */
00257 
00258 int
00259 GETCHAR()
00260 {
00261     int ret_char;           /* character returned */
00262     int status;                    /* system service status */
00263     static $DESCRIPTOR(sys_in,"TT:");
00264 
00265     if (tt_channel == -1)   /* then first call--assign channel */
00266     {
00267        status = sys$assign(&sys_in,&tt_channel,0,0);
00268        if (FAILED(status))
00269            lib$stop(status);
00270     }
00271     ret_char = 0;
00272     status = sys$qiow(0,tt_channel,IO$_TTYREADALL | IO$M_NOECHO,0,0,0,
00273        &ret_char,1,0,0,0,0);
00274     if (FAILED(status))
00275         lib$stop(status);
00276 
00277     return (ret_char);
00278 }
00279   
00280 /**********************************************************************/
00281 /*-->READ*/
00282 int
00283 READ(register int file_desc,register char *buffer,register int nbytes)
00284 {
00285     register int ngot;
00286     register int left;
00287     
00288     for ((left = nbytes, ngot = 0); left > 0; /* NOOP */)
00289     {
00290        ngot = read(file_desc,buffer,left);
00291        if (ngot < 0)
00292            return (-1);     /* error occurred */
00293        buffer += ngot;
00294        left -= ngot;
00295     }
00296     return(nbytes-left);
00297 }
00298 
00299 /**********************************************************************/
00300 /*-->UNGETC*/
00301 long
00302 UNGETC(char c, FILE *fp)    /* VMS ungetc() is a no-op if c < 0 */
00303 {                           /* (which is half the time!)        */
00304 
00305     if ((c == EOF) && feof(fp))
00306        return (EOF);        /* do nothing at true end-of-file */
00307     else if ((*fp)->_cnt >= 512)/* buffer full--no fgetc() done in this block!*/
00308        return (EOF);        /* must be user error if this happens */
00309     else                    /* put the character back in the buffer */
00310     {
00311       (*fp)->_cnt++;        /* increase count of characters left */
00312       (*fp)->_ptr--;        /* backup pointer to next available char */
00313       *((*fp)->_ptr) = c;   /* save the character */
00314       return (c);           /* and return it */
00315     }
00316 }
00317 
00318 /**********************************************************************/
00319 /*-->getenv*/
00320 char*
00321 GETENV(char *name)
00322 {
00323     char* p;
00324     char* result;
00325     char ucname[256];
00326 
00327     p = ucname;
00328     while (*name)    /* VMS logical names must be upper-case */
00329     {
00330       *p++ = islower(*name) ? toupper(*name) : *name;
00331       ++name;
00332     }
00333     *p = '\0';
00334 
00335     p = strchr(ucname,':');        /* colon in name? */
00336     if (p == (char *)NULL)         /* no colon in name */
00337         result = getenv(ucname);
00338     else                           /* try with and without colon */
00339     {
00340        result = getenv(ucname);
00341        if (result == (char *)NULL)
00342        {
00343            *p = '\0';
00344            result = getenv(ucname);
00345            *p = ':';
00346        }
00347     }
00348     return (result);
00349 }
00350 
00351 /**********************************************************************/
00352 /*-->getlogin*/
00353 char*
00354 getlogin()
00355 {
00356     return ((char *)getenv("USER")); /* use equivalent VMS routine */
00357 }
00358 
00359 /**********************************************************************/
00360 /*-->qsort*/
00361 
00362 /***********************************************************************
00363 TeXindex uses  the standard  Unix  library function  qsort()  for
00364 record sorting.  Unfortunately, qsort()  is not a stable  sorting
00365 algorithm, so input order is not necessarily preserved for  equal
00366 sort  keys.    This  is   important,  because   the  sorting   is
00367 case-independent, while  the  actual  entries may  not  be.   For
00368 example, the input
00369 
00370 \entry{i}{22}{{\CODE{i}}}
00371 \entry{i}{42}{{\CODE{i}}}
00372 \entry{I}{41}{{\CODE{I}}}
00373 \entry{I}{42}{{\CODE{I}}}
00374 
00375 produces
00376 
00377 \initial {I}
00378 \entry {{\CODE{i}}}{22}
00379 \entry {{\CODE{I}}}{41--42}
00380 \entry {{\CODE{i}}}{42}
00381 
00382 instead of the correct
00383 
00384 \initial {I}
00385 \entry {{\CODE{i}}}{22, 42}
00386 \entry {{\CODE{I}}}{41--42}
00387 
00388 We  therefore  provide  this  stable  shellsort  replacement  for
00389 qsort() based  on the  code  given on  p.  116 of  Kernighan  and
00390 Ritchie, ``The  C Programming  Language'', Prentice-Hall  (1978).
00391 This has  order  N**1.5  average performance,  which  is  usually
00392 slower than qsort().  In the interests of simplicity, we make  no
00393 attempt to handle short sequences by alternative methods.
00394 
00395 [07-Nov-86]
00396 ***********************************************************************/
00397 
00398 #if VMS_QSORT
00399 #define BASE(i) &base[(i)*width]
00400 
00401 void
00402 qsort(base, nel, width, compar)
00403     char base[];     /* start of data in memory */
00404     int nel;         /* number of elements to be sorted */
00405     int width;              /* size (in bytes) of each element */
00406     int (*compar)(); /* comparison function */
00407 {
00408     int gap;
00409     int i;
00410     int j;
00411 
00412     register int k;  /* inner exchange loop parameters */
00413     register char* p;
00414     register char* q;
00415     register char  c;
00416 
00417     for (gap = nel/2; gap > 0; gap /= 2)
00418     {
00419        for (i = gap; i < nel; i++)
00420        {
00421            for (j = i-gap; j >= 0; j -= gap)
00422            {
00423                p = BASE(j);
00424               q = BASE(j+gap);
00425               if ((*compar)(p,q) <= 0)
00426                   break;    /* exit j loop */
00427               else
00428               {
00429                   for (k = 0; k < width; (++p, ++q, ++k))
00430                   {
00431                      c = *q;
00432                      *q = *p;
00433                      *p = c;
00434                   }
00435               }
00436            }
00437        }
00438     }
00439 }
00440 #endif
00441 /**********************************************************************
00442 *-->system*
00443 int
00444 system(char *s)
00445 {
00446        struct dsc$descriptor t;
00447 
00448        t.dsc$w_length = strlen(s);
00449        t.dsc$a_pointer = s;
00450        t.dsc$b_class = DSC$K_CLASS_S;
00451        t.dsc$b_dtype = DSC$K_DTYPE_T;
00452        return (LIB$SPAWN(&t) == SS$_NORMAL) ? 0 : 127;
00453 }
00454 
00455 
00456 **********************************************************************/
00457 /*-->tell*/
00458 long
00459 tell(int handle)
00460 {
00461     return (lseek(handle,0L,1));
00462 }
00463 
00464 /**********************************************************************/
00465 /*-->unlink*/
00466 int
00467 unlink(char *filename)
00468 {
00469        return (delete(filename)); /* use equivalent VMS routine */
00470 }