Back to index

lightning-sunbird  0.9+nobinonly
e_sqrt.c
Go to the documentation of this file.
00001 /* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-
00002  *
00003  * ***** BEGIN LICENSE BLOCK *****
00004  * Version: MPL 1.1/GPL 2.0/LGPL 2.1
00005  *
00006  * The contents of this file are subject to the Mozilla Public License Version
00007  * 1.1 (the "License"); you may not use this file except in compliance with
00008  * the License. You may obtain a copy of the License at
00009  * http://www.mozilla.org/MPL/
00010  *
00011  * Software distributed under the License is distributed on an "AS IS" basis,
00012  * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
00013  * for the specific language governing rights and limitations under the
00014  * License.
00015  *
00016  * The Original Code is Mozilla Communicator client code, released
00017  * March 31, 1998.
00018  *
00019  * The Initial Developer of the Original Code is
00020  * Sun Microsystems, Inc.
00021  * Portions created by the Initial Developer are Copyright (C) 1998
00022  * the Initial Developer. All Rights Reserved.
00023  *
00024  * Contributor(s):
00025  *
00026  * Alternatively, the contents of this file may be used under the terms of
00027  * either of the GNU General Public License Version 2 or later (the "GPL"),
00028  * or the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
00029  * in which case the provisions of the GPL or the LGPL are applicable instead
00030  * of those above. If you wish to allow use of your version of this file only
00031  * under the terms of either the GPL or the LGPL, and not to allow others to
00032  * use your version of this file under the terms of the MPL, indicate your
00033  * decision by deleting the provisions above and replace them with the notice
00034  * and other provisions required by the GPL or the LGPL. If you do not delete
00035  * the provisions above, a recipient may use your version of this file under
00036  * the terms of any one of the MPL, the GPL or the LGPL.
00037  *
00038  * ***** END LICENSE BLOCK ***** */
00039 /* @(#)e_sqrt.c 1.3 95/01/18 */
00040 /*
00041  * ====================================================
00042  * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
00043  *
00044  * Developed at SunSoft, a Sun Microsystems, Inc. business.
00045  * Permission to use, copy, modify, and distribute this
00046  * software is freely granted, provided that this notice 
00047  * is preserved.
00048  * ====================================================
00049  */
00050 
00051 /* __ieee754_sqrt(x)
00052  * Return correctly rounded sqrt.
00053  *           ------------------------------------------
00054  *          |  Use the hardware sqrt if you have one |
00055  *           ------------------------------------------
00056  * Method: 
00057  *   Bit by bit method using integer arithmetic. (Slow, but portable) 
00058  *   1. Normalization
00059  *     Scale x to y in [1,4) with even powers of 2: 
00060  *     find an integer k such that  1 <= (y=x*2^(2k)) < 4, then
00061  *            sqrt(y) = 2^k * sqrt(x)
00062  *   2. Bit by bit computation
00063  *     Let q  = sqrt(y) truncated to i bit after binary point (q = 1),
00064  *          i                                            0
00065  *                                     i+1         2
00066  *         s  = 2*q , and   y  =  2   * ( y - q  ).            (1)
00067  *          i      i            i                 i
00068  *                                                        
00069  *     To compute q    from q , one checks whether 
00070  *                i+1       i                       
00071  *
00072  *                         -(i+1) 2
00073  *                   (q + 2      ) <= y.                (2)
00074  *                            i
00075  *                                                     -(i+1)
00076  *     If (2) is false, then q   = q ; otherwise q   = q  + 2      .
00077  *                          i+1   i             i+1   i
00078  *
00079  *     With some algebric manipulation, it is not difficult to see
00080  *     that (2) is equivalent to 
00081  *                             -(i+1)
00082  *                   s  +  2       <= y                 (3)
00083  *                    i                i
00084  *
00085  *     The advantage of (3) is that s  and y  can be computed by 
00086  *                                i      i
00087  *     the following recurrence formula:
00088  *         if (3) is false
00089  *
00090  *         s     =  s  ,    y    = y   ;                (4)
00091  *          i+1      i              i+1    i
00092  *
00093  *         otherwise,
00094  *                         -i                     -(i+1)
00095  *         s    =  s  + 2  ,  y    = y  -  s  - 2              (5)
00096  *           i+1      i          i+1    i     i
00097  *                          
00098  *     One may easily use induction to prove (4) and (5). 
00099  *     Note. Since the left hand side of (3) contain only i+2 bits,
00100  *           it does not necessary to do a full (53-bit) comparison 
00101  *           in (3).
00102  *   3. Final rounding
00103  *     After generating the 53 bits result, we compute one more bit.
00104  *     Together with the remainder, we can decide whether the
00105  *     result is exact, bigger than 1/2ulp, or less than 1/2ulp
00106  *     (it will never equal to 1/2ulp).
00107  *     The rounding mode can be detected by checking whether
00108  *     huge + tiny is equal to huge, and whether huge - tiny is
00109  *     equal to huge for some floating point number "huge" and "tiny".
00110  *            
00111  * Special cases:
00112  *     sqrt(+-0) = +-0      ... exact
00113  *     sqrt(inf) = inf
00114  *     sqrt(-ve) = NaN             ... with invalid signal
00115  *     sqrt(NaN) = NaN             ... with invalid signal for signaling NaN
00116  *
00117  * Other methods : see the appended file at the end of the program below.
00118  *---------------
00119  */
00120 
00121 #include "fdlibm.h"
00122 
00123 #if defined(_MSC_VER)
00124 /* Microsoft Compiler */
00125 #pragma warning( disable : 4723 ) /* disables potential divide by 0 warning */
00126 #endif
00127 
00128 #ifdef __STDC__
00129 static const double  one    = 1.0, tiny=1.0e-300;
00130 #else
00131 static double one    = 1.0, tiny=1.0e-300;
00132 #endif
00133 
00134 #ifdef __STDC__
00135        double __ieee754_sqrt(double x)
00136 #else
00137        double __ieee754_sqrt(x)
00138        double x;
00139 #endif
00140 {
00141         fd_twoints u;
00142        double z;
00143        int    sign = (int)0x80000000; 
00144        unsigned r,t1,s1,ix1,q1;
00145        int ix0,s0,q,m,t,i;
00146 
00147         u.d = x;
00148        ix0 = __HI(u);                     /* high word of x */
00149        ix1 = __LO(u);              /* low word of x */
00150 
00151     /* take care of Inf and NaN */
00152        if((ix0&0x7ff00000)==0x7ff00000) {               
00153            return x*x+x;           /* sqrt(NaN)=NaN, sqrt(+inf)=+inf
00154                                       sqrt(-inf)=sNaN */
00155        } 
00156     /* take care of zero */
00157        if(ix0<=0) {
00158            if(((ix0&(~sign))|ix1)==0) return x;/* sqrt(+-0) = +-0 */
00159            else if(ix0<0)
00160               return (x-x)/(x-x);         /* sqrt(-ve) = sNaN */
00161        }
00162     /* normalize x */
00163        m = (ix0>>20);
00164        if(m==0) {                         /* subnormal x */
00165            while(ix0==0) {
00166               m -= 21;
00167               ix0 |= (ix1>>11); ix1 <<= 21;
00168            }
00169            for(i=0;(ix0&0x00100000)==0;i++) ix0<<=1;
00170            m -= i-1;
00171            ix0 |= (ix1>>(32-i));
00172            ix1 <<= i;
00173        }
00174        m -= 1023;    /* unbias exponent */
00175        ix0 = (ix0&0x000fffff)|0x00100000;
00176        if(m&1){      /* odd m, double x to make it even */
00177            ix0 += ix0 + ((ix1&sign)>>31);
00178            ix1 += ix1;
00179        }
00180        m >>= 1;      /* m = [m/2] */
00181 
00182     /* generate sqrt(x) bit by bit */
00183        ix0 += ix0 + ((ix1&sign)>>31);
00184        ix1 += ix1;
00185        q = q1 = s0 = s1 = 0;       /* [q,q1] = sqrt(x) */
00186        r = 0x00200000;             /* r = moving bit from right to left */
00187 
00188        while(r!=0) {
00189            t = s0+r; 
00190            if(t<=ix0) { 
00191               s0   = t+r; 
00192               ix0 -= t; 
00193               q   += r; 
00194            } 
00195            ix0 += ix0 + ((ix1&sign)>>31);
00196            ix1 += ix1;
00197            r>>=1;
00198        }
00199 
00200        r = sign;
00201        while(r!=0) {
00202            t1 = s1+r; 
00203            t  = s0;
00204            if((t<ix0)||((t==ix0)&&(t1<=ix1))) { 
00205               s1  = t1+r;
00206               if(((int)(t1&sign)==sign)&&(s1&sign)==0) s0 += 1;
00207               ix0 -= t;
00208               if (ix1 < t1) ix0 -= 1;
00209               ix1 -= t1;
00210               q1  += r;
00211            }
00212            ix0 += ix0 + ((ix1&sign)>>31);
00213            ix1 += ix1;
00214            r>>=1;
00215        }
00216 
00217     /* use floating add to find out rounding direction */
00218        if((ix0|ix1)!=0) {
00219            z = one-tiny; /* trigger inexact flag */
00220            if (z>=one) {
00221                z = one+tiny;
00222                if (q1==(unsigned)0xffffffff) { q1=0; q += 1;}
00223               else if (z>one) {
00224                   if (q1==(unsigned)0xfffffffe) q+=1;
00225                   q1+=2; 
00226               } else
00227                    q1 += (q1&1);
00228            }
00229        }
00230        ix0 = (q>>1)+0x3fe00000;
00231        ix1 =  q1>>1;
00232        if ((q&1)==1) ix1 |= sign;
00233        ix0 += (m <<20);
00234         u.d = z;
00235        __HI(u) = ix0;
00236        __LO(u) = ix1;
00237         z = u.d;
00238        return z;
00239 }
00240 
00241 /*
00242 Other methods  (use floating-point arithmetic)
00243 -------------
00244 (This is a copy of a drafted paper by Prof W. Kahan 
00245 and K.C. Ng, written in May, 1986)
00246 
00247        Two algorithms are given here to implement sqrt(x) 
00248        (IEEE double precision arithmetic) in software.
00249        Both supply sqrt(x) correctly rounded. The first algorithm (in
00250        Section A) uses newton iterations and involves four divisions.
00251        The second one uses reciproot iterations to avoid division, but
00252        requires more multiplications. Both algorithms need the ability
00253        to chop results of arithmetic operations instead of round them, 
00254        and the INEXACT flag to indicate when an arithmetic operation
00255        is executed exactly with no roundoff error, all part of the 
00256        standard (IEEE 754-1985). The ability to perform shift, add,
00257        subtract and logical AND operations upon 32-bit words is needed
00258        too, though not part of the standard.
00259 
00260 A.  sqrt(x) by Newton Iteration
00261 
00262    (1) Initial approximation
00263 
00264        Let x0 and x1 be the leading and the trailing 32-bit words of
00265        a floating point number x (in IEEE double format) respectively 
00266 
00267            1    11               52                              ...widths
00268           ------------------------------------------------------
00269        x: |s|   e     |           f                            |
00270           ------------------------------------------------------
00271              msb    lsb  msb                                  lsb ...order
00272 
00273  
00274             ------------------------           ------------------------
00275        x0:  |s|   e    |    f1     |       x1: |          f2           |
00276             ------------------------           ------------------------
00277 
00278        By performing shifts and subtracts on x0 and x1 (both regarded
00279        as integers), we obtain an 8-bit approximation of sqrt(x) as
00280        follows.
00281 
00282               k  := (x0>>1) + 0x1ff80000;
00283               y0 := k - T1[31&(k>>15)].   ... y ~ sqrt(x) to 8 bits
00284        Here k is a 32-bit integer and T1[] is an integer array containing
00285        correction terms. Now magically the floating value of y (y's
00286        leading 32-bit word is y0, the value of its trailing word is 0)
00287        approximates sqrt(x) to almost 8-bit.
00288 
00289        Value of T1:
00290        static int T1[32]= {
00291        0,     1024,  3062,  5746,  9193,  13348, 18162, 23592,
00292        29598, 36145, 43202, 50740, 58733, 67158, 75992, 85215,
00293        83599, 71378, 60428, 50647, 41945, 34246, 27478, 21581,
00294        16499, 12183, 8588,  5674,  3403,  1742,  661,   130,};
00295 
00296     (2)       Iterative refinement
00297 
00298        Apply Heron's rule three times to y, we have y approximates 
00299        sqrt(x) to within 1 ulp (Unit in the Last Place):
00300 
00301               y := (y+x/y)/2              ... almost 17 sig. bits
00302               y := (y+x/y)/2              ... almost 35 sig. bits
00303               y := y-(y-x/y)/2     ... within 1 ulp
00304 
00305 
00306        Remark 1.
00307            Another way to improve y to within 1 ulp is:
00308 
00309               y := (y+x/y)         ... almost 17 sig. bits to 2*sqrt(x)
00310               y := y - 0x00100006  ... almost 18 sig. bits to sqrt(x)
00311 
00312                             2
00313                          (x-y )*y
00314               y := y + 2* ----------      ...within 1 ulp
00315                             2
00316                           3y  + x
00317 
00318 
00319        This formula has one division fewer than the one above; however,
00320        it requires more multiplications and additions. Also x must be
00321        scaled in advance to avoid spurious overflow in evaluating the
00322        expression 3y*y+x. Hence it is not recommended uless division
00323        is slow. If division is very slow, then one should use the 
00324        reciproot algorithm given in section B.
00325 
00326     (3) Final adjustment
00327 
00328        By twiddling y's last bit it is possible to force y to be 
00329        correctly rounded according to the prevailing rounding mode
00330        as follows. Let r and i be copies of the rounding mode and
00331        inexact flag before entering the square root program. Also we
00332        use the expression y+-ulp for the next representable floating
00333        numbers (up and down) of y. Note that y+-ulp = either fixed
00334        point y+-1, or multiply y by nextafter(1,+-inf) in chopped
00335        mode.
00336 
00337               I := FALSE;   ... reset INEXACT flag I
00338               R := RZ;      ... set rounding mode to round-toward-zero
00339               z := x/y;     ... chopped quotient, possibly inexact
00340               If(not I) then {     ... if the quotient is exact
00341                   if(z=y) {
00342                       I := i;       ... restore inexact flag
00343                       R := r;  ... restore rounded mode
00344                       return sqrt(x):=y.
00345                   } else {
00346                      z := z - ulp; ... special rounding
00347                   }
00348               }
00349               i := TRUE;           ... sqrt(x) is inexact
00350               If (r=RN) then z=z+ulp      ... rounded-to-nearest
00351               If (r=RP) then {     ... round-toward-+inf
00352                   y = y+ulp; z=z+ulp;
00353               }
00354               y := y+z;            ... chopped sum
00355               y0:=y0-0x00100000;   ... y := y/2 is correctly rounded.
00356                I := i;                    ... restore inexact flag
00357                R := r;             ... restore rounded mode
00358                return sqrt(x):=y.
00359                   
00360     (4)       Special cases
00361 
00362        Square root of +inf, +-0, or NaN is itself;
00363        Square root of a negative number is NaN with invalid signal.
00364 
00365 
00366 B.  sqrt(x) by Reciproot Iteration
00367 
00368    (1) Initial approximation
00369 
00370        Let x0 and x1 be the leading and the trailing 32-bit words of
00371        a floating point number x (in IEEE double format) respectively
00372        (see section A). By performing shifs and subtracts on x0 and y0,
00373        we obtain a 7.8-bit approximation of 1/sqrt(x) as follows.
00374 
00375            k := 0x5fe80000 - (x0>>1);
00376            y0:= k - T2[63&(k>>14)].       ... y ~ 1/sqrt(x) to 7.8 bits
00377 
00378        Here k is a 32-bit integer and T2[] is an integer array 
00379        containing correction terms. Now magically the floating
00380        value of y (y's leading 32-bit word is y0, the value of
00381        its trailing word y1 is set to zero) approximates 1/sqrt(x)
00382        to almost 7.8-bit.
00383 
00384        Value of T2:
00385        static int T2[64]= {
00386        0x1500,       0x2ef8,       0x4d67,       0x6b02,       0x87be,       0xa395,       0xbe7a,       0xd866,
00387        0xf14a,       0x1091b,0x11fcd,0x13552,0x14999,0x15c98,0x16e34,0x17e5f,
00388        0x18d03,0x19a01,0x1a545,0x1ae8a,0x1b5c4,0x1bb01,0x1bfde,0x1c28d,
00389        0x1c2de,0x1c0db,0x1ba73,0x1b11c,0x1a4b5,0x1953d,0x18266,0x16be0,
00390        0x1683e,0x179d8,0x18a4d,0x19992,0x1a789,0x1b445,0x1bf61,0x1c989,
00391        0x1d16d,0x1d77b,0x1dddf,0x1e2ad,0x1e5bf,0x1e6e8,0x1e654,0x1e3cd,
00392        0x1df2a,0x1d635,0x1cb16,0x1be2c,0x1ae4e,0x19bde,0x1868e,0x16e2e,
00393        0x1527f,0x1334a,0x11051,0xe951,    0xbe01,       0x8e0d,       0x5924,       0x1edd,};
00394 
00395     (2)       Iterative refinement
00396 
00397        Apply Reciproot iteration three times to y and multiply the
00398        result by x to get an approximation z that matches sqrt(x)
00399        to about 1 ulp. To be exact, we will have 
00400               -1ulp < sqrt(x)-z<1.0625ulp.
00401        
00402        ... set rounding mode to Round-to-nearest
00403           y := y*(1.5-0.5*x*y*y)   ... almost 15 sig. bits to 1/sqrt(x)
00404           y := y*((1.5-2^-30)+0.5*x*y*y)... about 29 sig. bits to 1/sqrt(x)
00405        ... special arrangement for better accuracy
00406           z := x*y                 ... 29 bits to sqrt(x), with z*y<1
00407           z := z + 0.5*z*(1-z*y)   ... about 1 ulp to sqrt(x)
00408 
00409        Remark 2. The constant 1.5-2^-30 is chosen to bias the error so that
00410        (a) the term z*y in the final iteration is always less than 1; 
00411        (b) the error in the final result is biased upward so that
00412               -1 ulp < sqrt(x) - z < 1.0625 ulp
00413            instead of |sqrt(x)-z|<1.03125ulp.
00414 
00415     (3)       Final adjustment
00416 
00417        By twiddling y's last bit it is possible to force y to be 
00418        correctly rounded according to the prevailing rounding mode
00419        as follows. Let r and i be copies of the rounding mode and
00420        inexact flag before entering the square root program. Also we
00421        use the expression y+-ulp for the next representable floating
00422        numbers (up and down) of y. Note that y+-ulp = either fixed
00423        point y+-1, or multiply y by nextafter(1,+-inf) in chopped
00424        mode.
00425 
00426        R := RZ;             ... set rounding mode to round-toward-zero
00427        switch(r) {
00428            case RN:         ... round-to-nearest
00429               if(x<= z*(z-ulp)...chopped) z = z - ulp; else
00430               if(x<= z*(z+ulp)...chopped) z = z; else z = z+ulp;
00431               break;
00432            case RZ:case RM: ... round-to-zero or round-to--inf
00433               R:=RP;        ... reset rounding mod to round-to-+inf
00434               if(x<z*z ... rounded up) z = z - ulp; else
00435               if(x>=(z+ulp)*(z+ulp) ...rounded up) z = z+ulp;
00436               break;
00437            case RP:         ... round-to-+inf
00438               if(x>(z+ulp)*(z+ulp)...chopped) z = z+2*ulp; else
00439               if(x>z*z ...chopped) z = z+ulp;
00440               break;
00441        }
00442 
00443        Remark 3. The above comparisons can be done in fixed point. For
00444        example, to compare x and w=z*z chopped, it suffices to compare
00445        x1 and w1 (the trailing parts of x and w), regarding them as
00446        two's complement integers.
00447 
00448        ...Is z an exact square root?
00449        To determine whether z is an exact square root of x, let z1 be the
00450        trailing part of z, and also let x0 and x1 be the leading and
00451        trailing parts of x.
00452 
00453        If ((z1&0x03ffffff)!=0)     ... not exact if trailing 26 bits of z!=0
00454            I := 1;          ... Raise Inexact flag: z is not exact
00455        else {
00456            j := 1 - [(x0>>20)&1]   ... j = logb(x) mod 2
00457            k := z1 >> 26;          ... get z's 25-th and 26-th 
00458                                        fraction bits
00459            I := i or (k&j) or ((k&(j+j+1))!=(x1&3));
00460        }
00461        R:= r         ... restore rounded mode
00462        return sqrt(x):=z.
00463 
00464        If multiplication is cheaper then the foregoing red tape, the 
00465        Inexact flag can be evaluated by
00466 
00467            I := i;
00468            I := (z*z!=x) or I.
00469 
00470        Note that z*z can overwrite I; this value must be sensed if it is 
00471        True.
00472 
00473        Remark 4. If z*z = x exactly, then bit 25 to bit 0 of z1 must be
00474        zero.
00475 
00476                   --------------------
00477               z1: |        f2        | 
00478                   --------------------
00479               bit 31           bit 0
00480 
00481        Further more, bit 27 and 26 of z1, bit 0 and 1 of x1, and the odd
00482        or even of logb(x) have the following relations:
00483 
00484        -------------------------------------------------
00485        bit 27,26 of z1             bit 1,0 of x1 logb(x)
00486        -------------------------------------------------
00487        00                   00            odd and even
00488        01                   01            even
00489        10                   10            odd
00490        10                   00            even
00491        11                   01            even
00492        -------------------------------------------------
00493 
00494     (4)       Special cases (see (4) of Section A).     
00495  
00496  */
00497