Real - Java floating point library for MIDP devices

Current version: 1.13
Copyright: 2003-2009 Roar Lauritzsen, roarl at users.sourceforge.net
Availability: The library is available under the GPL license, or for a fee for commercial use.
Develop: See http://sourceforge.net/projects/real-java
Feedback: Comments, reports of use, bug reports, bug fixes, improvements, performance tests and comparisons to similar libraries are all welcome.
Javadoc: Read the full documentation here.

"Real.java" is a self-contained, single file Java floating point emulation library for MIDP devices, such as a Java-enabled cell-phone or PDA. The MIDP/CLDC 1.0 standard does not support the basic floating point types "float" or "double", so when I set out to program a scientific calculator for my cell-phone, I had to reinvent floating point arithmetic from scratch, using only integer basic types. I found it a very interesting and educating challenge.

Download

For a pure java version, use the following link: http://real-java.sourceforge.net/Real.java.
This file is produced from the file http://real-java.sourceforge.net/Real.jpp using the C preprocessor like this:

   cpp -C -P -DDO_INLINE -DPACKAGE=ral -o Real.java Real.jpp

If you want more human-readable, but about 30% slower java code *), don't define the DO_INLINE macro when preprocessing. The size of the executable code won't change noticeably.

*) Most MIDP devices don't do just-in-time inlining, so this has to be done by the use of preprocessor macros.

Features

Precise: The internal mantissa is 63-bit, which corresponds to approximately 19 decimal digits accuracy. All operations are performed with maximum precision, basic operations are all 63 bits precise and correctly rounded, more advanced functions have error bounds of a few ULPs (Units in the Last Place), except the erfc function which is a bit difficult to get accurate beyond 44 bits.
Wide range: The internal exponent is 31-bit, which allows for numbers up to 4.197·10^323228496
Fast: Attention has been given especially to execution speed, while at the same time keeping the code as simple as possible to reduce footprint. All advanced operations are implemented using established and often improved algorithms with rapid convergence. Performance is nevertheless several orders of magnitude slower than normal integer performance, but adequate for quite complex calculator and spreadsheet functions on a typical MIDP device.
Small objects: One "Real" object uses 13 bytes of memory, plus normal "Object" overhead. Full-precision Real's can be packed in byte arrays with 12 bytes per number. Lower-precision formats allow saving "double" representations in long's and "float" representations in int's.
Small footprint: Obfuscated, the library compresses to about 15KB in a jar file. One important fact: A good obfuscator removes all unused methods and constants, which leaves you with just the amount of code that you actually need.
Clean: The library is contained in a single file and a single class (plus one inner class).
Easy-to-use: To encourage object reuse and minimize garbage production, the interface mimics x86 assembly instructions with destination and source operands, e.g. the assembly instruction for adding b to a, "add a,b", becomes "a.add(b);". In Java, this corresponds to writing expressions using only the operators "+=", "-=", "*=", etc, in this case "a+=b;".
No garbage: No temporary values are allocated from the heap by any of the library functions, and in programs using the library most garbage production can be avoided by reuse of objects.
Abnormal numbers: Implements infinities and NaN following the IEEE754 logic.
Functions: Exhaustive set of mathematical functions, e.g. sqrt, sin, acosh, atan2, pow, gamma and erfc.
Time arithmetic: Conversion to/from DH.MS format using Gregorian calendar accurate since 1582.
Random generator: Advanced, arguably cryptographically strong pseudo-random generator.
String output: Comprehensive string output formatting including SCI/FIX/ENG formats and left/right justification, with configurable separators, precision and maximum width. HEX/OCT/BIN output work seamlessly with other formatting options and produces unambiguous twos-complement output of fractional numbers.
Tested: Most features have been thoroughly used and tested by a number of users in my Calc MIDP application. Furthermore, all arithmetic functions have been compared and calibrated against William Rossi's rossi.dfp.dfp at 40 decimal digits accuracy.
Fully documented: The complete javadoc covers all public fields and methods, outfitted whenever possible with equivalent double code, calculation error bounds and execution time relative to one addition.

Speed comparison

The speed comparisons were performed on a SonyEricsson T610. Other, possibly contradictory results may be found using other hardware.

38 times faster than William Rossi's rossi.dfp.dfp (dfp accuracy set to a comparable 20 decimal digits).
13 times faster than Nick Henson's henson.midp.Float (additionally, Real is about twice as accurate in mul/div).

Runtime comparisons using henson.midp.FloatTest on Sony Ericsson T610 (R3C)
Library	sin, ms 100 times	cos, ms 100 times	tan, ms 100 times	add, ms 10000 times	mul, ms 10000 times	div, ms 10000 times	sqrt, ms 1000 times	avg. relative runtime
Real	425	420	855	3110	2920	4820	1660	« 1.0 »
dfp	23170	25220	50320	19460	26765	119280	83510	37.7
Float	8780	5055	14515	11385	30440	42120	34595	13.3
Real 1.08	720	645	1400	3145	2940	4750	2735	1.4
Real 1.06	1475	1260	2915	3000	3475	24960	3300	2.7

Sample program

public class Sample
{
  // Print out 2*PI in hexadecimal form

  public static void main(String [] args)
  {
    Real a = new Real("0.5");
    Real b = new Real();
    a.gamma();     // We all know that gamma(0.5) == sqrt(pi)
    a.sqr();
    b.assign(2);
    a.mul(b);

    Real.NumberFormat format = new Real.NumberFormat();
    format.base = 16;
    System.out.println(a.toString(format));
  }
}

Optimization tips

To maintain numbers in a normalized state and to preserve the integrity of abnormal numbers, it is discouraged to modify the inner representation of a Real directly. However, a number of tricks can be performed with the functions that are already present.

For maximum performance, keep the following tips in mind while programming:

Use integers: Whenever integers are involved, do as much calculation with the integers as possible (promote the integer to long, if necessary), before introducing the integer in the calculation. Additionally, using mul and div with integer arguments is faster than doing "assign(int)" first and then mul/div.
Use scalbn: Binary scale can be used instead of multiplication and division when the factor is a power of 2. "scalbn" works with Reals as right or left-shift works with integers, and is much faster than "mul" and "div".
Use mul instead of div: It is better to pre-calculate a reciprocal using "recip" outside a loop, and then use multiplication instead of division inside the loop. A division is about as slow as two multiplications.
Use sqr: "sqr" is slightly faster than using "mul" to multiply a number by itself.
Use pre-calculated constants: The pre-calculated constants are ready for use and may be more accurate than what you can calculate yourself. See warning.
Use functions with integer arguments: The most common operators have been overloaded to take integer arguments. This can reduce the number of temporary variables needed. For example, instead of 'tmp.assign(20); a.add(tmp);', you could do 'a.add(20);'.
Don't convert to String: Using Strings to store Reals is inefficient. Converting the internal binary representation into a decimal representation is time-consuming and may introduce inaccuracies. It should only be used to present results to the human eye. Once a string representation of a number is generated, it should be cached for quick access, to avoid generating the string over and over again.
Don't convert from String: Assigning from Strings should be avoided in time-critical code. Converting from a decimal string representation to the internal binary representation is time-consuming. It may be quicker to calculate the number from pre-calculated constants and integers. For example, instead of 'a.assign("0.25");', you could do 'a.assign(Real.ONE); a.scalbn(-2);'.
Avoid "new": Allocating temporary Real objects inside loops should be avoided. For every "abandoned" object, a little garbage is left in memory. When the garbage adds up, the garbage collector is run to clean it up, and this causes the application to stop for a short period of time.

Class interface summary

All functions are declared void unless the return type is specified

Public fields:
  long mantissa
  int exponent
  byte sign

Constructors/assignment:
  Real()                                 <==  0
  Real(Real)                             <==  Real
  Real(int)                              <==  int
  Real(long)                             <==  long
  Real(String)                           <==  "-1.234e56"
  Real(String, int base)                 <==  "-1.234e56" / "/FFF3.2e-10"
  Real(int s, int e, long m)             <==  (-1)**s * 2**(e-62) * m
  Real(byte[] data, int offset)          <==  data[offset]..data[offset+11]
  assign(Real)
  assign(int)
  assign(long)
  assign(String)
  assign(String, int base)
  assign(int s, int e, long m)
  assign(byte[] data, int offset)
  assignFloatBits(int)                   <==  IEEE754 32-bits float format
  assignDoubleBits(long)                 <==  IEEE754 64-bits double format

Output:
  String toString()                      ==>  "-1.234e56"
  String toString(int base)              ==>  "-1.234e56" / "03.FEe56"
  String toString(NumberFormat)          ==>  e.g. "-1'234'567,8900"
  int  toInteger()                       ==>  int
  long toLong()                          ==>  long
  void toBytes(byte[] data, int offset)  ==>  data[offset]..data[offset+11]
  int  toFloatBits()                     ==>  IEEE754 32-bits float format
  long toDoubleBits()                    ==>  IEEE754 64-bits double format


(Error bounds are calculated using William Rossi's rossi.dfp.dfp at
40 decimal digits accuracy. Error bounds may increase when results
approach zero or infinity. ULP = Unit in the Last Place. Error
bound of ½ ULP means that the result is correctly rounded. Relative
execution time is the average from running on SE T610 (R3C), K700i,
and Nokia 6230i)

                                              Approx error  Execution time
                       "Explanation"          bound (ULPs)  (rel. to add)
Binary operators:
  x.add(Real y)       : x+=y                       ½         ««« 1.0 »»»
  sub(Real)           : x-=y                       ½             2.0
  mul(Real)           : x*=y                       ½             1.3
  div(Real)           : x/=y                       ½             2.6
  rdiv(Real)          : x=y/x                      ½             3.1
  mod(Real)           : x=x-y*floor(x/y)           0              27
  divf(Real)          : x=floor(x/y)               0              22
  and(Real)           : x&=y                       0             1.5
  or(Real)            : x|=y                       0             1.6
  xor(Real)           : x^=y                       0             1.5
  bic(Real)           : x&=~y                      0             1.5
  swap(Real)          : tmp=x, x=y, y=tmp          0             0.5

Functions:
  x.abs()             : x=|x|                      0             0.2
  neg()               : x=-x                       0             0.2
  sqr()               : x=x*x                      ½             1.1
  recip()             : x=1/x                      ½             2.3
  sqrt()              : (square root)              1              19
  rsqrt()             : x=1/sqrt(x)                1              21
  cbrt()              : (cube root)                2              32
  exp()               : x=e**x                     1              31
  exp2()              : x=2**x                     1              27
  exp10()             : x=10**x                    1              31
  ln()                                             2              51
  log2()                                           1              51
  log10()                                          2              53
  sin()                                            1              28
  cos()                                            1              37
  tan()                                            2              70
  asin()                                           3              68
  acos()                                           2              67
  atan()                                           2              37
  sinh()                                           2              67
  cosh()                                           2              66
  tanh()                                           2              70
  asinh()                                          2              77
  acosh()                                          2              75
  atanh()                                          2              57
  fact()              : x=x!                      15           8-190
  gamma()             : x=(x-1)!                 100+            190
  erfc()              : (compl error func)     2**19         80-4900
  inverfc()           : (inverse erfc)         2**19        240-5100
  toDHMS()            : H -> yyyymmddHH.MMSS       ?              19
  fromDHMS()          : yyyymmddHH.MMSS -> H       ?              19
  time()              : x=HH.MMSS                  0             8.9
  date()              : x=yyyymmdd00               0              30
  random()            : random number [0,1)        -              81

Binary functions:
  hypot(Real)         : x=sqrt(x*x+y*y)            1              24
  y.atan2(Real x)     : y=atan(y/x)   (-pi,pi]     2              48
  pow(Real)           : x**=y                      2             110
  pow(int)            : x**=y                      ½              84
  nroot(Real)         : x**=1/y                    2             110

Integral values:
  floor()                                          0             0.5
  ceil()                                           0             1.8
  round()                                          0             1.3
  trunc()                                          0             1.2
  frac()                                           0             1.2

Utility functions:
  copysign(Real)      : x=|x|*y/|y|                0             0.2
  nextafter(Real)     : x+=(y-x)*epsilon           0             0.8
  scalbn(int)         : x<<=y                      0             0.3
  normalize()         : readjust mantissa          ½             0.7
  lowPow10()                                       0             3.6

Make special values:
  makeZero()                                       0             0.2
  makeZero(int sign)                               0             0.2
  makeInfinity(int sign)                           0             0.3
  makeNan()                                        0             0.3

Comparisons:
  boolean equalTo(Real)      : x==y                              1.0
  boolean notEqualTo(Real)   : x!=y                              1.0
  boolean lessThan(Real)     : x<y                               1.0
  boolean lessEqual(Real)    : x<=y                              1.0
  boolean greaterThan(Real)  : x>y                               1.0
  boolean greaterEqual(Real) : x>=y                              1.0
  boolean absLessThan(Real)  : |x|<|y|                           0.5

Query state:
  boolean isZero()                                               0.3
  boolean isInfinity()                                           0.3
  boolean isNan()                                                0.3
  boolean isFinite()                                             0.3
  boolean isFiniteNonZero()                                      0.3
  boolean isNegative()                                           0.3
  boolean isIntegral()                                           0.6
  boolean isOdd()                                                0.6

Overloaded methods, integer arguments:
  add(int)                                         ½             1.8
  sub(int)                                         ½             2.4
  mul(int)                                         ½             1.3
  div(int)                                         ½             3.4
  rdiv(int)                                        ½             3.9
  boolean equalTo(int)                                           1.7
  boolean notEqualTo(int)                                        1.7
  boolean lessThan(int)                                          1.7
  boolean lessEqual(int)                                         1.7
  boolean greaterThan(int)                                       1.7
  boolean greaterEqual(int)                                      1.7

Extended precision methods with 128-bit mantissa:
  add128                                      2**-62             2.0
  mul128                                      2**-60             3.1
  recip128                                    2**-60              17
  normalize128                                2**-64             0.7
  roundFrom128                                     ½             1.0

Other methods:
  assign(Real)                                     0             0.3
  assign(int)                                      0             0.6
  assign(long)                                     0             1.0
  assign(String, 2)                                0              54
  assign(String, 8)                                0              60
  assign(String, 10)                             ½-1             100
  assign(String, 16)                               0              60
  assign(int s, int e, long m)                     0             0.3
  assign(byte[] data, int offset)                  0             1.2
  assignFloatBits(int)                             0             0.6
  assignDoubleBits(long)                           0             0.6
  int  toInteger()                                               0.6
  long toLong()                                                  0.5
  String toString(2)                                             100
  String toString(8)                                             110
  String toString(10)                                            150
  String toString(16)                                            120
  void toBytes(byte[] data, int offset)                          1.2
  int  toFloatBits()                                             0.7
  long toDoubleBits()                                            0.7

Constants:
  ZERO     = 0
  ONE      = 1
  TWO      = 2
  THREE    = 3
  FIVE     = 5
  TEN      = 10
  HUNDRED  = 100
  HALF     = 1/2
  THIRD    = 1/3
  TENTH    = 1/10
  PERCENT  = 1/100
  SQRT2    = sqrt(2)
  SQRT1_2  = sqrt(1/2)
  PI2      = pi*2
  PI       = pi
  PI_2     = pi/2
  PI_4     = pi/4
  PI_8     = pi/8
  E        = e
  LN2      = ln(2)
  LN10     = ln(10)
  LOG2E    = log2(e)  = 1/ln(2)
  LOG10E   = log10(e) = 1/ln(10)
  MAX      = max non-infinite positive number = 4.197e323228496
  MIN      = min non-zero positive number     = 2.383e-323228497
  NAN      = not a number
  INF      = infinity
  INF_N    = -infinity
  ZERO_N   = -0
  ONE_N    = -1