Commit 65e32d1f authored by Benjamin Peterson's avatar Benjamin Peterson

merge heads

parents 1b1a8e7c 851a07e5
This diff is collapsed.
......@@ -8,9 +8,9 @@ Numeric and Mathematical Modules
The modules described in this chapter provide numeric and math-related functions
and data types. The :mod:`numbers` module defines an abstract hierarchy of
numeric types. The :mod:`math` and :mod:`cmath` modules contain various
mathematical functions for floating-point and complex numbers. For users more
interested in decimal accuracy than in speed, the :mod:`decimal` module supports
exact representations of decimal numbers.
mathematical functions for floating-point and complex numbers. The :mod:`decimal`
module supports exact representations of decimal numbers, using arbitrary precision
arithmetic.
The following modules are documented in this chapter:
......
......@@ -596,6 +596,93 @@ curses
(Contributed by Iñigo Serna in :issue:`6755`)
decimal
-------
:issue:`7652` - integrate fast native decimal arithmetic.
C-module and libmpdec written by Stefan Krah.
The new C version of the decimal module integrates the high speed libmpdec
library for arbitrary precision correctly-rounded decimal arithmetic.
libmpdec conforms to IBM's General Decimal Arithmetic Specification.
Performance gains range from 12x for database applications to 80x for
numerically intensive applications:
+---------+-------------+--------------+-------------+
| | decimal.py | _decimal | speedup |
+=========+=============+==============+=============+
| pi | 42.75s | 0.58s | 74x |
+---------+-------------+--------------+-------------+
| telco | 172.19s | 5.68s | 30x |
+---------+-------------+--------------+-------------+
| psycopg | 3.57s | 0.29s | 12x |
+---------+-------------+--------------+-------------+
Features
~~~~~~~~
* The :exc:`~decimal.FloatOperation` signal optionally enables stricter
semantics for mixing floats and Decimals.
* If Python is compiled without threads, the C version automatically
disables the expensive thread local context machinery. In this case,
the variable :data:`~decimal.HAVE_THREADS` is set to False.
API changes
~~~~~~~~~~~
* The C module has the following context limits, depending on the machine
architecture:
+-------------------+---------------------+------------------------------+
| | 32-bit | 64-bit |
+===================+=====================+==============================+
| :const:`MAX_PREC` | :const:`425000000` | :const:`999999999999999999` |
+-------------------+---------------------+------------------------------+
| :const:`MAX_EMAX` | :const:`425000000` | :const:`999999999999999999` |
+-------------------+---------------------+------------------------------+
| :const:`MIN_EMIN` | :const:`-425000000` | :const:`-999999999999999999` |
+-------------------+---------------------+------------------------------+
* In the context templates (:class:`~decimal.DefaultContext`,
:class:`~decimal.BasicContext` and :class:`~decimal.ExtendedContext`)
the magnitude of :attr:`~decimal.Context.Emax` and
:attr:`~decimal.Context.Emin` has changed to :const:`999999`.
* The :class:`~decimal.Decimal` constructor in decimal.py does not observe
the context limits and converts values with arbitrary exponents or precision
exactly. Since the C version has internal limits, the following scheme is
used: If possible, values are converted exactly, otherwise
:exc:`~decimal.InvalidOperation` is raised and the result is NaN. In the
latter case it is always possible to use :meth:`~decimal.Context.create_decimal`
in order to obtain a rounded or inexact value.
* The power function in decimal.py is always correctly-rounded. In the
C version, it is defined in terms of the correctly-rounded
:meth:`~decimal.Decimal.exp` and :meth:`~decimal.Decimal.ln` functions,
but the final result is only "almost always correctly rounded".
* In the C version, the context dictionary containing the signals is a
:class:`~collections.abc.MutableMapping`. For speed reasons,
:attr:`~decimal.Context.flags` and :attr:`~decimal.Context.traps` always
refer to the same :class:`~collections.abc.MutableMapping` that the context
was initialized with. If a new signal dictionary is assigned,
:attr:`~decimal.Context.flags` and :attr:`~decimal.Context.traps`
are updated with the new values, but they do not reference the RHS
dictionary.
* Pickling a :class:`~decimal.Context` produces a different output in order
to have a common interchange format for the Python and C versions.
* The order of arguments in the :class:`~decimal.Context` constructor has been
changed to match the order displayed by :func:`repr`.
faulthandler
------------
......
......@@ -6,7 +6,7 @@ extern "C" {
#endif
/* This is published for the benefit of "friend" marshal.c only. */
/* This is published for the benefit of "friends" marshal.c and _decimal.c. */
/* Parameters of the long integer representation. There are two different
sets of parameters: one set for 30-bit digits, stored in an unsigned 32-bit
......
This diff is collapsed.
What's New in IDLE 3.2.3?
=========================
- Issue #3573: IDLE hangs when passing invalid command line args
(directory(ies) instead of file(s)).
What's New in IDLE 3.2.1?
=========================
......
......@@ -1403,8 +1403,10 @@ def main():
if enable_edit:
if not (cmd or script):
for filename in args:
flist.open(filename)
for filename in args[:]:
if flist.open(filename) is None:
# filename is a directory actually, disconsider it
args.remove(filename)
if not args:
flist.new()
if enable_shell:
......
......@@ -374,6 +374,10 @@ class SMTPChannel(asynchat.async_chat):
return address
def smtp_MAIL(self, arg):
if not self.seen_greeting:
self.push('503 Error: send HELO first');
return
print('===> MAIL', arg, file=DEBUGSTREAM)
address = self.__getaddr('FROM:', arg) if arg else None
if not address:
......@@ -387,6 +391,10 @@ class SMTPChannel(asynchat.async_chat):
self.push('250 Ok')
def smtp_RCPT(self, arg):
if not self.seen_greeting:
self.push('503 Error: send HELO first');
return
print('===> RCPT', arg, file=DEBUGSTREAM)
if not self.mailfrom:
self.push('503 Error: need MAIL command')
......@@ -411,6 +419,10 @@ class SMTPChannel(asynchat.async_chat):
self.push('250 Ok')
def smtp_DATA(self, arg):
if not self.seen_greeting:
self.push('503 Error: send HELO first');
return
if not self.rcpttos:
self.push('503 Error: need RCPT command')
return
......
......@@ -1416,7 +1416,7 @@ def run_unittest(*classes):
#=======================================================================
# doctest driver.
def run_doctest(module, verbosity=None):
def run_doctest(module, verbosity=None, optionflags=0):
"""Run doctest on the given module. Return (#failures, #tests).
If optional argument verbosity is not specified (or is None), pass
......@@ -1431,7 +1431,7 @@ def run_doctest(module, verbosity=None):
else:
verbosity = None
f, t = doctest.testmod(module, verbose=verbosity)
f, t = doctest.testmod(module, verbose=verbosity, optionflags=optionflags)
if f:
raise TestFailed("%d of %d doctests failed" % (f, t))
if verbose:
......
This diff is collapsed.
......@@ -401,14 +401,10 @@ class FractionTest(unittest.TestCase):
def testMixingWithDecimal(self):
# Decimal refuses mixed arithmetic (but not mixed comparisons)
self.assertRaisesMessage(
TypeError,
"unsupported operand type(s) for +: 'Fraction' and 'Decimal'",
operator.add, F(3,11), Decimal('3.1415926'))
self.assertRaisesMessage(
TypeError,
"unsupported operand type(s) for +: 'Decimal' and 'Fraction'",
operator.add, Decimal('3.1415926'), F(3,11))
self.assertRaises(TypeError, operator.add,
F(3,11), Decimal('3.1415926'))
self.assertRaises(TypeError, operator.add,
Decimal('3.1415926'), F(3,11))
def testComparisons(self):
self.assertTrue(F(1, 2) < F(2, 3))
......
......@@ -150,7 +150,7 @@ class ComparisonTest(unittest.TestCase):
# int, float, Fraction, Decimal
test_values = [
float('-inf'),
D('-1e999999999'),
D('-1e425000000'),
-1e308,
F(-22, 7),
-3.14,
......
......@@ -39,6 +39,7 @@ class SMTPDServerTest(TestCase):
channel.socket.queue_recv(line)
channel.handle_read()
write_line(b'HELO test.example')
write_line(b'MAIL From:eggs@example')
write_line(b'RCPT To:spam@example')
write_line(b'DATA')
......@@ -104,6 +105,11 @@ class SMTPDChannelTest(TestCase):
self.write_line(b'NOOP')
self.assertEqual(self.channel.socket.last, b'250 Ok\r\n')
def test_HELO_NOOP(self):
self.write_line(b'HELO example')
self.write_line(b'NOOP')
self.assertEqual(self.channel.socket.last, b'250 Ok\r\n')
def test_NOOP_bad_syntax(self):
self.write_line(b'NOOP hi')
self.assertEqual(self.channel.socket.last,
......@@ -113,17 +119,23 @@ class SMTPDChannelTest(TestCase):
self.write_line(b'QUIT')
self.assertEqual(self.channel.socket.last, b'221 Bye\r\n')
def test_HELO_QUIT(self):
self.write_line(b'HELO example')
self.write_line(b'QUIT')
self.assertEqual(self.channel.socket.last, b'221 Bye\r\n')
def test_QUIT_arg_ignored(self):
self.write_line(b'QUIT bye bye')
self.assertEqual(self.channel.socket.last, b'221 Bye\r\n')
def test_bad_state(self):
self.channel.smtp_state = 'BAD STATE'
self.write_line(b'HELO')
self.write_line(b'HELO example')
self.assertEqual(self.channel.socket.last,
b'451 Internal confusion\r\n')
def test_command_too_long(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL from ' +
b'a' * self.channel.command_size_limit +
b'@example')
......@@ -133,6 +145,7 @@ class SMTPDChannelTest(TestCase):
def test_data_too_long(self):
# Small hack. Setting limit to 2K octets here will save us some time.
self.channel.data_size_limit = 2048
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'DATA')
......@@ -142,43 +155,61 @@ class SMTPDChannelTest(TestCase):
b'552 Error: Too much mail data\r\n')
def test_need_MAIL(self):
self.write_line(b'HELO example')
self.write_line(b'RCPT to:spam@example')
self.assertEqual(self.channel.socket.last,
b'503 Error: need MAIL command\r\n')
def test_MAIL_syntax(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL from eggs@example')
self.assertEqual(self.channel.socket.last,
b'501 Syntax: MAIL FROM:<address>\r\n')
def test_MAIL_missing_from(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL from:')
self.assertEqual(self.channel.socket.last,
b'501 Syntax: MAIL FROM:<address>\r\n')
def test_MAIL_chevrons(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL from:<eggs@example>')
self.assertEqual(self.channel.socket.last, b'250 Ok\r\n')
def test_nested_MAIL(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL from:eggs@example')
self.write_line(b'MAIL from:spam@example')
self.assertEqual(self.channel.socket.last,
b'503 Error: nested MAIL command\r\n')
def test_no_HELO_MAIL(self):
self.write_line(b'MAIL from:<foo@example.com>')
self.assertEqual(self.channel.socket.last,
b'503 Error: send HELO first\r\n')
def test_need_RCPT(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'DATA')
self.assertEqual(self.channel.socket.last,
b'503 Error: need RCPT command\r\n')
def test_RCPT_syntax(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT to eggs@example')
self.assertEqual(self.channel.socket.last,
b'501 Syntax: RCPT TO: <address>\r\n')
def test_no_HELO_RCPT(self):
self.write_line(b'RCPT to eggs@example')
self.assertEqual(self.channel.socket.last,
b'503 Error: send HELO first\r\n')
def test_data_dialog(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.assertEqual(self.channel.socket.last, b'250 Ok\r\n')
self.write_line(b'RCPT To:spam@example')
......@@ -193,12 +224,19 @@ class SMTPDChannelTest(TestCase):
[('peer', 'eggs@example', ['spam@example'], 'data\nmore')])
def test_DATA_syntax(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'DATA spam')
self.assertEqual(self.channel.socket.last, b'501 Syntax: DATA\r\n')
def test_no_HELO_DATA(self):
self.write_line(b'DATA spam')
self.assertEqual(self.channel.socket.last,
b'503 Error: send HELO first\r\n')
def test_data_transparency_section_4_5_2(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'DATA')
......@@ -206,6 +244,7 @@ class SMTPDChannelTest(TestCase):
self.assertEqual(self.channel.received_data, '.')
def test_multiple_RCPT(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'RCPT To:ham@example')
......@@ -216,6 +255,7 @@ class SMTPDChannelTest(TestCase):
def test_manual_status(self):
# checks that the Channel is able to return a custom status message
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'DATA')
......@@ -223,6 +263,7 @@ class SMTPDChannelTest(TestCase):
self.assertEqual(self.channel.socket.last, b'250 Okish\r\n')
def test_RSET(self):
self.write_line(b'HELO example')
self.write_line(b'MAIL From:eggs@example')
self.write_line(b'RCPT To:spam@example')
self.write_line(b'RSET')
......@@ -234,6 +275,11 @@ class SMTPDChannelTest(TestCase):
self.assertEqual(self.server.messages,
[('peer', 'foo@example', ['eggs@example'], 'data')])
def test_HELO_RSET(self):
self.write_line(b'HELO example')
self.write_line(b'RSET')
self.assertEqual(self.channel.socket.last, b'250 Ok\r\n')
def test_RSET_syntax(self):
self.write_line(b'RSET hi')
self.assertEqual(self.channel.socket.last, b'501 Syntax: RSET\r\n')
......
......@@ -531,6 +531,7 @@ Magnus Kessler
Lawrence Kesteloot
Vivek Khera
Mads Kiilerich
Jason Killen
Taek Joo Kim
W. Trevor King
Paul Kippes
......
......@@ -30,6 +30,16 @@ Core and Builtins
Library
-------
- Issue #7652: Integrate the decimal floating point libmpdec library to speed
up the decimal module. Performance gains of the new C implementation are
between 12x and 80x, depending on the application.
- Issue #3573: IDLE hangs when passing invalid command line args
(directory(ies) instead of file(s)) (Patch by Guilherme Polo)
- Issue #14269: SMTPD now conforms to the RFC and requires a HELO command
before MAIL, RCPT, or DATA.
- Issue #13694: asynchronous connect in asyncore.dispatcher does not set addr
attribute.
......
......@@ -456,3 +456,16 @@
fun:PyUnicode_FSConverter
}
# Additional suppressions for the unified decimal tests:
{
test_decimal
Memcheck:Addr4
fun:PyUnicodeUCS2_FSConverter
}
{
test_decimal2
Memcheck:Addr4
fun:PyUnicode_FSConverter
}
Normal priority:
----------------
1) Add C-API importable as capsule.
2) Add --with-system-libmpdec to ./configure.
3) Use same default emin/emax on 32-bit (MAX_EMAX=425000000) and 64-bit
(MAX_EMAX=10**18-1).
4) Order of arguments in Context().
5) Documentation.
6) quantize()
- exp argument is misleading:
Decimal('0.321000e+2').quantize(exp=9) -> user might expect
that the result will have exp=9.
- watchexp
7) Match the exception hierarchy of decimal.py:
exceptions.ArithmeticError(exceptions.Exception)
DecimalException
Clamped
DivisionByZero(DecimalException, exceptions.ZeroDivisionError)
Inexact
Overflow(Inexact, Rounded)
Underflow(Inexact, Rounded, Subnormal)
InvalidOperation
Rounded
Subnormal
FloatOperation
Low priority:
-------------
1) Convert tabs (wait until commit).
2) Pre-ANSI compilers require '#' in the first column (should be done
for the whole Python source tree if we support such compilers). (?)
3) FETCH_CURRENT_CONTEXT vs. CURRENT_CONTEXT?
4) Justify remaining uses of exit on overflow in bignum arith. Short
answer: with correct context values the coefficients never get big
enough for that to happen.
5) Justify remaining uses of abort() in mpdecimal: These uses are
for debug purposes and can't be reached when the library is used
correctly.
About
=====
_decimal.c is a wrapper for the libmpdec library. libmpdec is a fast C
library for correctly-rounded arbitrary precision decimal floating point
arithmetic. It is a complete implementation of Mike Cowlishaw/IBM's
General Decimal Arithmetic Specification.
Build process for the module
============================
As usual, the build process for _decimal.so is driven by setup.py in the top
level directory. setup.py autodetects the following build configurations:
1) x64 - 64-bit Python, x86_64 processor (AMD, Intel)
2) uint128 - 64-bit Python, compiler provides __uint128_t (gcc)
3) ansi64 - 64-bit Python, ANSI C
4) ppro - 32-bit Python, x86 CPU, PentiumPro or later
5) ansi32 - 32-bit Python, ANSI C
6) ansi-legacy - 32-bit Python, compiler without uint64_t
7) universal - Mac OS only (multi-arch)
It is possible to override autodetection by exporting:
PYTHON_DECIMAL_WITH_MACHINE=value, where value is one of the above options.
NOTE
====
decimal.so is not built from a static libmpdec.a since doing so led to
failures on AIX (user report) and Windows (mixing static and dynamic CRTs
causes locale problems and more).
This diff is collapsed.
This diff is collapsed.
libmpdec
========
libmpdec is a fast C/C++ library for correctly-rounded arbitrary precision
decimal floating point arithmetic. It is a complete implementation of
Mike Cowlishaw/IBM's General Decimal Arithmetic Specification.
Files required for the Python _decimal module
=============================================
Core files for small and medium precision arithmetic
----------------------------------------------------
basearith.{c,h} -> Core arithmetic in base 10**9 or 10**19.
bits.h -> Portable detection of least/most significant one-bit.
constants.{c,h} -> Constants that are used in multiple files.
context.c -> Context functions.
io.{c,h} -> Conversions between mpd_t and ASCII strings,
mpd_t formatting (allows UTF-8 fill character).
memory.{c,h} -> Allocation handlers with overflow detection
and functions for switching between static
and dynamic mpd_t.
mpdecimal.{c,h} -> All (quiet) functions of the specification.
typearith.h -> Fast primitives for double word multiplication,
division etc.
Visual Studio only:
~~~~~~~~~~~~~~~~~~~
vccompat.h -> snprintf <==> sprintf_s and similar things.
vcstdint.h -> stdint.h (included in VS 2010 but not in VS 2008).
vcdiv64.asm -> Double word division used in typearith.h. VS 2008 does
not allow inline asm for x64. Also, it does not provide
an intrinsic for double word division.
Files for bignum arithmetic:
----------------------------
The following files implement the Fast Number Theoretic Transform
used for multiplying coefficients with more than 1024 words (see
mpdecimal.c: _mpd_fntmul()).
umodarith.h -> Fast low level routines for unsigned modular arithmetic.
numbertheory.{c,h} -> Routines for setting up the Number Theoretic Transform.
difradix2.{c,h} -> Decimation in frequency transform, used as the
"base case" by the following three files:
fnt.{c,h} -> Transform arrays up to 4096 words.
sixstep.{c,h} -> Transform larger arrays of length 2**n.
fourstep.{c,h} -> Transform larger arrays of length 3 * 2**n.
convolute.{c,h} -> Fast convolution using one of the three transform
functions.
transpose.{c,h} -> Transpositions needed for the sixstep algorithm.
crt.{c,h} -> Chinese Remainder Theorem: use information from three
transforms modulo three different primes to get the
final result.
Pointers to literature, proofs and more
=======================================
literature/
-----------
REFERENCES.txt -> List of relevant papers.
bignum.txt -> Explanation of the Fast Number Theoretic Transform (FNT).
fnt.py -> Verify constants used in the FNT; Python demo for the
O(N**2) discrete transform.
matrix-transform.txt -> Proof for the Matrix Fourier Transform used in
fourstep.c.
six-step.txt -> Show that the algorithm used in sixstep.c is
a variant of the Matrix Fourier Transform.
mulmod-64.txt -> Proof for the mulmod64 algorithm from
umodarith.h.
mulmod-ppro.txt -> Proof for the x87 FPU modular multiplication
from umodarith.h.
umodarith.lisp -> ACL2 proofs for many functions from umodarith.h.
Library Author
==============
Stefan Krah <skrah@bytereef.org>
This diff is collapsed.
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef BASEARITH_H
#define BASEARITH_H
#include "mpdecimal.h"
#include <stdio.h>
#include "typearith.h"
mpd_uint_t _mpd_baseadd(mpd_uint_t *w, const mpd_uint_t *u, const mpd_uint_t *v,
mpd_size_t m, mpd_size_t n);
void _mpd_baseaddto(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n);
mpd_uint_t _mpd_shortadd(mpd_uint_t *w, mpd_size_t m, mpd_uint_t v);
mpd_uint_t _mpd_shortadd_b(mpd_uint_t *w, mpd_size_t m, mpd_uint_t v,
mpd_uint_t b);
mpd_uint_t _mpd_baseincr(mpd_uint_t *u, mpd_size_t n);
void _mpd_basesub(mpd_uint_t *w, const mpd_uint_t *u, const mpd_uint_t *v,
mpd_size_t m, mpd_size_t n);
void _mpd_basesubfrom(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n);
void _mpd_basemul(mpd_uint_t *w, const mpd_uint_t *u, const mpd_uint_t *v,
mpd_size_t m, mpd_size_t n);
void _mpd_shortmul(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n,
mpd_uint_t v);
void _mpd_shortmul_b(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n,
mpd_uint_t v, mpd_uint_t b);
mpd_uint_t _mpd_shortdiv(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n,
mpd_uint_t v);
mpd_uint_t _mpd_shortdiv_b(mpd_uint_t *w, const mpd_uint_t *u, mpd_size_t n,
mpd_uint_t v, mpd_uint_t b);
int _mpd_basedivmod(mpd_uint_t *q, mpd_uint_t *r, const mpd_uint_t *uconst,
const mpd_uint_t *vconst, mpd_size_t nplusm, mpd_size_t n);
void _mpd_baseshiftl(mpd_uint_t *dest, mpd_uint_t *src, mpd_size_t n,
mpd_size_t m, mpd_size_t shift);
mpd_uint_t _mpd_baseshiftr(mpd_uint_t *dest, mpd_uint_t *src, mpd_size_t slen,
mpd_size_t shift);
#ifdef CONFIG_64
extern const mpd_uint_t mprime_rdx;
/*
* Algorithm from: Division by Invariant Integers using Multiplication,
* T. Granlund and P. L. Montgomery, Proceedings of the SIGPLAN '94
* Conference on Programming Language Design and Implementation.
*
* http://gmplib.org/~tege/divcnst-pldi94.pdf
*
* Variables from the paper and their translations (See section 8):
*
* N := 64
* d := MPD_RADIX
* l := 64
* m' := floor((2**(64+64) - 1)/MPD_RADIX) - 2**64
*
* Since N-l == 0:
*
* dnorm := d
* n2 := hi
* n10 := lo
*
* ACL2 proof: mpd-div-words-r-correct
*/
static inline void
_mpd_div_words_r(mpd_uint_t *q, mpd_uint_t *r, mpd_uint_t hi, mpd_uint_t lo)
{
mpd_uint_t n_adj, h, l, t;
mpd_uint_t n1_neg;
/* n1_neg = if lo >= 2**63 then MPD_UINT_MAX else 0 */
n1_neg = (lo & (1ULL<<63)) ? MPD_UINT_MAX : 0;
/* n_adj = if lo >= 2**63 then lo+MPD_RADIX else lo */
n_adj = lo + (n1_neg & MPD_RADIX);
/* (h, l) = if lo >= 2**63 then m'*(hi+1) else m'*hi */
_mpd_mul_words(&h, &l, mprime_rdx, hi-n1_neg);
l = l + n_adj;
if (l < n_adj) h++;
t = h + hi;
/* At this point t == qest, with q == qest or q == qest+1:
* 1) 0 <= 2**64*hi + lo - qest*MPD_RADIX < 2*MPD_RADIX
*/
/* t = 2**64-1 - qest = 2**64 - (qest+1) */
t = MPD_UINT_MAX - t;
/* (h, l) = 2**64*MPD_RADIX - (qest+1)*MPD_RADIX */
_mpd_mul_words(&h, &l, t, MPD_RADIX);
l = l + lo;
if (l < lo) h++;
h += hi;
h -= MPD_RADIX;
/* (h, l) = 2**64*hi + lo - (qest+1)*MPD_RADIX (mod 2**128)
* Case q == qest+1:
* a) h == 0, l == r
* b) q := h - t == qest+1
* c) r := l
* Case q == qest:
* a) h == MPD_UINT_MAX, l == 2**64-(MPD_RADIX-r)
* b) q := h - t == qest
* c) r := l + MPD_RADIX = r
*/
*q = (h - t);
*r = l + (MPD_RADIX & h);
}
#else
static inline void
_mpd_div_words_r(mpd_uint_t *q, mpd_uint_t *r, mpd_uint_t hi, mpd_uint_t lo)
{
_mpd_div_words(q, r, hi, lo, MPD_RADIX);
}
#endif
/* Multiply two single base MPD_RADIX words, store result in array w[2]. */
static inline void
_mpd_singlemul(mpd_uint_t w[2], mpd_uint_t u, mpd_uint_t v)
{
mpd_uint_t hi, lo;
_mpd_mul_words(&hi, &lo, u, v);
_mpd_div_words_r(&w[1], &w[0], hi, lo);
}
/* Multiply u (len 2) and v (len m, 1 <= m <= 2). */
static inline void
_mpd_mul_2_le2(mpd_uint_t w[4], mpd_uint_t u[2], mpd_uint_t v[2], mpd_ssize_t m)
{
mpd_uint_t hi, lo;
_mpd_mul_words(&hi, &lo, u[0], v[0]);
_mpd_div_words_r(&w[1], &w[0], hi, lo);
_mpd_mul_words(&hi, &lo, u[1], v[0]);
lo = w[1] + lo;
if (lo < w[1]) hi++;
_mpd_div_words_r(&w[2], &w[1], hi, lo);
if (m == 1) return;
_mpd_mul_words(&hi, &lo, u[0], v[1]);
lo = w[1] + lo;
if (lo < w[1]) hi++;
_mpd_div_words_r(&w[3], &w[1], hi, lo);
_mpd_mul_words(&hi, &lo, u[1], v[1]);
lo = w[2] + lo;
if (lo < w[2]) hi++;
lo = w[3] + lo;
if (lo < w[3]) hi++;
_mpd_div_words_r(&w[3], &w[2], hi, lo);
}
/*
* Test if all words from data[len-1] to data[0] are zero. If len is 0, nothing
* is tested and the coefficient is regarded as "all zero".
*/
static inline int
_mpd_isallzero(const mpd_uint_t *data, mpd_ssize_t len)
{
while (--len >= 0) {
if (data[len] != 0) return 0;
}
return 1;
}
/*
* Test if all full words from data[len-1] to data[0] are MPD_RADIX-1
* (all nines). Return true if len == 0.
*/
static inline int
_mpd_isallnine(const mpd_uint_t *data, mpd_ssize_t len)
{
while (--len >= 0) {
if (data[len] != MPD_RADIX-1) return 0;
}
return 1;
}
#endif /* BASEARITH_H */
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef BITS_H
#define BITS_H
#include "mpdecimal.h"
#include <stdio.h>
/* Check if n is a power of 2. */
static inline int
ispower2(mpd_size_t n)
{
return n != 0 && (n & (n-1)) == 0;
}
#if defined(ANSI)
/*
* Return the most significant bit position of n from 0 to 31 (63).
* Assumptions: n != 0.
*/
static inline int
mpd_bsr(mpd_size_t n)
{
int pos = 0;
mpd_size_t tmp;
#ifdef CONFIG_64
tmp = n >> 32;
if (tmp != 0) { n = tmp; pos += 32; }
#endif
tmp = n >> 16;
if (tmp != 0) { n = tmp; pos += 16; }
tmp = n >> 8;
if (tmp != 0) { n = tmp; pos += 8; }
tmp = n >> 4;
if (tmp != 0) { n = tmp; pos += 4; }
tmp = n >> 2;
if (tmp != 0) { n = tmp; pos += 2; }
tmp = n >> 1;
if (tmp != 0) { n = tmp; pos += 1; }
return pos + (int)n - 1;
}
/*
* Return the least significant bit position of n from 0 to 31 (63).
* Assumptions: n != 0.
*/
static inline int
mpd_bsf(mpd_size_t n)
{
int pos;
#ifdef CONFIG_64
pos = 63;
if (n & 0x00000000FFFFFFFFULL) { pos -= 32; } else { n >>= 32; }
if (n & 0x000000000000FFFFULL) { pos -= 16; } else { n >>= 16; }
if (n & 0x00000000000000FFULL) { pos -= 8; } else { n >>= 8; }
if (n & 0x000000000000000FULL) { pos -= 4; } else { n >>= 4; }
if (n & 0x0000000000000003ULL) { pos -= 2; } else { n >>= 2; }
if (n & 0x0000000000000001ULL) { pos -= 1; }
#else
pos = 31;
if (n & 0x000000000000FFFFUL) { pos -= 16; } else { n >>= 16; }
if (n & 0x00000000000000FFUL) { pos -= 8; } else { n >>= 8; }
if (n & 0x000000000000000FUL) { pos -= 4; } else { n >>= 4; }
if (n & 0x0000000000000003UL) { pos -= 2; } else { n >>= 2; }
if (n & 0x0000000000000001UL) { pos -= 1; }
#endif
return pos;
}
/* END ANSI */
#elif defined(ASM)
/*
* Bit scan reverse. Assumptions: a != 0.
*/
static inline int
mpd_bsr(mpd_size_t a)
{
mpd_size_t retval;
__asm__ (
#ifdef CONFIG_64
"bsrq %1, %0\n\t"
#else
"bsr %1, %0\n\t"
#endif
:"=r" (retval)
:"r" (a)
:"cc"
);
return (int)retval;
}
/*
* Bit scan forward. Assumptions: a != 0.
*/
static inline int
mpd_bsf(mpd_size_t a)
{
mpd_size_t retval;
__asm__ (
#ifdef CONFIG_64
"bsfq %1, %0\n\t"
#else
"bsf %1, %0\n\t"
#endif
:"=r" (retval)
:"r" (a)
:"cc"
);
return (int)retval;
}
/* END ASM */
#elif defined(MASM)
#include <intrin.h>
/*
* Bit scan reverse. Assumptions: a != 0.
*/
static inline int __cdecl
mpd_bsr(mpd_size_t a)
{
unsigned long retval;
#ifdef CONFIG_64
_BitScanReverse64(&retval, a);
#else
_BitScanReverse(&retval, a);
#endif
return (int)retval;
}
/*
* Bit scan forward. Assumptions: a != 0.
*/
static inline int __cdecl
mpd_bsf(mpd_size_t a)
{
unsigned long retval;
#ifdef CONFIG_64
_BitScanForward64(&retval, a);
#else
_BitScanForward(&retval, a);
#endif
return (int)retval;
}
/* END MASM (_MSC_VER) */
#else
#error "missing preprocessor definitions"
#endif /* BSR/BSF */
#endif /* BITS_H */
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include "constants.h"
#if defined(CONFIG_64)
/* number-theory.c */
const mpd_uint_t mpd_moduli[3] = {
18446744069414584321ULL, 18446744056529682433ULL, 18446742974197923841ULL
};
const mpd_uint_t mpd_roots[3] = {7ULL, 10ULL, 19ULL};
/* crt.c */
const mpd_uint_t INV_P1_MOD_P2 = 18446744055098026669ULL;
const mpd_uint_t INV_P1P2_MOD_P3 = 287064143708160ULL;
const mpd_uint_t LH_P1P2 = 18446744052234715137ULL; /* (P1*P2) % 2^64 */
const mpd_uint_t UH_P1P2 = 18446744052234715141ULL; /* (P1*P2) / 2^64 */
/* transpose.c */
const mpd_size_t mpd_bits[64] = {
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384,
32768, 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608,
16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824,
2147483648ULL, 4294967296ULL, 8589934592ULL, 17179869184ULL, 34359738368ULL,
68719476736ULL, 137438953472ULL, 274877906944ULL, 549755813888ULL,
1099511627776ULL, 2199023255552ULL, 4398046511104, 8796093022208ULL,
17592186044416ULL, 35184372088832ULL, 70368744177664ULL, 140737488355328ULL,
281474976710656ULL, 562949953421312ULL, 1125899906842624ULL,
2251799813685248ULL, 4503599627370496ULL, 9007199254740992ULL,
18014398509481984ULL, 36028797018963968ULL, 72057594037927936ULL,
144115188075855872ULL, 288230376151711744ULL, 576460752303423488ULL,
1152921504606846976ULL, 2305843009213693952ULL, 4611686018427387904ULL,
9223372036854775808ULL
};
/* mpdecimal.c */
const mpd_uint_t mpd_pow10[MPD_RDIGITS+1] = {
1,10,100,1000,10000,100000,1000000,10000000,100000000,1000000000,
10000000000ULL,100000000000ULL,1000000000000ULL,10000000000000ULL,
100000000000000ULL,1000000000000000ULL,10000000000000000ULL,
100000000000000000ULL,1000000000000000000ULL,10000000000000000000ULL
};
/* magic number for constant division by MPD_RADIX */
const mpd_uint_t mprime_rdx = 15581492618384294730ULL;
#elif defined(CONFIG_32)
/* number-theory.c */
const mpd_uint_t mpd_moduli[3] = {2113929217UL, 2013265921UL, 1811939329UL};
const mpd_uint_t mpd_roots[3] = {5UL, 31UL, 13UL};
/* PentiumPro modular multiplication: These constants have to be loaded as
* 80 bit long doubles, which are not supported by certain compilers. */
const uint32_t mpd_invmoduli[3][3] = {
{4293885170U, 2181570688U, 16352U}, /* ((long double) 1 / 2113929217UL) */
{1698898177U, 2290649223U, 16352U}, /* ((long double) 1 / 2013265921UL) */
{2716021846U, 2545165803U, 16352U} /* ((long double) 1 / 1811939329UL) */
};
const float MPD_TWO63 = 9223372036854775808.0; /* 2^63 */
/* crt.c */
const mpd_uint_t INV_P1_MOD_P2 = 2013265901UL;
const mpd_uint_t INV_P1P2_MOD_P3 = 54UL;
const mpd_uint_t LH_P1P2 = 4127195137UL; /* (P1*P2) % 2^32 */
const mpd_uint_t UH_P1P2 = 990904320UL; /* (P1*P2) / 2^32 */
/* transpose.c */
const mpd_size_t mpd_bits[32] = {
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384,
32768, 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608,
16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824,
2147483648UL
};
/* mpdecimal.c */
const mpd_uint_t mpd_pow10[MPD_RDIGITS+1] = {
1,10,100,1000,10000,100000,1000000,10000000,100000000,1000000000
};
#else
#error "CONFIG_64 or CONFIG_32 must be defined."
#endif
const char *mpd_round_string[MPD_ROUND_GUARD] = {
"ROUND_UP", /* round away from 0 */
"ROUND_DOWN", /* round toward 0 (truncate) */
"ROUND_CEILING", /* round toward +infinity */
"ROUND_FLOOR", /* round toward -infinity */
"ROUND_HALF_UP", /* 0.5 is rounded up */
"ROUND_HALF_DOWN", /* 0.5 is rounded down */
"ROUND_HALF_EVEN", /* 0.5 is rounded to even */
"ROUND_05UP", /* round zero or five away from 0 */
"ROUND_TRUNC", /* truncate, but set infinity */
};
const char *mpd_clamp_string[MPD_CLAMP_GUARD] = {
"CLAMP_DEFAULT",
"CLAMP_IEEE_754"
};
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef CONSTANTS_H
#define CONSTANTS_H
#include "mpdecimal.h"
/* choice of optimized functions */
#if defined(CONFIG_64)
/* x64 */
#define MULMOD(a, b) x64_mulmod(a, b, umod)
#define MULMOD2C(a0, a1, w) x64_mulmod2c(a0, a1, w, umod)
#define MULMOD2(a0, b0, a1, b1) x64_mulmod2(a0, b0, a1, b1, umod)
#define POWMOD(base, exp) x64_powmod(base, exp, umod)
#define SETMODULUS(modnum) std_setmodulus(modnum, &umod)
#define SIZE3_NTT(x0, x1, x2, w3table) std_size3_ntt(x0, x1, x2, w3table, umod)
#elif defined(PPRO)
/* PentiumPro (or later) gcc inline asm */
#define MULMOD(a, b) ppro_mulmod(a, b, &dmod, dinvmod)
#define MULMOD2C(a0, a1, w) ppro_mulmod2c(a0, a1, w, &dmod, dinvmod)
#define MULMOD2(a0, b0, a1, b1) ppro_mulmod2(a0, b0, a1, b1, &dmod, dinvmod)
#define POWMOD(base, exp) ppro_powmod(base, exp, &dmod, dinvmod)
#define SETMODULUS(modnum) ppro_setmodulus(modnum, &umod, &dmod, dinvmod)
#define SIZE3_NTT(x0, x1, x2, w3table) ppro_size3_ntt(x0, x1, x2, w3table, umod, &dmod, dinvmod)
#else
/* ANSI C99 */
#define MULMOD(a, b) std_mulmod(a, b, umod)
#define MULMOD2C(a0, a1, w) std_mulmod2c(a0, a1, w, umod)
#define MULMOD2(a0, b0, a1, b1) std_mulmod2(a0, b0, a1, b1, umod)
#define POWMOD(base, exp) std_powmod(base, exp, umod)
#define SETMODULUS(modnum) std_setmodulus(modnum, &umod)
#define SIZE3_NTT(x0, x1, x2, w3table) std_size3_ntt(x0, x1, x2, w3table, umod)
#endif
/* PentiumPro (or later) gcc inline asm */
extern const float MPD_TWO63;
extern const uint32_t mpd_invmoduli[3][3];
enum {P1, P2, P3};
extern const mpd_uint_t mpd_moduli[];
extern const mpd_uint_t mpd_roots[];
extern const mpd_size_t mpd_bits[];
extern const mpd_uint_t mpd_pow10[];
extern const mpd_uint_t INV_P1_MOD_P2;
extern const mpd_uint_t INV_P1P2_MOD_P3;
extern const mpd_uint_t LH_P1P2;
extern const mpd_uint_t UH_P1P2;
#endif /* CONSTANTS_H */
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include <string.h>
#include <signal.h>
void
mpd_dflt_traphandler(mpd_context_t *ctx UNUSED)
{
raise(SIGFPE);
}
void (* mpd_traphandler)(mpd_context_t *) = mpd_dflt_traphandler;
/* Set guaranteed minimum number of coefficient words. The function may
be used once at program start. Setting MPD_MINALLOC to out-of-bounds
values is a catastrophic error, so in that case the function exits rather
than relying on the user to check a return value. */
void
mpd_setminalloc(mpd_ssize_t n)
{
static int minalloc_is_set = 0;
if (minalloc_is_set) {
mpd_err_warn("mpd_setminalloc: ignoring request to set "
"MPD_MINALLOC a second time\n");
return;
}
if (n < MPD_MINALLOC_MIN || n > MPD_MINALLOC_MAX) {
mpd_err_fatal("illegal value for MPD_MINALLOC"); /* GCOV_NOT_REACHED */
}
MPD_MINALLOC = n;
minalloc_is_set = 1;
}
void
mpd_init(mpd_context_t *ctx, mpd_ssize_t prec)
{
mpd_ssize_t ideal_minalloc;
mpd_defaultcontext(ctx);
if (!mpd_qsetprec(ctx, prec)) {
mpd_addstatus_raise(ctx, MPD_Invalid_context);
return;
}
ideal_minalloc = 2 * ((prec+MPD_RDIGITS-1) / MPD_RDIGITS);
if (ideal_minalloc < MPD_MINALLOC_MIN) ideal_minalloc = MPD_MINALLOC_MIN;
if (ideal_minalloc > MPD_MINALLOC_MAX) ideal_minalloc = MPD_MINALLOC_MAX;
mpd_setminalloc(ideal_minalloc);
}
void
mpd_maxcontext(mpd_context_t *ctx)
{
ctx->prec=MPD_MAX_PREC;
ctx->emax=MPD_MAX_EMAX;
ctx->emin=MPD_MIN_EMIN;
ctx->round=MPD_ROUND_HALF_EVEN;
ctx->traps=MPD_Traps;
ctx->status=0;
ctx->newtrap=0;
ctx->clamp=0;
ctx->allcr=1;
}
void
mpd_defaultcontext(mpd_context_t *ctx)
{
ctx->prec=2*MPD_RDIGITS;
ctx->emax=MPD_MAX_EMAX;
ctx->emin=MPD_MIN_EMIN;
ctx->round=MPD_ROUND_HALF_UP;
ctx->traps=MPD_Traps;
ctx->status=0;
ctx->newtrap=0;
ctx->clamp=0;
ctx->allcr=1;
}
void
mpd_basiccontext(mpd_context_t *ctx)
{
ctx->prec=9;
ctx->emax=MPD_MAX_EMAX;
ctx->emin=MPD_MIN_EMIN;
ctx->round=MPD_ROUND_HALF_UP;
ctx->traps=MPD_Traps|MPD_Clamped;
ctx->status=0;
ctx->newtrap=0;
ctx->clamp=0;
ctx->allcr=1;
}
int
mpd_ieee_context(mpd_context_t *ctx, int bits)
{
if (bits <= 0 || bits > MPD_IEEE_CONTEXT_MAX_BITS || bits % 32) {
return -1;
}
ctx->prec = 9 * (bits/32) - 2;
ctx->emax = 3 * ((mpd_ssize_t)1<<(bits/16+3));
ctx->emin = 1 - ctx->emax;
ctx->round=MPD_ROUND_HALF_EVEN;
ctx->traps=0;
ctx->status=0;
ctx->newtrap=0;
ctx->clamp=1;
ctx->allcr=1;
return 0;
}
mpd_ssize_t
mpd_getprec(const mpd_context_t *ctx)
{
return ctx->prec;
}
mpd_ssize_t
mpd_getemax(const mpd_context_t *ctx)
{
return ctx->emax;
}
mpd_ssize_t
mpd_getemin(const mpd_context_t *ctx)
{
return ctx->emin;
}
int
mpd_getround(const mpd_context_t *ctx)
{
return ctx->round;
}
uint32_t
mpd_gettraps(const mpd_context_t *ctx)
{
return ctx->traps;
}
uint32_t
mpd_getstatus(const mpd_context_t *ctx)
{
return ctx->status;
}
int
mpd_getclamp(const mpd_context_t *ctx)
{
return ctx->clamp;
}
int
mpd_getcr(const mpd_context_t *ctx)
{
return ctx->allcr;
}
int
mpd_qsetprec(mpd_context_t *ctx, mpd_ssize_t prec)
{
if (prec <= 0 || prec > MPD_MAX_PREC) {
return 0;
}
ctx->prec = prec;
return 1;
}
int
mpd_qsetemax(mpd_context_t *ctx, mpd_ssize_t emax)
{
if (emax < 0 || emax > MPD_MAX_EMAX) {
return 0;
}
ctx->emax = emax;
return 1;
}
int
mpd_qsetemin(mpd_context_t *ctx, mpd_ssize_t emin)
{
if (emin > 0 || emin < MPD_MIN_EMIN) {
return 0;
}
ctx->emin = emin;
return 1;
}
int
mpd_qsetround(mpd_context_t *ctx, int round)
{
if (!(0 <= round && round < MPD_ROUND_GUARD)) {
return 0;
}
ctx->round = round;
return 1;
}
int
mpd_qsettraps(mpd_context_t *ctx, uint32_t traps)
{
if (traps > MPD_Max_status) {
return 0;
}
ctx->traps = traps;
return 1;
}
int
mpd_qsetstatus(mpd_context_t *ctx, uint32_t flags)
{
if (flags > MPD_Max_status) {
return 0;
}
ctx->status = flags;
return 1;
}
int
mpd_qsetclamp(mpd_context_t *ctx, int c)
{
if (c != 0 && c != 1) {
return 0;
}
ctx->clamp = c;
return 1;
}
int
mpd_qsetcr(mpd_context_t *ctx, int c)
{
if (c != 0 && c != 1) {
return 0;
}
ctx->allcr = c;
return 1;
}
void
mpd_addstatus_raise(mpd_context_t *ctx, uint32_t flags)
{
ctx->status |= flags;
if (flags&ctx->traps) {
ctx->newtrap = (flags&ctx->traps);
mpd_traphandler(ctx);
}
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include "bits.h"
#include "constants.h"
#include "fnt.h"
#include "fourstep.h"
#include "numbertheory.h"
#include "sixstep.h"
#include "umodarith.h"
#include "convolute.h"
/* Bignum: Fast convolution using the Number Theoretic Transform. Used for
the multiplication of very large coefficients. */
/* Convolute the data in c1 and c2. Result is in c1. */
int
fnt_convolute(mpd_uint_t *c1, mpd_uint_t *c2, mpd_size_t n, int modnum)
{
int (*fnt)(mpd_uint_t *, mpd_size_t, int);
int (*inv_fnt)(mpd_uint_t *, mpd_size_t, int);
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_uint_t n_inv, umod;
mpd_size_t i;
SETMODULUS(modnum);
n_inv = POWMOD(n, (umod-2));
if (ispower2(n)) {
if (n > SIX_STEP_THRESHOLD) {
fnt = six_step_fnt;
inv_fnt = inv_six_step_fnt;
}
else {
fnt = std_fnt;
inv_fnt = std_inv_fnt;
}
}
else {
fnt = four_step_fnt;
inv_fnt = inv_four_step_fnt;
}
if (!fnt(c1, n, modnum)) {
return 0;
}
if (!fnt(c2, n, modnum)) {
return 0;
}
for (i = 0; i < n-1; i += 2) {
mpd_uint_t x0 = c1[i];
mpd_uint_t y0 = c2[i];
mpd_uint_t x1 = c1[i+1];
mpd_uint_t y1 = c2[i+1];
MULMOD2(&x0, y0, &x1, y1);
c1[i] = x0;
c1[i+1] = x1;
}
if (!inv_fnt(c1, n, modnum)) {
return 0;
}
for (i = 0; i < n-3; i += 4) {
mpd_uint_t x0 = c1[i];
mpd_uint_t x1 = c1[i+1];
mpd_uint_t x2 = c1[i+2];
mpd_uint_t x3 = c1[i+3];
MULMOD2C(&x0, &x1, n_inv);
MULMOD2C(&x2, &x3, n_inv);
c1[i] = x0;
c1[i+1] = x1;
c1[i+2] = x2;
c1[i+3] = x3;
}
return 1;
}
/* Autoconvolute the data in c1. Result is in c1. */
int
fnt_autoconvolute(mpd_uint_t *c1, mpd_size_t n, int modnum)
{
int (*fnt)(mpd_uint_t *, mpd_size_t, int);
int (*inv_fnt)(mpd_uint_t *, mpd_size_t, int);
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_uint_t n_inv, umod;
mpd_size_t i;
SETMODULUS(modnum);
n_inv = POWMOD(n, (umod-2));
if (ispower2(n)) {
if (n > SIX_STEP_THRESHOLD) {
fnt = six_step_fnt;
inv_fnt = inv_six_step_fnt;
}
else {
fnt = std_fnt;
inv_fnt = std_inv_fnt;
}
}
else {
fnt = four_step_fnt;
inv_fnt = inv_four_step_fnt;
}
if (!fnt(c1, n, modnum)) {
return 0;
}
for (i = 0; i < n-1; i += 2) {
mpd_uint_t x0 = c1[i];
mpd_uint_t x1 = c1[i+1];
MULMOD2(&x0, x0, &x1, x1);
c1[i] = x0;
c1[i+1] = x1;
}
if (!inv_fnt(c1, n, modnum)) {
return 0;
}
for (i = 0; i < n-3; i += 4) {
mpd_uint_t x0 = c1[i];
mpd_uint_t x1 = c1[i+1];
mpd_uint_t x2 = c1[i+2];
mpd_uint_t x3 = c1[i+3];
MULMOD2C(&x0, &x1, n_inv);
MULMOD2C(&x2, &x3, n_inv);
c1[i] = x0;
c1[i+1] = x1;
c1[i+2] = x2;
c1[i+3] = x3;
}
return 1;
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef CONVOLUTE_H
#define CONVOLUTE_H
#include "mpdecimal.h"
#include <stdio.h>
#define SIX_STEP_THRESHOLD 4096
int fnt_convolute(mpd_uint_t *c1, mpd_uint_t *c2, mpd_size_t n, int modnum);
int fnt_autoconvolute(mpd_uint_t *c1, mpd_size_t n, int modnum);
#endif
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include <assert.h>
#include "numbertheory.h"
#include "umodarith.h"
#include "crt.h"
/* Bignum: Chinese Remainder Theorem, extends the maximum transform length. */
/* Multiply P1P2 by v, store result in w. */
static inline void
_crt_mulP1P2_3(mpd_uint_t w[3], mpd_uint_t v)
{
mpd_uint_t hi1, hi2, lo;
_mpd_mul_words(&hi1, &lo, LH_P1P2, v);
w[0] = lo;
_mpd_mul_words(&hi2, &lo, UH_P1P2, v);
lo = hi1 + lo;
if (lo < hi1) hi2++;
w[1] = lo;
w[2] = hi2;
}
/* Add 3 words from v to w. The result is known to fit in w. */
static inline void
_crt_add3(mpd_uint_t w[3], mpd_uint_t v[3])
{
mpd_uint_t carry;
mpd_uint_t s;
s = w[0] + v[0];
carry = (s < w[0]);
w[0] = s;
s = w[1] + (v[1] + carry);
carry = (s < w[1]);
w[1] = s;
w[2] = w[2] + (v[2] + carry);
}
/* Divide 3 words in u by v, store result in w, return remainder. */
static inline mpd_uint_t
_crt_div3(mpd_uint_t *w, const mpd_uint_t *u, mpd_uint_t v)
{
mpd_uint_t r1 = u[2];
mpd_uint_t r2;
if (r1 < v) {
w[2] = 0;
}
else {
_mpd_div_word(&w[2], &r1, u[2], v); /* GCOV_NOT_REACHED */
}
_mpd_div_words(&w[1], &r2, r1, u[1], v);
_mpd_div_words(&w[0], &r1, r2, u[0], v);
return r1;
}
/*
* Chinese Remainder Theorem:
* Algorithm from Joerg Arndt, "Matters Computational",
* Chapter 37.4.1 [http://www.jjj.de/fxt/]
*
* See also Knuth, TAOCP, Volume 2, 4.3.2, exercise 7.
*/
/*
* CRT with carry: x1, x2, x3 contain numbers modulo p1, p2, p3. For each
* triple of members of the arrays, find the unique z modulo p1*p2*p3, with
* zmax = p1*p2*p3 - 1.
*
* In each iteration of the loop, split z into result[i] = z % MPD_RADIX
* and carry = z / MPD_RADIX. Let N be the size of carry[] and cmax the
* maximum carry.
*
* Limits for the 32-bit build:
*
* N = 2**96
* cmax = 7711435591312380274
*
* Limits for the 64 bit build:
*
* N = 2**192
* cmax = 627710135393475385904124401220046371710
*
* The following statements hold for both versions:
*
* 1) cmax + zmax < N, so the addition does not overflow.
*
* 2) (cmax + zmax) / MPD_RADIX == cmax.
*
* 3) If c <= cmax, then c_next = (c + zmax) / MPD_RADIX <= cmax.
*/
void
crt3(mpd_uint_t *x1, mpd_uint_t *x2, mpd_uint_t *x3, mpd_size_t rsize)
{
mpd_uint_t p1 = mpd_moduli[P1];
mpd_uint_t umod;
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_uint_t a1, a2, a3;
mpd_uint_t s;
mpd_uint_t z[3], t[3];
mpd_uint_t carry[3] = {0,0,0};
mpd_uint_t hi, lo;
mpd_size_t i;
for (i = 0; i < rsize; i++) {
a1 = x1[i];
a2 = x2[i];
a3 = x3[i];
SETMODULUS(P2);
s = ext_submod(a2, a1, umod);
s = MULMOD(s, INV_P1_MOD_P2);
_mpd_mul_words(&hi, &lo, s, p1);
lo = lo + a1;
if (lo < a1) hi++;
SETMODULUS(P3);
s = dw_submod(a3, hi, lo, umod);
s = MULMOD(s, INV_P1P2_MOD_P3);
z[0] = lo;
z[1] = hi;
z[2] = 0;
_crt_mulP1P2_3(t, s);
_crt_add3(z, t);
_crt_add3(carry, z);
x1[i] = _crt_div3(carry, carry, MPD_RADIX);
}
assert(carry[0] == 0 && carry[1] == 0 && carry[2] == 0);
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef CRT_H
#define CRT_H
#include "mpdecimal.h"
#include <stdio.h>
void crt3(mpd_uint_t *x1, mpd_uint_t *x2, mpd_uint_t *x3, mpd_size_t nmemb);
#endif
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include <assert.h>
#include "bits.h"
#include "numbertheory.h"
#include "umodarith.h"
#include "difradix2.h"
/* Bignum: The actual transform routine (decimation in frequency). */
/*
* Generate index pairs (x, bitreverse(x)) and carry out the permutation.
* n must be a power of two.
* Algorithm due to Brent/Lehmann, see Joerg Arndt, "Matters Computational",
* Chapter 1.14.4. [http://www.jjj.de/fxt/]
*/
static inline void
bitreverse_permute(mpd_uint_t a[], mpd_size_t n)
{
mpd_size_t x = 0;
mpd_size_t r = 0;
mpd_uint_t t;
do { /* Invariant: r = bitreverse(x) */
if (r > x) {
t = a[x];
a[x] = a[r];
a[r] = t;
}
/* Flip trailing consecutive 1 bits and the first zero bit
* that absorbs a possible carry. */
x += 1;
/* Mirror the operation on r: Flip n_trailing_zeros(x)+1
high bits of r. */
r ^= (n - (n >> (mpd_bsf(x)+1)));
/* The loop invariant is preserved. */
} while (x < n);
}
/* Fast Number Theoretic Transform, decimation in frequency. */
void
fnt_dif2(mpd_uint_t a[], mpd_size_t n, struct fnt_params *tparams)
{
mpd_uint_t *wtable = tparams->wtable;
mpd_uint_t umod;
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_uint_t u0, u1, v0, v1;
mpd_uint_t w, w0, w1, wstep;
mpd_size_t m, mhalf;
mpd_size_t j, r;
assert(ispower2(n));
assert(n >= 4);
SETMODULUS(tparams->modnum);
/* m == n */
mhalf = n / 2;
for (j = 0; j < mhalf; j += 2) {
w0 = wtable[j];
w1 = wtable[j+1];
u0 = a[j];
v0 = a[j+mhalf];
u1 = a[j+1];
v1 = a[j+1+mhalf];
a[j] = addmod(u0, v0, umod);
v0 = submod(u0, v0, umod);
a[j+1] = addmod(u1, v1, umod);
v1 = submod(u1, v1, umod);
MULMOD2(&v0, w0, &v1, w1);
a[j+mhalf] = v0;
a[j+1+mhalf] = v1;
}
wstep = 2;
for (m = n/2; m >= 2; m>>=1, wstep<<=1) {
mhalf = m / 2;
/* j == 0 */
for (r = 0; r < n; r += 2*m) {
u0 = a[r];
v0 = a[r+mhalf];
u1 = a[m+r];
v1 = a[m+r+mhalf];
a[r] = addmod(u0, v0, umod);
v0 = submod(u0, v0, umod);
a[m+r] = addmod(u1, v1, umod);
v1 = submod(u1, v1, umod);
a[r+mhalf] = v0;
a[m+r+mhalf] = v1;
}
for (j = 1; j < mhalf; j++) {
w = wtable[j*wstep];
for (r = 0; r < n; r += 2*m) {
u0 = a[r+j];
v0 = a[r+j+mhalf];
u1 = a[m+r+j];
v1 = a[m+r+j+mhalf];
a[r+j] = addmod(u0, v0, umod);
v0 = submod(u0, v0, umod);
a[m+r+j] = addmod(u1, v1, umod);
v1 = submod(u1, v1, umod);
MULMOD2C(&v0, &v1, w);
a[r+j+mhalf] = v0;
a[m+r+j+mhalf] = v1;
}
}
}
bitreverse_permute(a, n);
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef DIF_RADIX2_H
#define DIF_RADIX2_H
#include "mpdecimal.h"
#include <stdio.h>
#include "numbertheory.h"
void fnt_dif2(mpd_uint_t a[], mpd_size_t n, struct fnt_params *tparams);
#endif
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "bits.h"
#include "difradix2.h"
#include "numbertheory.h"
#include "fnt.h"
/* Bignum: Fast transform for medium-sized coefficients. */
/* forward transform, sign = -1 */
int
std_fnt(mpd_uint_t *a, mpd_size_t n, int modnum)
{
struct fnt_params *tparams;
assert(ispower2(n));
assert(n >= 4);
assert(n <= 3*MPD_MAXTRANSFORM_2N);
if ((tparams = _mpd_init_fnt_params(n, -1, modnum)) == NULL) {
return 0;
}
fnt_dif2(a, n, tparams);
mpd_free(tparams);
return 1;
}
/* reverse transform, sign = 1 */
int
std_inv_fnt(mpd_uint_t *a, mpd_size_t n, int modnum)
{
struct fnt_params *tparams;
assert(ispower2(n));
assert(n >= 4);
assert(n <= 3*MPD_MAXTRANSFORM_2N);
if ((tparams = _mpd_init_fnt_params(n, 1, modnum)) == NULL) {
return 0;
}
fnt_dif2(a, n, tparams);
mpd_free(tparams);
return 1;
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef FNT_H
#define FNT_H
#include "mpdecimal.h"
#include <stdio.h>
int std_fnt(mpd_uint_t a[], mpd_size_t n, int modnum);
int std_inv_fnt(mpd_uint_t a[], mpd_size_t n, int modnum);
#endif
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include "mpdecimal.h"
#include <assert.h>
#include "numbertheory.h"
#include "sixstep.h"
#include "transpose.h"
#include "umodarith.h"
#include "fourstep.h"
/* Bignum: Cache efficient Matrix Fourier Transform for arrays of the
form 3 * 2**n (See literature/matrix-transform.txt). */
#ifndef PPRO
static inline void
std_size3_ntt(mpd_uint_t *x1, mpd_uint_t *x2, mpd_uint_t *x3,
mpd_uint_t w3table[3], mpd_uint_t umod)
{
mpd_uint_t r1, r2;
mpd_uint_t w;
mpd_uint_t s, tmp;
/* k = 0 -> w = 1 */
s = *x1;
s = addmod(s, *x2, umod);
s = addmod(s, *x3, umod);
r1 = s;
/* k = 1 */
s = *x1;
w = w3table[1];
tmp = MULMOD(*x2, w);
s = addmod(s, tmp, umod);
w = w3table[2];
tmp = MULMOD(*x3, w);
s = addmod(s, tmp, umod);
r2 = s;
/* k = 2 */
s = *x1;
w = w3table[2];
tmp = MULMOD(*x2, w);
s = addmod(s, tmp, umod);
w = w3table[1];
tmp = MULMOD(*x3, w);
s = addmod(s, tmp, umod);
*x3 = s;
*x2 = r2;
*x1 = r1;
}
#else /* PPRO */
static inline void
ppro_size3_ntt(mpd_uint_t *x1, mpd_uint_t *x2, mpd_uint_t *x3, mpd_uint_t w3table[3],
mpd_uint_t umod, double *dmod, uint32_t dinvmod[3])
{
mpd_uint_t r1, r2;
mpd_uint_t w;
mpd_uint_t s, tmp;
/* k = 0 -> w = 1 */
s = *x1;
s = addmod(s, *x2, umod);
s = addmod(s, *x3, umod);
r1 = s;
/* k = 1 */
s = *x1;
w = w3table[1];
tmp = ppro_mulmod(*x2, w, dmod, dinvmod);
s = addmod(s, tmp, umod);
w = w3table[2];
tmp = ppro_mulmod(*x3, w, dmod, dinvmod);
s = addmod(s, tmp, umod);
r2 = s;
/* k = 2 */
s = *x1;
w = w3table[2];
tmp = ppro_mulmod(*x2, w, dmod, dinvmod);
s = addmod(s, tmp, umod);
w = w3table[1];
tmp = ppro_mulmod(*x3, w, dmod, dinvmod);
s = addmod(s, tmp, umod);
*x3 = s;
*x2 = r2;
*x1 = r1;
}
#endif
/* forward transform, sign = -1; transform length = 3 * 2**n */
int
four_step_fnt(mpd_uint_t *a, mpd_size_t n, int modnum)
{
mpd_size_t R = 3; /* number of rows */
mpd_size_t C = n / 3; /* number of columns */
mpd_uint_t w3table[3];
mpd_uint_t kernel, w0, w1, wstep;
mpd_uint_t *s, *p0, *p1, *p2;
mpd_uint_t umod;
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_size_t i, k;
assert(n >= 48);
assert(n <= 3*MPD_MAXTRANSFORM_2N);
/* Length R transform on the columns. */
SETMODULUS(modnum);
_mpd_init_w3table(w3table, -1, modnum);
for (p0=a, p1=p0+C, p2=p0+2*C; p0<a+C; p0++,p1++,p2++) {
SIZE3_NTT(p0, p1, p2, w3table);
}
/* Multiply each matrix element (addressed by i*C+k) by r**(i*k). */
kernel = _mpd_getkernel(n, -1, modnum);
for (i = 1; i < R; i++) {
w0 = 1; /* r**(i*0): initial value for k=0 */
w1 = POWMOD(kernel, i); /* r**(i*1): initial value for k=1 */
wstep = MULMOD(w1, w1); /* r**(2*i) */
for (k = 0; k < C-1; k += 2) {
mpd_uint_t x0 = a[i*C+k];
mpd_uint_t x1 = a[i*C+k+1];
MULMOD2(&x0, w0, &x1, w1);
MULMOD2C(&w0, &w1, wstep); /* r**(i*(k+2)) = r**(i*k) * r**(2*i) */
a[i*C+k] = x0;
a[i*C+k+1] = x1;
}
}
/* Length C transform on the rows. */
for (s = a; s < a+n; s += C) {
if (!six_step_fnt(s, C, modnum)) {
return 0;
}
}
#if 0 /* An unordered transform is sufficient for convolution. */
/* Transpose the matrix. */
transpose_3xpow2(a, R, C);
#endif
return 1;
}
/* backward transform, sign = 1; transform length = 3 * 2**n */
int
inv_four_step_fnt(mpd_uint_t *a, mpd_size_t n, int modnum)
{
mpd_size_t R = 3; /* number of rows */
mpd_size_t C = n / 3; /* number of columns */
mpd_uint_t w3table[3];
mpd_uint_t kernel, w0, w1, wstep;
mpd_uint_t *s, *p0, *p1, *p2;
mpd_uint_t umod;
#ifdef PPRO
double dmod;
uint32_t dinvmod[3];
#endif
mpd_size_t i, k;
assert(n >= 48);
assert(n <= 3*MPD_MAXTRANSFORM_2N);
#if 0 /* An unordered transform is sufficient for convolution. */
/* Transpose the matrix, producing an R*C matrix. */
transpose_3xpow2(a, C, R);
#endif
/* Length C transform on the rows. */
for (s = a; s < a+n; s += C) {
if (!inv_six_step_fnt(s, C, modnum)) {
return 0;
}
}
/* Multiply each matrix element (addressed by i*C+k) by r**(i*k). */
SETMODULUS(modnum);
kernel = _mpd_getkernel(n, 1, modnum);
for (i = 1; i < R; i++) {
w0 = 1;
w1 = POWMOD(kernel, i);
wstep = MULMOD(w1, w1);
for (k = 0; k < C; k += 2) {
mpd_uint_t x0 = a[i*C+k];
mpd_uint_t x1 = a[i*C+k+1];
MULMOD2(&x0, w0, &x1, w1);
MULMOD2C(&w0, &w1, wstep);
a[i*C+k] = x0;
a[i*C+k+1] = x1;
}
}
/* Length R transform on the columns. */
_mpd_init_w3table(w3table, 1, modnum);
for (p0=a, p1=p0+C, p2=p0+2*C; p0<a+C; p0++,p1++,p2++) {
SIZE3_NTT(p0, p1, p2, w3table);
}
return 1;
}
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef FOUR_STEP_H
#define FOUR_STEP_H
#include "mpdecimal.h"
#include <stdio.h>
int four_step_fnt(mpd_uint_t *a, mpd_size_t n, int modnum);
int inv_four_step_fnt(mpd_uint_t *a, mpd_size_t n, int modnum);
#endif
This diff is collapsed.
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef IO_H
#define IO_H
#include <errno.h>
#include "mpdecimal.h"
#if SIZE_MAX == MPD_SIZE_MAX
#define mpd_strtossize _mpd_strtossize
#else
static inline mpd_ssize_t
mpd_strtossize(const char *s, char **end, int base)
{
int64_t retval;
errno = 0;
retval = _mpd_strtossize(s, end, base);
if (errno == 0 && (retval > MPD_SSIZE_MAX || retval < MPD_SSIZE_MIN)) {
errno = ERANGE;
}
if (errno == ERANGE) {
return (retval < 0) ? MPD_SSIZE_MIN : MPD_SSIZE_MAX;
}
return (mpd_ssize_t)retval;
}
#endif
#endif
This document contains links to the literature used in the process of
creating the library. The list is probably not complete.
Mike Cowlishaw: General Decimal Arithmetic Specification
http://speleotrove.com/decimal/decarith.html
Jean-Michel Muller: On the definition of ulp (x)
lara.inist.fr/bitstream/2332/518/1/LIP-RR2005-09.pdf
T. E. Hull, A. Abrham: Properly rounded variable precision square root
http://portal.acm.org/citation.cfm?id=214413
T. E. Hull, A. Abrham: Variable precision exponential function
http://portal.acm.org/citation.cfm?id=6498
Roman E. Maeder: Storage allocation for the Karatsuba integer multiplication
algorithm. http://www.springerlink.com/content/w15058mj6v59t565/
J. M. Pollard: The fast Fourier transform in a finite field
http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html
David H. Bailey: FFTs in External or Hierarchical Memory
http://crd.lbl.gov/~dhbailey/dhbpapers/
W. Morven Gentleman: Matrix Multiplication and Fast Fourier Transforms
http://www.alcatel-lucent.com/bstj/vol47-1968/articles/bstj47-6-1099.pdf
Mikko Tommila: Apfloat documentation
http://www.apfloat.org/apfloat/2.41/apfloat.pdf
Joerg Arndt: "Matters Computational"
http://www.jjj.de/fxt/
Karl Hasselstrom: Fast Division of Large Integers
www.treskal.com/kalle/exjobb/original-report.pdf
Bignum support (Fast Number Theoretic Transform or FNT):
========================================================
Bignum arithmetic in libmpdec uses the scheme for fast convolution
of integer sequences from:
J. M. Pollard: The fast Fourier transform in a finite field
http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html
The transform in a finite field can be used for convolution in the same
way as the Fourier Transform. The main advantages of the Number Theoretic
Transform are that it is both exact and very memory efficient.
Convolution in pseudo-code:
~~~~~~~~~~~~~~~~~~~~~~~~~~~
fnt_convolute(a, b):
x = fnt(a) # forward transform of a
y = fnt(b) # forward transform of b
z = pairwise multiply x[i] and y[i]
result = inv_fnt(z) # backward transform of z.
Extending the maximum transform length (Chinese Remainder Theorem):
-------------------------------------------------------------------
The maximum transform length is quite limited when using a single
prime field. However, it is possible to use multiple primes and
recover the result using the Chinese Remainder Theorem.
Multiplication in pseudo-code:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_mpd_fntmul(u, v):
c1 = fnt_convolute(u, v, P1) # convolute modulo prime1
c2 = fnt_convolute(u, v, P2) # convolute modulo prime2
c3 = fnt_convolute(u, v, P3) # convolute modulo prime3
result = crt3(c1, c2, c3) # Chinese Remainder Theorem
Optimized transform functions:
------------------------------
There are three different fnt() functions:
std_fnt: "standard" decimation in frequency transform for array lengths
of 2**n. Performs well up to 1024 words.
sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms
std_fnt for large arrays.
fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly
in large parts.
List of bignum-only files:
--------------------------
Functions from these files are only used in _mpd_fntmul().
umodarith.h -> fast low level routines for unsigned modular arithmetic
numbertheory.c -> routines for setting up the FNT
difradix2.c -> decimation in frequency transform, used as the
"base case" by the following three files:
fnt.c -> standard transform for smaller arrays
sixstep.c -> transform large arrays of length 2**n
fourstep.c -> transform arrays of length 3 * 2**n
convolute.c -> do the actual fast convolution, using one of
the three transform functions.
transpose.c -> transpositions needed for the sixstep algorithm.
crt.c -> Chinese Remainder Theorem: use information from three
transforms modulo three different primes to get the
final result.
This diff is collapsed.
This diff is collapsed.
(* Copyright (c) 2011 Stefan Krah. All rights reserved. *)
==========================================================================
Calculate (a * b) % p using special primes
==========================================================================
A description of the algorithm can be found in the apfloat manual by
Tommila [1].
Definitions:
------------
In the whole document, "==" stands for "is congruent with".
Result of a * b in terms of high/low words:
(1) hi * 2**64 + lo = a * b
Special primes:
(2) p = 2**64 - z + 1, where z = 2**n
Single step modular reduction:
(3) R(hi, lo) = hi * z - hi + lo
Strategy:
---------
a) Set (hi, lo) to the result of a * b.
b) Set (hi', lo') to the result of R(hi, lo).
c) Repeat step b) until 0 <= hi' * 2**64 + lo' < 2*p.
d) If the result is less than p, return lo'. Otherwise return lo' - p.
The reduction step b) preserves congruence:
-------------------------------------------
hi * 2**64 + lo == hi * z - hi + lo (mod p)
Proof:
~~~~~~
hi * 2**64 + lo = (2**64 - z + 1) * hi + z * hi - hi + lo
= p * hi + z * hi - hi + lo
== z * hi - hi + lo (mod p)
Maximum numbers of step b):
---------------------------
# To avoid unneccessary formalism, define:
def R(hi, lo, z):
return divmod(hi * z - hi + lo, 2**64)
# For simplicity, assume hi=2**64-1, lo=2**64-1 after the
# initial multiplication a * b. This is of course impossible
# but certainly covers all cases.
# Then, for p1:
hi=2**64-1; lo=2**64-1; z=2**32
p1 = 2**64 - z + 1
hi, lo = R(hi, lo, z) # First reduction
hi, lo = R(hi, lo, z) # Second reduction
hi * 2**64 + lo < 2 * p1 # True
# For p2:
hi=2**64-1; lo=2**64-1; z=2**34
p2 = 2**64 - z + 1
hi, lo = R(hi, lo, z) # First reduction
hi, lo = R(hi, lo, z) # Second reduction
hi, lo = R(hi, lo, z) # Third reduction
hi * 2**64 + lo < 2 * p2 # True
# For p3:
hi=2**64-1; lo=2**64-1; z=2**40
p3 = 2**64 - z + 1
hi, lo = R(hi, lo, z) # First reduction
hi, lo = R(hi, lo, z) # Second reduction
hi, lo = R(hi, lo, z) # Third reduction
hi * 2**64 + lo < 2 * p3 # True
Step d) preserves congruence and yields a result < p:
-----------------------------------------------------
Case hi = 0:
Case lo < p: trivial.
Case lo >= p:
lo == lo - p (mod p) # result is congruent
p <= lo < 2*p -> 0 <= lo - p < p # result is in the correct range
Case hi = 1:
p < 2**64 /\ 2**64 + lo < 2*p -> lo < p # lo is always less than p
2**64 + lo == 2**64 + (lo - p) (mod p) # result is congruent
= lo - p # exactly the same value as the previous RHS
# in uint64_t arithmetic.
p < 2**64 + lo < 2*p -> 0 < 2**64 + (lo - p) < p # correct range
[1] http://www.apfloat.org/apfloat/2.40/apfloat.pdf
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
/*
* Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef MEMORY_H
#define MEMORY_H
#include "mpdecimal.h"
int mpd_switch_to_dyn(mpd_t *result, mpd_ssize_t size, uint32_t *status);
int mpd_switch_to_dyn_zero(mpd_t *result, mpd_ssize_t size, uint32_t *status);
int mpd_realloc_dyn(mpd_t *result, mpd_ssize_t size, uint32_t *status);
#endif
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -734,10 +734,6 @@
RelativePath="..\Include\floatobject.h"
>
</File>
<File
RelativePath="..\Include\formatter_unicode.h"
>
</File>
<File
RelativePath="..\Include\frameobject.h"
>
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment