High-Speed Software Implementation of the Optimal Ate Pairing over Barreto-Naehrig Curves
Abstruct
We release the source of our library described in the ePrint(High-Speed Software Implementation of the Optimal Ate Pairing over Barreto-Naehrig Curves).
The library is able to compute the optimal ate pairing over a 254-bit prime field Fp, in
just 2.33 million of clock cycles on a single core of an Intel Core i7 2.8GHz processor,
which implies that the pairing computation takes 0.836msec.
NEW VERSION[2012/Jan/30]
The new vesion computes the optimal ate pairing in just 1.38M clock cycles on a single core of an Intel Core i7 2620 3.4GHz processor,
which implies that the pairing computation takes 0.409msec.
(2012/Feb/2) 1.55M clock cycles on Core i7 2600K 3.4GHz without turbo boost.
To the best of our knowledge, this is the first time that a software or a hardware accelerator reports a high security level pairing computation either symmetric or asymmetric,
either on one core or on a multi-core platform, in less than one millisecond.
Remark :
Faster Explicit Formulas for Computing Pairings over Ordinary Curves[Diego F. Aranha and Koray Karabina and Patrick Longa and Catherine H. Gebotys and Julio López]
They released more faster implementation at 2010/Oct/13.
NEW VERSION[2012/Jan/30]
Maybe our new version runs at the same speed as their implementation.
Keywords
Tate pairing, optimal pairing, Barreto-Naehrig curve, ordinay curve, finite field arithmetic,
bilinear pairing software implementation.
Download
The latest version :
ate.20120130.zip
old version :
ate.20100908.zip
Requirements
- CPU
- x86_64(amd64) Intel/AMD processor
- OS
- 64-bit Windows/64-bit Linux/Max OS X
- C++ compiler
- Visual Studio 2008 or gcc 4.4.1 or later
Build on Windows
Open ate/ate.sln and compile test_bn with Release mode,
then you can get the binary in ate/x64/Release/test_bn.exe
Build on Linux/Mac OS
Type
>cd ate && make
then you can get the test binary in ate/test/bn
Benchmark
NEW VERSION[2012/Jan/30]
optimal ate pairing for ate.20120130.zip
| CPU | OS | M clock cycles |
| Core i7 2620 3.4GHz(with turbo boost) | Win7 | 1.385 |
| Core i7 2600K 3.4GHz(without turbo boost) | Linux | 1.552 |
| opteron 2376 2.3GHz | Linux | 1.669 |
| Xeon x5650 2.67GHz | Linux | 1.630 |
| Core i5 M 520 2.5GHz | Win7 | 1.603 |
| Core2Duo T7100 1.8GHz | Win7 | 1.998 |
| Core2Duo T9400 2.53GHz | Mac | 2.155 |
| Core i7 2720QM 2.2GHz | Mac | 1.078 |
the following resuts for older version
Cycle counts of multiplication over Fp2, squaring over Fp2, and optimal ate pairing
| Our results | dclxvi[1st version] |
| Core i7a | Opteronb | Core 2 Duoc | Athlon 64d | Opteronb | Core 2 Quade | Athlon 64d |
| Multiplication over Fp2 | 435 | 443 | 558 | 473 | 695 | 693 | 1714 |
| Squaring over over Fp2 | 342 | 355 | 445 | 376 | 614 | 558 | 1207 |
| Miller loop | 1.33M | 1.36M | 1.68M | 1.48M | 2.48M | 2.26M | 5.76M |
| Final exponentiation | 1.00M | 1.04M | 1.27M | 1.08M | 2.52M | 2.21M | 5.51M |
| Optimal ate pairing | 2.33M | 2.40M | 2.95M | 2.56M | 5.0M | 4.47M | 11.27M |
- Intel Core i7 860(2.8GHz), Windows 7, Visual Studio 2008 Professional
- Quad-Core AMD Opteron 2376(2.3GHz), Linux 2.6.18, gcc 4.4.1
- Intel Core 2 Duo T7100(1.8GHz), Windows 7, Visual Studio 2008 Professional
- AMD Athlon(tm) 64 X2 Dual Core Processor 6000+, Linux 2.6.23, gcc 4.1.2
- Intel Core 2 Quad Q6600(2394MHz), Linux 2.6.28, gcc 4.33
License
The BSD 3-Clause License
If you have any questions or problems, just let me know, then I'm happy.
Xbyak
Xbyak is an x86/x64 JIT assembler for C++.
I made this library for developping pairing functions efficiently.
Link
History
2012/Jan/30 new version
2010/Sep/8 change xi from u + 12 to u
2010/Jul/16 new version of pairing(use cyclotomic squarings)
2010/Jun/24 add the result of Athon 64
2010/Jun/23 [trivial] rename r_atePairing as opt_atePairing
2010/Jun/20 first release
1st:2010/Jun/20, last update:2012/01/26
mailto:MITSUNARI Shigeo<herumi@nifty.com>