ABSTRACT
This paper presents an extension of the Spiral system to automatically generate and optimize FFT algorithms for the discrete Fourier transform over finite fields. The generated code is intended to support modular algorithms for multivariate polynomial computations in the modpn library used by Maple. The resulting code provides an order of magnitude speedup over the original implementations in the modpn library, and the Spiral system provides the ability to automatically tune the FFT code to different computing platforms.
- {FLMS06} A. Filatei, X. Li, M. Moreno Maza, and É. Schost. Implementation techniques for fast polynomial arithmetic in a high-level programming environment. In Proc. ISSAC'06, pp 93--100, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
- {FVP06} Franz Franchetti, Yevgen Voronenko and Markus Püschel, "FFT Program Generation for Shared Memory: SMP and Multicore," Proc. Supercomputing (SC), 2006. Google ScholarDigital Library
- {FVP08} Franz Franchetti, Yevgen Voronenko and Markus Püschel, "A Rewriting System for the Vectorization of Signal Transforms," Proc. High Performance Computing for Computational Science (VECPAR), Lecture Notes in Computer Science, Springer, Vol. 4395, pp. 363--377, 2006. Google ScholarDigital Library
- {LM06} X. Li and M. Moreno Maza. Efficient implementation of polynomial arithmetic in a multiple-level programming environment. In A. Iglesias and N. Takayama, editors, Proc. International Congress of Mathematical Software - ICMS 2006, pp 12--23. Springer, 2006. Google ScholarDigital Library
- {LMP09} Xin Li, Marc Moreno Maza, and Wei Pan. Computations modulo regular chains. In ISSAC '09: Proceedings of the 2009 international symposium on Symbolic and algebraic computation, pp 239--246, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- {Mont85} P. L. Montgomery. Modular Multiplication Without Trial Division. Mathematics of Computation, vol. 44, no. 170, pp. 519--521, 1985.Google ScholarCross Ref
- {PMJ05} Markus Püschel, José M. F. Moura, Jeremy Johnson, David Padua, Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Nick Rizzolo SPIRAL: Code Generation for DSP Transforms Proceedings of the IEEE special issue on "Program Generation, Optimization, and Adaptation," Vol. 93, No. 2, 2005, pp. 232--275.Google ScholarCross Ref
- {www.spiral.net} Spiral project website. http://www.spiral.net, 2010.Google Scholar
Index Terms
- Spiral-generated modular FFT algorithms
Recommendations
FFT blitz: the tensor cores strike back
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingThe fast Fourier Transform (FFT), a reduced-complexity formulation of the Discrete Fourier Transform (DFT), is an important tool in many areas of science and engineering. FFTW is a well-known package that follows this approach and is currently one of ...
Automatic Parallel Library Generation for General-Size Modular FFT Algorithms
CASC 2013: Proceedings of the 15th International Workshop on Computer Algebra in Scientific Computing - Volume 8136This paper presents the automatic library generation for modular FFT algorithms with arbitrary input sizes. We show how to represent the transform and its algorithms at a high abstraction level. Symbolic manipulations and code optimizations that use ...
The Discrete Fourier Transform Over Finite Rings with Application to Fast Convolution
Necessary and sufficient conditions for a direct sum of local rings to support a generalized discrete Fourier transform are derived. In particular, these conditions can be applied to any finite ring. The function O(N) defined by Agarwal and Burrus for ...
Comments