Architectural techniques for accelerating subword permutations with repetitions
- 4 August 2003
- journal article
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Very Large Scale Integration (VLSI) Systems
- Vol. 11 (3) , 325-335
- https://doi.org/10.1109/tvlsi.2003.812318
Abstract
We propose two new instructions, swperm and sieve, that can be used to efficiently complete an arbitrary bit-level permutation of an n-bit word with or without repetitions. Permutations with repetitions are rearrangements of an ordered set in which elements may replace other elements in the set; such permutations are useful in cryptographic algorithms. On a four-way superscalar processor, we can complete an arbitrary 64-bit permutation with repetitions of 1-bit subwords in 11 instructions and only four cycles using the two proposed instructions. For subwords of size 4 bits or greater, we can perform an arbitrary permutation with repetitions of a 64-bit register in a single cycle using a single swperm instruction. This improves upon previous results by requiring fewer instructions to permute 4-bit or larger subwords packed in a 64-bit register and fewer execution cycles for 1-bit subwords on wide superscalar processors. We also demonstrate that we can accelerate the performance of the popular DES block cipher using the proposed instructions. We obtain a DES performance improvement of at least 55% in constrained embedded environments and an improvement of 71% on a four-way superscalar processor when applying DES as a cryptographic hash function.Keywords
This publication has 13 references indexed in Scilit:
- Multimedia extensions for general-purpose processorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Efficient permutation instructions for fast software cryptographyIEEE Micro, 2001
- Multimedia instructions in ia-64Published by Institute of Electrical and Electronics Engineers (IEEE) ,2001
- Architectural support for fast symmetric-key cryptographyPublished by Association for Computing Machinery (ACM) ,2000
- AltiVec extension to PowerPC accelerates media processingIEEE Micro, 2000
- AMD 3DNow! technology: architecture and implementationsIEEE Micro, 1999
- Subword parallelism with MAX-2IEEE Micro, 1996
- MMX technology extension to the Intel architectureIEEE Micro, 1996
- Accelerating multimedia with enhanced microprocessorsIEEE Micro, 1995
- Precision architectureComputer, 1989