Scalar operand networks: on-chip interconnect for ILP in partitioned architectures
- 27 August 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
The bypass paths and multiported register files in microprocessors serve as an implicit interconnect to communicate operand values among pipeline stages and multiple ALU. Previous superscalar designs implemented this interconnect using centralized structures that do not scale with increasing ILP demands. In search of scalability, recent microprocessor designs in industry and academia exhibit a trend towards distributed resources such as partitioned register files, banked caches, multiple independent compute pipelines, and even multiple program counters. Some of these partitioned microprocessor designs have begun to implement bypassing and operand transport using point-to-point interconnects rather than centralized networks. We call interconnects optimized for scalar data transport, whether centralized or distributed, scalar operand networks. Although these networks share many of the challenges of multiprocessor networks such as scalability and deadlock avoidance, they have many unique requirements, including ultra-low latencies (a few cycles versus tens of cycles) and ultra-fast operation-operand matching. This paper discusses the unique properties of scalar operand networks, examines alternative ways of implementing them, and describes in detail the implementation of one such network in the Raw microprocessor. The paper analyzes the performance of these networks for ILP workloads and the sensitivity of overall ILP performance to network properties.Keywords
This publication has 9 references indexed in Scilit:
- The implementation of the next-generation 64b itanium microprocessorPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Anatomy of a Message in the Alewife MultiprocessorPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Increasing and detecting memory address congruencePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- The RAW benchmark suite: computation structures for general purpose computingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Raw microprocessor: a computational fabric for software circuits and general-purpose programsIEEE Micro, 2002
- Space-time scheduling of instruction-level parallelism on a raw machinePublished by Association for Computing Machinery (ACM) ,1998
- Partitioned register file for TTAsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1995
- THE EVOLUTION OF DATAFLOW ARCHITECTURES: FROM STATIC DATAFLOW TO P-RISCInternational Journal of High Speed Computing, 1993
- A VLSI Architecture for Concurrent Data StructuresPublished by Springer Nature ,1987