This technical report describes some lessons learned from implementing the Message Passing Interface (MPI) standard, and some proposed extentions to MPI, at Sandia. The implementations were developed using Sandia-developed lightweight kernels running on the Intel Paragon and Intel TeraFLOPS platforms. The motivations for this research are discussed, and a detailed analysis of several implementation issues is presented.