Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers
- 22 December 2005
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
No abstract availableThis publication has 15 references indexed in Scilit:
- System-level fault-tolerance in large-scale parallel machines with buffered coschedulingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Designing Parallel Operating Systems via Parallel ProgrammingPublished by Springer Nature ,2004
- Architectural support for system software on large-scale clustersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- BCS-MPIPublished by Association for Computing Machinery (ACM) ,2003
- Exploiting operating system services to efficiently checkpoint parallel applications in GENESISPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- User-level communication in a system with gang schedulingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- BProcPublished by Association for Computing Machinery (ACM) ,2002
- The design and implementation of ZapPublished by Association for Computing Machinery (ACM) ,2002
- IMPROVED RESOURCE UTILIZATION WITH BUFFERED COSCHEDULINGParallel Algorithms and Applications, 2001
- Process migration in DEMOS/MPACM SIGOPS Operating Systems Review, 1983