CS 670 - Distributed Operating Systems Theory

Bulletin Description

This course covers advanced distributed operating system algorithms and theory. Topics such as distributed mutual exclusion, distributed event ordering, distributed deadlock detection/avoidance, agreement protocols, consistent global snapshot collection, stable predicate detection, failure recovery, fault-tolerant consensus, leader election, process groups and group communication. Case studies of distributed operating systems such as LOCUS, Grapevine, V System, ISIS, Amoeba, Sprite, and Mach will be used as illustrations of the above algorithms.

Prerequisites

CS 570 or consent of instructor.

Expected Preparation

CS 670 is intended to be an advanced graduate-level course in distributed systems. Students should have had a graduate course in operating systems, (equivalent to CS 570) covering distributed operating systems, multiprocessor operating systems, database operating systems and security issues in distributed system.

Student Learning Outcomes

Students will learn about distributed systems design and implementation. They will be exposed to various areas of research in distributed systems and mobile computing systems. They will learn about designing and implementing fault tolerant distributed systems. A student completing this course successfully will be able to pursue independent research in distributed systems.

Syllabus Information

Week by Week Course Outline:

This is a sample outline. Exact outline will be determined by the instructor offering this course.

Weeks Topics
1 Introduction
2-3 Synchronization, distributed mutual exclusion
4 Deadlock Detection/Avoidance
5 Consistent global snapshot collection
6 Predicate detection
7 Failure recovery in distributed systems
8-9 Fault-tolerant consensus
10-11 Leader election algorithms, Agreement Protocols
12-13 Process Groups and group communication
14-15 Experimental distributed operating systems

Examinations

Exact details about examinations in this course will be determined by the instructor offering the course. Typically there will be one in-class, midterm examination during the semester and a two-hour final examination. Specific details will be made available in the syllabus at the start of each semester in which the course is offered.

Grading:

A student's grade will be determined by a weighted average of homework assignments, programming exercises, projects, hour examinations, and the final examination. The faculty offering the course will make the details available at the start of the course. A typical weighting is:

Homework: 40%
Midterm: 25%
Final Examination: 35%

Possible Textbooks:

Nancy A. Lynch
Distributed Algorithms
Morgan Kaufmann

Hagit Attiya and Jennifer Welch
Distributed Computing Fundamentals, Simulations and Advanced Topics
McGraw Hill

Vijay K. Garg
Principles of Distributed Systems
Kluwer Academic Publishers

Mukesh Singhal and Niranjan G. Shivaratri
Advanced Concepts in Operating Systems
McGraw Hill, 1994

Papers from the literature