Distributed Systems

 
Course code:
X_400130
Period:
Period 2
Credits:
6.0
Language of tuition:
English
Faculty:
Faculteit der Exacte Wetenschappen
Coordinator:
prof. dr. ir. A. Iosup
Examinator:
prof. dr. ir. A. Iosup
Lecturers:
prof. dr. ir. A. Iosup
Teaching method(s):
Lecture, Seminar
Level:
400

Course objective

After taking this course, students will be able to:
1. Explain the basic concepts, objectives, and functions of distributed
computing systems, e.g., communication, resource management and
scheduling, data consistency, fault-tolerance, performance.
2. Compare distributed computing with other computing paradigms (i.e.,
centralized, parallel).
3. Identify the different flavors of modern distributed systems (i.e.,
peer-to-peer systems, cluster computing, grid computing, cloud
computing, datacenters, distributed HPC, SDN, Big Data systems, IoT
systems).
4. Analyze the trade-offs inherent in the design of modern distributed
systems.
5. Design your portfolio distributed-system, with many basic and some
complex operations of modern distributed systems.
6. Implement your portfolio distributed-system.
7. Analyze your portfolio distributed-system.

Course content

This course focuses on distributed computing systems. In general,
debugging and tuning existing systems, and designing, implementing, and
analyzing new distributed computing systems remains vital and
challenging.

Starting with the mid-1990s, computing is undergoing a revolution, in
which collections of independent computers appear to users as a single,
albeit distributed, computing system. Motivated by the advent of the
Internet, by the increase in the computation capacity of consumer
computers, by the commoditization of server-grade machines, by energy
constraints, etc., the distributed computing paradigm has permeated all
fields using computers. Current distributed computing applications range
from social networks to banking, from peer-to-peer file-sharing to
high-performance computing used in research, from massively multiplayer
online games to business-critical workloads, etc. Important advances
have helped to fuse heterogeneous resources into truly global
distributed systems, for example in scientific computing, where
distributed computation is using Big Data and distributed sensors to
produce meaningful progress for the humankind. We will focus in this
course on a number of these modern examples of distributed computing
systems.

Although so many distributed systems already exist, the list of
conceptual and technical challenges they pose is long. Depending on
requirements, even trivial communication between nodes of the
distributed system can be challenging. The failure of a single node, or
sometimes even a performance hiccup, can bring an entire system down;
with it, other nodes or entire other systems may also crash,
experiencing correlated and catastrophic failures. Data consistency and
coordinating nodes remain important challenges made worse by the
large-scale of real-world deployments. Poor resource management and
naive scheduling can lead to orders-of-magnitude higher operational
costs and consumption of energy that we simply cannot spare. It is not
uncommon for a modern distributed system to quickly rise and then fall
in popularity, as exemplified by the 2016 example of Pokemon Go. We will
present in this course real-world situations where modern distributed
systems have behaved poorly.

Addressing these challenges requires unique approaches and concepts.
Separating concerns and breaking down problems into smaller cases often
lead to limited success, because many properties of distributed systems
can only be achieved end-to-end. Can anyone imagine a perfectly reliable
production pipeline, if even one of its key stages can suffer failures?
Building capability by adding resources is often offset by the
distributed nature of the system. Can anyone ignore the physical
limitations of communication around the globe? In this course, we will
focus on the unique approaches and principles of distributed systems,
from specific architectures and communication protocols, to specific
concepts in resource management and scheduling, data consistency,
fault-tolerance, and performance.

Form of tuition

The course is taught as a series of lectures, in combination with
self-study and with a large practical assignment.

Type of assessment

Written exam. Depending on enrollment, an oral exam may also be
available.
Report on the practical assignment.

Course reading

The course uses as textbook the book:
Maarten van Steen and Andrew S. Tanenbaum, Distributed Systems, 3rd.
Ed., online edition, 2017. (free for all) [Online] Available:
https://www.distributed-systems.net/index.php/books/distributed-systems-

The lecture slides also recommend additional literature. The study guide
available on Canvas also indicates other worthwhile sources of
information.

Entry requirements

Ability to work in teams for the practical assignment.
Ability to develop code using modern software engineering practices,
e.g., setting up your own GitHub repository, co-editing using tools such
as Overleaf or Sharelatex, etc., is a big plus.

Recommended background knowledge

Students should have taken standard courses on:
- Computer networks.
- Programming paradigms, in particular OOP and/or actor-based
approaches.
- Software engineering.
Prior experience with Internet, web, distributed, or parallel
programming is helpful.
Prior experience with operating systems development and analysis, and in
general experience with computer systems courses, is a big plus.

Target audience

mCS, mPDCS, mSNE (UvA)

Remarks

This course uses gamification.

© Copyright VU University Amsterdam
asnDCcreatorasvVUAmsterdam asnDCdateasv2017 asnstudyguideasvmodule asnDCidentifierasv50049478 asnDCtitleasvDistributedSystems asnperiodasv120 asnperiodasv asncreditsasv6p0 asnvoertaalasvE asnfacultyasv50000044 asnDCcoverageasvprofdrirAIosup