Computer Science 441

Principles of Distributed Systems

Gregory M. Kapfhammer


flickr photo shared by Patrick Brosset under a Creative Commons ( BY-NC ) license

Color Scheme

Key Concept

Corresponding Diagram

In-Class Discussion

In-Class Activity

Details in the Textbook

Key Advances in Technology

Disks, CPU, RAM

High speed networks

Centralized versus Distributed

Definition of a Distributed System

What role does software play in this system?

Software components should be autonomous

Users think that they see a single system

No Assumptions

Computer type (e.g., Mica mote to super computer)

Connection type (e.g., weak wireless to ATM)

What are the implications of this decision?


flickr photo shared by Express Monorail under a Creative Commons ( BY-NC-ND ) license

Key Characteristics

Differences should be hidden from users

Users can interact in a consistent fashion

Should be continuously available

Should be easy to scale the system

What is the meaning of scalability?

Form terms to draw scalability graphs

Horizontal and vertical axis

Linear scalability

Sub-linear scalability

Super-linear scalability

Which one is the most realistic? Why?

Middleware

Refer to Figure 1.1 for more details!

What are examples of middleware?

Single-system view

Organize the system into layers

Hide the differences in computers!

Goals

Make resources available

Achieve distribution transparency

Allow for system openness

Make the system scalable

Let's discuss these in greater detail!

Also, be aware of the pitfalls and trade-offs!


flickr photo shared by p!o under a Creative Commons ( BY-NC-ND ) license

Transparency

The goal is to be able to hide ...

Access: data representation and access method

Location: where a resource resides

Migration: when a resource moves

Relocation: that a resource is moving

Replication: that the resource is copied

Concurrency: that the resource is shared

Failure: that the resource crashed and recovered

What are some of the challenges?

What are some of the challenges?

Let's examine transparency issues in greater detail!

Using Caches

What is a cache?

How does a cache improve scalability?

What are the challenges with using a cache?

Remember, caching is a special form of replication

Failure Transparency

How do we distinguish between a resource that has crashed and one that is slow?

Now, let's focus on scalability concerns

Scalability

Size: Add more users and resources

Geography: Allow users to exist "far away"

Administrative: Span many organizations

What are the challenges associated with scalability?

Scalability Bottlenecks

Centralized services

Centralized data

Centralized algorithms

See page 11 for the definition of a distributed algorithm

Synchronous vs. Asynchronous

See Figure 1-4 for details

Which one would be faster? Why?

Types of Distributed Systems

Computing

Information

Embedded (or pervasive)

Distributed Computing Systems

High-performance computing tasks

Cluster and grid computing systems

The focus is on the computation!

Distributed Information Systems

Transaction processing

Enterprise application integration

The focus is on the data!

ACID Properties

Atomic, consistent, isolated, durable

Distributed Pervasive Systems

Embrace contextual changes

Encourage ad-hoc composition

Recognize sharing as the default

Sensor Networks

TinyOS and TinyDB

Compare and contrast the systems in Figure 1-13