Skip to content

Learning Objectives

Course-Level Outcomes

This assignment directly supports the following course learning objectives for CS 4459:

1. Understand Distributed Systems Communication

  • Explain how RPC enables remote function calls across network boundaries
  • Compare RPC with alternative communication patterns (REST, message queues)
  • Analyze trade-offs between different communication approaches

2. Handle Failures in Distributed Systems

  • Identify failure modes in distributed systems (network partitions, server crashes, timeouts)
  • Implement retry strategies and timeout handling
  • Apply at-least-once and at-most-once semantics appropriately

3. Design for Reliability

  • Implement idempotent operations to enable safe retries
  • Apply circuit breaker pattern to prevent cascading failures
  • Design fault-tolerant distributed services

4. Apply Industry-Standard Tools

  • Use gRPC and Protocol Buffers for efficient RPC communication
  • Write service definitions using protobuf IDL
  • Generate client and server code from service definitions

Technical Skills

By completing this assignment, you will gain hands-on experience with:

gRPC Framework

  • Define service contracts using .proto files
  • Generate Python code from protobuf definitions
  • Implement both synchronous and asynchronous RPC calls
  • Configure client and server options (timeouts, interceptors)

Distributed Systems Patterns

  • Retry Logic: Exponential backoff, jitter, max attempts
  • Idempotency: Request IDs, deduplication, state management
  • Circuit Breaker: Failure tracking, state transitions, recovery

Testing Distributed Systems

  • Simulate network failures and timeouts
  • Test retry behavior and idempotency guarantees
  • Measure latency and throughput under different conditions

Conceptual Understanding

What Makes Operations Idempotent?

You'll understand why these operations are idempotent:

set_value(key="x", value=10)  # Always sets x to 10
get_value(key="x")            # Read-only, no side effects
delete(key="x")               # Deleting twice = same result

And why these are NOT idempotent:

increment(key="x")            # Calling twice increments twice
append(key="x", value=5)      # Calling twice appends twice
withdraw(amount=100)          # Calling twice withdraws $200

When to Use Circuit Breakers

You'll learn to identify scenarios where circuit breakers prevent cascading failures:

Scenario: Payment Service Down

  • Payment service becomes unavailable
  • Without circuit breaker: Every request waits for timeout (5s × 1000 requests = 5000s wasted)
  • With circuit breaker: Fails fast after detecting pattern (saves resources, better UX)

Assessment Criteria

Your understanding will be assessed through:

1. Implementation Quality (60%)

  • Correct gRPC service implementation
  • Proper error handling and retry logic
  • Working idempotency mechanism
  • Functional circuit breaker

2. Testing & Validation (20%)

  • Comprehensive test scenarios
  • Clear demonstration of failure handling
  • Performance measurements

3. Analysis & Reflection (20%)

  • Written report answering key questions
  • Trade-off analysis (RPC vs REST)
  • Design decisions and justifications

Success Indicators

You've mastered the material when you can:

✅ Explain why retrying a withdrawal operation is dangerous
✅ Design an API that's safe to retry automatically
✅ Identify when circuit breakers improve system resilience
✅ Choose between RPC and REST for a given use case
✅ Implement timeout handling without blocking other requests


Self-Assessment Questions

Before starting, reflect on these questions:

  1. What happens if a client sends the same request twice?
  2. How long should a client wait before timing out?
  3. When should a system stop retrying and fail fast?
  4. What information do you need to make operations idempotent?

You'll answer these through hands-on implementation!