Search for question
Question

2 Part II: Advanced

Problem 4 (9 marks, 1.5 pages). Problem Statement: You have two servers, X and Y,

each containing a database of n records, where each record is quite large (m bits).

Occasionally, a record may be updated on one server and not the other. To ensure both

servers have identical databases, they need to synchronize their data. During

synchronization, they exchange data in multiple rounds of communication.

Existing Algorithm: The simplest way to synchronize the data is for X to send its

entire database x to Y via the internet. Y then checks every record in x against its own

database y. If a record in x is the same as the corresponding record in y, Y does nothing.

If a record in x is different from the one in y, Y updates its database with the newer

record or sends the newer record to X. X, upon receiving a newer record from Y, updates

its own database.

Challenges with Existing Algorithm: While this method works, it consumes a lot

of bandwidth because it sends all n records over the network.

Goal: Your task is to design a more efficient synchronization algorithm that

minimizes communication complexity (sending data) and computation complexity

(calculating hash values). The key is to use cryptographic hash functions, which take

data as input and produce a fixed-size string (hash) as output. These hash functions are

collision-resistant, meaning it's hard to find two different inputs that produce the same

hash.

Requirements:

a) Overview (2 marks): Explain how your algorithm works, why it's a better

solution, and what the communication and computation complexities are.

b) Detailed Description (3 marks): Provide a detailed description of the main

steps of your algorithm, specifying what X and Y do to synchronize the two databases.

You can use pseudocode for this. The pseudocode format is not essential; clarity is key.

c) Demonstration (2 marks): Illustrate how your algorithm works with a small

example, like when n = 8.

d) Analysis (2 marks): State the communication and computation complexities of

your algorithm and explain why it has these complexities. The input size is (n, m).

The given marks are maximum possible for the parts. Your actual marks depend on

how efficient your algorithm is (regarding the communication complexity (major

criteria) AND the computation complexity) and also how well it is presented. A correct

and efficient but poorly written solution could still attract a low mark. An incorrect

solution will get a zero mark.

Fig: 1