Description:
You are asked to develop a distributed mutual exclusion primitive for a number of processes running on the computers on a single switched LAN (our Linux lab).
System Architecture:
Process1 Process2 Process3 Process4 ...
| | | |
| | | |
| LAN | | |
|------------------|--------------|---------------|-----
The client and servers are running Network File System (NFS) so that user files are visible at $HOME directory. You may want to set up the following environment:
$HOME/[login to view URL]: a list of computers (hostnames) which are running processes that need the distributed mutual exclusion. There is no reason why your implementation cannot support up to 5 servers.
Your implementation must follow the distributed mutual exclusion algorithm descrbied below.
A process sends a REQUEST message to all other processes to request their permission to enter the critical section. A process sends a REPLY message to a process to give its permission to that process.
Processes use Lamport-style logical clocks to assign a timestamp to critical section requests and timestamps are used to decide the priority of requests.
Each process pi maintains the Request-Deferred array, RDi , the size of which is the same as the number of processes in the system.
Initially, ∀i ∀j: RDi [j]=0. Whenever pi defer the request sent by pj , it sets RDi [j]=1 and after it has sent a REPLY message to pj , it sets RDi [j]=0.
Requesting the critical section:
(a) When a site S i wants to enter the CS, it broadcasts a timestamped REQUEST message to all other sites.
(b) When site S j receives a REQUEST message from site S i , it sends a REPLY message to site S i if site S j is neither requesting nor executing the CS, or if the site S j is requesting and S i ’s request’s timestamp is smaller than site S j ’s own request’s timestamp. Otherwise, the reply is deferred and S j sets RD j [i]=1
Executing the critical section:
(c) Site S i enters the CS after it has received a REPLY message from every site it sent a REQUEST message to.
Releasing the critical section:
(d) When site S i exits the CS, it sends all the deferred REPLY messages: ∀j if RD i [j]=1, then send a REPLY message to S j and set RD i [j]=0.
Notes:
When a site receives a message, it updates its clock using the timestamp in the message.
When a site takes up a request for the CS for processing, it updates its local clock and assigns a timestamp to the request.
Test:
To demonstrate your distributed mutual exclusion primitive is working, you must provide a test program that is running at all computers listed in your [login to view URL] file. In the test, you may want to simulate a bank account balance inquiry, deposit, and withdraw operations from all computers. The application program must invoke the distributed mutual exclusion primitive before and after the actually bank account read/write operations.
Implementation Guidance:
1. Message Structure:
struct msg_packet {
unsigned short cmd; // command, e.g., HELLO | HELLO_ACK | REQUEST | REPLY (HELLO & HELLO_ACK are additional command for debugging purposes)
unsigned short seq; // sequence number of avoid duplicates
unsigned int hostid; // this is optional because you can obtain the host (sender) information from the ip header, we add this fied for the convenience.
unsigned short vector_time[5]; // support upto to 5 hosts
};
2. [login to view URL]:
You may artificially give an ID to each host. For example:
[login to view URL] 0
[login to view URL] 1
[login to view URL] 2
[login to view URL] 3
Therefore, the timestamps will be in the same order for each host. The host ID (as the index) is also useful for managing the deferred array.
3. Blocking:
When a process sends a request but does not receive all replies, this process will be blocked until it receive all the replies.
When you use the UDP platform, this blocking is automatically done for you when you invoke recvfrom function. Note, you need to maintain a counter that counts how many REPLY messages you receive from all other hosts. If no enough REPLY received, you still need to call recvfrom function (and therefore the process is blocked).
Requirements:
1. You can only use C programming lanuage to complete this project. Your implementation should not rely on any extra library (to compile your code). Your code must compile by using gcc installed on the Linux workstations in FH 133.
2. Your distributed mutual exclusion primitive must be based on UDP based message passing.
3. (30 points) The UDP networking platform (code) is provided.
[login to view URL]
Your implement must be built on the provide UDP platform and perform a message communication test as described in the following:
Suppose you have 5 server with their IDs: 0, 1, 2, 3, 4 (contained in $HOME/[login to view URL]).
Servers with IDs 1, 2, 3, and 4 should be running first and doing nothing (listening);
Server 0 runs next. Server 0 sends a "Hello" message (can be emulated by 4 unicasts) to all other servers;
Once a server receives a "Hello" message, it immediately replies the sender with a unicast message "Reply".
When servers 1, 2, 3, and 4 receive a "Hello" from server 0, they wait for a certain amount of time and then each of them sends a "Hello" to everyone else. The waiting time is arranged as the following: server1: 2s; server2: 3s; server3: 5s; server4:7s.
The response to the "Hello" messages from server 1, 2, 3, & 4, should be the same way responding the "Hello" from server 0.
Each server prints out the messages it receives with the following format: the sender ID: the message content
4. (60 points) Your distributed mutual exclusion primitive must provide a clean interface (API) such that application programmers only need to invoke the lock and unlock calls before and after the program enters the critical section. While the lock is not available, the calling process must be suspended and be waiting for the availability of the lock. Busy waiting is not allowed.
DIstributed Mutual Exclusion API:
1. int distributed_mutex_init();
for initialization, should be called by the user who wants to use the distributed_mutex primitive;
in addition to variable initialization, init() should create a pthread that is listening on port 0x3333 (for incoming message)
return an int: 0 indicates normal and ready; -1 indicates an error.
2. int distributed_mutex_lock();
before the calling user enters in the critical section, the user must obtain this lock. Or the user should be blocked (waiting)
return an int: 0 indicates normal; -1 indicates an error.
3. int distributed_mutex_unlock();
it's the user's responsibility to call unlock before leaving the critical section. this API should send queued (deferred) REPLY messages to the sending hosts according the procotol described above.
return in int: 0 indicates normal; -1 indicates an error.
5. Makefile: you need to provide a Makefile that allows the instructor to compile your code by simply typing "make"
6. (10 points) README: you are required to write a README document (in txt format) that describes your project design detail and the execution sequence (with the commands). In particular, please explicitly state which part, if there is any, does not work and the possible reasons why that module does not work. For those working modules, please give a brief (in short) sample output.
******** Further Hint: Server Process Implementation Details. *************
Note that two or more processes may send the requests simultaneously. To respond the requests instantaneouly and timely, each server process should process the requests concurrently. That is, for each received request, a new thread (using Linux pthread library) should be spawned to service this request.
The duty of the spawned thread is to process the incoming message. It's duty includes, but not litmited to, in the following:
1. If the host receives a REQUEST message and the host is not requesting, sends a REPLY immediately;
2. If the host receives a REQUEST message and the host is also requesting:
a) the host has a vector time earlier than that of REQUEST, defer the REPLY and put the REQUEST in the queue;
b) the host has a vector time later than that of REQUEST, sends a REPLY immediately.
3. If the host receives an other message, e.g., HELLO, replies an ACK immediately.
The communication between the spawned thread and the main thread (distributed_mutex_lock()) can be done through pthread semaphore:
int distributed_mutex_lock(void) { // parameter setting is up to you
sem_init(&sem, 0, 4); // creatte a semaphore with a value of 5 (assuming there are 5 hosts, so we need to collect 4 REPLYs to proceed)
### sem_init(&sem, 0, 0); // initialize a semaphore with value 0 (have not received n-1 REPLYs yet): automatically blocking
###
send REQUESTs to other hosts for entering CS
sem_wait(&sem); // if the value goes below 4, the distributed_mutex_lock will wait until someone else adds // if the value is zero, the distributed_mutex_lock will wait until someone does sem_post(&sem)
// reached here since all REPLYs are collected
return 0; // allow the user to enter CS
}
In your spawned thread that processes the incoming messages, if it is a REPLY message, then you should do: sem_post(&sem).
### correction: if a REPLY message is received, this thread should increase the REPLY counter by 1
### sem_post(&sem) should only be invoked, which will increase the sem value by 1, if the REPLY counter is not n-1 yet.
### here you do need to maintain a REPLY counter, which should be protected by a (pthread) mutex.
### before doing sem_post(&sem), you also need to reset the REPLY counter back to zero.
###