Homework # 9 - Key

Distributed Databases & Client-Server

Due: 4/24/2001

70-455

Information Resources Management

4/17/2001

50 points

Use the information about the Carnegie Library System provided for homework #8.

Additionally, the focus will only be on two major areas:

1. Patron searches for book. A variety of search options are available. However, there are two types of searches. The default will be to search using specified criteria for books only at the current branch. A second option is to search other selected branches or all branches for books meeting the same criteria.

2. Librarians check out books to patrons. A patron can check out a book from any branch provided no fines are due. The librarians also enter when books have been returned. It's possible for someone to return a book that belongs in a different branch. In that case, the librarian still enters that the book has been returned and then puts the book aside for delivery to the proper branch.

The central location and site of the current centralized database is the branch in Oakland. The remaining 18 branches are scattered throughout the Pittsburgh area.

1. Describe how you would implement a client-server system for the Library System. Show what functionality would remain on the server(s) and what functionality would be executed on each of the two types of clients (patron search PC, checkout desk PC). A multi-tiered approach may be recommended. Document any assumptions that you may make about the distribution of the data. (10 points)

Keeping the data centralized, a three-tiered client-server system could be established.

Tier 1 - Patron & Checkout Desk PCs

Presentation Services

I/O Processing and Validation

Tier 2 - Branch Server

Some application programming

Tier 3 - Central Database Server

Remainder of application programming - batch database processing

Data management services

2. Describe how you would distribute the database using the following options (20 points)

A. Fragmentation only

Separate the data by branch and put the information for each branch on a database server located in each branch.

B. Replication only

Replicate the entire database in each branch.

C. Both fragmentation and replication

For all branches, except the central location in Oakland, put only the data for that branch. I would most likely not fragment the Patron table, but only keep the centralized copy of that table. In the Oakland branch, keep a copy of the entire database.

3. Using the "both fragmentation and replication distribution" option, describe the processing strategy to be used for each of the following transactions. (10 points)

A. Patron searches for a book in the current branch

Search the locally available database only.

B. Patron searches for a book at one or two other branches

Send the request to the branches

Each branch does the search

Combine the results

Present the results to the patron

C. Patron searches for a book at all other branches

Send the request to the Oakland branch

Receive the results

Present the results to the patron

D. Librarian checks out a book to a patron

Update the Book table at the local branch

Update the Book table at the Oakland branch

Only after both are complete, indicate a successful checkout

E. Librarian enters the return of a book for the current branch

Update the Book table at the local branch

Update the Book table at the Oakland branch

Only after both are complete, indicate success

F. Librarian enters the return of a book for a different branch

Query the Oakland branch to determine which branch "owns" the book

Send the update to that branch

Update the Book table at the Oakland branch

Only after both are complete, indicate success

4. Recommend one of the distribution options. Provide support for your recommendation, and explain the benefits and drawbacks of the option you selected.

The obvious choice is option C for question 2.

Benefits:

Availability

Faster response times for searches

Each branch is responsible for its own data

Drawbacks:

Development costs will be higher, both people & equipment

Testing is more complex

Problem identification & resolution is more difficult

Greater technical expertise is needed to implement and support

Network dependence is increased

Increased processing overhead

Updates are more complex and require more careful programming

(This is minimized some because there are at most only two copies of any row.)