04/17/2001

> I am having some difficulties with question 3 on the

> homework. Particularly, there are no graphical examples of

> partitioning in the book or the notes and I am having a hard time

> visualizing examples of this. Partitioning seems to be about breaking

> up the tables, but with the given schema, I don't really see how this

> could create more efficiency. Any help or hints would be greatly

> appreciated.

There are two kinds of partitioning - vertical and horizontal. Each splits the table in a different way. Let me give you an example. Assume you have a student table that has ID as the key and contains name, address, and units completed, QPA, etc.

I could horizontally partition the data by creating tables that have the same columns. So I could create StudentUnderGrad, StudentMasters, StudentPhD, StudentOther. I could also have split up the student file by first letter of last name, StudentAtoL and StudentMtoZ. The benefit of doing this is that I get smaller tables (fewer rows) in each table so queries that only have to search one of the new tables will be faster. The idea is to find a way to partition the table so that a query can use only one (or a few) of the newly created tables rather than having to search one big table.

If I wanted to vertically partition the same Student table, I could create StudentInfo and StudentAddress, where StudentInfo has everything that isn't part of the address. Notice that the ID (key) has to be repeated in the StudentAddress table. In this case both tables have exactly the same number of rows as the original table, but the number of columns in each table is reduced. (The total number of columns increases because the key has to be repeated.) The benefit here is that, again, a smaller table is faster to query against so you want to find a way to partition the table so that only one of the newly created tables has to be accessed to complete a query.

Finally, you can do *both* horizontal and vertical partitioning. The result would be:

StudentUnderGradInfo & StudentUnderGradAddress

StudentMastersInfo & StudentMastersAddress

etc.

The drawback of doing the partitioning is that the complexity of the queries is increased. You now have to know which of the student tables to search to complete a query. Further, partitioning the tables may dramatically increase performance of a single transaction but decrease performance of others since more joins will have to be performed.