Carnegie Mellon University 95-702 Organizational Communications and Distributed Object Technologies Lab 1 is due at the start of class on May 30, 2007. StAX The Streaming API for XML processing ========================================= Most of the projects in this course will involve clients interacting with remote objects in various ways. In this project, however, we will write a client that simply reads data marked up in XML from a remote source. We will use the StAX application programmar interface to read the XML. StAX is a relatively new approach to processing XML. Older approaches include the Document Object Model (DOM) and the Simple API for XML (SAX). Later in this course, we will use the Axis2 web service framework from Apache. This modern framework uses StAX to read and write XML messages and it is another reason for our interest in StAX in this lab. Software Prerequisites ====================== In this project you will need a recent version of Java (see http://java.sun.com/javase/). You will also need Eclipse as well as the two jar files implementing the streaming API for XML (StAX) (see https://sjsxp.dev.java.net/.) These jars are named sjsxp.jar and jsr173_1.0_api.jar and should be included in your Java build path when using Eclipse. Project Description =================== There are four schedules found under the directory www.andrew.cmu.edu/~mm6/95-702/McCarthysSchedule. These schedules are named schedule1.xml, schedule2.xml and so on. There is also an XSDL document called schedule.xsd that contains the grammar for the schedule language. Use a browser to examine one of the four schedules and study the schedule grammar carefully. Write a program in Java called Scheduler that attempts to find a meeting time when n > 1 people are free to meet. Scheduler examines a set of schedules and tries to find a meeting time for each day of the week. If it is able to find a common meeting time then it displays the day and time of the meeting. If it is unable to find such a time it announces that fact for that particular day. It does this for each of the seven days. That is, for each day, a common meeting time is either announced or declared as not possible. The input to the scheduling process will include a minimum meeting time in seconds. If a meeting time of 60 * 60 seconds is required then the scheduler will not generate meeting times for anything less than one hour. Scheduler will read a list of URL's from a local urlList.xml file. It will then fetch an XML document from each of these URL's. It will compute any common meeting times and display a report to the user. Looking for 2 hour meeting times, when applied to all four schedules, my scheduler produces the following output: java Scheduler Loading 4 schedules. **This group can't meet on Monday** **This group can't meet on Tuesday** **This group can't meet on Wednesday** **This group can't meet on Thursday** **This group can't meet on Friday** **This group can't meet on Saturday** Meeting scheduled for a minimum of 7200seconds at 13:0:0:16:0:0 on Sunday. You may hard code the minimum meeting time or provide it as a command line argument. Suggested Approach ================== It is suggested that you write an object oriented solution to this problem. Clearly, one type of object that we want to represent is a schedule object. Define a Schedule class with a single constructor and a single accessor method. The constructor for a Schedule object will take a URL object as an input parameter. It will then read the entire document using StAX and extract and retain data. These data will be made available to the user via the second method of this class. This second method will be called getAvailable and will have a signature similar to the following: public LinkedList getAvailable(String day) throws Exception There will be a second class called URLList that is used to represent objects holding URLList documents. Its constructor takes a single URL as an input parameter and uses StAX to read the contents of a URLList document. It retains data and makes these data available via two accessor methods. These methods have the following signatures: public int getNumURLs() throws Exception public String getURL(int i) throws Exception Another class that maintains a start and stop time and provides utility methods for time calculations is also called for. The scheduling activity might best be written as a static method of the Scheduler class. Finally, as a general rule, don't try to solve the entire problem. Instead, focus on building classes and objects that will act as useful tools. Test these on smaller instances of the problem. Solve smaller problems first. Develpoing a simple grammar =========================== It is required that you design an XSDL grammar for the urlList.xml file. See the grammar associated with schedule documents for a guide. Also, you are required to complete the tutorial located at W3C Schools on XML Schema: (see http://www.w3schools.com/schema/default.asp). Knowledge of XSDL will be of help later in the course when we study web services and WSDL. Here is a copy of my urlList.xml file. It is currently configured to provide four schedules to the scheduler. http://www.andrew.cmu.edu/~mm6/95-702/McCarthysSchedule/schedule1.xml http://www.andrew.cmu.edu/~mm6/95-702/McCarthysSchedule/schedule2.xml http://www.andrew.cmu.edu/~mm6/95-702/McCarthysSchedule/schedule3.xml http://www.andrew.cmu.edu/~mm6/95-702/McCarthysSchedule/schedule4.xml Use the following program to validate urlList.xml files against the grammar that you write. The grammar must require n > 1 schedule URL's. The program below makes use of Xerces. Information on Xerces and the required jars can be found at the following URL: http://xerces.apache.org/xerces-j/. Currently, as far as I am aware, there is no XSDL validation available with StAX parsers. Validate.java is a Java program that validates an XML instance against its schema. The schema document (.xsd) must be in the same directory as the document being validated. The document, however, may be pointed to by a URL. // Validate.java using Xerces import org.xml.sax.InputSource; import org.xml.sax.SAXParseException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLReaderFactory; public class Validate extends DefaultHandler { public static boolean valid = true; public void error(SAXParseException exception) { System.out.println("Received notification of a recoverable error." + exception); valid = false; } public void fatalError(SAXParseException exception) { System.out.println("Received notification of a non-recoverable error."+ exception); valid = false; } public void warning(SAXParseException exception) { System.out.println("Received notification of a warning."+ exception); } public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java Validate [filename.xml | URLToFile]"); System.exit (1); } try { // get a parser XMLReader reader = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser"); // request validation reader.setFeature("http://xml.org/sax/features/validation",true); reader.setFeature("http://apache.org/xml/features/validation/schema",true); reader.setErrorHandler(new Validate()); // associate an InputSource object with the file name or URL InputSource inputSource = new InputSource(argv[0]); // go ahead and parse reader.parse(inputSource); } catch(org.xml.sax.SAXException e) { System.out.println("Error in parsing " + e); valid = false; } catch(java.io.IOException e) { System.out.println("Error in I/O " + e); System.exit(0); } System.out.println("Valid Document is " + valid); } } Project One Submission requirements =================================== Submissions will not be accepted via email or the digital drop box. Please turn in your work during class. If you must miss class please make arrangements with another student to submit your work during class. Submit one large envelope with: Your name, course and project number on the front. All Java source code with documentation. One XSD listing showing the grammar developed for urlList.xml documents. Screenshots showing: A search for one hour meeting times on the first two schedules (schedule1.xml and schedule2.xml). A search for one hour meeting time on the third and fourth schedules (schedule3.xml and schedule4.xml). A search for 30 minute meeting times on all four schedules. Be able to demonstrate and defend your solution if required.