95-733 Internet Technologies September 2010 Homework 2 Due: Thursday, 11:59 PM, February 10, 2011 Lab Topics: XML, the Extensible Style Sheet Language for Transformations XSLT, Atom, and RSS The actual homework begins at Part 6. Parts 1 - 5 are for instruction only. It is strongly recommended that you work through parts 1 - 5. In this lab we will be programming in a transformation language called XSLT. XSLT is used to transform one XML document into another XML document (with a different structure). In order to write programs in XSLT, we need an XML parser (XSLT programs are XML documents) and an XSLT interpreter. The parser is called "Xerces". The interpreter is called "Xalan" (Xalan uses Xerces). The required jar files for XSLT processing using Xalan are : xalan.jar, xercesImpl.jar, xml-apis.jar and xsltc.jar. These may be downloaded from the Apache Foundation. Part 1 Command Line XSLT ======================== For DOS based machines, create a directory called "bats" and place a batch file called "xalan.bat" in that directory. Place the path to your bats directory in the system path variable. I recommend that you actually type the xalan.bat file into a text editor. The copy and paste approach has been troublesome. The file xalan.bat will hold the following: java org.apache.xalan.xslt.Process -IN %1 -XSL %2 -OUT %3 You will need to have the jar files mentioned above on your classpath before running xalan.bat. For Unix based machines, you will use a script file called xalan with execute permissions. My xalan jar files are saved in /Users/mm6/Applications/xalan. My xalan script is shown below. #!/bin/sh export XALAN_HOME=/Users/mm6/Applications/xalan export CP=$XALAN_HOME/xalan.jar:$XALAN_HOME/xercesImpl.jar:$XALAN_HOME/xml- apis.jar:$XALAN_HOME/xsltc.jar java -classpath $CP org.apache.xalan.xslt.Process -IN $1 -XSL $2 -OUT $3 Testing. The following is an xml file called books.xml that contains data on books. It's a copy of the file found on Page 70 of the XSLT Programmer's Reference by Michael Kay. Nigel Rees Sayings of the Century 8.95 Evelyn Waugh Sword of Honour 12.99 Herman Melville Moby Dick 8.99 J. R. R. Tolkien The Lord of the Rings 22.99 We would like to transform this file into an HTML document as shown here (result.html):

A list of books

1 Nigel Rees Sayings of the Century 8.95
2 Evelyn Waugh Sword of Honour 12.99
3 Herman Melville Moby Dick 8.99
4 J. R. R. Tolkien The Lord of the Rings 22.99
In order to carry out this transformation, we will use the XSLT programming language. While it is the case that XSLT is Turing complete, that is, we can solve a wide variety of problems using XSLT, it is especially good at performing XML transformations. Our first XSLT program looks like this (booklist.xsl):

A list of books

Place the two files (books.xml and booklist.xsl) into a directory and make sure that xalan is working properly by running the following command. The output file should look like result.html. xalan books.xml booklist.xsl result.html When debugging XSLT programs, it is often much more helpful to view your output in an editor like Notepad rather than to view your output in a browser like Netscape or IE or Safari. Look at the HTML document in a browser only after you are satisfied with the way it looks in Notepad. The browser view is often quite deceiving and makes a poor debugging tool. Part 2 Handling Namespaces ========================== Many documents make use of XML namespaces to remove ambiguity. The following is our books example with a namespace assigned to the namespace prefix p. Nigel Rees Sayings of the Century 8.95 Evelyn Waugh Sword of Honour 12.99 Herman Melville Moby Dick 8.99 J. R. R. Tolkien The Lord of the Rings 22.99 The same XSLT program that we wrote above needs to be adapted to handle these namespace qualified elements. Be sure to test this new program against the books file with namespaces.

A list of books

Part 3 Running Xalan from within Java ============================================ While command line xalan makes a very nice tool, it is often necessary to make calls for XSLT processing from within other programs. Here is a Java program that performs the same transformation as above. But this time the transformation is performed under application program control. This program would be executed with the command: java ProduceHTML books.xml booklist.xsl result.html // ProduceHTML.java is a simple program that demonstrates how XSLT programs // can be executed from within Java. import java.io.IOException; import java.io.OutputStream; import java.io.FileInputStream; import java.io.FileOutputStream; import javax.xml.transform.Source; import javax.xml.transform.stream.StreamSource; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.Result; import javax.xml.transform.TransformerFactory; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerException; public class ProduceHTML { public static void main(String a[] ) { Source xmlDoc, xslDoc; Result result; try { FileInputStream xml = new FileInputStream(a[0]); FileInputStream xsl = new FileInputStream(a[1]); FileOutputStream out = new FileOutputStream(a[2]); xmlDoc = new StreamSource(xml); xslDoc = new StreamSource(xsl); result = new StreamResult(out); TransformerFactory factory = TransformerFactory.newInstance(); Transformer trans = factory.newTransformer(xslDoc); trans.transform(xmlDoc,result); } catch(TransformerException e) { System.out.println("Transformer Probem" + e); } catch(IOException e) { System.out.println("An I/O problem"); } } } Part 4. Running XSLT from within a Java servlet. ================================================ Suppose we want to use a local stylesheet called XSLTransformerCode.xsl to process a remote XML file at some URL. Using Netbeans and Glassfish, add the xsl stylesheet file to the project's Web Pages folder. A doGet method might have the following code: PrintWriter out = response.getWriter(); // get the xsl stored in this project ServletContext context = getServletContext(); InputStream xsl = (InputStream) (context.getResourceAsStream("/XSLTransformerCode.xsl")); // We need two source objects and one result // get an external xml document using a url in a // string format Source xmlDoc = new StreamSource(urlAsString); Source xslDoc = new StreamSource(xsl); Result result = new StreamResult(out); // Prepare to transform TransformerFactory factory = TransformerFactory.newInstance(); Transformer trans = factory.newTransformer(xslDoc); trans.transform(xmlDoc,result); // The transformed document is returned to the browser. Part 5. An Atom document from the W3C ===================================== The following document was accessed from the W3C's main web page by clicking on the syndicate link. It is meant to be read by a news reader. We will use it as our input file for the homework problems below. The current W3C feed may be accessed here: http://www.w3.org/News/atom.xml W3C News tag:www.w3.org,2008-09-29://4 2010-09-07T18:29:54Z Movable Type 4.261 W3C Invites Implementations of Geolocation API Specification tag:www.w3.org,2010://4.8889 2010-09-07T18:29:03Z 2010-09-07T18:29:03Z The Geolocation Working Group invites implementation of the Candidate Recommendation of Geolocation API Specification. The Geolocation API defines a high-level interface to location information associated only with the device hosting the implementation, such as latitude and longitude. The API itself... W3C Staff The Geolocation Working Group invites implementation of the Candidate Recommendation of Geolocation API Specification. The Geolocation API defines a high-level interface to location information associated only with the device hosting the implementation, such as latitude and longitude. The API itself is agnostic of the underlying location information sources. Common sources of location information include Global Positioning System (GPS) and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth MAC addresses, and GSM/CDMA cell IDs, as well as user input. Learn more about the Ubiquitous Web Applications Activity.

]]>
XMLHttpRequest Level 2 Draft Published tag:www.w3.org,2010://4.8888 2010-09-07T18:26:43Z 2010-09-07T18:26:43Z The Web Applications Working Group has published a Working Draft of XMLHttpRequest Level 2. The XMLHttpRequest Level 2 specification enhances the XMLHttpRequest object with new features, such as cross-origin requests, progress events, and the handling of byte streams for both... W3C Staff The Web Applications Working Group has published a Working Draft of XMLHttpRequest Level 2. The XMLHttpRequest Level 2 specification enhances the XMLHttpRequest object with new features, such as cross-origin requests, progress events, and the handling of byte streams for both sending and receiving. Learn more about the Rich Web Client Activity.

]]>
Last Call: The Widget Interface tag:www.w3.org,2010://4.8887 2010-09-07T18:24:46Z 2010-09-07T18:24:46Z The Web Applications Working Group has published a Last Call Working Draft of The Widget Interface. This specification defines an application programming interface (API) for widgets that provides, amongst other things, functionality for accessing a widget's metadata and persistently storing... W3C Staff The Web Applications Working Group has published a Last Call Working Draft of The Widget Interface. This specification defines an application programming interface (API) for widgets that provides, amongst other things, functionality for accessing a widget's metadata and persistently storing data. Comments are welcome through 28 September. Learn more about the Rich Web Client Activity.

]]>
Updated Note: Device API Access Control Use Cases and Requirements tag:www.w3.org,2010://4.8886 2010-09-07T18:22:31Z 2010-09-07T18:22:31Z The Device APIs and Policy Working Group has updated a Group Note of Device API Access Control Use Cases and Requirements. This document examines the question of managing sensitive information that can become available through device APIs (e.g., position information).... W3C Staff The Device APIs and Policy Working Group has updated a Group Note of Device API Access Control Use Cases and Requirements. This document examines the question of managing sensitive information that can become available through device APIs (e.g., position information). The approach taken in this document is to simplify the possible interactions by considering three related use cases: (1) browser web pages and untrusted widgets (2) trusted widgets and applications, and (3) delegated authority. Learn more about the Ubiquitous Web Applications Activity.

]]>
W3C Extends Speech Framework to Asian Languages tag:www.w3.org,2010://4.8885 2010-09-07T16:41:07Z 2010-09-07T16:41:07Z W3C today extended speech on the Web to an enormous new market by improving support for Asian languages and multi-lingual voice applications. The Speech Synthesis Markup Language (SSML 1.1) Recommendation provides control over voice selection as well as speech characteristics... W3C Staff W3C today extended speech on the Web to an enormous new market by improving support for Asian languages and multi-lingual voice applications. The Speech Synthesis Markup Language (SSML 1.1) Recommendation provides control over voice selection as well as speech characteristics such as pronunciation, volume, and pitch. SSML is part of W3C's Speech Interface Framework for building voice applications, which also includes the widely deployed VoiceXML. "With SSML 1.1 there is an intentional focus on Asian language support," said Dan Burnett, Co-Chair of the Voice Browser Working Group and Director of Speech Technologies and Standards at Voxeo, "including Chinese languages, Japanese, Thai, Urdu, and others, to provide a wide deployment potential." Read more in the press release and W3C Member Testimonials. Learn more about voice browsing.

]]>
Five XML Security Drafts Published tag:www.w3.org,2010://4.8883 2010-09-01T20:52:23Z 2010-09-01T20:52:23Z The XML Security Working Group has published five working drafts today. XML Signature 2.0, Canonical XML 2.0 and the XML Signature Streamable Profile of XPath 1.0 are part of an ongoing effort to rework XML Signature and Canonical XML in... W3C Staff The XML Security Working Group has published five working drafts today. XML Signature 2.0, Canonical XML 2.0 and the XML Signature Streamable Profile of XPath 1.0 are part of an ongoing effort to rework XML Signature and Canonical XML in order to address issues around performance, streaming, robustness, and attack surface. The Working Group has also published updated Working Drafts for its XML Signature Best Practices and XML Security Relax NG Schemas Working Group Notes. Learn more about XML Security.

]]>
Voice Extensible Markup Language (VoiceXML) 3.0 Draft Published tag:www.w3.org,2010://4.8882 2010-08-31T19:33:56Z 2010-08-31T19:33:56Z The Voice Browser Working Group has published a Working Draft of Voice Extensible Markup Language (VoiceXML) 3.0. Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative... W3C Staff The Voice Browser Working Group has published a Working Draft of Voice Extensible Markup Language (VoiceXML) 3.0. Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. Learn more about the Voice Browser Activity.

]]>
W3C Launches HTML Speech Incubator Group tag:www.w3.org,2010://4.8881 2010-08-30T15:40:46Z 2010-08-30T15:40:46Z W3C is pleased to announce the creation of the HTML Speech Incubator Group, whose mission is to determine the feasibility of integrating speech technology in HTML5 in a way that leverages the capabilities of both speech and HTML (e.g., DOM)... W3C Staff W3C is pleased to announce the creation of the HTML Speech Incubator Group, whose mission is to determine the feasibility of integrating speech technology in HTML5 in a way that leverages the capabilities of both speech and HTML (e.g., DOM) to provide a high-quality, browser-independent speech/multimodal experience while avoiding unnecessary standards fragmentation or overlap. The following W3C Members have sponsored the charter for this group: Voxeo, Microsoft, Openstream, Google, AT&T, Mozilla. Read more about the Incubator Activity, an initiative to foster development of emerging Web-related technologies. Incubator Activity work is not on the W3C standards track but in many cases serves as a starting point for a future Working Group.

]]>
W3C Launches Web Performance Working Group tag:www.w3.org,2010://4.8879 2010-08-19T14:15:11Z 2010-08-19T14:15:11Z W3C has launched a new Web Performance Working Group, whose mission is to provide methods to measure aspects of application performance of user agent features and APIs. As Web browsers and their underlying engines include richer capabilities and become more... W3C Staff W3C has launched a new Web Performance Working Group, whose mission is to provide methods to measure aspects of application performance of user agent features and APIs. As Web browsers and their underlying engines include richer capabilities and become more powerful, Web developers are building more sophisticated applications where application performance is increasingly important. Developers need the ability to assess and understand the performance characteristics of their applications using well-defined interoperable methods. This new Working Group will look at user agent features and APIs to measure aspects of application performance. Group deliverables will apply to desktop and mobile browsers and other non-browser environments where appropriate and will be consistent with Web technologies designed in other working groups including HTML, CSS, WebApps, DAP and SVG. Learn more in the Working Group charter and how this work fits into the W3C's Rich Web Client Activity.

]]>
Contacts API Draft Published tag:www.w3.org,2010://4.8878 2010-08-17T17:33:10Z 2010-08-17T17:33:10Z The Device APIs and Policy Working Group has published a Working Draft of Contacts API. This specification defines the concept of a user's unified address book - where address book data may be sourced from a plurality of sources -... W3C Staff The Device APIs and Policy Working Group has published a Working Draft of Contacts API. This specification defines the concept of a user's unified address book - where address book data may be sourced from a plurality of sources - both online and locally. This specification then defines the interfaces on which 3rd party applications can access a user's unified address book; with explicit user permission and filtering. Learn more about the Ubiquitous Web Applications Activity.

]]>
W3C Leads Discussion at TypeCon 2010 on New Open Web Font Format (WOFF) tag:www.w3.org,2010://4.8877 2010-08-17T11:22:02Z 2010-08-17T11:22:02Z W3C attends TypeCon 2010 this week for community discussion about Web Open File Format (WOFF), the new open format for enabling high-quality typography for the Web. WOFF expands the typographic palette available to Web designers, improving readability, accessibility, internationalization, branding,... W3C Staff W3C attends TypeCon 2010 this week for community discussion about Web Open File Format (WOFF), the new open format for enabling high-quality typography for the Web. WOFF expands the typographic palette available to Web designers, improving readability, accessibility, internationalization, branding, and search optimization. Though still in the early phases of standardization, WOFF represents a pivotal agreement among browser vendors, foundries and font service providers who have convened at W3C to address the long-standing goal of advancing Web typography. ÒAs a key Web font standard developed by W3C, WOFF 1.0 represents a universal solution for enabling advanced typography on the Web,Ó said Vladimir Levantovsky, W3C WebFonts Working Group chair and senior technology strategist at Monotype Imaging, Inc. ÒWith the backing of browser companies and font vendors, who are making their fonts available for licensing in WOFF, this new W3C Recommendation-track document will bring rich typographic choice for content creators, Web authors and brand managers." Learn more in the press release and WOFF FAQ, as well as more about fonts on the Web.

]]>
Privacy Workshop Participants Share Implementation Experience; User Behaviors tag:www.w3.org,2010://4.8876 2010-08-15T20:49:51Z 2010-08-15T20:49:51Z In July, W3C brought together participants across the industry for a privacy workshop (organized jointly with the PrimeLife EU project in London). Discussion topics included privacy-related implementation experience with the W3C geolocation API, and privacy icon and ruleset proposals for... W3C Staff In July, W3C brought together participants across the industry for a privacy workshop (organized jointly with the PrimeLife EU project in London). Discussion topics included privacy-related implementation experience with the W3C geolocation API, and privacy icon and ruleset proposals for Web sites and APIs, respectively. Read the Workshop Report and learn more about the W3C Privacy Activity.

]]>
Web Security Context: User Interface Guidelines is a W3C Recommendation tag:www.w3.org,2010://4.8875 2010-08-12T14:44:25Z 2010-08-12T14:44:25Z The Web Security Context Working Group has published a W3C Recommendation of Web Security Context: User Interface Guidelines. This specification deals with the trust decisions that users must make online, and with ways to support them in making safe and... W3C Staff The Web Security Context Working Group has published a W3C Recommendation of Web Security Context: User Interface Guidelines. This specification deals with the trust decisions that users must make online, and with ways to support them in making safe and informed decisions where possible. It describes user interactions and user interface guidelines with a goal toward making security usable, based on known best practice in this area. Learn more about the Security Activity.

]]>
W3C Invites Review of First Draft of The Messaging API tag:www.w3.org,2010://4.8874 2010-08-10T18:11:50Z 2010-08-10T18:11:50Z The Device APIs and Policy Working Group has published a First Public Working Draft of The Messaging API. The Messaging API defines a high-level interface to Messaging functionality, including SMS, MMS and Email. It includes APIs to create, send and... W3C Staff The Device APIs and Policy Working Group has published a First Public Working Draft of The Messaging API. The Messaging API defines a high-level interface to Messaging functionality, including SMS, MMS and Email. It includes APIs to create, send and receive messages. Learn more about the Ubiquitous Web Applications Activity.

]]>
Call for Review: MathML 3.0; MathML for CSS Profile are Proposed Recommendations tag:www.w3.org,2010://4.8873 2010-08-10T14:55:33Z 2010-08-10T14:55:33Z The Math Working Group published two Proposed Recommendations today: Mathematical Markup Language (MathML) Version 3.0 and A MathML for CSS Profile. This first defines the Mathematical Markup Language, or MathML, which enables people to express mathematics in Web documents. The... W3C Staff The Math Working Group published two Proposed Recommendations today: Mathematical Markup Language (MathML) Version 3.0 and A MathML for CSS Profile. This first defines the Mathematical Markup Language, or MathML, which enables people to express mathematics in Web documents. The second describes a profile of MathML 3.0 that is suitable for styling with Cascading Style Sheets (CSS). Comments are welcome through 10 September. Learn more about the Math Activity.

]]>
PART 6 Introductory XSLT Programming ==================================== In solving the Atom puzzles below, I used the following in each of my XSLT programs. (1) 10 Points. Using command line XSLT, write an XSLT program that displays the contents of each title that is a direct child of feed/entry. This list of titles will appear as an HTML unsigned list. It will appear something like this:

W3C Atom Document

(2) 10 Points. Using command line XSLT, write an XSLT program that displays the number of Atom entry elements that appear in the document. You must use the XSLT count function in your solution. Your output will be marked up as HTML and will appear in a browser as follows: Counting Atom entry items 15 (3) 10 Points. Using command line XSLT, write an XSLT program that displays the contents of each title that is a direct child of feed/entry. Your output will be marked up as HTML and will appear in a browser with the titles underlined as hypertext links. If the user clicks on a link the browser will fetch the associated document that is pointed to by the link element. The output on the browser will appear as follows (in a browser, these show up as clickable links.) Titles (with links) * W3C Invites Implementations of Geolocation API Specification * XMLHttpRequest Level 2 Draft Published * Last Call: The Widget Interface : (4) 10 Points. Using command line XSLT, write an XSLT program that displays the contents of each title and the value of the term attribute of each category associated with that title. The output will be marked up nicely in HTML. A browser will display something like the following: W3C Invites Implementations of Geolocation API Specification * Web of Devices * Home Page Stories * Publication XMLHttpRequest Level 2 Draft Published * Browsers and Authoring Tools * Home Page Stories * Publication * Web Design and Applications SERVER SIDE MASHUP ================== (5) 40 Points. Write a JSP page that asks the user to enter a topic from a list of topics shown in a drop down list. The three topics will be Business, Technology and World News. Once a selection is made your browser will make a call on a Java servlet passing along the topic. The topic is simply a string passed to the servlet from the browser. The servlet will fetch the appropriate RSS 2.0 feed from the NY Times web site. It will apply a style sheet that will generate HTML to the browser. The HTML display will show each news title of each item. Each news title will be displayed as a link. The user will be able to click links to visit the associated page. Note that there are no namespaces defined on the main elements in RSS 2.0. New York Times feeds may be found at http://www.nytimes.com/services/xml/rss/index.html (6) 10 Points. Add a source of feeds drop down box to the application that you built in question 5. Thus, the user will be able to select a topic and a source. At a minimum, you will need to provide for three sources. In my solution, I used the BBC, the New York Times and the Sydney Morning Herald. The BBC feeds are discussed here: http://news.bbc.co.uk/1/hi/help/3223484.stm The Sydney Morning Herald feeds are discussed here: http://www.smh.com.au/rsschannels/ (7) 10 points. Add Ajax to your solution in question 6. Be creative and redesign the site so that there is no need for a full page refresh.