95-733 Internet Technologies September 2010
Homework 2 Due: Thursday, 11:59 PM, February 10, 2011
Lab Topics: XML, the Extensible Style Sheet Language for Transformations XSLT,
Atom, and RSS
The actual homework begins at Part 6. Parts 1 - 5 are for instruction only. It
is strongly recommended that you work through parts 1 - 5.
In this lab we will be programming in a transformation language called
XSLT. XSLT is used to transform one XML document into another XML document
(with a different structure). In order to write programs in XSLT, we
need an XML parser (XSLT programs are XML documents) and an XSLT
interpreter. The parser is called "Xerces". The interpreter is called
"Xalan" (Xalan uses Xerces).
The required jar files for XSLT processing using Xalan are : xalan.jar,
xercesImpl.jar, xml-apis.jar and xsltc.jar. These may be downloaded
from the Apache Foundation.
Part 1 Command Line XSLT
========================
For DOS based machines, create a directory called "bats" and place
a batch file called "xalan.bat" in that directory. Place the path
to your bats directory in the system path variable.
I recommend that you actually type the xalan.bat file into
a text editor. The copy and paste approach has been troublesome.
The file xalan.bat will hold the following:
java org.apache.xalan.xslt.Process -IN %1 -XSL %2 -OUT %3
You will need to have the jar files mentioned above on your classpath
before running xalan.bat.
For Unix based machines, you will use a script file called xalan with
execute permissions. My xalan jar files are saved in
/Users/mm6/Applications/xalan.
My xalan script is shown below.
#!/bin/sh
export XALAN_HOME=/Users/mm6/Applications/xalan
export CP=$XALAN_HOME/xalan.jar:$XALAN_HOME/xercesImpl.jar:$XALAN_HOME/xml-
apis.jar:$XALAN_HOME/xsltc.jar
java -classpath $CP org.apache.xalan.xslt.Process -IN $1 -XSL $2 -OUT $3
Testing. The following is an xml file called books.xml that contains data
on books. It's a copy of the file found on Page 70 of the XSLT Programmer's
Reference by Michael Kay.
Nigel ReesSayings of the Century8.95Evelyn WaughSword of Honour12.99Herman MelvilleMoby Dick8.99J. R. R. TolkienThe Lord of the Rings22.99
We would like to transform this file into an HTML document as shown here (result.html):
A list of books
1
Nigel Rees
Sayings of the Century
8.95
2
Evelyn Waugh
Sword of Honour
12.99
3
Herman Melville
Moby Dick
8.99
4
J. R. R. Tolkien
The Lord of the Rings
22.99
In order to carry out this transformation, we will use the XSLT
programming language. While it is the case that XSLT is Turing
complete, that is, we can solve a wide variety of problems using
XSLT, it is especially good at performing XML transformations.
Our first XSLT program looks like this
(booklist.xsl):
A list of books
Place the two files (books.xml and booklist.xsl) into a directory
and make sure that xalan is working properly by running the
following command. The output file should look like result.html.
xalan books.xml booklist.xsl result.html
When debugging XSLT programs, it is often much more helpful to
view your output in an editor like Notepad rather than to view
your output in a browser like Netscape or IE or Safari. Look at
the HTML document in a browser only after you are satisfied
with the way it looks in Notepad. The browser view is often
quite deceiving and makes a poor debugging tool.
Part 2 Handling Namespaces
==========================
Many documents make use of XML namespaces to remove ambiguity.
The following is our books example with a namespace assigned to
the namespace prefix p.
Nigel ReesSayings of the Century8.95Evelyn WaughSword of Honour12.99Herman MelvilleMoby Dick8.99J. R. R. TolkienThe Lord of the Rings22.99
The same XSLT program that we wrote above needs to be adapted
to handle these namespace qualified elements. Be sure to test
this new program against the books file with namespaces.
A list of books
Part 3 Running Xalan from within Java
============================================
While command line xalan makes a very nice tool, it is
often necessary to make calls for XSLT processing from
within other programs. Here is a Java program that
performs the same transformation as above. But this
time the transformation is performed under application
program control.
This program would be executed with the command:
java ProduceHTML books.xml booklist.xsl result.html
// ProduceHTML.java is a simple program that demonstrates how XSLT programs
// can be executed from within Java.
import java.io.IOException;
import java.io.OutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.Result;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
public class ProduceHTML {
public static void main(String a[] ) {
Source xmlDoc, xslDoc;
Result result;
try {
FileInputStream xml = new FileInputStream(a[0]);
FileInputStream xsl = new FileInputStream(a[1]);
FileOutputStream out = new FileOutputStream(a[2]);
xmlDoc = new StreamSource(xml);
xslDoc = new StreamSource(xsl);
result = new StreamResult(out);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer trans = factory.newTransformer(xslDoc);
trans.transform(xmlDoc,result);
}
catch(TransformerException e) {
System.out.println("Transformer Probem" + e);
}
catch(IOException e) {
System.out.println("An I/O problem");
}
}
}
Part 4. Running XSLT from within a Java servlet.
================================================
Suppose we want to use a local stylesheet called
XSLTransformerCode.xsl to process a remote XML file
at some URL.
Using Netbeans and Glassfish, add the xsl
stylesheet file to the project's Web Pages folder.
A doGet method might have the following code:
PrintWriter out = response.getWriter();
// get the xsl stored in this project
ServletContext context = getServletContext();
InputStream xsl = (InputStream)
(context.getResourceAsStream("/XSLTransformerCode.xsl"));
// We need two source objects and one result
// get an external xml document using a url in a
// string format
Source xmlDoc = new StreamSource(urlAsString);
Source xslDoc = new StreamSource(xsl);
Result result = new StreamResult(out);
// Prepare to transform
TransformerFactory factory = TransformerFactory.newInstance();
Transformer trans = factory.newTransformer(xslDoc);
trans.transform(xmlDoc,result);
// The transformed document is returned to the browser.
Part 5. An Atom document from the W3C
=====================================
The following document was accessed from the W3C's main
web page by clicking on the syndicate link. It is meant to
be read by a news reader. We will use it as our input file
for the homework problems below.
The current W3C feed may be accessed here:
http://www.w3.org/News/atom.xml
W3C Newstag:www.w3.org,2008-09-29://42010-09-07T18:29:54ZMovable Type 4.261W3C Invites Implementations of Geolocation API Specificationtag:www.w3.org,2010://4.88892010-09-07T18:29:03Z2010-09-07T18:29:03ZThe Geolocation Working Group invites implementation of the Candidate Recommendation of Geolocation API Specification. The Geolocation API defines a high-level interface to location information associated only with the device hosting the implementation, such as latitude and longitude. The API itself...W3C Staff
The Geolocation Working Group invites implementation of the Candidate Recommendation of Geolocation API Specification. The Geolocation API defines a high-level interface to location information associated only with the device hosting the implementation, such as latitude and longitude. The API itself is agnostic of the underlying location information sources. Common sources of location information include Global Positioning System (GPS) and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth MAC addresses, and GSM/CDMA cell IDs, as well as user input. Learn more about the Ubiquitous Web Applications Activity.]]>
XMLHttpRequest Level 2 Draft Publishedtag:www.w3.org,2010://4.88882010-09-07T18:26:43Z2010-09-07T18:26:43ZThe Web Applications Working Group has published a Working Draft of XMLHttpRequest Level 2. The XMLHttpRequest Level 2 specification enhances the XMLHttpRequest object with new features, such as cross-origin requests, progress events, and the handling of byte streams for both...W3C Staff
The Web Applications Working Group has published a Working Draft of XMLHttpRequest Level 2. The XMLHttpRequest Level 2 specification enhances the XMLHttpRequest object with new features, such as cross-origin requests, progress events, and the handling of byte streams for both sending and receiving. Learn more about the Rich Web Client Activity.]]>
Last Call: The Widget Interfacetag:www.w3.org,2010://4.88872010-09-07T18:24:46Z2010-09-07T18:24:46ZThe Web Applications Working Group has published a Last Call Working Draft of The Widget Interface. This specification defines an application programming interface (API) for widgets that provides, amongst other things, functionality for accessing a widget's metadata and persistently storing...W3C Staff
The Web Applications Working Group has published a Last Call Working Draft of The Widget Interface. This specification defines an application programming interface (API) for widgets that provides, amongst other things, functionality for accessing a widget's metadata and persistently storing data. Comments are welcome through 28 September. Learn more about the Rich Web Client Activity.]]>
Updated Note: Device API Access Control Use Cases and Requirementstag:www.w3.org,2010://4.88862010-09-07T18:22:31Z2010-09-07T18:22:31ZThe Device APIs and Policy Working Group has updated a Group Note of Device API Access Control Use Cases and Requirements. This document examines the question of managing sensitive information that can become available through device APIs (e.g., position information)....W3C Staff
The Device APIs and Policy Working Group has updated a Group Note of Device API Access Control Use Cases and Requirements. This document examines the question of managing sensitive information that can become available through device APIs (e.g., position information). The approach taken in this document is to simplify the possible interactions by considering three related use cases: (1) browser web pages and untrusted widgets (2) trusted widgets and applications, and (3) delegated authority. Learn more about the Ubiquitous Web Applications Activity.]]>
W3C Extends Speech Framework to Asian Languagestag:www.w3.org,2010://4.88852010-09-07T16:41:07Z2010-09-07T16:41:07ZW3C today extended speech on the Web to an enormous new market by improving support for Asian languages and multi-lingual voice applications. The Speech Synthesis Markup Language (SSML 1.1) Recommendation provides control over voice selection as well as speech characteristics...W3C Staff
W3C today extended speech on the Web to an enormous new market by improving support for Asian languages and multi-lingual voice applications. The Speech Synthesis Markup Language (SSML
1.1) Recommendation provides control over voice selection as well as speech characteristics such as pronunciation, volume, and pitch. SSML is part of W3C's Speech Interface Framework for building voice applications, which also includes the widely deployed VoiceXML. "With SSML 1.1 there is an intentional focus on Asian language support," said Dan Burnett, Co-Chair of the Voice Browser Working Group and Director of Speech Technologies and Standards at Voxeo, "including Chinese languages, Japanese, Thai, Urdu, and others, to provide a wide deployment potential." Read more in the press release and W3C Member Testimonials. Learn more about voice browsing.]]>
Five XML Security Drafts Publishedtag:www.w3.org,2010://4.88832010-09-01T20:52:23Z2010-09-01T20:52:23ZThe XML Security Working Group has published five working drafts today. XML Signature 2.0, Canonical XML 2.0 and the XML Signature Streamable Profile of XPath 1.0 are part of an ongoing effort to rework XML Signature and Canonical XML in...W3C Staff
The XML Security Working Group has published five working drafts today. XML Signature 2.0, Canonical XML 2.0 and the XML Signature Streamable Profile of XPath 1.0 are part of an ongoing effort to rework XML Signature and Canonical XML in order to address issues around performance, streaming, robustness, and attack surface. The Working Group has also published updated Working Drafts for its XML Signature Best Practices and XML Security Relax NG Schemas Working Group Notes. Learn more about XML Security.]]>
Voice Extensible Markup Language (VoiceXML) 3.0 Draft Publishedtag:www.w3.org,2010://4.88822010-08-31T19:33:56Z2010-08-31T19:33:56ZThe Voice Browser Working Group has published a Working Draft of Voice Extensible Markup Language (VoiceXML) 3.0. Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative...W3C Staff
The Voice Browser Working Group has published a Working Draft of Voice Extensible Markup Language (VoiceXML) 3.0. Voice XML is used to create interactive media dialogs that feature synthesized speech, recognition of spoken and DTMF key input, telephony, mixed initiative conversations, and recording and presentation of a variety of media formats including digitized audio, and digitized video. Learn more about the Voice Browser Activity.]]>
W3C Launches HTML Speech Incubator Grouptag:www.w3.org,2010://4.88812010-08-30T15:40:46Z2010-08-30T15:40:46ZW3C is pleased to announce the creation of the HTML Speech Incubator Group, whose mission is to determine the feasibility of integrating speech technology in HTML5 in a way that leverages the capabilities of both speech and HTML (e.g., DOM)...W3C Staff
W3C is pleased to announce the creation of the HTML Speech Incubator Group, whose mission is to determine the feasibility of integrating speech technology in HTML5 in a way that leverages the capabilities of both speech and HTML (e.g., DOM) to provide a high-quality, browser-independent speech/multimodal experience while avoiding unnecessary standards fragmentation or overlap. The following W3C Members have sponsored the charter for this group: Voxeo, Microsoft, Openstream, Google, AT&T, Mozilla. Read more about the Incubator Activity, an initiative to foster development of emerging Web-related technologies. Incubator Activity work is not on the W3C standards track but in many cases serves as a starting point for a future Working Group.]]>
W3C Launches Web Performance Working Grouptag:www.w3.org,2010://4.88792010-08-19T14:15:11Z2010-08-19T14:15:11ZW3C has launched a new Web Performance Working Group, whose mission is to provide methods to measure aspects of application performance of user agent features and APIs. As Web browsers and their underlying engines include richer capabilities and become more...W3C Staff
W3C has launched a new Web Performance Working Group, whose mission is to provide methods to measure aspects of application performance of user agent features and APIs. As Web browsers and their underlying engines include richer capabilities and become more powerful, Web developers are building more sophisticated applications where application performance is increasingly important. Developers need the ability to assess and understand the performance characteristics of their applications using well-defined interoperable methods. This new Working Group will look at user agent features and APIs to measure aspects of application performance. Group deliverables will apply to desktop and mobile browsers and other non-browser environments where appropriate and will be consistent with Web technologies designed in other working groups including HTML, CSS, WebApps, DAP and SVG. Learn more in the Working Group charter and how this work fits into the W3C's Rich Web Client Activity.]]>
Contacts API Draft Publishedtag:www.w3.org,2010://4.88782010-08-17T17:33:10Z2010-08-17T17:33:10ZThe Device APIs and Policy Working Group has published a Working Draft of Contacts API. This specification defines the concept of a user's unified address book - where address book data may be sourced from a plurality of sources -...W3C Staff
The Device APIs and Policy Working Group has published a Working Draft of Contacts API. This specification defines the concept of a user's unified address book - where address book data may be sourced from a plurality of sources - both online and locally. This specification then defines the interfaces on which 3rd party applications can access a user's unified address book; with explicit user permission and filtering. Learn more about the Ubiquitous Web Applications Activity.]]>
W3C Leads Discussion at TypeCon 2010 on New Open Web Font Format (WOFF)tag:www.w3.org,2010://4.88772010-08-17T11:22:02Z2010-08-17T11:22:02ZW3C attends TypeCon 2010 this week for community discussion about Web Open File Format (WOFF), the new open format for enabling high-quality typography for the Web. WOFF expands the typographic palette available to Web designers, improving readability, accessibility, internationalization, branding,...W3C Staff
W3C attends TypeCon 2010 this
week for community discussion about Web Open File
Format (WOFF), the new open format for enabling high-quality
typography for the Web. WOFF expands the typographic palette available
to Web designers, improving readability, accessibility,
internationalization, branding, and search optimization. Though still
in the early phases of standardization, WOFF represents a pivotal
agreement among browser vendors, foundries and font service providers
who have convened at W3C to address the long-standing goal of
advancing Web typography. ÒAs a key Web font standard developed by
W3C, WOFF 1.0 represents a universal solution for enabling advanced
typography on the Web,Ó said Vladimir Levantovsky, W3C WebFonts Working Group chair and senior technology
strategist at Monotype Imaging, Inc. ÒWith the backing of browser
companies and font vendors, who are making their fonts available for
licensing in WOFF, this new W3C Recommendation-track document will
bring rich typographic choice for content creators, Web authors and
brand managers." Learn more in the press release and WOFF FAQ,
as well as more about fonts on the Web.
]]>
Privacy Workshop Participants Share Implementation Experience; User Behaviorstag:www.w3.org,2010://4.88762010-08-15T20:49:51Z2010-08-15T20:49:51ZIn July, W3C brought together participants across the industry for a privacy workshop (organized jointly with the PrimeLife EU project in London). Discussion topics included privacy-related implementation experience with the W3C geolocation API, and privacy icon and ruleset proposals for...W3C Staff
In July, W3C brought together participants across the industry for a privacy workshop (organized jointly with the PrimeLife EU project in London). Discussion topics included privacy-related implementation experience with the W3C geolocation API, and privacy icon and ruleset proposals for Web sites and APIs, respectively. Read the Workshop Report and learn more about the W3C Privacy Activity.
]]>
Web Security Context: User Interface Guidelines is a W3C Recommendationtag:www.w3.org,2010://4.88752010-08-12T14:44:25Z2010-08-12T14:44:25ZThe Web Security Context Working Group has published a W3C Recommendation of Web Security Context: User Interface Guidelines. This specification deals with the trust decisions that users must make online, and with ways to support them in making safe and...W3C Staff
The Web Security Context Working Group has published a W3C Recommendation of Web Security Context: User Interface Guidelines. This specification deals with the trust decisions that users must make online, and with ways to support them in making safe and informed decisions where possible. It describes user interactions and user interface guidelines with a goal toward making security usable, based on known best practice in this area.
Learn more about the Security Activity.]]>
W3C Invites Review of First Draft of The Messaging APItag:www.w3.org,2010://4.88742010-08-10T18:11:50Z2010-08-10T18:11:50ZThe Device APIs and Policy Working Group has published a First Public Working Draft of The Messaging API. The Messaging API defines a high-level interface to Messaging functionality, including SMS, MMS and Email. It includes APIs to create, send and...W3C Staff
The Device APIs and Policy Working Group has published a First Public Working Draft of The Messaging API. The Messaging API defines a high-level interface to Messaging functionality, including SMS, MMS and Email. It includes APIs to create, send and receive messages. Learn more about the Ubiquitous Web Applications Activity.]]>
Call for Review: MathML 3.0; MathML for CSS Profile are Proposed Recommendationstag:www.w3.org,2010://4.88732010-08-10T14:55:33Z2010-08-10T14:55:33ZThe Math Working Group published two Proposed Recommendations today: Mathematical Markup Language (MathML) Version 3.0 and A MathML for CSS Profile. This first defines the Mathematical Markup Language, or MathML, which enables people to express mathematics in Web documents. The...W3C Staff
The Math Working Group published two Proposed Recommendations today: Mathematical Markup Language (MathML) Version 3.0 and A MathML for CSS Profile. This first defines the Mathematical Markup Language, or MathML, which enables people to express mathematics in Web documents. The second describes a profile of MathML 3.0 that is suitable for styling with Cascading Style Sheets (CSS). Comments are welcome through 10 September. Learn more about the Math Activity.]]>
PART 6 Introductory XSLT Programming
====================================
In solving the Atom puzzles below, I used the following in each of my
XSLT programs.
(1) 10 Points. Using command line XSLT, write an XSLT program that displays
the contents of each title that is a direct child of feed/entry.
This list of titles will appear as an HTML unsigned list.
It will appear something like this:
W3C Atom Document
W3C Invites Implementations of Geolocation API Specification
XMLHttpRequest Level 2 Draft Published
Last Call: The Widget Interface
:
:
(2) 10 Points. Using command line XSLT, write an XSLT program that displays
the number of Atom entry elements that appear in the document. You
must use the XSLT count function in your solution. Your output will
be marked up as HTML and will appear in a browser as follows:
Counting Atom entry items
15
(3) 10 Points. Using command line XSLT, write an XSLT program that displays
the contents of each title that is a direct child of feed/entry.
Your output will be marked up as HTML and will appear in a
browser with the titles underlined as hypertext links. If the user
clicks on a link the browser will fetch the associated document that
is pointed to by the link element. The output on the browser will
appear as follows (in a browser, these show up as clickable links.)
Titles (with links)
* W3C Invites Implementations of Geolocation API Specification
* XMLHttpRequest Level 2 Draft Published
* Last Call: The Widget Interface
:
(4) 10 Points. Using command line XSLT, write an XSLT program that displays
the contents of each title and the value of the term attribute of each
category associated with that title. The output will be marked up nicely
in HTML. A browser will display something like the following:
W3C Invites Implementations of Geolocation API Specification
* Web of Devices
* Home Page Stories
* Publication
XMLHttpRequest Level 2 Draft Published
* Browsers and Authoring Tools
* Home Page Stories
* Publication
* Web Design and Applications
SERVER SIDE MASHUP
==================
(5) 40 Points. Write a JSP page that asks the user to enter a topic from
a list of topics shown in a drop down list. The three topics will be
Business, Technology and World News. Once a selection is made your browser
will make a call on a Java servlet passing along the topic. The topic
is simply a string passed to the servlet from the browser.
The servlet will fetch the appropriate RSS 2.0 feed from the NY Times
web site. It will apply a style sheet that will generate HTML to
the browser. The HTML display will show each news title of each item.
Each news title will be displayed as a link. The user will be able to
click links to visit the associated page. Note that there are no
namespaces defined on the main elements in RSS 2.0.
New York Times feeds may be found at http://www.nytimes.com/services/xml/rss/index.html
(6) 10 Points. Add a source of feeds drop down box to the application that
you built in question 5. Thus, the user will be able to select a topic and
a source. At a minimum, you will need to provide for three sources. In
my solution, I used the BBC, the New York Times and the Sydney Morning
Herald.
The BBC feeds are discussed here:
http://news.bbc.co.uk/1/hi/help/3223484.stm
The Sydney Morning Herald feeds are discussed
here:
http://www.smh.com.au/rsschannels/
(7) 10 points. Add Ajax to your solution in question 6. Be creative
and redesign the site so that there is no need for a full page
refresh.