15-100 Lecture 2 (Wednesday, Jan 14, 2004)

15-100 Lecture 2 (Wednesday, Jan 14, 2004)

The Three Essential Skills for Software Development

I believe that the three essential skills for software development are abstraction, symbolic representation and manipulation, and interpretation.
What do I mean by each of these?

abstraction: One critical skill required of a software developer is the ability to take a look at a complex problem and reduce it to its essential properties. As software developers we need to decide which properties of the system should be part of our software model, and which properties we can neglect as unimportant.

symbolic representation and manipulation: As software developers, we need to go farther than simply understanding the system. After we have developed an abstract understanding of the system, we must be able to define it symbolically. We need to take our understanding of the system and represent it in a way that is processable by the computer. For example, this semester, we'll need to represent our mental models of various systems in the Java programming language.
We also need the ability to manipulate systems that have already been represented symbolically. We need to be able to look at source code and change it to correct misrepresentations or to reflect changes in our understanding of the system it represents.

interpretation: As we've discussed, as software developers, we need to develop abstract models of systems, represent these systems symbolically, and manipulate them in the symbolic domain. But, this isn't enough. We must also by able to interprete symbolic representations of systems and realize their abstract meaning. This is necessary in order to truly understand what a piece of software does. And, it is also necessary while maintaining software to ensure that the changes being made actually affect the abstract system as intended.
Those with the ability to interprete symbolic systems don't view the system as a collection of symbols. Instead they see through the symbols and view the system that they represent. For example, I view music printed on a sheet of paper and try to "clap it out". A musician can read the printed page and appreciate the music. I am told that the best composers enjoy hearing their work performed, but don't need to hear it to know how it sounds.

Consider this. In the early days of the IS boom in the mid-to-late 1970s and early 1980s, when computer programmers were scarce, IBM developed a solution. Like most, they hired those with degrees in "information systems" and "computer science" (very few CS programs existed at this time).
And, like most, they turned to those who had similar skills in abstraction, symbolic representation and manipulation, and interpretation, and trained them for the job. These folks typically included mathematicians, engineers, physicists, and certain types of business backgrounds.
But, they developed a not-so-secret strategy in the battle for talent. They began hiring musicians -- in large numbers. What IBM understood was that learning to program is easy -- if you have mastered the three essential skills -- and that talented musicians had demonstrated an aptitude for these in a different domain.
I won't say that every punk rocker hired by IBM turned out to be an outstanding programmer. But, their yield was, in fact, high. Certainly programmers need to know a programming language well enough to express ideas freely -- but IBM's experience demonstrates that syntax and symantics are far from the hard part of the problem.
As a footnote, let me add that the first "burst of the bubble" didn't occur recently, with the "Dot Com Bust". The salaries of computer science and IS graduates were depressed through much of the mid-to-late 1980s, and only really recovered in the early 1990s. This was largely due to the rapid growth of CS departments in response to the demand of the late 1970s and early 1980s. Believe it or not, there were more people enrolled in computer science degree and related degree programs in the early 1980s than today! Talk about a glut of people hitting the market!

The Early Days of Programming Education

A couple/three decades ago, when the teaching of computer programming became big business, the landscape was quite different. We were, quite literally, teaching computer programming. Our students were electrical engineers, physicists, mathematicians, business professionals, &c.
Each of these groups of people had masted the three essential skills of abstraction, symbolic manipulation and representation, and interpretation in their own domain. Physicists were already accustomed to considering physical systems, large and small, forming hypotheses, representing the essential properites of these systems mathematically, manipulating the equations, and interpreting the results. Similarly, for electrical engineers, viewing complex electrical and electronic circuits as systems of differential equations was child's play. Mathematicians were already accustomed to generating a symbolic language for the representation of abstract systems with properties of their own creation, manipulating and interpreting these symbolic representations, and interpreting the symbols -- in ways that appear "Greek to the rest of us". And, business people were the inventors of "flow charts" and "entity relationship diagrams". These were tools used to describe business processes and resources.
They came to us, the teachers of computer programming, and asked us to teach them how to program. And, quite literally, that is what they meant, "Show me how to map between my symbolic language and yours." And so, that's what we did. We taught them the syntax and semantics of languages such as FORmula TRANslation (FORTRAN) and the COmmon Business Oriented Language (COBOL). Beyond teaching them our symbolic language, we didn't need to teach them the three essential skills -- they had already mastered them in their own domains. We simply showed them how to convert their symbolic representations into machine-readable representations.

Computer Science Education Today

Today, computers are everywhere. They are no longer tools for specialized scientific, engineering, and business problems. Instead, they are used for all sorts of different things. And so, in effect, a new discipline has emerged. Today "computer scientists" come in all shapes and sizes and address all sorts of problems.
The students, like you, who arrive and say, "I want to learn to program" are asking for something entirely different than your predecessors of 20-30 years ago. You aren't asking to simply be shown the syntax and semantics of Java, or some other programming language, for this is quite insufficient for any purpose.
Unlike your predecessors, you aren't physicists, mathematicians, business people (or musicians). You haven't already mastered the three essential skills in your own domain -- and you want to be able to solve problems across many domains.
So, when you say, "I want to learn to program", you really mean, "Help me to learn everything I need to learn to solve problems using a computer." You want us to teach you not only the sytax and semantics of a language -- but also how to use that language to solve problems. You want us to help you to master the three essential skills as well as teach you the language.
And so, this semester, I will do my best to do just that. And, we'll start exactly there.

Software Design and the Object-Oriented Approach

There are many diffferent approachs to the design of software systems. Over the years, we have developed and taught many different design methodologies. In the past, we began programming courses discussing flow charts and/or top-down diagrams. And sometimes we even discussed the merits of bottom-up approaches. But, these approaches proved too limited for complex problems and a new methodology came to light, object-oriented programming (OOP).
At the heart of the object-oriented approach are objects. Don't expect a complex mathematical description or a long technical definition. You won't find one, at least here. Instead let me just suggest that objects are all of the individual identifiable components of a system. If you want to point at it, name it, use it, or talk about it, it is an object.
One way of discovering the objects in a system is to describe it to someone else. Then consider all of the nouns that you used in your description. These are probably objects.
A very important piece of object-oriented design is, as you might imagine, describing the objects within the system. Again, a conversational model might be useful here. Often times when we describe things to each other, we will talk about the objects first in terms of their behavior. For example, if a little child asks you, "What is a car?" You would first tell the child that it can move forward and backaward and turn. And then, you would tell the child that it can carry people. Eventually you would tell the child that cars come in all different colors, and that some have two doors and others four doors, &c.
This is the same process that we'll use to describe the objects in software systems. After we identify each object, we'll ask ourselves, "What does this object do?" We call the things that an object can do its behaviors. We also ask ourselves, "What do we need to tell it to get it to do these things? What does it need to know?" For example, if we want a car to move forward, we need to tell it, "how fast". If we want it to turn left, we have to tell it, "how much". These are the parameters of its behaviors.
After considering the age old question, "What can the object do for us?" We'll ask ourselves, "What are the other properties of the object?" What color is it? How big is it? How many doors does it have? These aspects of an object we call its attributes.
Many of these attributes are visible attributes, we can observe them from outside of the object. But, some of the attributes we can only discover by inference. We'll call these the hidden attributes.
For example, if we consider a typical calculator, the only number we can see on the display is the most recent input. But, by observing how a calculator adds and subtracts, we can infer that it maintains an accumulator containing the most recent result. We can't see the accumulator, but we know that it must exist -- without it, we cannot explain the behavior of the calculator.

Classes: Different Types of Objects

If I would ask you to describe the contents of a room with only one person inside, you would probably describe that one person, by name. But, if I would ask you to describe a room with 100 people inside, you would probably approach the problem slightly differently.
You would describe a generic person first. You would tell me what all people have in common, and how they differ. Then, you would tell me about each object in the room by first identifying it as a person, and then telling me about how it is special -- by describing its collection of attributes.
In object-oriented languages, we can do exactly the same thing. We can describe types of objects, a.k.a., classes of objects. We do this by writing a class specification that describes the behaviors and attributes of a class of objects. Then, when we create new objects, we do it by building them according to some class specification and "filling in the blanks" for each attribute. So, in object oriented languages, new objects are "instances of a class" meaning that they are built according to some class specification.
We can define a class for cars that observes that cars can have differing numbers of doors, differing sizes of engines, and differing colors, but are all capable of moving forward, backward, and turning left and right. Then, we can create a new car, specifying that it should be be red and have two doors. Sometimes in using the car we might be concerned with the number of doors -- and on other occasions we just might want to drive it forward without worrying about it.

Objects Around The Room

To help you guys with these questions, a small example follows. Let's model something in the real world, a class in one of the 5419 clusters:

What do we have in the class?

People - What kind of people?

Females

Males

Computers - What kind of computers?

Desktops - What kinds of desktops?

Macs

Laptops - What kind of laptops?

Intel PC

iBook

PDAs - What kind of PDAs?

Palm PC

Linux

Walls - What kind of walls?

Fixed

Moveable (partition)

The Object-Oriented Approach

In the example above, everything we listed was a "type of thing." In Java, a type is a classification based on attributes including behaviors. There are few, if any, programming languages that have built-in types for People, Computers and Walls, however. In the case of the example above, we would have to create classes for our things.
A class is simply a type that we define that specifies the behaviors and attributes of a collection of things or "objects". In Java, an object is something that is created based on the class "blueprint" and is considered to be an instance of a class. Classes are a great way to decompose a problem and allow you to break a program into smaller, more manageable parts through the use of inheritance and composition.

Subclasses and Inheritance

Many of you have already heard of inheritance, a mechanism common to many object-oriented languages. What is inheritance? Inheritance is a mechanism within Java and other object-oriented relationships for defining subtypes, a.k.a., subclasses.
In other words, inheritance is a method for defining an "is a" relationships between classes of objects. For example, an iBook is a Laptop, and a Laptop is a Computer.
When analyzing a system, we like to observe the "is a" relationship among classes, because it enables us to describe the system in a more concise, more meaningful way. By linking the definition of a specific type (subclass or derived class) of object to a more general class of objects (base class, a.k.a, the parent class), we not only make our code smaller and easier to read, but we also make the properites of the classes of objects more apparent.
If no mechanism for defining subclasses were available, or if we chose not to use it, we would ahve to define each class of object, from scratch. The would mean that we would need to write more code -- repeating common aspects of the specification among derived types. This not only makes the original program longer, but requires that, as changes are made over time, they be repeated in each subclass of objects -- a wasteful and error-prone process.
And, even worse, if each object is defined from scratch, without observing the subclass relationships, the relationships are obscured by the code. Someone reading the code could be forced to digest and understand large sections of code -- simply to determine that they are, in fact, describing the same thing!

One Good Example of Inheritance

We can, for example, represent the types of computers present within the room as below. This hierarchy captures, in a meaningful way, the types and subtypes of computers:
The Computer base class might capture the ability to load, store, and execute software, and the ability to display and acquire information from the user, as well as the ability to load and store user data.
The Desktop class might capture such properites as a moveable keyboard and display, the ability to add peripheral devices, &c.
The Portable class might capture such properties as built-in user input and output devices, the ability to "open and close" the case, and the ability to "suspend and wake up".
The Notebook class might capture tha bility to run a certain class of software, receive input via a keyboard and output via an LCD display, and the ability to plug and unplug external peripherals.
The PDA class might capture the ability to receive input via a touchscreen display, the ability to run a different class of software, and the ability to dim the display, &c.

A Bad Example of Inheritance

In class, we also discussed a bad example of inheritance: The FourLeggedObject base class. This hypothetical class could be used to derive such subclasses as tables, chairs, dogs, cats, cribs, and plant stands.
But, what do these things really have in common? Nothing more than the fact that they have four legs -- and legs with different capabilities, at that.
Constructing the classesof objects mentioned above in this way does very little to describe them. And, in fact, could lead people to misunderstandings about their properties adn degree of similarity.
And, in the end, it is likely to lead to coincidental similarities being defined in the base class -- only to be refactored into each of the derived classes during subsequent maintanence. It obscures the underlying model and makes the software more complex to maintain. Bad!

Multiple Inheritance, and Java's Lack of Support Thereof

Some sections heard a diatribe from me about Java's lack of support for multiple inheritance, the ability to have one class of objects as a subclass of two or more base classes, and the resulting weakness of Java's ability to concisely describe some types and relationships. Other sections were spared the diatribe and entangled history of OO languages and their abuse by programmers.
For 15-100 purposes, it is sufficient to know that, in Java, programmers can only specify a single parent class.

Composition

It is often easier to describe complex things by breaking them down into simpler constituent parts and describing these. In fact, this is at the hear of object-oriented design. When a class of objects is described by describing the parts that object sof this type contains, the object is said to be described by composition.
Much like the subtyping can be described as the "is (necessarily) a" relationship, composition can be described by the "contains a" relationship.
For example, if we wanted to describe a Human, it would be much easier to do this by describing the properties and interactions of each organ, than by describing the much more complicated aggregate behaviors and properties of a whole Human. We might create classes for each major organ, such as the the heart, the lungs, the brain, etc. We can then include instances fo these organs within the Human class, and simply describe their interaction within the Human class.
In Java, the use of composition is very straight-forward. We simply include references to instances of the composing classes within the class specification for the aggregate class. We talk more about how to do this with what are called static, a.k.a, class variables and instance variables very soon.

A Word of Caution

Please do not confuse the use of inheritance with the use of composition. Sometimes it is difficult to figure out which to use at the code level, but try not to get frustrated. Instead, look beyond the code back to the actual problem and the big-picture model. Which relationship is being described? "Is (necessarily) a"? Or, "has a"?

The Complete Process

Okay -- let's connect the dots and describe the whole process for solving a problem:

Identify Objects that are part of the system including their behaviors, state, and other attributes.

Think of these objects as instances of classes. Try to determine what type of "thing" each object is.

Identify the relationships among the classes. Any 2 classes will have either no relationship, have a subclass/parent class relationship, or will be related by composition.

Specify the behaviors and attributes of each class of objects. If you do this in plain-English, this is what we call writing a spec. If you do this in code, it is programming. In many cases it is helpful to write a spec before programming -- the process helps flesh out the relationshps with less overhead from detail.

Refactor your design and reimplement it, as necessary. Design is often an iterative process. With each pass, by clarifying additional relationships, the desing often becomes less tangled. As a result, other relationships become more clear and subject to revision. This process ends when the specification or class design is an excellent description of the real-world or abstract system.

Finally, make use of the program, and repair or augment it, as necessary.

Thinking About Programming

A program models a system and allows for observation of output and performance. That's great, but what do you need to figure out before you start constructing your program? Here's a helpful list of questions to cover before you begin coding:

What problem do you need to solve?

What are the components/properties/behaviors involved in the problem?

How do you model those components/properties/behaviors?

How do you set the system in motion and measure results?