15-112 Lecture 18 (June 17, 2014)

15-112 Lecture 18 (June 17, 2014)

Local Area Networks (LANs)

If we take a bunch of computers, or similar devices, and connect them together in a way that shares the same connecting media, such as a single switch, wireless access point, or wire, this forms a traditional Local Area Network (LAN).
These days, we can make LANs bigger by connecting them using additional network switches. These switches, for example, may connect multiple wireless access points, shared network wires (these are rare in the modern world), or other "lower level" switches, such as those that actually have computers and other devices attached to them.

Communication within a LAN

Okay, we talk a bunch of stations and connect them together to from a LAN. Maybe they are all connected to the same wire. Or, maybe they are all connected to the same network switch. Or, maybe they are all within earshot of each other over the air. How do they talk?
Well, we've basically discussed that model. They broadcast. They yell. They shout. And, when any station does -- they all hear the broadcast. In the degenerate case of a point-to-point network, there are only two stations, so the only recipient is the intended recipient. But, in the more general case, every station hears the broadcast messages.
Each station has an station id or LAN address. When a message is sent, it includes both the source and destination addresses. Although all stations might well hear all messages -- they ignore all but those for which they are the intended recipient. Only those messages intended for a particular station get passed up by the network software to the application level. It is certainly possible to cheat and listen in to other stations messages, this is called promiscusous mode. But, this is usually only done for diagnostic (or malicious) purposes.

The Size of a LAN is Self-Limiting

The size of a LAN is self-limiting, both in terms of physical size and also in terms of the number of stations. The longer a wire, the more attenuation -- the signal is weaked as it travels farther and farther. The greater the distance through the air, the weaker the signal. In the end, measured in physical distance, there is only so far that a signal can travel.
And, beyond that, we've got other problems. The more stations we have sharing a broadcast channel, the less network time exists per station. With not-so-many stations, even those with just modest use, the network could become clogged with collision. Remember, broadcast protocols only work with low contention and bursty loads -- they rely on relatively large periods of quiet time in whcih to resolve collisions resulting from the relative short bursts. Broadcast networks can collapse with utilization as low as 30%.

Stretching LANs with Switch

It is posisble to stretch the size of a LAN by using a switch. The basic idea is that we can take a bunch of separate physical LANs and connect them together to form a larger logical LAN. The switches receive and retransmit signals from one network to another, correcting the signal strnegh, noise, and timing, as they do.
And, modern, active switches go farther than that. As stations transmit, they make note of the originating LAN. Then, if they later hear a message destined for that station, they send it only to that one LAN, not to all of the connected LANs.
Basically, they maintain a hash table of pairs. When they hear a message, they update the hash table. Entries in the hash table age out or succumb to cache pressure to make room for new entries. It is only when the switch does not have an entry for a particular destination station that it needs to broadcast the message onto all connected LANs. In short, the switches listen carefully and, in so doing, they are able to cache the location of stations and send messages only to the physical segments on which they live.
By carving up a big LAN into multiple segments, or building one up from multiple segments, contention is reduced. A message can only collide on either the sender's segment or the receiver's segment, or, depending on the switch's design, within its own fabric. As long as the switch knows where the receiver lives, the other segments of the network are unaffected and can support additional transmissions.

Inconveniences of Switching

It is worth nothing that when using switches, moving a station from one segment to another can be a minor inconvenience. Until the old location ages out of the switch's hash table, it will send messages destined for the moved host onto its old segment. It is also possible to break a LAN by faking a station on the wrong segment -- and thereby stealing its traffic. To get around these problems, most modern switches are managed. The system administrator is able to lock stations onto certain segments, disable discovery mode, and delete stale entries.
It is also the case that complex geometries can be challenging for switchd networks. For example, it is often desirable to create multiple paths from segment-to-segment. This allows paths to exist, even if a switch failes. But, such paths can create cycles. Switches will resent messages until they appear on both sides of the same switch. Depending on the dynamic behavior, this can cause messages to be transmitted to a wrong or redundant segment -- or not all all. To solve this problem, most modern switches include configuration protocols. When enabled by the system administrator, they go into a configuration mode, learn each other's location, elect a root node, and form a spanning tree. This tree breaks the cycles enabling good communication. In the event that a birdge fails, they can subsequently agree to forma different tree to get around the failure.

Limitations of Switches

Switches extend the size of LANs by a bit -- but they surely aren't the global answer. Like anything else, they've got limits. In the case of switches, memory and failure are the limiting factors. There are just far too many stations on the planet for any one switch to remember them all. It simply ain't possible. No way, no how.
And, even if magic were to happen to make this possible, it would be challenging to build a spanning tree the size of the globe. There would always be failure. And, they'd always be trying to learn a new tree.

Building up the Protocol Stack

We often talk about the architecture of the network protocol stack in terms of layers:

Physical layer -- The hardware specs: voltage levesl, light colors, the shape of connectors, frequencies and power levels, &c

Link layer -- The layer that manages the communication within a LAN. Frame spec, collison managemnt, flow control, &c

Network layer -- Manages the movement of messages across an internetwork, from lan-to-lan.

We're now about to enter the domain of the network layer. We're going to talk about how, instead of scaling up LANs, we can recognize them as separate networks and efficiently communicate messages from one to the next, until we get from the source network to the destination network.

Rethinking the Problem and Hierarchical addresses

So, it is pretty clear that we can't hope to keep track of every host on the Internet. We must, somehow, structure the problem and play with bigger blocks. The way people usually deal with large problems is to impose a hierarchy so that no single level is too large. Consider the organization of corporations, schools, books in a library, files on a computer, &c.
We're going to do the same thing with the Internet. Instead of viewing the entire Internet as one large, flat network, we are going to view it for what it is -- a collection of individual networks. Step one is going to be routing packets from one network to another network. Once there, we'll worry about getting them to the right machine.
To achive this, we are going to create a new network address -- one that is structured so that it contains both a network number and the host number, rather than the flat address, a.k.a station id, that we've discusses so far. The station id will still be used within a LAN -- but we'll use this new IP Address to get from one network to another. We'll leave the details of the form of network addresses for 15-441.

How Internet Routing Works

At this point, we are viewing our internetwork as what it is -- a collection of networks tied together. Tying these networks together are routers. Ultimately, when a message is sent from one host to another, one of two things is true:

It is destined for a host on the same network
It is destined for a host on another network.

If it is destined for a host on the same network, there is no routing. The host, itself, looks at the destination IP address, notices that it is on the same network, and simply sends the message to the destination using the lower-level protocol. But, if it is destined for a different network, it sends it to the router instead.
The router is a device that ties together several networks. It gets a message because the lower-level LAN address indicates it as the destination. But, it knows that it isn't the real destination, because the higher-level IP address indicates another recipient.
Based on this, it looks up the network number from within the IP address in a table and forwards the message to one of the connected networks. If the destination lives on that connected network, it get sent directly there, as if it had originated on that LAN. Otherwise, the LAN is still used -- but to send it to another router, as described above.

Routing Protocols

It is important to note that these routers might be connected to many networks. It is even more important to note that these networks might form a graph, with mutliple paths between destinations. And, yet more important to reaize that there might be many, many hops from source to destination.
Given this, how do the routers know which way to send a packet so that it doesn't get lost or go around in circles? The answer is that the routers talk, and, based on that conversation, they build up two tables: one that describes the network, as a whole, known as the routing table and one that describes exactly what the router should do, known as the forwarding table.
We're going to leave the details of how these tables get built to 15-441. Especially since there are different protocols that get the job done and different strategies -- and there are tons of interesting and subtle things about them.

Network Layers, A Reference Model

As we work our way up from network hardware to the application programmer, we are beginning to see the overal organization of a network. This architecture is sometimes described using the following model:

Application Layer: The details of the messages and structures used by a particular application

Transport Layer: Establishment of endpoints and other services commonly used by programmers

Network Layer: Movement of packets from network to network across an intern-network

Link Layer: Management of stations sharing the same channel

Physical layer: Voltages, connector shapes, power levels, light colors, &c

Thus far, we've worked our way up, talking a little about the physical layer, which is really the domain of various engineering disciplines, and a lot about the link and network layers. Nowwe are going to begin our discussion of the transport layer.

The Transport Layer

The transport layer establishes an end-to-end abstraction that is useful to the programmer. Part of that is that it needs to hide the hop-by-hop nature of the network layer's routing process. And, part of that is that it needs to establish program-to-program communication, since multiple programs might be running on the same host -- and the network layer just goes hop-to-hop.
In addition to these basic requirements, it must somehow answer the question, "What is a message, and how do we know when we have one?". For example, we often classify transport layers as being either:

Message-oriented: Messages are sent like the mail. When you get a message, it comes in a discrete chunk. Whatever is placed into the envelope when sent is exactly what is in the envelope when it is read. Envelopes, even those sent in series from the same sender to the same receiver are never merged.

Stream-oriented: There really isn't the concept of a discrete message -- there is the flow of data. Consider a phone conversation or radio broadcast. These don't come in envelopes. There can be periods of quiet, but there is no packaging or dividing line.

Protocols are often also classified in terms of their quality of service:

Best-effort: Also like the post office, the protocol does its best, but makes no guarantees. Messages may be lost or delivered out of order

Reliable: The protocol will try diligently to resent anything that is not confirmed to be delivered.

As we'll talk about soon, unreliable protocols may, or may not, be session-oriented. A session-oriented protocol establishes a relationship between the sender and receiver before any data is exchanged. this session remains in place until it is closed. So, in some sense, the recipient knows to be waiting for communication. Unreliable protocols need not be session-oriented. But, reliable protocols need to be session oriented so that the sender and receiver can coordinate what has, and what has not, been successfully received.
In the context of Internet protocols, the TCP/IP protocol suite, there are two general-purpose transport protocols:

User Datagram Protocol (UDP): Unreliable, message-oriented
Transport Control Protocol (TCP): Reliable, stream-oriented

UDP adds very little value of IP, itself -- basically, it adds port numbers. It allows applications to be identified with ports so that messages, upon arriving at the destination, can be sent to the right program.
We'll talk about TCP in a few minutes. It adds a lot of value. It adds streams and reliability. But, for today, I'd like to examine what it means --and does not mean-- to be a reliable protocol.

Simple Reliability

It is easy to see how we could create a reliable protocol above UDP. We add a sequence number to each message. We send a message and wait for an ACKnowledgement. We know the maximum round-trip time, and wait at least that long. If we don't get an ACK within that time, we assume that the message got lost en route to the recipient -- even though maybe only the ACK got lost. We resend. When the sender gets it, it'll ACK, possibly again. There won't be any confusion, even if it is received twice, because the sequence number will enable the duplicate to be detected and discarded. The same is true of a duplicate ACK. If we send more than once copy, and more than one ACK eventually makes its way to the sender, the sender just ignores the duplicates -- it ignores any ACK that is not associated with the present message number.
To this end, it is important to note that only one message is in flight at a time. The time between the sending of a message and when its ACK is received is dead air. For this reason, this type of reliable protocol is often known as a stop-and-wait protocol.

Reliable vs Unreliable

A reliable media certainly beats one that is not. But, as we now know, a reliable protocol is really just a diligent protocol. It tries, and tries, and tries some more.
But, this is not always desirable. In some cases, a late packet is worthless -- and resending it just wastes network time. This is the case for many types of real-tiem communication, such as live video or audio, e.g. telephone calls or web cams.
What does one do with a 10 minute old syllabal? If we delay the subsequent syllabals by 10 minutes, the call is worthless. And, if we charge forward, we can't exactly introduce a stray word later. It is best to just let it go an hear a brief pause or pop. The same is true of video. We'd rather see a brief freeze and a jump in one part of the frame than have th4e whole thing delayed.

About the Transmission Control Protocol (TCP)

Please see these slides for the support materials for today's TCP discussion.