Oh what a tangled web we weave.....
http://www.rediff.com/news/column/how-the-internet-was-born-25-years-ago/20151113.htm
November 12 marks 25 years of the beginning of the World Wide
Web. Shivanand Kanavi gives us the story of how it all began
"Great Cloud. Please help me.
I am away from my beloved and miss her very much. Please go to the city called
Alaka where my beloved lives in our moonlit house”
—From Meghadoot (messenger
cloud) of Kalidasa,Sanskrit poet, playwright, fourth century AD
Twenty five years ago on Nov 12, 2015, Particle
Physicist Tim Berners Lee, working at the European Organisation for Nuclear
Research (CERN) at Geneva submitted a note to his bosses on Hyper Text (http)
and thus started a chain of events that led to the information revolution of
the World Wide Web (see http://www.w3.org/Proposal.html for a copy of Tim
Berners Lee's note).
Tim Berners Lee
Today we have
over 150 million users of Internet already in India and the number is growing
by leaps as SmartPhones are selling by the millions every month. The Internet
has become a massive labyrinthine library, where one can search for and obtain
information in seconds. It has also evolved into an instant, inexpensive
communication medium where one can send email text and even images, sounds and
videos, to a receiver, girdling the globe.
There are billions of documents in the Internet,
on millions of computers known as Internet servers, all interconnected by a
tangled web of cables, optic fibres and wireless links. We can be part of the
Net through our own PC, laptop, SmartPhone, using a wired or a wireless
connection to an Internet Service Provider.
Like Jack’s beanstalk, the Net is growing at a
tremendous speed.
However, one thing we learn from ‘Jack and the
Beanstalk’ is that every giant magical tree has humble origins. The beans, in
the case of Internet, were sown as far back as the sixties. To understand the
significance of Lee's contribution, one should briefly look at the history of
the Internet.
It all started with the Advanced Research
Projects Agency (ARPA) of the US Department of Defence. ARPA was funding
advanced computer science research from the early ’60s. J.C.R Licklider, who
was then working in ARPA, took the initiative in encouraging several academic
groups in the US to work on interactive computing and time-sharing.
Bob Taylor (Photos: Palashranjan Bhaumick)
One glitch, however, was that these different
groups could not communicate their programmes or data or even ideas with each
other easily. The situation was so bad that Taylor had three different
terminals in his office in the Pentagon connected to three different computers
that were being used for time -sharing experiments at MIT, UCLA and Stanford
Research Institute. Thus started an experiment in enabling computers to
exchange files among themselves. Bob Taylor played a crucial role in
Information Processing Technology Office of ARPA in creating this network,
which was later named Arpanet. “We wanted to create a network to support the
formation of a community of shared interests among computer scientists and that
was the origin of the Arpanet”, says Taylor.
It is a fact, however, that the first computer
network to be proposed theoretically was for military purposes. It was to
decentralize nuclear missile command and control. The idea was not to have
centralized, computer-based command facilities, which could be destroyed in a
missile attack. In order to survive a missile attack and retain what was known,
during the US-Soviet Cold War, as ‘Second Strike Capability’, Paul Baran of
Rand Corporation had proposed the idea of a distributed network. In those mad
days of Mutually Assured Destruction (MAD), it seemed logical.
Baran elaborated his ideas to the military in an
eleven-volume report ‘Distributed Communications System’ during 1962-64. This
report was available to civilian research groups as well. However, no civilian
network was built based on it. Baran even worked out the details of a packet
switched network, though he used a clumsy name, ‘Distributed Adaptive Message
Block Switching’. Donald Davies, in the UK, independently discovered the same a
little later and called it packet switching.
Networking pioneers like Paul Baran, Bob Taylor,
Larry Roberts, Frank Heart,Vinton Cerf, Steve Crocker, Bob Metcalfe, Len
Kleinrock, Bob Kahn and others have recalled, in several interviews, the
struggle they had to go through to convince AT&T, the US telephone monopoly
of those days.
AT&T did not believe packet switching would
work, and that, if it ever did, it would become a competing network and kill
their business! This battle between data communication and incumbent telephone
companies is still not over. As voice communication adopts packet technology,
as in Voice Over Internet, the old phone companies all over the world are
barely conceding to packet switching, kicking and crying.
Using ARPA funds, the first computer network
based on packet switching was built in the US between 1966 and 1972. A whole
community of users came into being at over a dozen sites, and started
exchanging files. Soon they also developed a system to exchange notes and they
called it ‘e-mail’ (an abbreviation for electronic mail). Abhay Bhushan, who
worked in the Arpanet project from 1967 to 1974 was then at MIT and wrote the
note on FTP or File Transfer Protocol, the basis of email. In those days,
several theoretical and practical problems were sorted out through RFCs, which
stood for Request For Comments –a message sent to all Arpanet users. Any
researcher in a dozen ARPA sites could pose a problem or post a solution
through such RFCs. Thus, an informal, non-hierarchical culture developed among
these original Netizens. “Those were heady days when so many things were done
for the first time without much ado,” recalls Abhay Bhushan.
Abhay Bhushan
An email
program that immediately became popular due to its simplicity was sendmsg,
written by Ray Tomlinson, a young engineer at Bolt Beranek and Newman (BBN), a
Boston-based company, which was the prime contractor for building the Arpanet.
His email programs have obviously been superseded in the last thirty years by
others. But one thing that has survived is the @ sign to denote the computer
address of a sender. Tomlinson was looking for a symbol to separate the
receiver’s user name and the address of his host computer. When he looked at
his Teletype, he saw a few punctuation marks available and chose @ since it had
the connotation of ‘at’ among accountants, and did not occur in software
programs in some other connotation.
A ‘communication protocol’ is a favourite word
of networking engineers just as ‘algorithm’ is a favourite of computer
scientists. Leaving the technical details aside, a protocol is actually a
step-by-step approach to enable two computers “talk to each other” i.e.
exchange data. We use protocols all the time in human communication, so we
don’t notice it, but if two strangers met, then how would they start to
converse? They would start by introducing themselves, finding a common
language, agreeing on a level of communication—formal, informal, professional,
personal, polite, polemical and so on, before exchanging information.
As Arpanet rose in popularity in the 70s, a
clamour started from every university and research institution to be connected
to Arpanet. Everybody wanted to be part of this new community of shared
interests. However, not everyone in a Local Area Network could be given a
separate Arpanet connection, so one needed to connect entire LANs to Arpanet.
Here again there was a diversity of networks and protocols. So how would you
build a network of networks (also called the Internet)? Largely, Robert Kahn
and Vinton Cerf solved this problem by developing TCP (Transmission Control
Protocol) and hence they are justly called the inventors of the Internet.
Meanwhile, in 1971, an undergraduate student at
IIT Bombay, Yogen Dalal, was frustrated by the interminable wait to get his
programs executed by the old Russian computer. Thanks to encouragement from a
faculty member, J R Isaac, who was then head of the computer centre, Dalal
started a BTech project on building a remote terminal for the mainframe. “Like
all undergraduate projects, this also did not work,” laughs Dalal, recalling
those days. But when he went to Stanford for his MS and PhD and saw
cutting-edge work being done in networking by Cerf & Co., he naturally got
drawn into it.
Vinton Cerf with the author
As a result, Vinton Cerf, Yogen Dalal and
another graduate student, Carl Sunshine, wrote the first paper setting forth
the standards for an improved version of TCP/IP, in 1974, which became the
standard for the Internet. “Yogen did some fundamental work on TCP/IP. I
remember, during 1974, when we were trying to sort out various problems of the
protocol, we would come to some conclusions at the end of the day and Yogen
would go home and come back in the morning with counter examples. He was always
blowing up our ideas to make this work,” recalls Cerf.
“They were the most exciting years of my life,”
says Yogen Dalal, who after a successful career at Xerox PARC and Apple, is a
respected venture capitalist in Silicon Valley. Recently he was listed as among
the top fifty venture capitalists in the world.
Yogen Dalal
Two things
changed the Internet, one was the development of the World Wide Web and the
other was a small program called the Browser that allowed you to navigate in
this web and read the web pages.
The web is made up of host computers connected
to the Internet containing a program called a Web Server. The Web Server is a
piece of computer software that can respond to a browser’s request for a page
and deliver the page to the Web browser through the Internet. You can think of
a Web server as an apartment complex with each apartment housing someone’s Web
page. In order to store your page in the complex, you need to pay rent on the
space. Pages that live in this complex can be displayed to and viewed by anyone
all over the world. The host computer is your landlord and your rent is called
your hosting charge. Every day, there are millions of Web servers delivering
pages to the browsers of tens of millions of people through the network we call
the Internet.
The host computers connected to the Net, called
Internet servers, are given a certain address. The partitions within the server
hosting separate documents belonging to different owners are called Websites.
Each website in turn is also given an address—Universal Resource Locator (URL).
These addresses are assigned by an independent agency. It acts in a manner
similar to that of the registrar of newspapers and periodicals or the registrar
of trademarks, who allow you to use a unique name for your publication or
product if others are not using it.
When you type in the address or URL of a website
in the space for the address in your browser, the program sends packets
requesting to see the website. The welcome page of the website is called the
home page. The home page carries an index of other pages, which are part of the
same website and residing in the same server. When you click with your mouse on
one of them, the browser recognises your desire to see the new document and
sends a request to the new address, based on the hyperlink. Thus, the browser
helps you navigate the Web or surf the information waves of the Web—which is
also called Cyberspace, to differentiate from real navigation in real space.
The web pages carry composing or formatting
instructions in a computer language known as Hyper Text Markup Language (HTML).
The browser reads these instructions or tags when it displays the web page on
your screen. It is important to note that the page, on the Internet, does not
actually look the way it does on your screen. It is a text file with embedded
HTML tags giving instructions like ‘this line should be bold’, ‘that line
should be in italics’, ‘this heading should be in this colour and font,’ ‘here
you should place a particular picture’ and so on. When you ask for that page,
the browser brings it from the Internet web servers and displays it according
to the coded instructions. A web browser is a computer program in your computer
that has a communication function and a display function. When you ask it to go
to an Internet address and get a particular page, it will send a message
through the Internet to that server and get the file and then, interpreting the
coded HTML instructions in that page, compose the page and display it to you.
An important feature of the web pages is that
they carry hyperlinks. Such text (with embedded hyperlinks) is called Hyper
Text, which is basically text within text. For example, in the above
paragraphs, there are words like ‘HTML’, ‘World Wide Web’ and ‘Browser’. Now if
these words are hyperlinked and you want to know more about them, then I need
not give the information right here, but provide a link to a separate document
to explain each of these words. So, only if you want to know more about them,
would you go that deep.
In case you do want to know more about the Web
and you click on it, then a new document that appears might explain what the
Web is and how it was invented by Tim Berners-Lee, a particle physicist, when
he was at CERN, the European Centre for Nuclear Research at Geneva. Now if you
wanted to know more about Tim Berners-Lee or CERN then you could click on those
words with your mouse and a small program would hyperlink the words to other
documents containing details about Lee or CERN and so on.
Thus, starting with one page, you might ‘crawl’
to different documents in different servers over the Net depending on where the
hyperlinks are pointing. This crawling and connectedness of documents through
hyperlinks seems like a spider crawling over its web and there lies the origin
of the term ‘World Wide Web.’
For a literary person, the hyperlinked text
looks similar to what writers call non-linear text. A linear text has a plot
and a beginning, a middle and an end. It has a certain chronology and
structure. But a nonlinear text need not have a beginning, middle and an end in
the normal sense. It need not be chronological. It can have flashbacks and
flash-forwards and so on.
If you were
familiar with Indian epics then you would understand hyperlinked text right
away. After all, Mahabharat, Ramayana, Kathasaritsagar, Panchatantra, Vikram and
Betal’s stories have nonlinearities built into them. Every story has a
sub-story. Sometimes there are storytellers as characters within stories, who
then tell other stories, and so on. At times you can lose the thread because,
unlike Hyper Text and hyperlinks—where the reader can exercise his choice to
follow a hyperlink or not—the sub-stories in our epics drag you there anyway!
Earlier, you could get only text documents on
the Net. With HTML pages, one could now get text with pictures or animations or
even some music clips or video clips and so on. The documents on the Net became
so much livelier, while the hyperlinks embedded within the page took you to
different servers—host computers on the Internet acting as repositories of
documents.
It is as if you open one book in a library and
it offers you the chance to browse through the whole library of books, CDs and
videos! By the way, the reference to the Web as a magical library is not
fortuitous. This idea of a hyperlinked electronic library was essentially
visualised in the 1940s by Vannevar Bush at MIT, which he had called Memex.
Incidentally, Tim Berners-Lee was actually
trying to solve the problem of documentation and knowledge management in CERN.
He was grappling with the problem of how to create a database of knowledge so
that the experience of the past could be distilled in a complex organisation.
It would also allow different groups in a large organisation to share their
knowledge resources. That is why his proposal to his boss to create a hyperlinked
web of knowledge within CERN, written in 1989-90, was called: ‘Information
Management: A Proposal’. Luckily, his boss is supposed to have written two
famous words, “Why not?” on his proposal. Lee saw that the concept could be
generalised to the Internet. The Internet community quickly grasped it, and we
saw the birth of the Internet as we know it today. A new era had begun.
Lee himself developed a program, that looked
like a word processor and had hyperlinks as underlined words. He called it a
browser. The browser had two functions: a communication function which used
Hyper Text Transfer Protocol (HTTP) to communicate with servers, and a
presentation function. As more and more servers capable of using HTTP were set
up, the Web grew.
Soon more browsers started appearing. The one
written by a graduate student at the University of Illinois, Marc Andreessen,
became very popular for its high quality and free downloading. It was called
Mosaic. Soon, Andreessen left the university, teamed up with Jim Clark, founder
of Silicon Graphics, and floated a new company called Netscape Communications.
Its Netscape Navigator created a storm and the company started the Internet
mania on the stock market when it went public, attracting billions of dollars
in valuation even though it was not making any profit!
Meanwhile, Tim Berners-Lee did not make a cent
from his path breaking work since he refused to patent it. He continues to look
at the development of the next generation of the Internet as a non-profit
service to society and heads a research group, W3C, at MIT, which has become a
standards-setting consortium for the Web.