Web technologies are in a constant state of flux. It’s impossible to
predict which will fail, which will shine brightly then quickly fade
away, and which have real longevity. Rapid innovation is what makes web
app development so exciting, but shiny new things shouldn’t be pursued
without a solid understanding of the underlying web platform.
To convert names to numbers, the browser queries a series of DNS servers, which are distributed directories of names and numbers for all web servers. To speed up this process, the lookups are cached at a number of locations: your internet service provider (ISP) will hold a cache, your operating system may hold a cache and even your web browser software will hold a short lifetime cache.
Your browser will add supplementary information to the request message, much of which is designed to improve the speed and format of the returned content. For example, it might include data about the browser’s compression capabilities and your preferred language.
The opening status line contains the HTTP version number, a numeric response code and a textual description of the response code. The web browser is designed to recognise the numeric response code and proceed accordingly.
Common response codes include:
HTTP cookies are the most common solution to this problem. A cookie is a small text file that the browser stores on your computer. It contains a name and a value associated with a specific website. Cookies can be temporary or can persist for years.
One way a browser can achieve this is through content sniffing. The browser examines the first few bytes (and sometimes more) of the content to see if it recognises a pattern in the data, such as a PDF or JPG header. Apart from the accuracy and performance issues that this may introduce, it can also have security implications.
Web architecture primer
Let’s start with DNS (domain name system) and HTTP (hypertext transfer protocol). These are the underlying systems that web browsers use to send and fetch data to and from web apps. Familiarity with these protocols is essential for later discussions on application programming interfaces (APIs), performance and security.DNS
When you type an address into a web browser or follow a link, the browser first has to identify which computer server in the world to ask for the content. Although web addresses use domain names like fivesimplesteps.com to make them easier for people to remember them, computers use unique numbers to identify each other1.To convert names to numbers, the browser queries a series of DNS servers, which are distributed directories of names and numbers for all web servers. To speed up this process, the lookups are cached at a number of locations: your internet service provider (ISP) will hold a cache, your operating system may hold a cache and even your web browser software will hold a short lifetime cache.
HTTP requests
Once your browser has identified the correct number associated with the domain name, it connects to the server with the equivalent of, “Hello, can I ask you something?” The connection is agreed and your browser sends a message to request the content. As a single web server can host thousands of websites, the message has to be specific about the content that it is looking for.Your browser will add supplementary information to the request message, much of which is designed to improve the speed and format of the returned content. For example, it might include data about the browser’s compression capabilities and your preferred language.
HTTP responses
The response from the server to the browser also contains an HTTP message, prefixed to the requested content.HTTP/1.1 200 OK
Date: Sun, 20 Feb 2011 03:49:19 GMT
Server: Apache
Set-Cookie: BBC-UID=d4fd96e01cf7083; expires=Mon, 20-Feb-12 07:49:32 GMT;
path=/;domain=bbc.co.uk;
Cache-Control: max-age=0
Expires: Sun, 20 Feb 2011 03:49:19 GMT
Keep-Alive: timeout=10, max=796
Transfer-Encoding: chunked
Content-Type: text/html
Connection: keep-alive
125
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
The opening status line contains the HTTP version number, a numeric response code and a textual description of the response code. The web browser is designed to recognise the numeric response code and proceed accordingly.
Common response codes include:
- 200: OK
Successful request - 301: Moved Permanently
The requested content has been moved permanently to a new given location. This and all future requests should be redirected to the new address. - 302: Found
The requested content has been moved temporarily to a given address. Future requests should use the original location. - 404: Not Found
The server cannot find the resource at the requested location. - 500: Internal Server Error
A generic error message, shown when an unexpected condition prevents the request from being fulfilled.
Statelessness and cookies
HTTP is stateless. This means that multiple requests from the browser to the server are independent of one another, and the server has no memory of requests from one to the next. But most web apps need to track state to allow users to remain logged in across requests and to personalise pages across sessions.HTTP cookies are the most common solution to this problem. A cookie is a small text file that the browser stores on your computer. It contains a name and a value associated with a specific website. Cookies can be temporary or can persist for years.
Content type
Your browser now has the content it requested, thanks to the HTTP response from the server. Before the content can be processed and displayed though, the browser needs to determine what type of content it is: an image, PDF file, webpage, or something else.One way a browser can achieve this is through content sniffing. The browser examines the first few bytes (and sometimes more) of the content to see if it recognises a pattern in the data, such as a PDF or JPG header. Apart from the accuracy and performance issues that this may introduce, it can also have security implications.
Summary
Knowledge of the underlying web technologies enables you to develop workarounds for web browser restrictions and optimise performance and security.- DNS converts domain names to computer-usable identification numbers.
- HTTP messages govern the requests and responses between web browsers and web servers.
- HTTP is stateless, but cookies can be used to remember a computer from one request to another.
- Content-type HTTP header fields tell the browser what type of content is being sent.
- Character encoding headers tell the browser how to understand text files.
- UTF-8 is the most practical character encoding for the web.
- Web browsers convert HTML into a document object model (DOM) tree in memory.
- A second render tree is created in browser memory from the DOM, to represent the visual page layout.
- Use a
DOCTYPEto tell the browser which layout mode to use. - JavaScript can modify the DOM, and Ajax techniques can request additional data from the server and make partial updates to the DOM.
No comments:
Post a Comment