Since the beginning of the internet, HTTP has been its foundation. It’s an acronym for hypertext transfer protocol. The keyword here is “protocol”. A protocol is explained as “the official procedure or system of rules governing affairs of state or diplomatic occasions.”. In this regard, the HTTP protocol defines how data is requested and delivered back and forth between the clients and servers.
In the context of the internet, everything is data, including videos, images, websites, and audio files. You can think of the internet as a government, web servers as government offices and the citizens as internet users. In this case, HTTP is the bureaucracy. When you think of it this way, it’s not hard to imagine the process consisting of too many steps, requests and time.
The Technical Definition of HTTP
An HTTP session is a sequence of network request-response transactions. An HTTP client initiates a request by establishing a Transmission Control Protocol (TCP) connection to a particular port on a server (typically port 80, occasionally port 8080; see List of TCP and UDP port numbers). An HTTP server listening on that port waits for a client’s request message.
Upon receiving the request, the server sends back a status line, such as “HTTP/1.1 200 OK”, and a message of its own. The body of this message is typically the requested resource, although an error message or other information may also be returned. (Source: Wikipedia)
What Was the Problem with HTTP?
All data on the internet consists of three main parts: Header, payload and footer. As stated before, delivery of complete data requires many requests and deliveries between the clients and the web servers. Since the internet was originally developed in a controlled environment in a university network, the initial structure of HTTP protocol didn’t cause any problems back then because there were a limited number of clients requesting data from a specific web server.
As the internet grew and the client numbers surged past billions, this initial protocol started to become a burden. It was extending page load times and the servers couldn’t comply with request and response chains created by multiple TCP connections. The internet was drowning under its own bureaucracy.
Even some innovative workarounds like domain sharding were not enough to make any improvements in page load times or reducing latency. Domain sharding increased the amount of simultaneously downloaded resources for a particular website by using multiple domains. This allowed websites to be delivered faster to users as they didn’t need to wait for the previous set of resources download before beginning the next set. Through this technique, web developers tried to work around the problem since they couldn’t remove the boundaries created by single TCP connections.
What Happened Before HTTP2?
Before the emergence of HTTP2, there were attempts to improve page load times and reduce latency. Some of these attempts were on the browser side, with some browsers starting to cache data to prevent re-downloading in a recurring visit.
However, the ultimate problem was the process itself and attempts to get around the protocol didn’t make any significant improvements. There were also other attempts to improve the speed of requests and responses, but these weren’t enough to make a difference.
The Foundation of a New Transfer Protocol: SPDY
Like everyone else in the internet community, Google was aware of the situation and the issues of the HTTP protocol. In 2009, they announced that they were working on a whole new protocol to significantly improve (and even reduce) request and response times. This new protocol was called SPDY. Google’s purpose in creating SDPY was to:
- Avoid the need to make infrastructural changes of network
- Avoid the need for any changes in website content
- Minimize the deployment complexity
Google made it clear that they intended to develop SDPY in collaboration with the open-source community, as per the company’s open approach at the time.
When SDPY launched, Google software engineers Roberto Peon and Mike Belshe declared that they achieved an improvement in page load speeds and lowered latency up to 55% in a controlled environment.
It was obvious that SPDY was far superior to traditional HTTP protocols. New versions of browsers, a vast number of websites and other social platforms started supporting SPDY. This made Google’s new SPDY protocol the de facto standard because the benefits were obvious and scalable, while deployment was relatively effortless.
2 HTTP Protocols Growing in Parallel
The success of Google’s new protocol and its widespread adoption compelled the HTTP Working Group to start working on a new standardised protocol, which would later become the standard HTTP2 (i.e. WEB 2.0).
For a while, these two HTTP protocols lived side-by-side, while SPDY starting to fill the role of a guinea pig. Bold improvements were applied and tested first on the SPDY protocol, with successful ones adopted into the HTTP2 protocol. In just a few years, HTTP2 matured enough to become the new standard for the internet.
HTTP2 as a New Standard
Since 2015, HTTP2 became the new standard for the internet. CDN providers also started supporting HTTP2. The ease and relatively negligible costs of application increased the speed of adoption.
What Makes HTTP2 So fast?
HTTP2 is simply the evolution of a protocol developed in a controlled environment, to a new technique that can handle real-world issues. Almost all the innovations have emerged from experience and real-world applications of general internet usage. Here are some of those innovations that made HTTP2 much faster than HTTP:
The process of data delivery between web servers and the clients in the older versions of HTTP could have been likened to a chess match.
- Client requests the data and waits for the response.
- The web server receives the request, processes it, and sends the requested data.
- The client receives the sent data, sends back a confirmation of receipt, and requests more data.
- The web server receives the confirmation, finds more data, and sends it. Then it waits for confirmation that the client received the data.
This request-response cycle continues until all the data is delivered or the connection is broken and the client stops requesting data. And much like a chess match, no one can make a move before a previous move was complete by the other.
The innovation of the servers’ push method was to send the necessary data even before it was requested. Since the structure of data was clear and identified, the web server could simply predict what data was needed and send it beforehand. This also shortened the server-client engagement, leading to many other indirect data delivery speed improvements. The data delivered by the server push method could be:
- Cached by the client
- Reused across different pages
- Multiplexed alongside other resources
- Prioritized by the server
- Declined by the client
The push promise method is an agreement or contract between the client and the web server before the HTTP stream starts. Simply put, the server informs the client about the content of the data set and what will be delivered as a result of the client’s request. The client can decline the delivery of the data and proceed. This prevents the client and the connection to be overloaded with data that is useless to the client.
Since HTTP2 requires the server to “push” the data forward to the client, measures were required to prevent the web server overloading the client with the unnecessary data.
The client’s and the web server’s stream priorities, processing capabilities and workloads may not match. Actually, in real-world conditions, they almost never match. A server may handle the traffic of thousands or even millions of clients and doesn’t have to prioritize any specific task above anything else.
With the flow control, this kind of information is exchanged between the server and the client. Since HTTP2 doesn’t determine the exact parameters during this exchange, the server and the client may determine an exchange pace and priority according to real-time variables.
As mentioned above, in older versions of HTTP, the request-response cycle should be kept and data has to be sent in order to be processable (or meaningful) by each side. Multiplexing divides data into parts where it can be requested or sent according to both party’s requirements and then connected on the other side after delivery. This also eliminated the need to apply workarounds, like domain sharding.
There are also some other methods and innovations applied in HTTP2, which are very different from HTTP or even SPYD.
In general, HTTP2 revolutionized the way that web servers and clients engaged in data exchange, resulting in a speed increase of roughly 50%. Nowadays, a WEB 3.0 is being talked about. Let’s see if it will be HTTP3.0 or something even more revolutionary.