HTTP (HyperText Transfer Protocol) is the way most of the web communicates. It’s the system that governs how your computer talks to servers around the world to retrieve webpages, talk to your online calendar, and even send messages in point of sales systems. It basically gives you a set of rules to follow and a way for computers to talk to one another. It runs on top of existing networking hardware connections and is often something you can directly access as part of web development frameworks.
HTTP is a request-response system. This means there is one computer (the client) that sends a request to another computer (the host), which then generates and sends a response. An easy analogy would be to think of sending mail. Someone sends a letter, it gets delivered, and then someone sends a response.
The first thing you need in order to be able to send a request is somewhere to send it to. URL’s are used in the browser to indicate where you want to go on the web, and what pages you want to access. For example this page has a URL of https://schulichignite.com/definitions/http
. This follows a format of: protocol://domain_name/slug
. So in this URL https
is the protocol, schulichignite.com
is the domain name, and /definitions/http
is the slug.
Http is a protocol. This basically tells the browser how it should format it’s communications. HTTP is the one used on most of the internet (or https), but there’s also FTP, SMTP, file and other protocols that can be used.
A domain is something you purchase from a domain name registrar. The only way to communicate between two computers is an IP address, something like 192.168.1.1
. The problem is that these suck to use, so domains allow you (with the help of DNS) to register a domain name and map it to an IP address.
So if we were to buy schulichignite.com
, and then register a DNS record of @ A 192.168.1.1
then @
means this domain, an A record means map an IP address, and 192.168.1.1 is our hypothetical server. So when someone goes to http://schulichignite.com
it would be the same as going to http://192.168.1.1
, and they would get our server!
Slugs are used to identify different resources when you get to a server. Leaving a URL at the “home/root URL” (i.e. http://schulichignite.com
) sets the slug to /
. From there this means you can have a bunch of webpages that your server can serve to a user based on what slug they provide. So for example https://schulichignite.com/scorch
gives us a slug of /scorch
. So first the https://schulichignite.com
tells our browser to lookup the server for schulichignite.com (let’s say it’s IP is 192.168.1.1
), then comunicates to it over HTTP and asks for the /scorch
page!
Headers are key-value pairs that are used to communicate information between the client and host. This is essentially extra data used to inform one another about settings, preferences, security standards etc. Headers are in key-value format, so for example to indicate the date you might do:
Date: Wed, 10 May 2023 20:34:20GMT
Headers can include whatever information you want, but there are many standards that are existing and respected by many http servers, so be careful if you are making your own up that they aren’t taken already!
When we work on our computer we often use extensions to tell what type a file is. This doesn’t work in networking. Instead we use MIME types to indicate what type of resource something is. This is what changes how your browser processes a page so it can tell when it should treat a URL as a PDF vs a HTML page vs an image etc. A MIME type is just a header (typically called content-type
), and there is a list of approved MIME types recognized by the browser.
Content is just the plain text associated with the resource you are looking to access. If it’s a webpage then it’s the HTML/CSS/JS source code, if it’s an image it’s the raw binary data etc.
when looking at HTTP communication there is the host and the client. The host is the system that recieves a request and has the resource that is being requested by the client. The steps for HTTP communication are:
flowchart TD a[Client] --1. REQUEST: GET me /web on schulichignite.com --> b[Host] b --3. RESPONSE: HTML file for the page schulichignite.com/web --> a b--2. finds requested resource associated with /web on server-->b
These are used to request CRUD (Create, read, update delete) operations from a server. Keep in mind that in order to send a request to a server you need the IP address, and you need to allow a port on the host computer to be open. A port is what allows you to specify where on the host machine you want to access. This means one computer can run multiple servers by running them on different ports. Most commonly you will use port 80 for http, and 443 for https.
There are various things you can request from a server, at the start of an HTTP request you need to specify what method you want to request with. This will determine how the server interprets what you want to do such as:
Once sent they will await for a response
HTTP has a ton of request methods for telling the server what you want to do, here are a few:
Parameters are one of the three ways to send data to a server (the headers and/or the body are the other two). Parameters are two standards used to pass information to a server via the request URL.
With query parameters Values are included after the url with ? and key-value pairs for example:
weather.com/today?city=calgary
Multiple parameters
weather.com/today?city=calgary&measurements=celcius
Values that are included in the main URL. So for example weather.com/calgary the API will define
weather.com/{city} The /calgary is a variable called city
Multiple Parameters
weather.com/{city}/{measurement}
weather.com/calgary/celcius
I used {} here to indicate a variable, but some systems use <>
, like /<city>
When you put things in a URL they are visible, don’t put private info there!
Path parameters:
weather.com/login/{username}/{password}
weather.com/login/john/P455W0RD
Query Parameters:
weather.com/login?password=P455W0RD&username=john
Now someone only has to look at the user’s browser history to steal the password for the user account!
Requests are just plain text, the format is as follows:
METHOD /slug HTTP/version
header: value
content
So when making a GET request to http://schulichignite.com/scorch via HTTP 1.1 you would send (no content because GET requests don’t need content):
GET /beginner HTTP/1.1
Host: schulichignite.com
When sending data via a POST request, you also have a body, here’s an example of a form submission on example.com/form :
POST /form HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: <needs to be calculated>
name=kieran&age=24
*The Content-Type header indicates a MIME type
When sending data via a POST request, you also have a body, here’s an example of a JSON body on weather.com/calgary :
POST /calgary HTTP/1.1
Host: weather.com
Content-Type: application/json
Content-Length: <needs to be calculated>
{"date":"11/29/2022"}
*The Content-Type header indicates a MIME type
Once a request is received the server will generate a response, this response looks similar to a request but with some different fields.
Some additional documentation can be found here
Annoyingly basically every header in an HTTP response is optional, except status codes.
Status codes are how a host tells a client how processing their request went. This is the only required header in a response. There are a lot of them, but here’s the basics:
*Typically they will also have a message associated, eg: 404 Not Found
Responses are also just plain text, the format is as follows:
status_code message
header: value
content
Here is an example of a simple html page response:
200 OK
Content-Type: text/html; charset=utf-8
Server: cloudflare
Date: Tue, 29 Nov 2022 20:34:20 GMT
<!doctype html><html><head></head><body><h1>Hello World</h1></body></html>
Error codes will often have no content, they may be as simple as:
404 Not Found
Or if there’s an error page they might have a “full” response:
404 NOT FOUND
Date: Tue, 29 Nov 2022 20:34:20 GMT
Content-Type: text/html; charset=utf-8
Server: cloudflare
<!doctype html><html><head></head><body><h1>Oops... Couldn't find that page</h1></body></html>