CM3035 Topic 01: The Web Stack
Main Info
Title: The Web Stack
Teachers: Daniel Buchan
Semester Taken: April 2022
Parent Module: cm3035: Advanced Web Development
Description
This topic covers the foundations of the web stack and getting set up with a first Django project.
Assigned Reading
Other Reading
Kumar, Akshi, Web Technology Theory and Practice ch. 7. Really basic stuff, not noted here.
Lecture Summaries
1.1 The Full stack Web Server
1.101 TCP and IP
The internet protocol suite is a model for a set of communication protocols use to transmit data over computer networks. It consists of a great number of protocols but typically we refer to its two core protocols:
TCP or Transmission Control Protocol is a means of delivering error-checked streams of bytes between internet protocol connected computers.
IP or Internet Protocol governs how to relay and route packets of data across computer connected networks.
The origins lie in research from the 60s and 70s. By 1975 the first test between Stanford and UCL was completed. In 1982 it was selected by the US DOD as their standard for military communication. It was the de facto standard for networking by early 90s.
There are four logical layers in TCP/IP that abstract functionality from the layer below.
The link layer the lowest layer, concerns the protocols required to send data directly between connected computers, known as hosts on a single network.
The internet layer, the next one up, is concerned with how to read the data between hosts on two or more connected networks. It is in this layer that hosts are uniquely identified by their IP address.
The transport layer, is concerned with host-to-host task specific communication in a manner independent of the implementation of the lower layers.
The application layer, the top layer, is concerned with the processes that generate data which will be sent over the network. These are often user-focused applications like browsers or email clients, and the web servers. These typically use higher-level data packaging protocols or standards to encapsulate the data, to ensure the sending and receiving applications understand each other. (eg HTTP, SMTP, FTP).
Port Numbers
TCP/IP protocols additionally make use of port numbers. A port number is an additional numeric value alongside the IP address that specifies the destination application or service running on the computer. EG a web server may be set to receive/send on port 80. When an incoming request is received, only those labeled for port 80 will be forwarded to the web server. Hence the web server is said to be listening on port 80.
Port numbers are unsigned integers from 0 to 65,535. Values below 1024 are reserved for common applications, eg 80 for HTTP and web comms, 22 for secure shell, 53 for DNS. Typically during development we’ll use values like 4000, 8000, or 8080.
1.103 HTTP
In server development we typically focus on the application layer of the stack. Clients that (generally) request and consume data, servers that create and send data. To understand each other, clients and servers package requests and data using the HTTP protocol.
TBL and team proposed the initial HTTP standard as part of the WWW set of standards in 1989.
HTTP is a stateless request response protocol where a client application controlled by a user makes a request to a remote web server for resources. These resources are usually files and may include HTML documents, video files, images, JSON data, and more.
Client applications that make requests are known as user agents They include web browsers but also apps like curl, mobile apps etc.
HTTP is stateless so far as the server stores no session info about the user agent between sequentical requests. so the request needs to include all the information required to retrieve the requested resource.
Client requests use Uniform Resource Identifiers, or URIs to identify resources. It is a formatted string:

The first portion http://
defines the protocol.
Everything before the next forward slash is the authority, optionally including a user id (john.doe:password@
), the name of the remote host or server domain: www.example.com
, and an optional port :123
.
After that slash, the information between any following slashes is the path, the location of the resource on the server. Historically this would have been an actual file path on the server, but now servers can define the location as they wish.
After the ?
we have the query information. These are key value pairs defined as key=value
and separated by the ampersand character.
Requests and responses are packaged as HTTP messages. These have four regions of information:
Line | HTTP Request | HTTP Response |
---|---|---|
1 | request line | status line |
2 | requests meta line | response metadata |
3 | blank line | blank line |
4 | request body | response body |
The first line of a request defines the method (eg GET
), resource location, and HTTP protocol version.
There are nine request methods, the most common are GET, POST, PUT, DELETE.
GET requests request resources, POST asks the server to accept data and use the data to define what the resource is to return. PUT indicates to the server that data is to be submitted and stood at the given URI, updating or creating a new resource. DELETE requests that the server deletes the resource.
For responses, the same structure applies. The status line with the HTTP version, and a status code and human readable message.
The response header metadata will include what sort of file or resources are being returned often. Then a blank line and everything that remains is the response body (optional).
HTTP Status codes
Response Class | Response Type | Meaning |
---|---|---|
— | — | — |
1xx | Informational | request received correctly, processing is ongoing |
2xx | Success | request received correctly, responded correctly |
3xx | redirection | requests received, resource moved location |
4xx | client error codes | request malformatted |
5xx | server error codes | server is unable to fulfil request |
1.106 Components of a Full Stack Web App
A web stack refers to the suite of software and technologies needed to deliver a website or web application. The minimal set is a computer running an OS, a web service software to serve http data, a method of data storage, adn the tech for writing computer code or scripts. We woudl consider the client side technologies too in the full stack environment - the web browser, HTML, JS, CSS etc.
Stacks are often referred to by acronyms reflecting sets of components in the stack, eg:
Element | LAMP | WISA | MEAN |
---|---|---|---|
— | — | — | — |
OS | Linux | Windows | Linux |
Web Server | Apache | Windows Server IIS | Express |
Data Store | MySQL | Microsoft SQL Server | MongoDB |
Server side programming language | PHP | ASP.net | JS |
Frontend Framework | unspecified | unspecified | Angular |
1.109 Apache and NGINX
We’ll use Apache on the module. These two between them dominate the web server market.
Apache is commonly known as httpd
or just Apache
. It’s been around since 1995, and runs on most modern OSs. It used to dominate, but more recently NGINX has grown.
NGINX runs on Linux primarily, can serve 10k concurrent connections (4x Apache default). Not as flexible in module support as Apache, but most modern web frameworks will work fine, and NGINX is v. popular for large load servers.
1.112 Web App Frameworks and MVC
A web framework is a software library or framework for the easy development of web apps. Provides a structured way to organize application code, and libraries providing common functionality, such as routing client requests to functions, database access, page generation via templating, and handling user sessions and auth.
We can think of heavyweight and lightweight frameworks. Lightweight tend to be less opinionated, or rigid, on how to structure code. Typically they provide less out of the box functionality.
Fully featured frameworks liek Django or Rails tend to specify a structured way to lay out code, and provide libraries for all common tasks. This can make development rapid, at the expense of flexibility and sometimes performance.
MVC or Model View Controller is a design pattern for user facing software apps. It’s a common design pattern in web frameworks, used by Django, Rails, and lots more.
It separates out three logically distinct concerns:
The model, the portion of the app concerned with retrieving or generating data. In a web app the model is typically the underlying database and the software library used to interact with it. This layer allows the app to retrieve or modify data in the DB.
The view, is the portion of the app that generates anything the user or client interacts with. In the contenst of web apps this is typically the library that generates web pages based on the data fetched.
The controller is the portion of the app that receives and responds to user requests. Typically handles fetching data via the model, and then dispatching results to the appropriate view which is returned to the user.
The app logic lies in the controller mostly.
1.115 Data Storage
One major component of web apps is a means of storing data so taht it can be read, updated, or manipulated by the app.
For static sites, this can just be the file system of the server. But most apps use some kind of database.
Commonly we use relational databases, MySQL and PostgreSQL are the most common. Historically MySQL was known as fast at reads, at the cost of slow writes. Vice versa for Postgres, but recent releases have narrowed these gaps.
Postgres handles concurrent access more robustly than MySQL, so more robust for data integrity concerns. Postgres supports more sophisticated data types like JSON, geographic data, key-value data.
Concurrent connections use more memory in Postgres, so for apps with large numbers of connections MySQL might be better, and there are use cases where MySQL is faster.
NoSQL data stores refer to systems that don’t rely on relational data models. EG KV stores, document stores, or graph databases.
Django Setup
1.203 walks through installing Django in a virtual environment and creating a new project with the django-admin startproject <my-project>
command.
Once created the project will have a manage.py
script in the root. This is a copy of django-admin that points directly to our site’s settings, so we don’t need the admin tool itself any more, we’ll interface with manage.py
.
The admin tool (and its copy in manage.py
includes a bunch of useful tools to inspect and interface with the database (like make migrations), run a dev server, run tests etc. For more see the docs
1.204 Django App Layout
A Django project hosts one or more web apps, arranged such that discrete parts of the site or web project’s functionality are encapsulated with separate apps. EG user auth, or video streaming, might be separate functional units within one larger web project.
Once you have a project setup you can create a new app from the project root with python manage.py startapp <appname>
Walks through the file structure of the app, mentions that Django’s terminology is slightly different from normal, and the controller logic from MVC is usually called ‘views’ in Django (Model View Template).
1.207 Serving the first webpage
A lengthy run through of creating a hello world Django app.
Helpful in that it walks through the way Django projects consist of multiple apps, and the relation between the apps and the project. Includes defining a data model, importing some seed data, and routing urls via the project urls.py through to the app’s urls.py.
1.3 The Main Django Components
1.302: Views
A Django view is a function that receives a web request, and returns a web response.
The views will contain the main logic of the app. By convention we find them in the views.py
file in each app. You can do what you like of course.
A view function always receives as its only argument an object representing the http request, which by convention we call request
.
We can interrogate that object to find out info about the incoming request.
There are some convenience functions for eg HttpResponse
which enable you to pass some data, a content type, status code etc.
For more on the Request and Response objects see the docs
1.303: Models
Django models are Python classes that describe data resources.
In the majority of cases this will be a relational database like Postgres. In dev we often use the lightweight SQLite.
SQLite is popular for lightweight apps, storing local data in mobile apps for example.
A Django model provides two interlinked pieces of functionality. They provide a way of describing a database, its tables and the relationships between the tables, in pure Python. We can then use this code so the web app can access the data in a logical way, called object-relational mapping.
We’ll use the model classes a lot in views.
Given the model class describes a schema, the model can be used to generate a database matching the description. Django admin uses the models to generate migrations, instructions which create and alter an existing database such that it matches the current description.
All model classes are subclassed from models.Model
.
The class represents a table.
They can have class properties that themselves are subclassed from the data type definitions in the Django models
package. For example models.IntegerField
or models.CharField
, these will describe the fields, or columns, in the table.
Arguments to these data type functions describe the field properties. EG the charfield has a max length.
Common arguments include null
which specifies whether it can be null, and blank
which specifies whether it can be blank.
You can specify db_index=True
to say that you want to create an index for the column.
Every table must have a primary key, but this doesn’t have to be declared by the user.
You can have class methods on the class. You can link tables by specifying foreign keys with models.ForeignKey
. When creating a foreign key you pass an argument for on_delete
which describes what should happen when the foreign entity is deleted. For example models.SET_NULL
will delete the value of the foreign key, but not the entity itself.
1.305: Templates
These would usually be called views in the MVC model, but Django calls them templates.
We can invoke a template in a Django view by calling the render
function.
The first argument to render is the request object, the second is the location of the template used, the third is the context object, a Python dict that will be available to the template. By default render sets media type to text/html
but we can change this.
By convention we keep the templates in a templates
directory within each app. Settings will expect this unless you tell it otherwise.
Within the html doc there are escape sequences that allow us to use python code within the html.
Double curly braces say that we insert the value of a Python variable from the context object. For example {{ name }}
will be replaced with the value associated with the name key in the context dict you pass to the template.
The other option is to use braces with a percent sign, which lets you put in any arbitrary python code {% name.upper() %}
But debugging code in templates is a nightmare, so best to only use for short code blocks. The context object is still available.
Code blocks aren’t parsed using standard Python indentation rules, so you have to use weird syntax like {% endif %}
.
Templating is there for re-use. You can specify content regions within a template
Say you want to have a base template with your head etc that will wrap the content. You can define a base template like this:
<html>
<head>
</head>
<body>
{% block content %}
{% enblock %}
</body>
</html>
Then you can define what goes in the content block for a specific page and use the following:
{% extends 'base.html' %}
{% block content %}
…my html content…
{% enblock %}
Django supports this default Django Templating language, also Jinja2, which is a popular Python templating language.
For more on Django templating see the docs
For more on the Django template language see those docs.
1.307 Django URLS
URL dispatching or routing is the process by which HTTP requests for resources are mapped to functions in Django views.
Typically this happens across one or more urls.py
files. Conventionally there is one main urls
file for the whole project, and then specific urls
files in each application.
The location of the main file is in the settings, by convention it’s in the same folder as the settings and config for the project.
Usually this main file includes the various other files from the sub-apps. So the project urls file is responsible for deciding which app a resource request is sent to, and then the app takes care of mapping the request to the specific functions in its views.
The urls.py
is a Python script and it’s mostly used to build the contents of a list called urlpatterns
. Usually this is a static list, but you could build it dynamically if needed.
Many functions call path
or re_path
functions. The first argument to path
is the route to match. Pattern matching happens sequentially, and it’s possible to not be able to reach later patterns if you’re not careful.
path
just matches simple strings. re_path
allows use of regex patterns.
If we’re dispatching the request to a function we have to fully specify the path to the function. If we’re dispatching to another urls file, we use the include('path/to/urls')
function instead.
We can match on query parameters like this:
path('user', myfunc)
path('user/int:id', myotherfunc)
path('user/str:name', mythirdfunc)
Note we don’t have to use re_path
for this, but we can do that for more complex occasions.
re_path(r'^user$', myfunc)
re_path(r'^user/(?P<id>\d+)$', myotherfunc)
re_path(r'^user/(?P<name>\w+)$', mythirdfunc)
?P
indicates the capture variable with its name in angle brackets.
For more on routing see the docs
1.402 Lightweight Django
Shows that you don’t need all the infra that the default project setup provides.
from django.http import HttpResponse
from django.urls import path
from django.conf import settings
import sys
def index(request):
return HttpResponse('<html><head></head><body><p>Hello world!</p></body></html>')
urlpatterns = [ path('', index) ]
settings.configure(
DEBUG=True,
SECRET_KEY="ThisIsTheSecretKey",
ROOT_URLCONF=__name__,
MIDDLEWARE_CLASSES=(
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
),
)
if __name__ == "__main__":
from django.core.management import execute_from_command_line
execute_from_command_line(sys.argv)