Web infrastructure design.
What happens when you type https://www.holbertonschool.com and press ENTER.
As we know when we type something on our browser we expect an answer, here you are going to find the steps to get this answer. First we need to understand some concepts.
The Client-server model is a distributed application structure that partitions task or workload between the providers of a resource or service, called servers, and service requester called clients. In the client-server architecture, when the client computer sends a request for data to the server through the internet, the server accepts the requested process and deliver the data packets requested back to the client. Clients do not share any of their resources. Examples of Client-Server Model are Email, World Wide Web, etc. When we talk the word Client, it mean to talk of a person or an organization using a particular service. Similarly in the digital world a Client is a computer (Host) i.e. capable of receiving information or using a particular service from the service providers (Servers).
After the client write the request in the browser, the request need’s the internet.
The Internet is a global network of billions of computers and other electronic devices. With the Internet, it’s possible to access almost any information, communicate with anyone else in the world, and do much more.
You can do all of this by connecting a computer to the Internet, which is also called going online. When someone says a computer is online, it’s just another way of saying it’s connected to the Internet.
It’s important to realize that the Internet is a global network of physical cables, which can include copper telephone wires, TV cables, and fiber optic cables. Even wireless connections like Wi-Fi and 3G/4G rely on these physical cables to access the Internet.
When you visit a website, your computer sends a request over these wires to a server. A server is where websites are stored, and it works a lot like your computer’s hard drive. Once the request arrives, the server retrieves the website and sends the correct data back to your computer.
The request that we send to the internet need’s to translate the domain name into an Ip address
Dynamic name service (DNS)
The process of DNS resolution involves converting a host name (such as www.example.com) into a computer-friendly IP address (such as 192.168.1.1). An IP address is given to each device on the Internet, and that address is necessary to find the appropriate Internet device — like a street address is used to find a particular home. When a user wants to load a web page, a translation must occur between what a user types into their web browser (example.com) and the machine-friendly address necessary to locate the example.com web page.
In order to understand the process behind the DNS resolution, it’s important to learn about the different hardware components a DNS query must pass between. For the web browser, the DNS lookup occurs “ behind the scenes” and requires no interaction from the user’s computer apart from the initial request.
Each device connected to the Internet has a unique IP address which other machines use to find the device. DNS servers eliminate the need for humans to memorize IP addresses such as 192.168.1.1 (in IPv4), or more complex newer alphanumeric IP addresses such as 2400:cb00:2048:1::c629:d7a2 (in IPv6).
TCP/IP (Transmission Control Protocol/Internet Protocol)
Is a suite of communication protocols used to interconnect network devices on the internet. TCP/IP can also be used as a communications protocol in a private computer network (an intranet or an extranet).
The entire Internet Protocol suite — a set of rules and procedures — is commonly referred to as TCP/IP. TCP and IP are the two main protocols, though others are included in the suite. The TCP/IP protocol suite functions as an abstraction layer between internet applications and the routing/switching fabric.
TCP/IP specifies how data is exchanged over the internet by providing end-to-end communications that identify how it should be broken into packets, addressed, transmitted, routed and received at the destination. TCP/IP requires little central management, and it is designed to make networks reliable, with the ability to recover automatically from the failure of any device on the network.
To connect the browser with the web page that we are looking for, we have to use a protocol
HTTP and HTTPS
Hyper Text Transfer Protocol Secure (HTTPS) is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The ‘S’ at the end of HTTPS stands for ‘Secure’. It means all communications between your browser and the website are encrypted. HTTPS is often used to protect highly confidential online transactions like online banking and online shopping order forms.
Web browsers such as Internet Explorer, Firefox and Chrome also display a padlock icon in the address bar to visually indicate that a HTTPS connection is in effect.
Some big pages doesn’t have just one server, they have a lot of them, for that we can usea a Load balancer.
Ever wonder how Facebook, Linkedin, Twitter and other web giants are handling such huge amounts of traffic? They don’t have just one server, but tens of thousands of them. In order to achieve this, web traffic needs to be distributed to these servers, and that is the role of a load-balancer.
We can find software and hardware load balencer.
Software load balancer
Software load balancers generally implements a combination of one or more scheduling algorithms.
- Weighted Scheduling Algorithm: Work is assigned to the server according to the weight assigned to the server. For different types of the server in the group different weights are assigned thus the load gets distributed.
- Round Robin Scheduling: Requests are served by the server sequentially one after another. After sending the request to the last server, it starts from the first server again.
The diagram below depicts this approach. Sequentially each request gets assigned to each server one by one and the round goes on. The change in the request assigned can be easily understood by looking into the diagram below.
This algorithm is used when servers are of equal specification and there not much persistent connections.
Round Robin passes each new connection request to the next server in line, eventually distributing connections evenly across the array of machines being load balanced. Round Robin works well in most configurations, but could be better if the equipment that you are load balancing is not roughly equal in processing speed, connection speed, and/or memory.
Hardware Load Balancer
Load balancing hardware are often referred as specialized routers or switches which are deployed in between the servers and the client. It can also be a dedicated system in between the the client and the server to balance the load.
The hardware load balancer are implemented on Layer4 (Transport layer) and Layer7 (Application layer) of OSI model so prominent among these hardwares are L4-L7 routers.
A web server is software and hardware that uses HTTP (Hypertext Transfer Protocol) and other protocols to respond to client requests made over the World Wide Web. The main job of a web server is to display website content through storing, processing and delivering webpages to users. Besides HTTP, web servers also support SMTP (Simple Mail Transfer Protocol) and FTP (File Transfer Protocol), used for email, file transfer and storage.
Web server hardware is connected to the internet and allows data to be exchanged with other connected devices, while web server software controls how a user accesses hosted files. The web server process is an example of the client/server model. All computers that host websites must have web server software.
Dynamic and static web servers
A web server can be used to serve either static or dynamic content. Static refers to the content being shown as is, while dynamic content can be updated and changed. A static web server will consist of a computer and HTTP software. It is considered static because the sever will send hosted files as is to a browser.
Dynamic web browsers will consist of a web server and other software such as an application server and database. It is considered dynamic because the application server can be used to update any hosted files before they are sent to a browser. The web server can generate content when it is requested from the database. Though this process is more flexible, it is also more complicated.
- On the software side, a web server includes several parts that control how web users access hosted files. At a minimum, this is an HTTP server. An HTTP server is software that understands URLs (web addresses) and HTTP (the protocol your browser uses to view web pages). An HTTP server can be accessed through the domain names of the websites it stores, and it delivers the content of these hosted websites to the end user’s device.
An application server is a server specifically designed to run applications. The “server” includes both the hardware and software that provide an environment for programs to run.
Application servers are used for many purposes. Several examples are listed below:
- running web applications
- hosting a hypervisor that manages virtual machines
- distributing and monitoring software updates
- processing data sent from another server
A web server is designed — and often optimized — to serve webpages. Therefore, it may not have the resources to run demanding web applications. An application server provides the rocessing power and memory to run these applications in real time. It also provides the environment to run specific applications. For example, a cloud service may need to process data on a Windows machine. A Linux based server may provide the web interface for the cloud service, but it cannot run Windows applications. Therefore, it may send Input data to a Windows-based application server. The application server can process the data, then return the result to the web server, which can output the result in a web browser.
Web server or application server
Web servers are process HTTP request by responding with HTML pages the web servers doesn’t use database or dynamic generation of HTML, instead the application server is dynamic and can interact with the client, also the applications server are connect with the database.
A database is a collection of information that is organized so that it can be easily accessed, managed and updated. Computer databases typically contain aggregations of data records or files, containing information about sales transactions or interactions with specific customers.
Just as the heart monitor in a hospital that is making sure that a patient’s heart is beating and at the right beat, software monitoring will watch computer metrics, record them, and emit an alert if something is unusual or that could make the computer not work properly happens.
You cannot fix or improve what you cannot measure is a famous saying in the tech industry. In the age of the data-ism, monitoring how our software systems are doing is an important thing.
Web stack monitoring can be broken down into 2 categories:
- Application monitoring: getting data about your running software and making sure it is behaving as expected
- Server monitoring: getting data about your virtual or physical server and making sure they are not overloaded (could be CPU, memory, disk or network overload)
What happen when…?
The best way to understand what happen when you type “www.holbertonschool.com” is with a flowchart, in the follow flowchart you will find all the infrastructure to connect a server with a client
Now that we have the concepts and we understand how the server is connect with the client, we can answer the question “What happen when we type https://www.holbertonschool.com and press ENTER. When we use a device that works as a client, and we open the browser to find something on the internet, we made a request, this request needs to translate the domain name to an IP address using the DNS.
After that the request go to a load balancer to be distribute to the server, between the load balancer an the internet we will need to locate a firewall to make secure the site also w e can use a monitor to see if everything is working fine. As we know we have a lot of algorithms for the load balancer in this case we are using Round robin algorithm.
The last step is in the server, now that we are here, the web server or the applications server needs to response the client using the HTTPS (Hypertext Transfer Protocol secure) or the HTTP some site Also in this part we will find the data base, when is located all the information. Now the client will see in his browser all the information that he needs.
Sheltren, J. (2019, 23 julio). High Performance Drupal. O’Reilly Online Learning. https://www.oreilly.com/library/view/high-performance-drupal/9781449358013/ch07.html
TechTarget Contributors. (2020, 1 junio). server. WhatIs.com. https://whatis.techtarget.com/definition/server
Tyson, J. (2020, 15 febrero). How Internet Infrastructure Works. HowStuffWorks. https://computer.howstuffworks.com/internet/basics/internet-infrastructure.htm
Everything You Need to Know About Computer Networking From the Start. (2018). Lifewire. https://www.lifewire.com/what-is-computer-networking-816249