PostgreSQL for Data Architects
上QQ阅读APP看书,第一时间看更新

Starting with the daemon process

The first process that is started when we start PostgreSQL is /usr/local/pgsql/bin/postgres. This process has quite a few responsibilities such as performing recovery, initializing shared data structures/memory space, and kicking off the mandatory and optional processes. These processes are also referred to as utility processes and include bgwriter, checkpointer, autovacuum launcher, log writer, stats collector process, and so on. The daemon process also listens for connection requests, receives requests for connections from clients, and spawns server processes for the client. It's obvious that the daemon itself is a mandatory process that should be running for a user to connect to the database.

Let's focus on the user connecting-issuing-commands scenario and other pieces should fall in place. The following diagram walks you through the process of how the daemon process receives a connection request and starts (forks) a backend process. The backend process will, on successful authentication, start handling requests from that client:

Starting with the daemon process

The process is repeated for all connection requests (unless we hit the max_connections settings, in which case we get an error).

So, an active server, after a period of time, will have the processes that were there when the server started, plus quite a few processes to serve client connections, as shown in the following diagram:

Starting with the daemon process

Once a user is connected to a database, the user typically wants to read (SELECT) data or write (UPDATE/DELETE/INSERT) data, not to mention making changes to table structure, adding indexes, and so on. For example, a user logs in to Amazon and searches for the latest iPad, its price, and availability. This sounds simple enough. Assuming the simplest (unbelievably simple) table structure, this search will become the query:

SELECT price, available_count FROM product_tbl WHERE product = 'iPad';

However, when we consider that there might be thousands of users who want to do this, it gets a little bit more complex. The preceding query gets executed a few thousand times concurrently. When thousands of users search for different products, the iPad gets changed to thousands of different product names. So far so good. However, what happens when there is just one of the new iPads left at Amazon and there are a few hundred users trying to add it to their shopping cart? Gets real tricky, right? That is, for the database, many users trying to write to the same record:

UPDATE product_tbl SET available_count =0 WHERE product = 'iPad';

With these possibilities at the back of our minds, let's move on with understanding the rest of the PostgreSQL processes and memory management.