Extending PostgreSQL with C, Part I : Intro
Table of contents
Preface
Let us first explain the choice of technologies: PostgreSQL and C. It might appear very niche nowadays as most online development content is heavily focused on web development... However, web dev is far from being the only type of development, and who's not tired of "how to's" in React, Angular, Javascript ...
PostgreSQL has many strengths, besides being open, and one of them is providing the possibility to build custom extensions that run "inside" the database. This is the focus of this chapter and the next ones: giving insight into how to build such an extension.
C is the mother of all languages, it is also the language in which PostgreSQL is written. It is very fast and pretty intuitive, and it always is a good skill to know (in my opinion).
But beware, here is not the place to discuss how to access databases within C programs (even though this is also a very useful skill to know)! It is to build an extension, such as PostGIS, arguably the most famous one.
For those familiar with PostgreSQL, you would probably have already used this kind of query :
CREATE EXTENSION <extension_name>;
If not, a PostgreSQL extension is a package, that provides new and custom functionalities to PostgreSQL. Even though some are provided by the software, anyone may build and use one. It consists in a separate code that can be integrated into PostgreSQL and interact with it through the provided API.
I am not an expert in PostgreSQL extensions (but I like C and PostgreSQL), and I am willing to write this guide as a "follow my side project" and also because I came to realize that there are not many tutorials or guides on this matter online.
As much as possible, I will try to explain how to develop an extension globally, illustrated by the development of a side project of mine.
NB: C is not the only language allowed to build extensions, but this is the chosen one here.
Essential installation
The first thing is to make sure PostgreSQL is installed on your system (mine is Ubuntu 22.04, hence the commands will be valid for this OS) and that you have a C compiler (gcc for instance).
Make sure to update the package index :
sudo apt-get update
Then install PostgreSQL itself
sudo apt install postgresql postgresql-contrib
and the build-essential which contains gcc
sudo apt install build-essential
At the time those lines are being written, I run PostgreSQL 14 on my machine and will then use this version as a reference. It probably won't be that different in future versions.
Now is time to install the development package for PostgreSQL
sudo apt-get install postgresql-server-dev-14
Context of the project
As introduced above, this guide follows a side project (and a very specific one).
It's about the implementation of a public transit routing algorithm (which finds the fastest way within a public transport network) called RAPTOR, which paper may be found here.
It uses open data, whose format respect the GTFS, that describe all routes, trips and stops of a transportation network. These data can be found as CSV files, which are inserted into a PostgreSQL database (with the help of gtfs2pgrouting, another project I've worked on).
It is possible to implement this algorithm externally to PostgreSQL, but at the same time, it appears suitable for an extension, as pgRouting did (which implements routing algorithms on PostGIS maps). This library is also a heavy inspiration for my work and this guide!
Part II is available here.