Extending PostgreSQL with C, Part II : First steps

·

4 min read

Make sure you read Part I, for more context.

Basic files

First of all, let us set up the makefile, needed to compile the library that will serve as the extension of PostgreSQL - PG. Fortunately, PG provides a standard makefile to compile such an extension, we only need to include it in our makefile as follows :

MODULES = raptor
EXTENSION = raptor     # the extersion's name
DATA = raptor--0.0.1.sql    # script file to install

# for posgres build
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

We also need a control file, whose name is the same as the extension (with .control extension, in my case raptor.control),

comment = 'Raptor algorithm for routing in GTFS'
default_version = '0.0.1'

and a script file that contains the SQL command for the actual extension creation. It must be named as <extension-name>--version.sql, hence mine is raptor--0.0.1.sql.

Leave it blank for now.

That is it for the basic configuration of the project, and it is all we need to compile the C library, install it and create the extension in a PostgreSQL database.

C library: Interacting with the PostgreSQL API

Now begins the interesting part: writing C code.

In your favorite code editor, make a new file (raptor.c in my concern) and its header (raptor.h, not mandatory but in my opinion, setting up a structured code is a good habit to have)

As we will compile the code as a library, no int main(int argc, char \argv[])* is required.

We shall include the necessary headers into our header file, as such

#ifndef RAPTOR_H
#define RAPTOR_H

#include "postgres.h"
#include "fmgr.h"

#endif

containing both postgres.h and fmgr.h, headers provided by PostgreSQL to access to some functions of the API.

Let us name our extension's function raptor_run, defined in the following C file.

In this C file, we shall make this function "callable" thanks to the macro PG_FUNCTION_INFO_V1. The prototype for such a function (callable within the extension) is also fixed as :

Datum function_name(PG_FUNCTION_ARGS);

See here, the official documentation where it is referred to.

Hence our draft file becomes :

#include "raptor.h"

PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(raptor_run);

Datum raptor_run(PG_FUNCTION_ARGS)
{

}

PG_FUNCTION_ARGS contains input arguments of the function, as would argv do in a regular C program. The actual number and type of arguments will be determined in the SQL script introduced later.

To retrieve the arguments, we shall use the macro PG_GETARG_xxx(), where xxx is the type of the argument (see the doc here).

We now can write functions that look like agv(), sum(), min(), max() on two arguments (but not yet on entier columns : these are called aggregates and would be introduced much later).

One more thing : Datum plays the role of the generic datatype for PostgreSQL, and the macros PG_RETURN_xxx() returns a specified type.

Now a very basic example to clear the mind :

#include "raptor.h"

PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(raptor_run);

Datum raptor_run(PG_FUNCTION_ARGS)
{
    float a = PG_GETARG_FLOAT8(0);
    float b = PG_GETARG_FLOAT8(1);

    PG_RETURN_FLOAT8((a+b)/2);
}

which simply computes the mean of 2 floats passed as arguments.

We now need to declare a SQL function that uses this C library into the raptor--0.0.1*.sql* file, introduced above. The declaration is as such :

CREATE FUNCTION raptor_run(
    FLOAT,
    FLOAT  )
RETURNS FLOAT  
as '/usr/lib/postgresql/14/lib/raptor.so', 'raptor_run'
LANGUAGE C VOLATILE STRICT;

where

  • raptor_run is the name of the function callable within PostgreSQL,

  • (FLOAT, FLOAT) are both the number and the types of the arguments

  • RETURNS FLOAT described the returned type (obviously)

  • as '/usr/lib/postgresql/14/lib/raptor.so', 'raptor_run' , gives the path of the compiled C library (in this example, it matches the path where the installation is done by the makefile provided by PostgreSQL) and the name of the C function to use when this function is called (it might be different from the SQL function name)

  • LANGUAGE C VOLATILE STRICT; indicates that this uses a C library

Once all those files are set, we now can compile and install the library thanks to the makefile :

make 
sudo make install

Now open your favourite front-end PostgreSQL terminal, to create and test the extension as such :

CREATE EXTENSION raptor;
SELECT raptor_run(1.0, 9.0);

All went ok, the result should be 5!

As such an extension is now very useful, we shall see how to interact with actual data in the database, and how to integrate more C code to manipulate data furthermore. Stay posted for Part III.