Vijaykumar Jain cabecada

## gist:b7369cc53aa3550b8ab5072d2e3fb25a
-- 1. Create a custom Enum type (required before table creation)
CREATE TYPE mood_enum AS ENUM ('sad', 'okay', 'happy');

-- 2. Create the table with all standard PostgreSQL types
CREATE TABLE all_postgres_types (
    -- Numeric Types
    col_smallint       SMALLINT,
    col_integer        INTEGER,
    col_bigint         BIGINT,
    col_decimal        DECIMAL,

## gist:ea2ad57765eda06c74f188cee73865d7

Zero-Copy Architecture (The mmap Advantage)
In a standard program, reading a file involves "Context Switching":

The Kernel reads the data from the disk into Kernel Space.

The Kernel then copies that data into your program's User Space buffer.

Your CPU finally looks at the data.

## gist:4dfb39be6ca8f820a0a44dfa76e58876
Frank Pachot
https://lnkd.in/dMPzDBWK

Laurenz Albe from cybertec-postgresql
https://lnkd.in/drp932PZ

Nikolay Samokhvalov from postgres ai
https://lnkd.in/dHmtVdT7


## gist:4bf8a817eed8794fadac6a6ae402ff69
I want to write a C extension that can mimic forall bulk collect processing millions of transactions concurrently with error logging. how would it work

To mimic Oracle's FORALL and BULK COLLECT in a PostgreSQL C extension for millions of rows, you have to move away from the "row-at-a-time" processing of standard PL/pgSQL and move into Internal Batch Processing.

In PostgreSQL, there is no direct "Bulk Collect" syntax, so your extension must manually manage the memory transition between the database engine and your logic using Array types and the SPI (Server Programming Interface).

1. The Architectural Strategy
To handle millions of transactions concurrently with error logging, your C extension should follow this pattern:

Input: Accept arrays of data (the "Bulk Collect" part).

## gist:d22224010648ca94459a317fbd0836b7
postgres@ubuntu:/mnt/VHD/consistent$ cat pg_consistent.c
#include "postgres.h"
#include "fmgr.h"
#include "utils/builtins.h"

PG_MODULE_MAGIC;

// --- MurmurHash3 Engine ---
static uint32_t murmur3_32(const char *key, int len, uint32_t seed) {
    uint32_t h1 = seed;

## gist:79a8d64c8c52164c510b66201304bfa0
this is just for fun.
we corrupt some page of a table that is heavily bloated and autovaccum has not yet run. now the bloat is enough for autovacuum to see the threshold breach
and trigger its run on the bloated table. it runs, then it finds the corrupted page and it fails silently. then it sleeps but it finds the
same table again for threshold breach and runs it again only to fail again. this goes on forever, monitoring shows no new bloat, but the table
does not have enough bloat to trouble other queries. (maybe a cold table). but then, txid starts growing and we now see
a txid exhaustion limit approaching. it triggers forced autovaccum, but still it keeps growing.

postgres@ubuntu:/mnt/VHD$ pg_ctl -D db1 -l logfile start
waiting for server to start.... done
server started

## gist:efe116c63ebab2b57fe362ae1941e5ad
testing the utility on https://github.com/wublabdubdub/PDU-PostgresqlDataUnloader

--setup archiving on disk for now
postgres@ubuntu:/tmp$ grep '^archive_command' db1/postgresql.conf
archive_command = 'test ! -f /tmp/archive/%f && cp %p /tmp/archive/%f'          # command to use to archive a WAL file
postgres@ubuntu:/tmp$ pg_ctl -D db1 -l logfile start
waiting for server to start.... done
server started

--create a simple table with 100000 records

## gist:54e22d72f3cae38aad16036c6c3c09e2
https://medium.com/@pranavt84/postgresql-page-structure-a-deep-dive-e82094a613de

seek= how far you go ahead in the output file

skip= how far you go ahead in the input file

count= how many segments you copy (can be set via bs=)

say you have 2 16 byte files like so:

## gist:c166d638808e725c1420cc93476091df
postgres=# set client_min_messages TO debug1;

SET

postgres=# create table t(col1 int primary key);

DEBUG: CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t"

DEBUG: building index "t_pkey" on table "t" serially

## gist:ce9d88f2d1c5ea2a021814ad0d8ac9b6


sudo apt-get -y -q install libipc-run-perl lcov build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config libicu-dev


sudo mkdir /opt/postgresql/17
sudo chown -R postgres:postgres /opt/postgresql/17
cd postgres
./configure --prefix=/opt/postgresql/17 --enable-debug --enable-cassert --enable-tap-tests --enable-coverage CFLAGS="-ggdb3 -O0"
make -j4 install
	-- 1. Create a custom Enum type (required before table creation)
	CREATE TYPE mood_enum AS ENUM ('sad', 'okay', 'happy');

	-- 2. Create the table with all standard PostgreSQL types
	CREATE TABLE all_postgres_types (
	-- Numeric Types
	col_smallint SMALLINT,
	col_integer INTEGER,
	col_bigint BIGINT,
	col_decimal DECIMAL,

	Zero-Copy Architecture (The mmap Advantage)
	In a standard program, reading a file involves "Context Switching":

	The Kernel reads the data from the disk into Kernel Space.

	The Kernel then copies that data into your program's User Space buffer.

	Your CPU finally looks at the data.
	Frank Pachot
	https://lnkd.in/dMPzDBWK

	Laurenz Albe from cybertec-postgresql
	https://lnkd.in/drp932PZ

	Nikolay Samokhvalov from postgres ai
	https://lnkd.in/dHmtVdT7
	I want to write a C extension that can mimic forall bulk collect processing millions of transactions concurrently with error logging. how would it work

	To mimic Oracle's FORALL and BULK COLLECT in a PostgreSQL C extension for millions of rows, you have to move away from the "row-at-a-time" processing of standard PL/pgSQL and move into Internal Batch Processing.

	In PostgreSQL, there is no direct "Bulk Collect" syntax, so your extension must manually manage the memory transition between the database engine and your logic using Array types and the SPI (Server Programming Interface).

	1. The Architectural Strategy
	To handle millions of transactions concurrently with error logging, your C extension should follow this pattern:

	Input: Accept arrays of data (the "Bulk Collect" part).
	postgres@ubuntu:/mnt/VHD/consistent$ cat pg_consistent.c
	#include "postgres.h"
	#include "fmgr.h"
	#include "utils/builtins.h"

	PG_MODULE_MAGIC;

	// --- MurmurHash3 Engine ---
	static uint32_t murmur3_32(const char *key, int len, uint32_t seed) {
	uint32_t h1 = seed;
	this is just for fun.
	we corrupt some page of a table that is heavily bloated and autovaccum has not yet run. now the bloat is enough for autovacuum to see the threshold breach
	and trigger its run on the bloated table. it runs, then it finds the corrupted page and it fails silently. then it sleeps but it finds the
	same table again for threshold breach and runs it again only to fail again. this goes on forever, monitoring shows no new bloat, but the table
	does not have enough bloat to trouble other queries. (maybe a cold table). but then, txid starts growing and we now see
	a txid exhaustion limit approaching. it triggers forced autovaccum, but still it keeps growing.

	postgres@ubuntu:/mnt/VHD$ pg_ctl -D db1 -l logfile start
	waiting for server to start.... done
	server started
	testing the utility on https://github.com/wublabdubdub/PDU-PostgresqlDataUnloader

	--setup archiving on disk for now
	postgres@ubuntu:/tmp$ grep '^archive_command' db1/postgresql.conf
	archive_command = 'test ! -f /tmp/archive/%f && cp %p /tmp/archive/%f' # command to use to archive a WAL file
	postgres@ubuntu:/tmp$ pg_ctl -D db1 -l logfile start
	waiting for server to start.... done
	server started

	--create a simple table with 100000 records
	https://medium.com/@pranavt84/postgresql-page-structure-a-deep-dive-e82094a613de

	seek= how far you go ahead in the output file

	skip= how far you go ahead in the input file

	count= how many segments you copy (can be set via bs=)

	say you have 2 16 byte files like so:
	postgres=# set client_min_messages TO debug1;

	SET

	postgres=# create table t(col1 int primary key);

	DEBUG: CREATE TABLE / PRIMARY KEY will create implicit index "t_pkey" for table "t"

	DEBUG: building index "t_pkey" on table "t" serially


	sudo apt-get -y -q install libipc-run-perl lcov build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config libicu-dev


	sudo mkdir /opt/postgresql/17
	sudo chown -R postgres:postgres /opt/postgresql/17
	cd postgres
	./configure --prefix=/opt/postgresql/17 --enable-debug --enable-cassert --enable-tap-tests --enable-coverage CFLAGS="-ggdb3 -O0"
	make -j4 install