In this article, we describe a recent enhancement to
pg_dump that Swarm64 contributed that more fully supports the backup and restoration of foreign tables. The contribution has been committed to PostgreSQL 13.
The PostgreSQL foreign-data wrapper (FDW)
The PostgreSQL foreign-data-wrapper interface is an extensibility feature that allows PostgreSQL to integrate data from another data source, which might be another PostgreSQL instance, an Oracle server, a CSV file, or more. Queries against a foreign table access the external data source and return results as if they had been accessed from a native PostgreSQL table.
The Swarm64 DA PostgreSQL accelerator extension speeds up PostgreSQL query performance in a number of ways, including the use of a compressed, columnar storage format that we store in a foreign table.
Foreign table backup and restore prior to PostgreSQL 13
Foreign tables extend PostgreSQL in many useful ways. Unfortunately, prior to PostgreSQL 13,
pg_dump does not fully support foreign tables;
pg_dump correctly dumps the DDL commands needed to define the foreign table again, but no rows will ever be dumped. If you restore that backup, the foreign tables will have been defined again, but there won’t be any data.
Foreign table backup and restore in PostgreSQL 13
In order to make database administration easier for our customers, as of Swarm64 DA 3.1.0, and PostgreSQL 13, we developed a patched version of
pg_dump that allows you to dump the content of a foreign table using the
--include-foreign-data option. This option instructs
pg_dump to perform a
COPY (SELECT * FROM foreign_table) TO on the tables that use the foreign server provided by the option.
For example, the following database
test_db has a Swarm64 foreign table
CREATE EXTENSION swarm64da; CREATE FOREIGN TABLE foo(a int) server swarm64da_server; INSERT INTO foo SELECT * FROM generate_series(1,10);
You can create the backup
dump.txt file using the new option:
$ pg_dump -d test_db --include-foreign-data=swarm64da_server > dump.txt
The backup contains the necessary statements to recreate the table and insert the data that is in table
foo, as seen in the following excerpt of
-- -- Name: foo; Type: FOREIGN TABLE; Schema: public; Owner: postgres -- CREATE FOREIGN TABLE public.foo ( a integer ) SERVER swarm64da_server; ALTER FOREIGN TABLE public.foo OWNER TO postgres; -- -- Data for Name: foo; Type: TABLE DATA; Schema: public; Owner: postgres -- COPY public.foo (a) FROM stdin; 1 2 3 4 5 6 7 8 9 10 \. -- -- PostgreSQL database dump complete --
To restore the dump in a new database named
new_db, apply the dump using the
$ psql new_db < dump.txt
If your setup has a different foreign-data wrapper installed, you can issue the option multiple times to selectively dump the content of tables backed by different foreign servers:
$ pg_dump -d test_db --include-foreign-data=swarm64da_server --include-foreign-data=other_fdw_server
Some FDWs are read-only, such as the
file_fdw provided by PostgreSQL. Do not use
--include-foreign-data with a server that is read-only because it makes your backup unusable. To help avoid this situation, the
--include-foreign-data option forces you to explicitly define the foreign servers that should be used. Unfortunately, this is not automatically preventable because
pg_dump has no way of knowing if a foreign-data wrapper is read-only.
--include-foreign-data option for
pg_dump is available as of Swarm64 DA 3.1.0 and PostgreSQL 13. If you are using previous versions of PostgreSQL, you can apply the patch, which for PostgreSQL 12 and 11 can be easily adapted.
I’d like to thank the PostgreSQL reviewers and my colleague Ursula Kallio for their contributions.