Londiste Tutorial (Skytools 2)

From PostgreSQL wiki
Jump to navigationJump to search



Londiste is Skytools' asynchronous primary/secondary replication system, built atop PGQ.


This tutorial refers to skytools 2.x. The tools and commands have changed in Skytools 3.x.


If you just want to get a test installation running, section 1. Installation and setup is all you have to read. The following parts give you tips on how to solve common tasks, troubleshoot your installation, and give you some more insights on how it works. Maybe the later sections should get moved on to a dedicated wiki page.

Configure, Build and Install

Note: Skytools is a Python-based application, and requires that the Python-Postgres driver 'psycopg2' be installed. You can find more information and download at the main psycopg2 website: Psycopg2. Keep in mind that versions 2.4.2 and 2.4.3 of psycopg2 have known problems with data loss, and should not be used with Skytools or Londiste.

Once you have downloaded the release, and uncompressed it on the local system, you'll need to configure, build and install the required components:

 > tar zxf skytools-2.x.x.tar.gz
 > cd skytools-2.x.x

The two important options for the 'configure' script are --prefix and --with-pgconfig. These options are used to select the the installation directory with the skytools executables will be located ('prefix') and which installed version of postgres the tools should be configured to work with ('with-pgconfig'). In this example, we'll install into /opt/skytools-2.x.x against a postgres installation in /opt/postgres-9.0.1/.


 > ./configure --prefix=/opt/skytools-2.x.x --with-pgconfig=/opt/postgres-9.0.1/bin/pg_config

Once the configure steps completes, you can then build and install:

 > make
 > sudo make install

Next, you'll need to also build and install the python portions of the tools:

 > python setup.py build
 > sudo python setup.py install

Once you've completed these two steps, all of the skytools/londiste components (both python and sql) should be installed in the correct locations).

Setup

Let's consider the following simple situation:

  • one provider, source database is P
  • one subscriber, destination database is S
  • a couple of tables to replicate, T1 and T2, and a sequence S1, all in schema public

The ticker daemon

Londiste needs a ticker which has to target the P(rovider) database, and can be run from another machine. The common usage is to run the ticker directly on the Provider database host.

Any ticker can host as many queues as you want. Each queue has a unique name and can be used by as many subscribers as needed.

If you have several copies of the same database, you can subscribe to the same queue from several subscribers. If you want to have different subsets of the same source database on several subscribers, you either can have those use the same queue but only a part of the tables in it, or have a queue per set of tables.

ticker.ini

You have to configure the ticker in order for it to reference the Provider database, where you must install PGQ support code (C and PL/pgSQL, and if running a pre-8.3 server version, the txid contrib module)

 [pgqadm]
 job_name = myticker_name
 db = dbname=P
 
 # how often to run maintenance [seconds]
 maint_delay = 600
 
 # how often to check for activity [seconds]
 loop_delay = 0.1
 logfile = ~/log/%(job_name)s.log
 pidfile = ~/pid/%(job_name)s.pid

The job_name is only there for reference in the logfile and pidfile names. It bears no relation to queues hosted by the ticker.

Installing the ticker and producing ticks

Now that your configuration is ready, you need to install support code in your database and fire up the ticker daemon. The PGQ package has to be installed on the machine hosting the database in order for the PostgreSQL module (.so or .dll) to be present, and on the machine hosting the daemon, which will use local SQL files to install the addon.

 pgqadm.py ticker.ini install
 pgqadm.py ticker.ini ticker -d

Now you have a ticker running, you can check its logs, which will get seldom updated while there's no activity. PgQ defaults to issuing a tick only each minute when no events get produced, which is still the case.

The replication daemon

You need one replication daemon running on each subscriber. You could host the daemon itself on a separate machine, but it will have to connect to the subscriber database, and the londiste (skytools) package will have to be installed at both the daemon hosting machine and the database hosting machine. It's usually simplest to run the subscriber daemon on the subscriber database host.

p-to-s.ini

First, you need a configuration file to tell londiste how to find the provider and subscriber databases, and where to log files etc. A good practice is to create a dedicated system user (if running a Unix variant) and to put the daemon logfile in e.g. /var/log/londiste and the pidfile in /var/run/londiste. This is an easy way to solve the permissions problem.

 [londiste]
 job_name = test_to_subcriber
 
 provider_db = dbname=P port=6000 host=127.0.0.1
 subscriber_db = dbname=S port=6000 host=127.0.0.1
 
 # it will be used as sql ident so no dots/spaces
 pgq_queue_name = testing
 
 logfile = /var/log/londiste/%(job_name)s.log
 pidfile = /var/run/londiste/%(job_name)s.pid

The job_name is whatever you want, it's only used for naming the files and as the consumer name if using the SQL PgQ API. The pgq_queue_name is the name of the pgqadm.py queue that'll get created automatically for you on the install step.

Installing londiste

Now, using your new configuration file, installing is that easy:

 londiste.py p-to-s.ini provider install
 londiste.py p-to-s.ini subscriber install

As you see, you can install the londiste software on the provider directly from the subscriber. This requires that you install the whole skytools package on both provider and subscriber.

Launching replication

Now you're ready to replicate events from the provider to the subscriber!

 londiste.py p-to-s.ini replay -d

Repeat those subscriber steps on as many machines as you want to, each time pointing to a different subscriber database, and you have single primary multiple secondary databases.

It's important to launch the replay process before adding tables because if you produce events before you register a consumer, you won't ever get them.

Adding tables and sequences

Do this step only once, whatever the number of subscribers:

 londiste.py p-to-s.ini provider add public.T1 public.T2
 londiste.py p-to-s.ini provider add-seq public.S1

Then you add the same elements to the subscribers:

 londiste.py p-to-s.ini subscriber add public.T1 public.T2
 londiste.py p-to-s.ini subscriber add-seq public.S1

If you follow the londiste logs (tail -f), you'll see all the steps londiste go through in order to add your tables. You'll have to have created the tables on the subscriber, which allows you to use different indexes on the subscriber from the provider. If the subscriber schema has foreign key references, you will obviously need to create the referenced tables, and to add them to the replication set. Londiste will take care of removing the foreign key constraints while doing the initial COPY and re-adding them just after.

Common tasks

Adding all provider tables to subscriber

is as easy as this command:

 londiste.py <ini> provider tables | xargs londiste.py <ini> subscriber add

Checking subscribers state

This query runs on the provider node, and gives some information for each queue and consumer.

 SELECT queue_name, consumer_name, lag, last_seen
   FROM pgq.get_consumer_info();

The lag column is showing this same lag we were talking at the beginning of this document, the last_seen column shows the timestamptz of the latter consumer request. The value of this column should never be more than 60s (and some, it's quick but not instantaneous) in the default configuration.

Dropping all queue events from a former consumer

When you're playing with londiste you can easily be in a situation were you want to clear up all your installation in order to start again cleanly. Or it could be that a former PGQ consumer has been deprecated in your architecture, and you want PGQ to forget all about it.

For PGQ to stop accumulating events for a never to come back consumer, use the following API:

 SELECT pgq.unregister_consumer('queue_name', 'consumer_name');

Or use the pgqadm.py tool:

 $ pgqadm.py <ticker.ini> unregister queue_name consumer_name

Adding a column to a replicated table

This case is handled in a simple process:

  1. add the column on all the subscribers
  2. BEGIN; -- on the provider
  3. add the column on the provider
  4. SELECT londiste.provider_refresh_trigger('queue_name', 'tablename');
  5. COMMIT;

Removing a column to a replicated table

  1. drop column from provider and change trigger in same transaction
  2. look at the lag when londiste has passed the moment of the drop
  3. drop column on subscribers

The trick here is to drop the column on the subscribers only when there's no more event in the queue referencing it.

Adding custom triggers on subscriber side

By default, londiste will consider that the triggers which exist on the subscriber tables are there because you just restore the provider schema there, being as lazy as possible. If you intend to run custom triggers on the subscriber, you have to tell londiste about them as follows:

  1. create the custom trigger on the subscriber
  2. londiste.py p-to-s.ini subscriber add public.T1
  3. londiste.py p-to-s.ini subscriber restore-triggers public.T1

CAUTION: When londiste stops, it will remove the trigger again.

You can even give the restore trigger a specific trigger name, and only this one will get reinstalled.

Federated database

What if you want to have a central database containing a live copy of the data provided by more than one primary?

This is quite easy to do when you realize that PostgreSQL and londiste both refer to tables by their schema-qualified name internally. Ok, that's not exactly that in the case of PostgreSQL, but it's akin to that.

The solution is then to create a globaly unique local schema on each provider, and to create all those schemas on the central subscriber. You then independantly replicate provider1.table and provider2.table to tables of the same schema-qualified name on the central server.

You can create a federation.table table on the central server, and to either use inheritance or a subscriber trigger to move data to the main table. Table inheritance allows the data to stay on the table londiste puts them, while the subscriber trigger allows you to have a different partitioning scheme (by date range rather than server of origin) or to have a statistics (or costs) computing trigger on the central server.

Federating data by trigger

When using a subscriber trigger, mind the londiste.py handling of triggers and the following command:

 $ londiste.py <ini> subscriber restore-triggers <schema.table>

Globally unique serial amongst more than one provider?

In case your primary key is a serial, you will want to avoid collisions in the central database. There are two ways to do that, first is to segment a bigint range amongst servers by having the sequences start at the right value rather than all starting at 1.

A better way is to turn your primary key into a two-column key, adding a server reference (globally unique) in the PK. Then it'll be easy to know where each federated line of data comes from, using either the inheritance trick or the federation trigger trick (which moves data).

Replicating a partitioned table

As the partitioning is done with inheritance in PostgreSQL, that usually means the main table has a trigger to redirect modifications to a target child. Londiste being a trigger based replication system, how to mix and match partitioning and replication?

Either use a partitioning replication subscriber from Skytools (see e.g. cube dispatcher) or simply replicate the partitions themselves, with a cron entry to add next partition each month e.g.

Trouble Shooting

This section aims at proposing common solution to common problems, and could get to its own page if we add to much to it...

Londiste is eating all my CPU and lag is raising

This happens for example if you had a nightly glitch and the on-call sysadm forgot to restart the ticker. Or when you did a big UPDATE or DELETE in a single transaction, but now realize each event from the same transaction has to get processed in the same transaction at the subscriber site...

The following query allows you to count how many events that represents, with the magic tick numbers coming from pgq.subscription in columns sub_last_tick and sub_next_tick.

 SELECT count(*)
   FROM pgq.event_1,
     (SELECT tick_snapshot
        FROM pgq.tick
       WHERE tick_id BETWEEN 5715138 AND 5715139
     ) as t(snapshots)
 WHERE txid_visible_in_snapshot(ev_txid, snapshots);

In our case, this was more than 5 millions and 400 thousands of events. With this many events to care about, if you start londiste, it'll eat as many memory as needed to have them all around, which might be more that what your system is able to give it. So you want a way to tell londiste not to load all events at once. Here's how: add the following knob to your .ini configuration file before to restart the londiste daemon:

   pgq_lazy_fetch = 500

Now, londiste will lazyly fetch 500 events at once or less, even if a single batch (which contains all events between two ticks) contains a huge number of events. This number seems a good choice as it's the default PGQ setting of number of events in a single batch. This number is only outgrown when the ticker is not running or when you're producing more events than that in a single transaction.

Detecting problems keeping tables in sync

If you suspect that some of your tables are not being kept in sync, below are two commands you can run (even from cron) to keep an eye on the synchronization status of your tables:

To perform a compute- and space-intensive row-by-row comparison, which will detect ANY discrepancies between tables in the 'public' schema, you can run the following command (substituting the proper schema name, londiste.py and config.ini file pathnames):

rm -rf /home/tmp/repair && mkdir -p /home/tmp/repair && cd /home/tmp/repair && londiste.py config.ini subscriber tables |
perl -nle 'if ( m/^(public\S+)/ ) { print $1; }' | 
xargs --replace=XX  londiste.py  config.ini repair XX 2>&1

Once the command has executed, the outputs will be located in the /home/tmp/repair directory. Any tables with mismatched contents will also have a 'repair' script located in the /home/tmp/repair directory. You can run that script on the subscriber table to bring it into sync with the primary.

A lighter-weight version of this uses the 'compare' command instead of the 'repair' command to look for mismatching row counts in each table, and only generates output when a problems is found:

londiste.py config.ini subscriber tables | perl -nle 'if ( m/^(public\S+)/ ) { print $1; }' | 
xargs --replace=XX  londiste.py config.ini compare XX 2>&1 | 
/usr/bin/grep "do not match"

Concepts

In londiste terms, the primary is called a provider, and its secondaries are subscribers, and the data the subscribers are polling from the queue are events to replay.

For the provider to be able to group events in batches, it'll need to have a reference ticker, provided by pgqadm.py.

Glossary

Taken from the PGQ documentation

Event
atomic piece of data created by Producers. In PgQ event is one record in one of tables that services that queue. Event record contains some system fields for PgQ and several data fileds filled by Producers. PgQ is neither checking nor enforcing event type. Event type is someting that consumer and produser must agree on. PgQ guarantees that each event is seen at least once but it is up to consumer to make sure that event is processed no more than once if that is needed.
Batch
PgQ is designed for efficiency and high throughput so events are grouped into batches for bulk processing. Creating these batches is one of main tasks of PgQadm and there are several parameters for each queue that can be use to tune size and frequency of batches. Consumerss receive events in these batches and depending on business requirements process events separately or also in batches.
Queue
Event are stored in queue tables i.e queues. Several producers can write into same queeu and several consumers can read from the queue. Events are kept in queue until all the consumers have seen them. We use table rotation to decrease hard disk io. Queue can contain any number of event types it is up to Producer and Consumer to agree on what types of events are passed and how they are encoded For example Londiste producer side can produce events for more tables tan consumer side needs so consumer subscribes only to those tables it needs and events for other tables are ignores.
Producer
applicatione that pushes event into queue. Prodecer can be written in any langaage that is able to run stored procedures in Postgres.
Consumer
application that reads events from queue. Consumers can be written in any language that can interact with Postgres. SkyTools package contains several useful consumers written in Python that can be used as they are or as good starting points to write more complex consumers.

Primary Secondary replication

Londiste is a single primary to several secondary solutions. That means any given table has to see its data flow from its primary to its secondaries.

Now it's possible to have, in the same PostgreSQL database, some provider tables, from which the data are flowing, and some subscriber tables, to which some other data are flowing.

Single primary only means than any one given table on a given host can not be considered at the same time as providing data and as subscribing to remote data. That is multi-primary replication, and another topic entirely.

Asynchronous replication

That async thing means you don't have any guarantee at commit time on the provider (the primary) that the changeset you made is replayed already on the secondary.

  • The drawback is that cases exist where the provider has happily commited a change that the secondary finds itself unable to apply (no space left on device is the first example coming to my mind).
  • The advantage is that in case of network failure, disk shortage, or some other kinf of outage, the provider see very little impact: its queues are growing more than usual, that's it.

Depending on your needs, you'll be more interrested into the advantage or the drawback.

The ticker

The ticker is an independant daemon facility which is responsible for generating ticks. Those ticks are produced on-demand, each time a consumer is asking for a new batch, and will get produced such as any batch contains either ticker_max_lag seconds worth of events or ticker_max_count events, whichever comes first.

Now, londiste is the replicating daemon, and in fact is just another kind of PGQ subscriber. So you're running a londiste.py daemon for each subscriber. This londiste daemon is configured to get events from the pgq_queue_name queue, which can be used on as many subscribers as you want to. On the other hand, each subscriber must have a unique job_name.

The queue daemon, pgqadm.py can be hosted on any machine, it'll often be the provider host or a third machine, independant of the replication databases. A single ticker instance can host as many queues as you want it to, but in the case of replication with londiste.py (remember, you could be using your own subscriber code atop PGQ), the queue will need to be installed on the provider database.