TimelineFollowing97

From PostgreSQL wiki
Jump to navigationJump to search

Mapping out timeline following on the various timeline-aware code paths in preparation for trying to consolidate the logic into a single path, or at least one for recovery and one for everything else.

Map

The timeline following logic is a bit tangled right now:

Timelines.png

(It's easier to read if you open it full size)

Key:

Key.png

Recovery, logical decoding from the SQL interface and logical decoding from the walsender interface all use the xlogreader. Physical walsender does not.

Logical decoding from the walsender and physical walsender uses static variables in walsender.c to pass timeline info to its xlog reading functions. Redo uses static variables in xlog.c. Logical decoding from the SQL interface uses variables in the xlogreaderstate struct.

Redo uses its own function to read a page from an xlog segment. The walsender has its own which is used for both physical and logical walsender (one via xlogreader, one direct). Logical decoding from sql has its own separate xlog page reading function too.

Recovery

w StartupXLOG() - sets ThisTimeLineID
  -> ReadRecord(...)              - only checks timeline of result
   -> XLogReadRecord(...)         - doesn't care about timeline
    -> ReadPageInternal(...)      - doesn't care about timeline
r    -> XLogPageRead(...)         - Actual read. Callback, uses curFileTLI global in walsender.c to get TLI
w     -> WaitForWALToBecomeAvailable(...)
w     | -> XLogFileReadAnyTLI(...) - walks candidate TLIs
w     |  -> XLogFileRead(...)      - sets curFileTLI as used in XLogPageRead
w     -> XLogFileRead(...)         - (also called direct from WaitForWALToBecomeAvailable)

Recovery uses the xlogreader. Since the xlogreader is completely timeline-agnostic, it passes information about the timeline to read "around" the xlogreader using the curFileTLI global in walsender.c. Timeline following (and advance) logic is done in a number of places.

curFileTLI is set in XLogFileRead(...)

ThisTimeLineID gets set in StartupXLog(...) and also in CreateRestartPoint(...).

Timeline decisions are made in numerous places throughout StartupXLOG(...) depending on whether it's a master, cascading standby, normal standby, whether it's streaming, etc.

WaitForWALToBecomeAvailable(...) has handling for timeline switches due to an upstream cascading standby promotion or lagging replay of a prior promotion on the master.

Physical streaming (walsender)

Physical streaming does not use the xlogreader, unlike recovery or logical streaming. It reads xlog segment files directly without extracting records.

w  StartReplication(...)           - determines timeline to read, sets walsender.c globals
rw  -> XLogSendPhysical(...)       - detects if current tli became historic and finds new
rw   -> walsender.c:XLogRead(...)  - reads walsender.c globals, follows switches and sets globals

StartReplication(...) sets the walsender.c static globals sendTimeLineIsHistoric, sendTimeLine, sendTimeLineValidUpto and sendTimeLineNextTLI in walsender.c.

XLogRead(...) finds segment files based on sendTimeLine, switches TLI based on sendTimeLineIsHistoric, sendTimeLineValidUpto and sendTimeLineNextTLI.

(CreateReplicationSlot(...) also sets sendTimeLineIsHistoric and sendTimeLine).

Logical streaming (walsender)

Logical replication in the walsender.

StartLogicalReplication(...)
 -> WalSndLoop(...)
  -> XLogSendLogical(...)
   -> XLogReadRecord(...)
    -> ReadPageInternal(...)
rw   -> logical_read_xlog_page(...)
rw    -> XLogReadDetermineTimeline(...)
rw    -> walsender.c:XLogRead(...)

CreateReplicationSlot(...)
 -> DecodingContextFindStartpoint(...)
  -> XLogReadRecord(...)
   -> ReadPageInternal(...)
rw  -> logical_read_xlog_page(...)
rw    -> XLogReadDetermineTimeline(...)
rw    -> walsender.c:XLogRead(...)

Both paths now (in proposed patch) use XLogReadDetermineTimeline(...) in logical_read_xlog_page to handle timeline following, copying the timeline tracking state from where it's updated in XLogReaderState to the walsender.c globals used by XLogRead to determine the timeline to read from.

Both XLogRead and XLogReadDetermineTimeline do timeline following logic, since XLogRead has logic to switch timeline early and sets the relevant walsender globals. At the moment XLogReadDetermineTimeline just overrides that each time since it also thinks it is responsible for switching TLI at a segment boundary.

Logical streaming (SQL interface)

[all _get_ funcs]
-> pg_logical_slot_get_changes_guts(...)
 -> XLogReadPage(...) 
  -> ReadPageInternal(...)
   -> logical_read_local_xlog_page(...)
    -> read_local_xlog_page(...)
     -> XLogReadDetermineTimeline(...)
     -> xlogutils.c:XLogRead(...)

pg_create_logical_replication_slot(...)
 -> DecodingContextFindStartpoint(...)
  -> XLogReadPage(...) 
   -> ReadPageInternal(...)
    -> logical_read_local_xlog_page(...)
     -> read_local_xlog_page(...)
      -> XLogReadDetermineTimeline(...)
      -> xlogutils.c:XLogRead(...)

All the _get_ functions call pg_logical_slot_get_changes_guts.

The SQL interface to logical decoding uses a different method to pass the timeline to read to the actual xlog page read function, since it doesn't live in the walsender and can't use the walsender static globals. It's fetched from the XLogReaderState where it's set by XLogReadDetermineTimeline.

Note that XLogRead(...) in xlogutils.c is not the same function as XLogRead(...) in walsender.c, there are two static functions with identical names. Both read a page from an xlog seg, but in different ways, with different methods of waiting for new data, different inputs for what to read from which TLI, etc.