Parallel Query Execution
From PostgreSQL wiki
(Difference between revisions)
(Remove mention of 8.3 test branch.) |
|||
| Line 42: | Line 42: | ||
* Thread - | * Thread - | ||
** Not thread safe code | ** Not thread safe code | ||
| − | |||
| − | |||
| − | |||
[[Category:Development]] | [[Category:Development]] | ||
Revision as of 00:38, 15 January 2013
Contents |
Project Goal
- Implement parallel query
- Implementation will use one master process (current backend) and multiple slaves processes forked from postmaster as a result of masters signal to postmaster.
Issues
- Shared memory
- new shared memory context which uses ossp mm library
- limitation – so far we do not bother with attaching to shared in execbackend case, so slaves can only be forked from postmaster
- Slave process
- initialization almost the same as standard backend, only username and database is from master process
- limitation – additional pg modules loaded in backend are not reloaded in slave
ToDo
- parallel sort using multiple processes
- in nodesort distribute incoming tuples to slaves using hash function
- implement producer consument structure in shared memory to allow sending data between processes
- implement final merge phase of slave results
Progress
- 04/09 -
PGTHREADstructure which will hold information about locks- they have to be stored in shared memory as
PGPROC - proc array remains unchanged because thread will not be creating its own transactions
- they have to be stored in shared memory as
- 04/09 - Lwlock.c - lowlock are granted to threads
- 05/09 - signal handling
- what to use in protecting access to
num_held_lwlocksandheld_lwlocks[]- pg spinlocks orpthread_mutex_t? - it has to be initialized for each thread, fast, secure
- what to use in protecting access to
- August–September 2009 - switching to implementation with processess as a result of Zdenek’s discussion with Tom and Simon
- September 2009 - implementing new shared memory context based on multiplatform ossp mm library
- October 2009 - figuring out the architecture of processes
- November 2009 - implementing fully initialized slave backend process, created after master process send signal to postmaster
- December 2009 - distributing tuples to worker processes in nodesort.c to performsort in them, final merge in master process
Process vs Thread
- Process +
- Existing code does not need to be rewritten to be thread safe
- Thread +
- No special effort to share data between threads
- Process -
- Speed issues in switching context
- Thread -
- Not thread safe code