User:Gurjeet

From PostgreSQL wiki
Jump to: navigation, search

About

See http://Gurje.et

Patch Reviews

Patches that I have, or am, reviewing Patch Reviews.

Projects

Completed

In Development

  • Named Operators
The operators in Postgres can get very cryptic; guess what does <@ mean?
The proposal is to allow any identifier to be used as a name for an operator.
For example, the named operator :contained_in: can be used to represent the tsquery <@ tsquery operator.
See this comment on HN for at least one person's desire to identify an operator by name.
Patch submitted. See discussion here
  • PG Plan Guarantee
  • Websockets
    See also HTTP API
  • Add support for RISC-V memory-barriers
    Actively in the process of acquiring a SiFive HiFive Unmatched-RISCV Development Board
    Active as of 2022-08-10: See https://www.ebay.com/itm/385053303589
    Won auction, waiting for delivery (2022-08-16)
    Received, unpacked. (2022-08-23)
    Used this video for guidance on which components to buy.
    Ordered a PSU (2022-08-30); other items, such as SSD and WiFi are optional, so will buy those after a boot-up test.
    Received the PSU (2022-09-01). Bootup test completed. Logged-in via serial console, and can verify Linux working. See pictures
    Ordered 1TB SSD. (2022-09-02)
    Ordered quieter CPU fans, and Mini-ITX enclosure. (2022-09-03)
    Received SSD, and CPU fans. (2022-09-05)
    2022-09-06:
    Received Mini-ITX enclosure.
    Assembled all the components into Mini-ITX enclosure; took about 3 hours.
    Booted up again from the flash card, and observed boot messages over the serial-to-usb connection.
    Created partition on the SSD.
    Ordered Intel WiFi card (2022/09/07). See https://a.co/a09xLOq
    Intel WiFi card arrived; installed, connected to Home WiFi (2022/09/09)
    Latest Linux distro to install: Ubuntu. See https://ubuntu.com/download/risc-v
    TODO: Install Ubuntu 22.04.1 on SSD. See https://wiki.ubuntu.com/RISC-V
    TODO: Install Nix package manager (to support various compilers simultaneously)
    TODO: Set up Postgres buildfarm
    TODO: Register as a new animal in the Buldfarm
    TODO: Add RISC-V to supported platforms

Future/Potential/Interested

EC Tables bring some of the advantages of NoSQL databases to the Postgres users.
  • Users keeps their main table simple, with just a PKey
Hence they get fast responses to SQL commands
  • User creates an EC Table based off a normal table
  • System creates a table identical in structure to the original table
  • System creates a trigger on the EC Table to reject any direct DML commands.
  •  ?? Does the system recreate the partitions, too, if the original table is partitioned? No
No, for now. The intended goal is to keep the main table fast; partitioned main tables are not fast.
  • A Background (BG) process hooks into the WAL stream
  • The BG Process replays the changes of main table on the EC Table.
  • It considers WAL records only from successful transactions
Hence the 'eventual' in the feature's name.
  • It converts WAL records into DML commands, and applies them to the EC Table
  • User can create additional indexes on the the EC Table to speed up commands on that table.
  • Potential problems
  • What to do when a DDL command changes the main table?
  • Advantages
  • The commands and queries that rely on PKey access are the fastest they can be.
  • The commands that can use slightly stale data can use the EC Table, and its indexes
  • Disadvantages
  • At least double the space consumption
No different than how NoSQL databases deal with it
  • Long running transactions can fill up WAL storage
  • Possible improvements that can help
  • Index-only Tables
  • EC Tables only on a replica.
  • tz_name data-type
    Before: functions accept a text param and then try to convert it to timezone name/abbreviation
    After: functions can simply declare they accept a tz_name, and the type-I/O functions will do the validation.
    Example: make_timestamptz_at_timezone function accepts 'text' and then converts it into TZ-name
  • pl/wasm
    Postgres extension to run WASM code
  • plpgsql-to-WASM
    Convert plpgsql to WASM, for speed, while retaining safety.
  • pg_wasm
    Postgres compiled to WebAssembly; to potentially run in Web Browsers
  • emPostgres
    Postgres as a library, embedded in other applications
  • csvPostgres
    emPostgres embedded in an application, with file_fdw enabled, to process CSV files using SQL.
    It could be an alternative to Q.
  • file.postgres
    Data directory in a single-file (think: SQLite replacement, when combined with emPostgres)
  • PGLite
    emPostgres and file.postgres, combined, will provide a SQLite alternative.
  • Muheam - Multi-headed Monster
  • Export/Import
    Enable copying relations as files between Postgres Instances
  • SQL - Engine Support
    List of SQL features and support level in various databases; a. la. C++ compiler support
  • Consistent read-after-write on replica
    Enable read-after-write consistency when using read-only replicas for read-scalability
    Notify-XID + libpq + backend-wait-for-XID
    - Application writes to primary
    - Application gets XID upon commit
    - Applciation sends that XID to start new transaction on replica
    - Replica backend waits until that XID is applied
  • Copy-on-Write system/object snapshots
    Similar to how Filesystem (e.g. ZFS) snapshots work
  • Contribute to PostgresML
  • Optimized Mode: Chained, One-shot Transactions
    In this mode of server operation, only one transaction is allowed to run at a time, theoretically.
    This allows the transaction to run without taking any locks, hence improving latency.
    The transaction must be fully available to the server; interactivity not allowed; savepoints not allowed.
    For better throughput, multiple transactions run at the same time.
    Each transaction knows exactly which transaction is ahead of it.
    A transaction cannot be committed before its predecessor transaction.
    Optimistically, each transaction assumes the transaction ahead of it is going to COMMIT; and _not_ ROLLBACK.
    Based on that optimism
    * the transaction ignores row-versions deleted by its precedecessor, and
    * considers the new row-versions created by the predecessor as committed
    - Problem: the predecessor may create many row-versions of a row;
    - How to identify which is the last row-version, so we can ignore the non-last row-versions.
    - This is currently a blocker.
    If a transaction fails (on ERROR, etc.), then all following transactions that are in-progress also fail.
    - This is the cost of optimistic transactions.
    Any transaction that starts after such failure, starts a fresh list/chain of transactions.
    When done with a row, a transaction passes the row on to its follower transaction.
    - This may be our antidote to the problem of identifying the non-last row-version.
    - A transaction _must_ only pass a row's last row-version to the follower transaction.
    - ... it _must_not_ pass any non-last row-versions of a row to the follower transaction.
    If not interested in a relation, the transaction must pass the whole relation to the next transaction.
    - This would help increase throughput, since the transactions working on unrelated sets of relations may proceed in parallel.
    As an optimization, a predecessor transaction would pass the negative-list of relations
    - Any relations not in the list are freely available to the follower to operate on.
    - The follower may add its own relations to the negative-list, and pass on the list to its follower.