Row-security

From PostgreSQL wiki

Revision as of 17:24, 28 December 2012 by Simon (Talk | contribs)

Jump to: navigation, search

Contents

Row-Level Security

This page is for discussion of implementing RLS in PostgreSQL.

Jobs of access control feature

According to the definition of ISO/IEC27001 (information security management), design target of information security feature is to ensure confidentiality, integrity and availability of information asset. In short, these are often called C.I.A. Access control feature contributes towards both of confidentiality and integrity. It prevent unprivileged users to read or write information asset according to the rule that stand on. Information does not have a particular tangible form, so must be stored in object. Usually, access control feature allows or prohibits users to access the object that shall contain information asset. Thus, rules and its target is the core stuff of the security feature. For example, regular GRANT/REVOKE mechanism controls accesses on the specified database object according to the access control list.

Design Target

The purpose of row-level security feature is to prevent users (not means database roles, just users) to access unprivileged rows. Please note that "access" means two different direction of information (1) data read (rows => users) for confidentiality, and (2) data write (users => rows) for data integrity. Due to the nature of database access, it is the most straight-forwards way to describe the rule with regular qualifier style of WHERE clause; that is an expression returning a bool value.

Overall, RLS prevents users to read and write rows that does not satisfies the rule (we call it row-security policy later). If we support per-command configuration, the row-security policy to be checked depends on the command. RLS design accepts individual row-security policy is applied on SELECT, INSERT, UPDATE or DELETE. Relevant discussions are below.

The second purpose of this is that we have a clear and complete policy. If we implement a policy via a number of overlapping commands, adding or removing any part of it could cause holes to appear.

So the combined purpose is to "have" security and to "know" that we have security.

(Note that if we did implement security via BEFORE ROW triggers, we would need a way to ensure that all security triggers are fired last, otherwise it would be possible to pass the security check using one role, then switch to another role in a later trigger).

We have some SQL commands that allow users to access database rows; SELECT, INSERT, UPDATE or DELETE. COPY TO/FROM is synonym of SELECT and INSERT from security perspective. In case of SELECT (data read), things we should do is quite simple; any rows that don't match with the configured row-security policy shall be filtered out before it become visible for users. In case of INSERT or UPDATE (data write), RLS does not allow unprivileged rows being written to the result relation, as if CHECK constraint performing. In case of UPDATE or DELETE (data write), RLS does not allow unprivileged rows to appear as candidates of modification; that means row-security policy should be applied on the table scan stage. We also need to pay attention on potential information leaks using the leaky-view scenario below. So, UPDATE and DELETE shall also take row-security checks of SELECT command on table scanning stage.

TRUNCATE command performs as if DELETE, but much faster. It has its own permission separated from DELETE. So, we re-define meaning of TRUNCATE permission; that also implies to ignore row-security policy of DELETE. The reason for this is that the TRUNCATE command works by removing all storage for a table, so there is no opportunity for us to apply a WHERE clause during execution of TRUNCATE.

Row-security policy is set by table owner, using the following syntax. If no "ON <command>" given, a unique security policy shall be applied on all the supported commands (SELECT, INSERT, UPDATE and DELETE).

ALTER TABLE <relname> SET ROW SECURITY (<expression>) [ON <command>];

ALTER TABLE <relname> RESET ROW SECURITY [ ON <command> ];

Superuser bypass row-security policy (1) to avoid Trojan-horse attack by table owner (2) to ensure consistent view for database backup, however, row-security policy injected by extension (such as sepgsql) is exception.

Issues & discussion

Per-command security policy

Asymmetric row-security policy may cause unexpected information leaks using UPDATE or DELETE command with RETURNING clause or leaky functions in WHERE clause. It is an idea to append row-security policy of SELECT when executor scan the result relation. It ensures the rows to be modified are always visible. One other idea was to enforce a unique policy for all commands, however, it has a problematic scenario when user wants to define INSERT-only relation.

Writer-side checks

Now we can implement writer-side checks on INSERT or UPDATE command using before-row triggers. On the other hand, it makes users to synchronize the configuration of RLS with triggers of this checks. One model case to solve this concern is implementation of FK constraints; that automatically defines before-row triggers that checks newer version of tuples to be inserted or updated.

Table statistics

pg_statistic hold some example of values on the table being analyzed. Right now, we have no way to prevent users to see collected values on statistical board. An idea is to mask the field if the relation has RLS policy.

Minimal core feature set

  • Checks are only applied on table scanning. If writer-side checks are required, users can do that using triggers, even though it takes complex setting.
  • A unique security policy can be configurable on a table. Even though RLS design allows per-command policy, we need to investigate whether asymmetric policy is harmless.

Previous discussion in 2010

Prerequisites

  • Before we can try to tackle row-level security generally, using labels or not, we need to fix the issues with data leaks in views.

Issue: A leaky VIEWs for RLS

  • This issue is summarized as: an untrusted user can define a function which stores all information it is presented, then query a view using that function in a way which will convince the planner to send every row of the underlying table to the function, thus leaking data in the table which the view was intended to prevent.

In this message, KaiGai pointed out we have two different causes of the problem, but both of them can cause same information leaks.

  • [1] Unexpected order to evaluate qualifiers on a certain scan plan
    • When a scan-plan has multiple qualifiers to filter scanned tuples, the optimizer sort the qualifiers based on their estimated cost to execute. If an exogenetic function has smaller cost than other qualifiers come from view definition, the exogenetic function shall be evaluated earlier than others, then contents of invisible tuples may be provided to malicious user-defined functions.
    • It is reordered at order_qual_clauses().
  • [2] Unexpected qualifier distributions into inside of join loop
    • When planner makes a scan plan, it tries to distribute qualifiers of scans into smaller unit as possible as it can. For example, if a function takes arguments only come from a certain table, it will be distributed into scan plan of the table, not outside of the join.
    • When a view contains JOIN clause between A and B, user can reference the view with his own WHERE clause. If a clause takes arguments depending on only A, the planner distribute the clause into the scan plan of A. In the reault, tuples to be filtered out may become visible to user defined functions.
    • It is distributed at distribute_qual_to_rels().

Discussion

  • At the point [2], if we prevent all exogenetic functions to push down into join loops, it will make unignorable performance degradation, although the view might not be intended to security purpose.
  • It needs a way to provide a hint whether the view is defined for security purpose, or not.
    • Tom Lane suggested CREATE SECURITY VIEW AS ... statement.
    • It was not concluded which is the default. Security view? Regular view?
    • How about a GUC option to specify the default? NACKed.
  • In addition, Robert Haas suggested to test privileges of users whether they have privileges to reference underlaying tables without vires, or not. If available, it is eventually harmless even if user defined functions are evaluated within join loop.
    • Here was one opposition because this check will be applied on planner stage, not execution stage.
  • KaiGai submitted a proof of concept patch that prevents to push down exogenetic functions into security views.
  • At the point [1], we don't have any active discussions yet.

Discussions of RLS in PG

Articles/Documentation of existing RLS implementations

Use Cases

  • PCI Compliant implementations
  • Classified Environments
  • Other?

Components of an implementation

  • Overview
    • Allow the query to be modified, prior to being passed to the planner, in such a way that the rows returned will be those the user is authorized for
    • This depends on being able to tell the planner that this filter must be done, in some way, prior to user-defined functions being called
    • This issue is related to the VIEW leak described above. Once that issue has been resolved, this should be pretty straight-forward to implement
  • Grammar for catalog updates/changes; user-interface for specifying how RLS is to be done
  • Catalog changes for storing RLS information
  • Storage - Could this just be a regular column in the table? It would be good to avoid changing the header or creating a system column for this.
    • We would need to track, in some fashion, the "security" column in the catalog, perhaps as a flag on pg_attribute, or a 'security_attnum' in pg_class, etc.
  • Planner updates to enforce the filter based on RLS- this can't be done till after we deal with the issue with VIEWs
  • Executor changes may not be required.. but how to deal with stored plans? Use invalidation if anything changes with regard to RLS?
  • Other?

Issues

  • Covert Channel
    • If we try to insert a value which violates a PK constraint, we can assume existence of invisible PK from the error.
    • The same issue exists in a Foreign Key relationship situation
    • Other databases with row-level security don't address this issue, so it's unclear if we really need to (reference?)
    • In any case, this isn't something we need to address in our initial RLS implementation
  • Order to evaluate row level policy
    • Addressed above with regard to views, once we solve that, this will be handled
  • TRUNCATE statement
    • TRUNCATE is expected to be fast.
    • If the user does not have rights to remove all rows from the table regardless of row-level policy, then any TRUNCATE must be denied.
    • Turning TRUNCATE (a PostgreSQL extension which is not in the SQL spec anyway) into a DELETE FROM doesn't make sense.
  • COPY TO statement
    • COPY can just be reformed into a COPY statement with a query being used instead, eg:
    • COPY tblname TO xxx; can be replaced by COPY (SELECT * FROM tblname) TO xxx;.
  • Table inheritance
    • Not something we really need to deal with in the initial version, so long as it doesn't completely break (or we make sure it does for inheritance)
    • What policy should be applied on the parent and child relations.
    • idea: Also copy row-level policies from the parent tables.
  • Foreign Key constraint
    • Adding multiple modes for FK should be a separate, follow-on patch, it doesn't need to be in the initial version.
    • idea: We can provide two modes: The first is filter-mode, the second is abort-mode.
      • filter-mode: A normal mode. Row-level policy is evaluated earlier than user given condition, and returns false, if violated.
      • abort-mode: A special internal mode. Row-level policy is evaluated after all the condition, and raises an error, if violated. The condition shall be evaluated earlier than row-level policy, the query has to be trusted. Such as queries in FK constraint.

Considerations

  • Performance
    • With RLS
    • Without RLS
    • The page.16 of LAPP/SELinux slides shows a pgbench comparison between pgsql-8.4.1 vs SE-PostgreSQL with RLS.
      • It has 2-4% of performance tradeoff depending on database size.
      • Note that SE-PostgreSQL implemented RLS in a different way than what is being proposed here; our goal is to get RLS support into core PG
  • Integration with external security manager (eg: SELinux, SMACK)
    • This will not be included in the initial support of RLS (unless we happen to get support for external security managers implemented first in PG)
Personal tools