From PostgreSQL wiki
Revision as of 22:25, 19 May 2012 by Boshomi
This page is for discussion of implementing RLS in PostgreSQL.
- Before we can try to tackle row-level security generally, using labels or not, we need to fix the issues with data leaks in views.
- Related topics in -hackers
Issue: A leaky VIEWs for RLS
- This issue is summarized as: an untrusted user can define a function which stores all information it is presented, then query a view using that function in a way which will convince the planner to send every row of the underlying table to the function, thus leaking data in the table which the view was intended to prevent.
In this message, KaiGai pointed out we have two different causes of the problem, but both of them can cause same information leaks.
-  Unexpected order to evaluate qualifiers on a certain scan plan
- When a scan-plan has multiple qualifiers to filter scanned tuples, the optimizer sort the qualifiers based on their estimated cost to execute. If an exogenetic function has smaller cost than other qualifiers come from view definition, the exogenetic function shall be evaluated earlier than others, then contents of invisible tuples may be provided to malicious user-defined functions.
- It is reordered at order_qual_clauses().
-  Unexpected qualifier distributions into inside of join loop
- When planner makes a scan plan, it tries to distribute qualifiers of scans into smaller unit as possible as it can. For example, if a function takes arguments only come from a certain table, it will be distributed into scan plan of the table, not outside of the join.
- When a view contains JOIN clause between A and B, user can reference the view with his own WHERE clause. If a clause takes arguments depending on only A, the planner distribute the clause into the scan plan of A. In the reault, tuples to be filtered out may become visible to user defined functions.
- It is distributed at distribute_qual_to_rels().
- At the point , if we prevent all exogenetic functions to push down into join loops, it will make unignorable performance degradation, although the view might not be intended to security purpose.
- It needs a way to provide a hint whether the view is defined for security purpose, or not.
- Tom Lane suggested CREATE SECURITY VIEW AS ... statement.
- It was not concluded which is the default. Security view? Regular view?
- How about a GUC option to specify the default? NACKed.
- In addition, Robert Haas suggested to test privileges of users whether they have privileges to reference underlaying tables without vires, or not. If available, it is eventually harmless even if user defined functions are evaluated within join loop.
- Here was one opposition because this check will be applied on planner stage, not execution stage.
- KaiGai submitted a proof of concept patch that prevents to push down exogenetic functions into security views.
- At the point , we don't have any active discussions yet.
Discussions of RLS in PG
- Current thread on -hackers
- Josh Berkus on RLS in PG, Part 1
- Josh Berkus on RLS in PG, Part 2
- SEPostgreSQL_Specifications Specifications for SEPostgreSQL, includes RLS
Articles/Documentation of existing RLS implementations
- Oracle RLS Article, Part 1
- Oracle RLS Article, Part 2
- Oracle RLS and VPD Article
- SQL Server RLS with Classified Systems
- IBM/DB2 RLS Documentation
- PCI Compliant implementations
- Classified Environments
Components of an implementation
- Allow the query to be modified, prior to being passed to the planner, in such a way that the rows returned will be those the user is authorized for
- This depends on being able to tell the planner that this filter must be done, in some way, prior to user-defined functions being called
- This issue is related to the VIEW leak described above. Once that issue has been resolved, this should be pretty straight-forward to implement
- Grammar for catalog updates/changes; user-interface for specifying how RLS is to be done
- Catalog changes for storing RLS information
- Storage - Could this just be a regular column in the table? It would be good to avoid changing the header or creating a system column for this.
- We would need to track, in some fashion, the "security" column in the catalog, perhaps as a flag on pg_attribute, or a 'security_attnum' in pg_class, etc.
- Planner updates to enforce the filter based on RLS- this can't be done till after we deal with the issue with VIEWs
- Executor changes may not be required.. but how to deal with stored plans? Use invalidation if anything changes with regard to RLS?
- Covert Channel
- If we try to insert a value which violates a PK constraint, we can assume existence of invisible PK from the error.
- The same issue exists in a Foreign Key relationship situation
- Other databases with row-level security don't address this issue, so it's unclear if we really need to (reference?)
- In any case, this isn't something we need to address in our initial RLS implementation
- Order to evaluate row level policy
- Addressed above with regard to views, once we solve that, this will be handled
- TRUNCATE statement
- TRUNCATE is expected to be fast.
- If the user does not have rights to remove all rows from the table regardless of row-level policy, then any TRUNCATE must be denied.
- Turning TRUNCATE (a PostgreSQL extension which is not in the SQL spec anyway) into a DELETE FROM doesn't make sense.
- COPY TO statement
- COPY can just be reformed into a COPY statement with a query being used instead, eg:
- COPY tblname TO xxx; can be replaced by COPY (SELECT * FROM tblname) TO xxx;.
- Table inheritance
- Not something we really need to deal with in the initial version, so long as it doesn't completely break (or we make sure it does for inheritance)
- What policy should be applied on the parent and child relations.
- idea: Also copy row-level policies from the parent tables.
- Foreign Key constraint
- Adding multiple modes for FK should be a separate, follow-on patch, it doesn't need to be in the initial version.
- idea: We can provide two modes: The first is filter-mode, the second is abort-mode.
- filter-mode: A normal mode. Row-level policy is evaluated earlier than user given condition, and returns false, if violated.
- abort-mode: A special internal mode. Row-level policy is evaluated after all the condition, and raises an error, if violated. The condition shall be evaluated earlier than row-level policy, the query has to be trusted. Such as queries in FK constraint.
- With RLS
- Without RLS
- The page.16 of LAPP/SELinux slides shows a pgbench comparison between pgsql-8.4.1 vs SE-PostgreSQL with RLS.
- It has 2-4% of performance tradeoff depending on database size.
- Note that SE-PostgreSQL implemented RLS in a different way than what is being proposed here; our goal is to get RLS support into core PG
- Integration with external security manager (eg: SELinux, SMACK)
- This will not be included in the initial support of RLS (unless we happen to get support for external security managers implemented first in PG)