Security Model
Design Principle
graph stores topology, not data. Your security policies stay in Postgres.
The graph index contains:
- Node indices (u32)
- Edge connections (source → target, type_id)
- Bloom signatures (lossy, non-reconstructable)
- TokenIndex tokens (property key:value pairs, ≤
graph.max_token_lengthchars) - Node metadata (source_table OID, source_pk)
It does NOT contain:
- Full row payloads
- Sensitive field values (beyond indexed short properties)
- Passwords, PII beyond what you explicitly register
Security Layers
graph implements security at three layers, each providing a different guarantee:
Layer 1: ACL Pre-Flight Check (BEFORE BFS)
→ "Can this user even START a traversal on this table?"
→ O(1), zero cost, enforced by graph
Layer 2: Hydration-Time RLS (AFTER BFS, DURING JOIN)
→ "Can this user SEE the payload of each discovered node?"
→ Enforced by Postgres (standard RLS policies)
Layer 3: Tenant-Scoped Traversal (OPTIONAL, DURING BFS)
→ "Should the BFS even VISIT nodes from other tenants?"
→ Prevents topology leakage across tenant boundariesLayer 1: ACL Pre-Flight Check
Before dropping into the Rust BFS hot loop, every graph query function verifies that the calling user has SELECT permission on the seed table. This uses Postgres’s internal C function pg_class_aclcheck() — the same function Postgres uses for its own permission checks.
// Executed BEFORE any graph traversal begins
let user_oid = pgrx::pg_sys::GetUserId();
let acl_result = pgrx::pg_sys::pg_class_aclcheck(
seed_table_oid,
user_oid,
pgrx::pg_sys::ACL_SELECT
);
if acl_result != pgrx::pg_sys::ACLCHECK_OK {
return Err(GraphError::AclDenied {
table: table_name.to_string(),
});
}Where ACL Checks Happen
| Function | Tables Checked |
|---|---|
graph.traverse() | Seed table |
graph.shortest_path() | Both source and target tables |
graph.search() | table_filter (if specified) |
graph.build() | All registered tables (during build) |
Cost: O(1) per check. Effectively zero microseconds. This check happens once per function call, not per node visited.
Layer 2: Hydration-Time RLS (Default Security)
The traversal function returns (node_table REGCLASS, node_id TEXT) pairs. When the user JOINs back to the source tables, Postgres RLS policies apply automatically:
-- RLS policy on users table
CREATE POLICY tenant_isolation ON users
USING (organisation_id = current_setting('app.tenant_id'));
-- Traversal returns ALL connected node IDs (fast)
-- The JOIN enforces RLS (secure)
SELECT u.name, u.email, g.depth
FROM graph.traverse('users', 'U-123', 4) g
JOIN users u ON u.id = g.node_id
WHERE g.node_table = 'users'::regclass;
-- If the user doesn't have SELECT on `users`, the JOIN returns zero rows.
-- If RLS hides certain users, they are filtered out automatically.
-- graph doesn't need to know anything about your security policies.This is the recommended pattern and the most Postgres-native approach. The traversal is fast (no security checks in the BFS loop), and RLS enforcement happens during hydration — exactly how Postgres is designed to work.
Layer 3: Tenant-Scoped Traversal (Optional)
For multi-tenant applications where you want to prevent topology leakage (user A shouldn’t even know that a node in tenant B exists), graph supports tenant-level filtering during BFS.
When a tenant_column is registered with graph.add_table(), the engine builds per-tenant bitmaps in the TokenIndex during build(). At query time, graph reads the current tenant context from a Postgres session variable (e.g., current_setting('app.tenant_id')) and restricts BFS expansion to nodes matching that tenant. Within that registered tenant scope, nodes outside the tenant are not expanded or returned by traversal/search. This is a topology-leakage mitigation, not a blanket zero-leakage guarantee for every deployment or artifact.
-- 1. Register table with tenant column
SELECT graph.add_table('users', id_columns := ARRAY['id'],
columns := ARRAY['name', 'email'],
tenant_column := 'organisation_id'
);
SELECT graph.build();
-- 2. Set tenant context (typically done by your app/middleware)
SET app.tenant_id = 'org_123';
-- 3. Traverse — only nodes belonging to org_123 are visited
SELECT * FROM graph.traverse('users', 'U-123', 5);Note: Tenant isolation at the BFS level requires
tenant_columnto be registered on every table in the graph. Tables without atenant_columnare treated as shared/global and are always traversable.
The “Ghost Traversal” Problem
This is the one structural vulnerability you must understand.
The Problem
Consider this graph:
[Public Node A] ─────→ [Secret RLS Node B] ─────→ [Public Node C]If a user starts at Node A and does a 2-hop traversal, the Rust BFS hot loop (which is blind to RLS) will traverse through Node B and return Node C.
When Postgres does the hydration JOIN:
- The user sees Node A and Node C (they have access)
- The user does NOT see the payload of Node B (RLS hides it)
- However, by seeing that Node C was returned at depth 2, the user now knows that a relationship exists between A and C through some intermediate node
In highly classified environments, knowing the topology exists can be a security breach even without seeing the payload.
Impact Assessment
| Environment | Risk Level | Recommended Approach |
|---|---|---|
| Standard OLTP | Low | Layer 2 (hydration RLS) is sufficient |
| Multi-tenant SaaS | Medium | Use Layer 3 (tenant-scoped traversal) |
| Classified / financial | High | Use Layer 3 + application-level path filtering |
Mitigation Options
-
Tenant-scoped traversal (Layer 3): If tenants should never see each other’s nodes, register a
tenant_columnand always passtenant_id. Nodes from other tenants are never expanded in BFS. -
Post-traversal path validation: After traversal, the application can filter paths where any intermediate node belongs to a table the user can’t access:
-- Application-level path filtering
WITH graph AS (
SELECT * FROM graph.traverse('users', 'U-123', 5)
)
SELECT g.*
FROM graph g
WHERE NOT EXISTS (
-- Remove results whose path passes through restricted tables
SELECT 1 FROM unnest(g.path) AS path_node
WHERE NOT has_table_privilege(current_user,
(SELECT node_table FROM graph WHERE node_id = path_node LIMIT 1),
'SELECT')
);The Honest Position
Layer 2 (hydration RLS) is often sufficient for ordinary application retrieval because payload visibility remains governed by Postgres. The ghost traversal problem matters when:
- The existence of a relationship is itself classified
- AND the user has access to nodes on both sides of a hidden node
- AND the user can infer the hidden node’s identity from the path structure
Document this clearly to users and let them choose the appropriate security layer.
Data Safety
What the Extension Stores
| Data | Where | Reconstructable? | Risk |
|---|---|---|---|
| Node indices (u32) | RAM + .graph file | No (meaningless without mapping) | None |
| Edge connections | RAM + .graph file | Topology only | Low |
| Bloom signatures (u64) | RAM + .graph file | No (lossy hash) | None |
| TokenIndex tokens | RAM + .graph file | ⚠️ Yes (plaintext key:value) | Medium |
| source_table + source_pk | RAM + .graph file | ⚠️ Yes (identifies source row) | Medium |
TokenIndex Security
The TokenIndex stores plaintext key:value tokens for properties ≤ graph.max_token_length (default 128 characters). This means:
name:aliceis stored in the indexstatus:activeis stored in the index- A 5000-character
descriptionfield is NOT stored (exceeds threshold, falls back to Bloom)
Mitigation options:
-
Exclude sensitive columns from
columnswhen registering tables:-- Don't index SSN, DOB, or salary SELECT graph.add_table('employees', id_columns := ARRAY['id'], columns := ARRAY['name', 'department', 'status'] -- SSN, date_of_birth, salary are NOT indexed ); -
Use Bloom-only mode (future): A configuration option to disable TokenIndex entirely and use only Bloom filters for property filtering. Bloom signatures are non-reconstructable.
.graph File Security
The .graph file is stored in $PGDATA/graph/. It has the same file permissions as other Postgres data files (owned by the postgres user, mode 0600).
If an attacker gains access to $PGDATA, they already have access to your raw table data — the .graph file is the least of your concerns.
Extension Permissions
Functions
All graph functions live in the graph schema. Access requires USAGE on the schema:
-- Grant access to a role
GRANT USAGE ON SCHEMA graph TO app_role;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA graph TO app_role;
-- Revoke build/admin functions from non-admin roles
REVOKE EXECUTE ON FUNCTION graph.build() FROM app_role;
REVOKE EXECUTE ON FUNCTION graph.reset() FROM app_role;
REVOKE EXECUTE ON FUNCTION graph.vacuum() FROM app_role;Recommended Role Setup
-- Admin role: can build, vacuum, register tables
CREATE ROLE graph_admin;
GRANT ALL ON SCHEMA graph TO graph_admin;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA graph TO graph_admin;
-- Query role: explicit whitelist of allowed functions
-- (Using explicit grants rather than GRANT ALL + REVOKE, because the REVOKE
-- pattern is fragile — adding new admin functions in future versions would
-- accidentally grant them to readers.)
CREATE ROLE graph_reader;
GRANT USAGE ON SCHEMA graph TO graph_reader;
GRANT EXECUTE ON FUNCTION graph.traverse(regclass, text, int, text[], text, regclass[], jsonb, text, text, text, boolean, boolean, int, int, int, int) TO graph_reader;
GRANT EXECUTE ON FUNCTION graph.shortest_path(regclass, text, regclass, text, int, boolean) TO graph_reader;
GRANT EXECUTE ON FUNCTION graph.weighted_shortest_path(regclass, text, regclass, text) TO graph_reader;
GRANT EXECUTE ON FUNCTION graph.search(text, text, regclass, text, boolean, int, int, text, boolean, boolean) TO graph_reader;
GRANT EXECUTE ON FUNCTION graph.status() TO graph_reader;Threat Model
| Threat | Mitigation |
|---|---|
| SQL injection via table parameters | REGCLASS type validates at call time, plus pg_catalog cross-check |
| SQL injection via column names, property keys, filter payloads, or tenant values | Validate identifiers against pg_catalog, quote identifiers with Postgres APIs, bind values through SPI parameters, reject unsupported JSONB filter payloads before SQL generation |
| User traverses table they can’t access | ACL pre-flight check (Layer 1) — O(1), before BFS |
| Denial of service via deep traversal | Circuit breakers: max_nodes, max_depth, max_frontier |
| Memory exhaustion via large graph | graph.memory_limit_mb hard cap + OOM guard (ERROR, not crash) |
| Cross-tenant topology leakage | Layer 3 tenant-scoped traversal (optional) |
| Ghost traversal (see above) | Documented, with mitigation options for each security level |
| Stale graph serving wrong results | is_active tombstoning, generation counters (future) |
| .graph file tampering | CRC32 checksum, magic bytes verification |
| Background worker crash | Auto-restart with 5-second backoff |
| Rust panic in extension | catch_unwind on all entry points → PG ERROR, not crash |