Skip to Content
Security

Security Model

Design Principle

graph stores topology, not data. Your security policies stay in Postgres.

The graph index contains:

  • Node indices (u32)
  • Edge connections (source → target, type_id)
  • Bloom signatures (lossy, non-reconstructable)
  • TokenIndex tokens (property key:value pairs, ≤ graph.max_token_length chars)
  • Node metadata (source_table OID, source_pk)

It does NOT contain:

  • Full row payloads
  • Sensitive field values (beyond indexed short properties)
  • Passwords, PII beyond what you explicitly register

Security Layers

graph implements security at three layers, each providing a different guarantee:

Layer 1: ACL Pre-Flight Check (BEFORE BFS) → "Can this user even START a traversal on this table?" → O(1), zero cost, enforced by graph Layer 2: Hydration-Time RLS (AFTER BFS, DURING JOIN) → "Can this user SEE the payload of each discovered node?" → Enforced by Postgres (standard RLS policies) Layer 3: Tenant-Scoped Traversal (OPTIONAL, DURING BFS) → "Should the BFS even VISIT nodes from other tenants?" → Prevents topology leakage across tenant boundaries

Layer 1: ACL Pre-Flight Check

Before dropping into the Rust BFS hot loop, every graph query function verifies that the calling user has SELECT permission on the seed table. This uses Postgres’s internal C function pg_class_aclcheck() — the same function Postgres uses for its own permission checks.

// Executed BEFORE any graph traversal begins let user_oid = pgrx::pg_sys::GetUserId(); let acl_result = pgrx::pg_sys::pg_class_aclcheck( seed_table_oid, user_oid, pgrx::pg_sys::ACL_SELECT ); if acl_result != pgrx::pg_sys::ACLCHECK_OK { return Err(GraphError::AclDenied { table: table_name.to_string(), }); }

Where ACL Checks Happen

FunctionTables Checked
graph.traverse()Seed table
graph.shortest_path()Both source and target tables
graph.search()table_filter (if specified)
graph.build()All registered tables (during build)

Cost: O(1) per check. Effectively zero microseconds. This check happens once per function call, not per node visited.


Layer 2: Hydration-Time RLS (Default Security)

The traversal function returns (node_table REGCLASS, node_id TEXT) pairs. When the user JOINs back to the source tables, Postgres RLS policies apply automatically:

-- RLS policy on users table CREATE POLICY tenant_isolation ON users USING (organisation_id = current_setting('app.tenant_id')); -- Traversal returns ALL connected node IDs (fast) -- The JOIN enforces RLS (secure) SELECT u.name, u.email, g.depth FROM graph.traverse('users', 'U-123', 4) g JOIN users u ON u.id = g.node_id WHERE g.node_table = 'users'::regclass; -- If the user doesn't have SELECT on `users`, the JOIN returns zero rows. -- If RLS hides certain users, they are filtered out automatically. -- graph doesn't need to know anything about your security policies.

This is the recommended pattern and the most Postgres-native approach. The traversal is fast (no security checks in the BFS loop), and RLS enforcement happens during hydration — exactly how Postgres is designed to work.


Layer 3: Tenant-Scoped Traversal (Optional)

For multi-tenant applications where you want to prevent topology leakage (user A shouldn’t even know that a node in tenant B exists), graph supports tenant-level filtering during BFS.

When a tenant_column is registered with graph.add_table(), the engine builds per-tenant bitmaps in the TokenIndex during build(). At query time, graph reads the current tenant context from a Postgres session variable (e.g., current_setting('app.tenant_id')) and restricts BFS expansion to nodes matching that tenant. Within that registered tenant scope, nodes outside the tenant are not expanded or returned by traversal/search. This is a topology-leakage mitigation, not a blanket zero-leakage guarantee for every deployment or artifact.

-- 1. Register table with tenant column SELECT graph.add_table('users', id_columns := ARRAY['id'], columns := ARRAY['name', 'email'], tenant_column := 'organisation_id' ); SELECT graph.build(); -- 2. Set tenant context (typically done by your app/middleware) SET app.tenant_id = 'org_123'; -- 3. Traverse — only nodes belonging to org_123 are visited SELECT * FROM graph.traverse('users', 'U-123', 5);

Note: Tenant isolation at the BFS level requires tenant_column to be registered on every table in the graph. Tables without a tenant_column are treated as shared/global and are always traversable.


The “Ghost Traversal” Problem

This is the one structural vulnerability you must understand.

The Problem

Consider this graph:

[Public Node A] ─────→ [Secret RLS Node B] ─────→ [Public Node C]

If a user starts at Node A and does a 2-hop traversal, the Rust BFS hot loop (which is blind to RLS) will traverse through Node B and return Node C.

When Postgres does the hydration JOIN:

  • The user sees Node A and Node C (they have access)
  • The user does NOT see the payload of Node B (RLS hides it)
  • However, by seeing that Node C was returned at depth 2, the user now knows that a relationship exists between A and C through some intermediate node

In highly classified environments, knowing the topology exists can be a security breach even without seeing the payload.

Impact Assessment

EnvironmentRisk LevelRecommended Approach
Standard OLTPLowLayer 2 (hydration RLS) is sufficient
Multi-tenant SaaSMediumUse Layer 3 (tenant-scoped traversal)
Classified / financialHighUse Layer 3 + application-level path filtering

Mitigation Options

  1. Tenant-scoped traversal (Layer 3): If tenants should never see each other’s nodes, register a tenant_column and always pass tenant_id. Nodes from other tenants are never expanded in BFS.

  2. Post-traversal path validation: After traversal, the application can filter paths where any intermediate node belongs to a table the user can’t access:

-- Application-level path filtering WITH graph AS ( SELECT * FROM graph.traverse('users', 'U-123', 5) ) SELECT g.* FROM graph g WHERE NOT EXISTS ( -- Remove results whose path passes through restricted tables SELECT 1 FROM unnest(g.path) AS path_node WHERE NOT has_table_privilege(current_user, (SELECT node_table FROM graph WHERE node_id = path_node LIMIT 1), 'SELECT') );

The Honest Position

Layer 2 (hydration RLS) is often sufficient for ordinary application retrieval because payload visibility remains governed by Postgres. The ghost traversal problem matters when:

  1. The existence of a relationship is itself classified
  2. AND the user has access to nodes on both sides of a hidden node
  3. AND the user can infer the hidden node’s identity from the path structure

Document this clearly to users and let them choose the appropriate security layer.


Data Safety

What the Extension Stores

DataWhereReconstructable?Risk
Node indices (u32)RAM + .graph fileNo (meaningless without mapping)None
Edge connectionsRAM + .graph fileTopology onlyLow
Bloom signatures (u64)RAM + .graph fileNo (lossy hash)None
TokenIndex tokensRAM + .graph file⚠️ Yes (plaintext key:value)Medium
source_table + source_pkRAM + .graph file⚠️ Yes (identifies source row)Medium

TokenIndex Security

The TokenIndex stores plaintext key:value tokens for properties ≤ graph.max_token_length (default 128 characters). This means:

  • name:alice is stored in the index
  • status:active is stored in the index
  • A 5000-character description field is NOT stored (exceeds threshold, falls back to Bloom)

Mitigation options:

  1. Exclude sensitive columns from columns when registering tables:

    -- Don't index SSN, DOB, or salary SELECT graph.add_table('employees', id_columns := ARRAY['id'], columns := ARRAY['name', 'department', 'status'] -- SSN, date_of_birth, salary are NOT indexed );
  2. Use Bloom-only mode (future): A configuration option to disable TokenIndex entirely and use only Bloom filters for property filtering. Bloom signatures are non-reconstructable.

.graph File Security

The .graph file is stored in $PGDATA/graph/. It has the same file permissions as other Postgres data files (owned by the postgres user, mode 0600).

If an attacker gains access to $PGDATA, they already have access to your raw table data — the .graph file is the least of your concerns.


Extension Permissions

Functions

All graph functions live in the graph schema. Access requires USAGE on the schema:

-- Grant access to a role GRANT USAGE ON SCHEMA graph TO app_role; GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA graph TO app_role; -- Revoke build/admin functions from non-admin roles REVOKE EXECUTE ON FUNCTION graph.build() FROM app_role; REVOKE EXECUTE ON FUNCTION graph.reset() FROM app_role; REVOKE EXECUTE ON FUNCTION graph.vacuum() FROM app_role;
-- Admin role: can build, vacuum, register tables CREATE ROLE graph_admin; GRANT ALL ON SCHEMA graph TO graph_admin; GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA graph TO graph_admin; -- Query role: explicit whitelist of allowed functions -- (Using explicit grants rather than GRANT ALL + REVOKE, because the REVOKE -- pattern is fragile — adding new admin functions in future versions would -- accidentally grant them to readers.) CREATE ROLE graph_reader; GRANT USAGE ON SCHEMA graph TO graph_reader; GRANT EXECUTE ON FUNCTION graph.traverse(regclass, text, int, text[], text, regclass[], jsonb, text, text, text, boolean, boolean, int, int, int, int) TO graph_reader; GRANT EXECUTE ON FUNCTION graph.shortest_path(regclass, text, regclass, text, int, boolean) TO graph_reader; GRANT EXECUTE ON FUNCTION graph.weighted_shortest_path(regclass, text, regclass, text) TO graph_reader; GRANT EXECUTE ON FUNCTION graph.search(text, text, regclass, text, boolean, int, int, text, boolean, boolean) TO graph_reader; GRANT EXECUTE ON FUNCTION graph.status() TO graph_reader;

Threat Model

ThreatMitigation
SQL injection via table parametersREGCLASS type validates at call time, plus pg_catalog cross-check
SQL injection via column names, property keys, filter payloads, or tenant valuesValidate identifiers against pg_catalog, quote identifiers with Postgres APIs, bind values through SPI parameters, reject unsupported JSONB filter payloads before SQL generation
User traverses table they can’t accessACL pre-flight check (Layer 1) — O(1), before BFS
Denial of service via deep traversalCircuit breakers: max_nodes, max_depth, max_frontier
Memory exhaustion via large graphgraph.memory_limit_mb hard cap + OOM guard (ERROR, not crash)
Cross-tenant topology leakageLayer 3 tenant-scoped traversal (optional)
Ghost traversal (see above)Documented, with mitigation options for each security level
Stale graph serving wrong resultsis_active tombstoning, generation counters (future)
.graph file tamperingCRC32 checksum, magic bytes verification
Background worker crashAuto-restart with 5-second backoff
Rust panic in extensioncatch_unwind on all entry points → PG ERROR, not crash
Last updated on