PageRank Entry Points & Traversal Counters

Graph traversals have to start somewhere. sutraDB uses PageRank to pick structurally important starting nodes, and runtime traversal counters to identify hot areas that deserve materialized adjacency lists. Two complementary optimizations: one structural, one adaptive.

The Problem

HNSW picks a random entry point and greedily descends. For graph traversals, a random start means more hops to reach the target. And B-tree index lookups for every neighbor of a hot node waste cycles on cache misses.

PageRank Entry Points

PageRank identifies nodes that many paths flow through. Starting traversals at high-PageRank nodes means fewer hops to reach any destination — the same principle that makes Google's web search start from authoritative pages.

Traversal Counters

As queries run, sutraDB increments a counter each time a node's neighbors are looked up. When a node crosses a hotness threshold, its adjacency list is materialized — all neighbors pre-loaded into contiguous memory. This is adaptive: cold nodes stay in the B-tree, hot nodes get fast flat arrays.

Steps