Database correlation
The moment a replay shows a route getting hammered by a slow query, the next question is always the same: which query, and which request fired it? This page covers how clearvoiance answers that without patching your ORM or adding a span to every statement.
There are exactly two strategies, picked automatically by the adapter you import from the SDK:
- Observer-based — the SDK stamps an identifier onto every
database connection; an out-of-band observer reads introspection
tables (
pg_stat_activity,pg_stat_statements) and emits observation events. Zero in-process overhead on your query path. - SDK-side emission — the adapter hooks the driver's middleware,
times each op, and emits a
DbObservationEventdirectly. Used where no suitable introspection view exists (MongoDB).
Both paths produce the same event shape, so the dashboard shows DB activity from all drivers on one timeline.
The identifier format
Whichever strategy is in play, the SDK encodes the active event id into an identifier the observer can parse back out:
clv:<replayId?>:<eventId>
clv:<replayId?>:<eventId>
clv:— fixed prefix so multiple clearvoiance installs on the same DB don't clobber non-clearvoiance connections.<replayId>— optional. Set when the SDK is running in a replay so the dashboard can filter to ops that happened during that specific replay (vs. the original capture).<eventId>— the id of the HTTP / cron / queue / socket event that opened the currentAsyncLocalStoragescope. Empty when the query fires outside a capture scope.
For Postgres this goes into application_name, which is truncated to
63 bytes by the server — the SDK truncates before sending so the
round-trip stays exact.
Supported drivers
| Driver / ORM | Strategy | Import |
|---|---|---|
pg (Pool) | Observer-based | @clearvoiance/node/db/postgres |
| Knex (pg) | Observer-based | @clearvoiance/node/db/knex |
| Prisma (pg) | Observer-based | @clearvoiance/node/db/prisma |
| Mongoose | SDK-side emit | @clearvoiance/node/db/mongoose |
Drivers not listed are safe to use alongside — the SDK just won't
produce DB observations for them. Knex in particular is a silent
no-op when db.client.driverName !== "pg", so it's safe to wire up
unconditionally in apps that mix drivers across environments.
Observer-based (Postgres)
The flow for every query your app runs under a capture scope:
your code SDK (per-connection) observer engine
┌───────────┐ ┌────────────────────────┐ ┌────────────┐ ┌─────────┐
│ knex │──▶│ SET application_name │──▶│ polls │──▶│ emits │
│ .select() │ │ = 'clv:ev_a1b2' │ │ pg_stat_* │ │ DbObs │
└───────────┘ └────────────────────────┘ └────────────┘ └─────────┘
↑ │
│ once per connection │ correlates on
│ on the first query of │ application_name →
│ each AsyncLocalStorage │ sets caused_by_event_id
│ scope │
your code SDK (per-connection) observer engine
┌───────────┐ ┌────────────────────────┐ ┌────────────┐ ┌─────────┐
│ knex │──▶│ SET application_name │──▶│ polls │──▶│ emits │
│ .select() │ │ = 'clv:ev_a1b2' │ │ pg_stat_* │ │ DbObs │
└───────────┘ └────────────────────────┘ └────────────┘ └─────────┘
↑ │
│ once per connection │ correlates on
│ on the first query of │ application_name →
│ each AsyncLocalStorage │ sets caused_by_event_id
│ scope │
Why SET + observer, not client-side interceptors:
- Truthful timing. The observer reports what the database actually did, including lock waits and disk I/O. An interceptor only sees wall-clock time on the client side.
- No per-query cost. The SET happens once per connection per event; the rest of your queries run un-instrumented.
- Works for queries the app doesn't know it's running. Prisma's query engine, Strapi's populate queries, background connection keep-alives — everything that shares the pool gets tagged.
pg.Pool (raw)
import { Pool } from "pg";
import { instrumentPg } from "@clearvoiance/node/db/postgres";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
instrumentPg(pool, { replayId: process.env.CLEARVOIANCE_REPLAY_ID });
import { Pool } from "pg";
import { instrumentPg } from "@clearvoiance/node/db/postgres";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
instrumentPg(pool, { replayId: process.env.CLEARVOIANCE_REPLAY_ID });
instrumentPg subscribes to pool.on("connect") so every new client
gets wrapped on creation. Pre-existing clients are wrapped retroactively
if pool._clients is readable.
Knex
Knex manages its own tarn.js pool, so the pool.on("connect") hook
above doesn't fire on the clients Knex actually uses. The Knex adapter
instead overrides knex.client.acquireRawConnection so every
pg.Client Knex creates comes back already wrapped.
import knex from "knex";
import { instrumentKnex } from "@clearvoiance/node/db/knex";
const db = knex({ client: "pg", connection: process.env.DATABASE_URL });
instrumentKnex(db, { replayId: process.env.CLEARVOIANCE_REPLAY_ID });
import knex from "knex";
import { instrumentKnex } from "@clearvoiance/node/db/knex";
const db = knex({ client: "pg", connection: process.env.DATABASE_URL });
instrumentKnex(db, { replayId: process.env.CLEARVOIANCE_REPLAY_ID });
The adapter also walks db.client.pool._freeObjects + _usedObjects
to retroactively wrap any connections Knex opened before
instrumentKnex was called. Useful in Strapi-style setups where the
query builder is constructed well before middleware boots.
Prisma
Prisma's query engine is a separate Node.js process the client talks to over a local socket, so the pg-Pool hook can't reach it. The Prisma adapter uses Prisma's own middleware API to set the application_name at the start of each transaction.
import { PrismaClient } from "@prisma/client";
import { instrumentPrisma } from "@clearvoiance/node/db/prisma";
const prisma = instrumentPrisma(new PrismaClient(), {
replayId: process.env.CLEARVOIANCE_REPLAY_ID,
});
import { PrismaClient } from "@prisma/client";
import { instrumentPrisma } from "@clearvoiance/node/db/prisma";
const prisma = instrumentPrisma(new PrismaClient(), {
replayId: process.env.CLEARVOIANCE_REPLAY_ID,
});
Running the observer
Observer-based strategies require the observer to be running somewhere that can reach your Postgres. Typical setup: one observer container alongside the engine, pointed at the same DB your SUT uses. See the Deployment guide for the compose / stack config.
The observer's poll interval is configurable (default 1s) and it only
reports queries whose query_start is inside an active session, so
idle DB traffic doesn't show up in the dashboard.
SDK-side emission (Mongoose)
MongoDB doesn't have a pg_stat_activity equivalent that's stable
across versions and deployment modes (Atlas, self-hosted, replica set).
Rather than ship a MongoDB observer that breaks on every server rev,
the Mongoose adapter takes the in-process path:
your code SDK plugin engine
┌───────────────┐ ┌──────────────────────────────┐ ┌──────────┐
│ User.find() │──▶│ pre hook: record hrtime │──▶│ DbObs │
│ │ │ post hook: compute duration, │ │ event │
│ │ │ emit DbObsEvent │ │ │
└───────────────┘ └──────────────────────────────┘ └──────────┘
your code SDK plugin engine
┌───────────────┐ ┌──────────────────────────────┐ ┌──────────┐
│ User.find() │──▶│ pre hook: record hrtime │──▶│ DbObs │
│ │ │ post hook: compute duration, │ │ event │
│ │ │ emit DbObsEvent │ │ │
└───────────────┘ └──────────────────────────────┘ └──────────┘
The plugin installs via mongoose.plugin(...) so every future
schema picks it up. Schemas registered before the install don't get
the hook — wire it up at the top of your bootstrap:
import mongoose from "mongoose";
import { createClient } from "@clearvoiance/node";
import { instrumentMongoose } from "@clearvoiance/node/db/mongoose";
const client = createClient({ /* ... */ });
await client.start();
// Install BEFORE any Model is defined.
instrumentMongoose(mongoose, client, {
slowThresholdMs: 50, // 0 = emit every op (chatty but full fidelity)
replayId: process.env.CLEARVOIANCE_REPLAY_ID,
});
import mongoose from "mongoose";
import { createClient } from "@clearvoiance/node";
import { instrumentMongoose } from "@clearvoiance/node/db/mongoose";
const client = createClient({ /* ... */ });
await client.start();
// Install BEFORE any Model is defined.
instrumentMongoose(mongoose, client, {
slowThresholdMs: 50, // 0 = emit every op (chatty but full fidelity)
replayId: process.env.CLEARVOIANCE_REPLAY_ID,
});
What gets hooked:
- Query middleware:
find,findOne,findOneAndUpdate,findOneAndDelete,updateOne,updateMany,deleteOne,deleteMany,replaceOne,countDocuments,estimatedDocumentCount,distinct,aggregate. - Document middleware:
save,validate,remove,deleteOne.
Each emitted event carries query_fingerprint = "<Model>.<op>" and
query_text = "<Model>.<op>(<short filter JSON or empty>)". Filter
payloads are dropped when they exceed 200 chars to avoid silently
persisting PII into ClickHouse.
Ops that fire outside any capture scope are dropped without being emitted — there's nothing to correlate them to, and they'd only add noise.
Writing your own adapter
The SDK is deliberately small and the adapter surface is just two functions. If you need to support a driver that isn't in the table above, pick whichever strategy fits:
Observer-style contract
Your adapter should set a per-connection identifier whose format is:
clv:<replayId?>:<eventId>
clv:<replayId?>:<eventId>
truncated to the driver's identifier limit. For Postgres re-use
parseClvAppName from @clearvoiance/node/db/postgres so observers
see one shape regardless of source. The identifier must update on
every event-scope change, ideally once per acquired connection.
SDK-side contract
- Read the active event id via
currentEventId()(exported from the top-level@clearvoiance/nodepackage). - Time the op with
process.hrtime.bigint(). - Build a
DbObservationEventwithcaused_by_event_idset to the captured id andobservation_type = SLOW_QUERY. - Wrap it in an
Eventwithadapter: "db.<driver>",timestamp_ns, and a uniqueid. - Call
client.sendBatch([event]).
Drop ops that fire outside any event scope. Respect a
slowThresholdMs option so users can opt out of emitting fast queries
in high-QPS apps.
The Mongoose adapter's 5-test suite is a reasonable template to copy.
Next
- Core concepts — how DB observations fit into the broader capture / replay picture.
- Monitors — trigger a capture from the dashboard so the DB correlation has events to attach itself to.
- Deployment — running the observer alongside the engine.