Skip to main content

Connecting Your Data

Understand how 42Cells connects to your databases, executes queries, and what data stays where.

How Queries Execute

When you run a SQL cell, here's what happens:

  1. Connection — 42Cells opens a connection to your database from our secure pool
  2. Execution — Your query runs directly on your database
  3. Results — Rows are streamed back and stored in 42Cells
  4. Display — Results appear in your notebook

Connection Pooling

We maintain a pool of connections to reduce latency:

SettingValue
Pool size5 connections per database
Idle timeout5 minutes
Max age30 minutes

Connections are health-checked before reuse and automatically recycled.

Query Limits

LimitValueWhy
Timeout30 secondsPrevents runaway queries
Max rows10,000Keeps results manageable
Query typeSELECT onlyProtects your data

INSERT, UPDATE, and DELETE are blocked — 42Cells is read-only.

What Gets Stored Where

In 42Cells (our database)

DataPurpose
Notebook metadataNames, descriptions, settings
Cell contentYour SQL queries, markdown text
Cell resultsQuery output (JSON, max 10K rows)
CredentialsEncrypted database passwords
EdgesCell dependency graph

In Your Database (untouched)

DataStatus
Raw tablesNever modified
SchemasRead-only introspection
User dataNever copied in bulk

We never write to your database. All queries are SELECT-only.

Security Model

Credential Storage

  • Passwords encrypted at rest using industry-standard encryption
  • Decrypted only at query time, never logged

Access Control

  • Connections are organization-scoped — team members can use shared connections
  • Permission levels:
    • Can use in notebooks — Run queries
    • Can manage — Edit or delete the connection

Network

  • Connections made over encrypted channels
  • Consider using SSH tunnels or VPN for sensitive databases

Best Practices

Use Read-Only Users

Create a dedicated database user with SELECT-only permissions:

-- PostgreSQL
CREATE USER datacortex_reader WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE mydb TO datacortex_reader;
GRANT USAGE ON SCHEMA public TO datacortex_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO datacortex_reader;

Limit Accessible Schemas

Only grant access to schemas you want to query.

Monitor Query Logs

Check your database logs to see what 42Cells is querying.

Derived Cells

For advanced pipelines, cells can read from upstream cell results instead of your database:

  • DB mode — Cell queries your database directly
  • Derived mode — Cell reads Parquet output from upstream cells via DuckDB

This enables multi-step transformations without round-trips to your database.

Set a cell to derived mode by removing its connection — it will automatically use incoming edges as data sources.