DocumentationNeuronDB Documentation
Documentation Branch: You are viewing documentation for the main branch (3.0.0-devel). Select a branch to view its documentation:

Audit Logging for ML and RAG Operations

Overview

NeuronDB provides comprehensive audit logging for ML inference and RAG operations, enabling compliance monitoring, security analysis, and usage tracking.

Features

  • Audit logging for ML model inference calls
  • Audit logging for RAG retrieve/generate operations
  • Configurable retention periods
  • Query interface for audit logs
  • Input/output hashing for integrity verification

Configuration

Enable audit logging:

Enable audit logging

-- Enable ML inference audit logging
SET neurondb.audit_ml_enabled = true;

-- Enable RAG operation audit logging
SET neurondb.audit_rag_enabled = true;

-- Set retention period (days)
SET neurondb.audit_retention_days = 365;

ML Inference Audit Logging

Automatic Logging

When neurondb.audit_ml_enabled is enabled, ML inference operations are automatically logged.

Manual Logging

Log ML inference operation

-- Log ML inference operation
SELECT log_ml_inference(
    model_id := 1,
    operation_type := 'predict',
    input_hash := encode(digest(input_data::text, 'sha256'), 'hex'),
    output_hash := encode(digest(output_data::text, 'sha256'), 'hex'),
    metadata := '{"batch_size": 100, "latency_ms": 45}'::jsonb
);

Querying ML Audit Logs

Query ML inference audit logs

-- Query ML inference audit logs
SELECT * FROM query_audit_log(
    'ml_inference',
    start_time := '2024-01-01'::timestamptz,
    end_time := '2024-12-31'::timestamptz,
    user_id := 'admin',
    operation_type := 'predict'
);

RAG Operation Audit Logging

Automatic Logging

When neurondb.audit_rag_enabled is enabled, RAG operations are automatically logged.

Manual Logging

Log RAG operation

-- Log RAG operation
SELECT log_rag_operation(
    pipeline_name := 'documents_rag',
    operation_type := 'retrieve',
    query_hash := encode(digest('What is machine learning?'::text, 'sha256'), 'hex'),
    result_count := 5,
    metadata := '{"k": 5, "rerank": true}'::jsonb
);

Querying RAG Audit Logs

Query RAG operation audit logs

-- Query RAG operation audit logs
SELECT * FROM query_audit_log(
    'rag_operation',
    start_time := '2024-01-01'::timestamptz,
    end_time := '2024-12-31'::timestamptz,
    user_id := 'user123',
    operation_type := 'generate'
);

Audit Log Schema

ML Inference Audit Log

ML inference audit log table

CREATE TABLE neurondb.ml_inference_audit_log (
    audit_id BIGSERIAL PRIMARY KEY,
    model_id INTEGER,
    operation_type TEXT NOT NULL,
    user_id TEXT DEFAULT CURRENT_USER,
    input_hash TEXT,
    output_hash TEXT,
    metadata JSONB,
    timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

RAG Operation Audit Log

RAG operation audit log table

CREATE TABLE neurondb.rag_operation_audit_log (
    audit_id BIGSERIAL PRIMARY KEY,
    pipeline_name TEXT NOT NULL,
    operation_type TEXT NOT NULL,
    user_id TEXT DEFAULT CURRENT_USER,
    query_hash TEXT,
    result_count INTEGER DEFAULT 0,
    metadata JSONB,
    timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

Compliance Considerations

GDPR

  • User IDs are logged for data access tracking
  • Input/output hashes enable integrity verification without storing sensitive data
  • Retention periods can be configured per compliance requirements

HIPAA

  • Audit logs track all access to PHI-related ML models
  • Query hashes enable compliance reporting
  • Timestamps enable audit trail reconstruction

SOC 2

  • Comprehensive logging of all ML/RAG operations
  • User attribution for all operations
  • Configurable retention for audit requirements

Best Practices

  1. Retention Management: Regularly archive or delete old audit logs based on retention policies.
  2. Indexing: Audit tables are indexed for efficient querying. Monitor table sizes and partition if needed.
  3. Performance: Audit logging is asynchronous where possible to minimize impact on operations.
  4. Monitoring: Set up alerts for unusual patterns in audit logs (e.g., excessive access, failed operations).

Log Rotation

Audit logs should be periodically rotated or archived based on retention policies:

Delete logs older than retention period

-- Delete logs older than retention period
DELETE FROM neurondb.ml_inference_audit_log
WHERE timestamp < CURRENT_TIMESTAMP - (neurondb.audit_retention_days || ' days')::interval;

DELETE FROM neurondb.rag_operation_audit_log
WHERE timestamp < CURRENT_TIMESTAMP - (neurondb.audit_retention_days || ' days')::interval;

Related Topics