Optimizing mypy.ini for Large Codebases: Performance & Precision Tuning

Scaling static type checking across massive Python repositories requires precise configuration of Mypy Configuration & Strictness parameters. This guide details exact tuning strategies for incremental caching, per-module strictness scoping, and third-party import routing. You will eliminate CI bottlenecks while maintaining rigorous type safety within broader Static Analysis Tools & CI Integration workflows.

Incremental Caching & SQLite Backend Tuning

Enable persistent caching to bypass cold-start overhead. Set cache_dir to a volume that survives CI job restarts. For repositories exceeding 100k LOC, switch to the SQLite backend to avoid filesystem inode limits. This reduces disk I/O during concurrent worker execution.

[mypy]
python_version = 3.11
incremental = true
sqlite_cache = true
cache_dir = .mypy_cache
show_error_codes = true
warn_return_any = true
warn_unused_ignores = true
follow_imports = silent

This configuration demonstrates persistent cache routing and strict error reporting. It silences untyped third-party traversal to prevent AST bloat. Note that sqlite_cache requires mypy >=0.900. Python 3.10+ is recommended for stable PEP 604 union syntax. Run mypy --clear-cache only during major interpreter upgrades to prevent schema drift.

Per-Module Strictness Overrides for Gradual Adoption

Apply granular strictness via per-module overrides instead of global flags. Define base [mypy] defaults first. Then scope exceptions to legacy paths. This prevents strictness leakage while isolating warn_return_any to active development paths.

[mypy-legacy.*]
disallow_untyped_defs = false
ignore_errors = true

[mypy-app.core.*]
disallow_untyped_defs = true
strict_equality = true

This pattern isolates strict checking to active modules while bypassing legacy code. It maintains global coverage metrics without degrading developer velocity. Consider this Python 3.10+ module that leverages the strict config:

# src/app/core/models.py
from typing import Protocol, TypeAlias

class DataProcessor(Protocol):
 def process(self, data: bytes) -> str: ...

HandlerType: TypeAlias = DataProcessor | None

def execute(handler: HandlerType) -> str:
 if handler is None:
 raise ValueError("Handler required")
 return handler.process(b"payload") # mypy validates strict equality & return types

Pyright and Ruff handle strictness differently. Ruff uses pyproject.toml rule codes. Pyright relies on typeCheckingMode. Stick to mypy’s INI overrides for precise path scoping. Version constraints matter: mypy >=1.4.0 optimizes protocol checking for this pattern.

Third-Party Import Routing & Stub Path Optimization

Route untyped dependencies through custom stub directories instead of ignoring imports globally. Configure mypy_path to point to internal type definitions. Use follow_imports = silent for known-untyped packages. This suppresses noise while still reading .pyi stubs.

[mypy]
mypy_path = ./typeshed_custom

[mypy-pandas.*]
ignore_missing_imports = true

[mypy-numpy.*]
ignore_missing_imports = true

This routes internal type definitions to a dedicated folder while explicitly ignoring specific heavy dependencies. It reduces memory footprint without breaking type propagation. Avoid blanket ignore_missing_imports = true at the root level. It masks legitimate missing hints in first-party code. Pyright natively resolves typeshed differently. Mypy requires explicit path routing for enterprise stubs.

CI Memory Constraints & Parallel Execution Limits

Prevent OOM kills by limiting stdout serialization and capping parallel workers. Set show_traceback = false in your config to reduce memory overhead during large batch runs. Control concurrency via environment variables rather than hardcoded CLI flags.

# CI Pipeline Configuration
export MYPY_WORKERS=4
export PYTHONHASHSEED=0
mypy --config-file mypy.ini src/

Use --no-incremental strategically for nightly full scans. This catches cache drift without impacting daily PR checks. Ruff and Pyright parallelize differently. Mypy’s worker model scales linearly until memory pressure hits. Monitor your runner’s RSS and adjust MYPY_WORKERS accordingly. Python 3.11+ improves GIL contention during parallel type checking.

Common Mistakes

  • Disabling incremental mode globally: Forces full AST rebuilds on every run. This increases CI times from seconds to minutes. It negates mypy’s primary scaling mechanism.
  • Using follow_imports=skip for untyped libraries: Prevents reading .pyi stubs entirely. This causes false negatives. It breaks type propagation across module boundaries.
  • Applying ignore_missing_imports = true globally: Silences legitimate missing type hints in first-party code. This masks critical integration errors. It degrades coverage metrics.

FAQ

How do I prevent mypy cache corruption in shared CI environments? Isolate cache directories per branch or job ID using environment variables. Run mypy --clear-cache only during major version upgrades to reset the SQLite schema safely.

Should I use follow_imports=skip to speed up large monorepos? No. Use follow_imports=silent instead. It preserves type inference from .pyi stubs while suppressing error output from untyped dependencies.

How can I enforce strict typing only on newly added files? Combine git diff --name-only with a pre-commit hook that passes a dynamic file list to mypy. Maintain per-module overrides in mypy.ini for legacy paths to avoid false positives.

What is the optimal cache directory location for Dockerized CI runners? Mount a persistent volume to /tmp/mypy_cache or use GitHub Actions cache keys to preserve .mypy_cache across workflow runs. Ensure the volume is writable by the CI user.