Skip to main content

File Exclusion Filtering

Bilanc automatically excludes generated and non-human-authored files from all metrics — both AI complexity scoring and rework analysis. Filtering happens at the data pipeline level, before any analysis runs, so excluded files never consume AI tokens or inflate statistics.

What Gets Excluded

Lock files — Machine-generated dependency resolution output:
  • package-lock.json, yarn.lock, pnpm-lock.yaml
  • Gemfile.lock, poetry.lock, Pipfile.lock
  • composer.lock, Cargo.lock, go.sum
  • Any file ending in .lock
Build output — Compiled, minified, or bundled artifacts:
  • *.min.js, *.min.css, *.bundle.js, *.chunk.js
  • Files under node_modules/, dist/, or build/ directories
Generated code — Auto-generated source files:
  • Files under generated/ directories
  • Protocol buffer output (*.pb.go, *.pb.py)
  • Other generated patterns (*_generated.go, *.g.dart)
Other noise — Files that don’t represent meaningful code changes:
  • Compiled Python (*.pyc, __pycache__/)
  • Environment files (.env, .env.*)
  • Source maps (*.map)

Where Filtering Happens

Exclusion is applied in the pull_request_files model, which is the single data source feeding both:
  • AI complexity scoring via pull_requests_view
  • Rework analysis via pull_request_rework
File matching is case-insensitive and handles both root-level and nested paths.

AI Complexity Scoring

For each merged pull request, Bilanc sends the filtered diffs (with filename context) to an AI model that produces:
  • A complexity score for the overall PR
  • Categorization of the type of work (feature, bug fix, refactor, etc.)
  • A summary of the changes
The AI receives each file’s path alongside its diff, which helps it understand the context and purpose of changes.

Known Limitations

  • Bitbucket filenames: Bitbucket pull request files may not include filenames. These files pass through unfiltered since we cannot match them against exclusion patterns.
  • Hardcoded list: The exclusion list is not currently configurable per organization. It covers common patterns but may not match every project’s generated files.
  • No weighting: Excluded files are fully removed from analysis, not downweighted. There is no partial credit for changes to semi-generated files.
File exclusion runs automatically on all code contributions with no configuration required. The exclusion list is maintained as part of the data pipeline and applies uniformly across all organizations.