Skip to content

Suh0161/CodeScope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeScope AI

Search deeper. Know your codebase.

CodeScope AI v2.0 is an intelligent codebase search and analysis tool that provides comprehensive context and structural awareness for your entire workspace. It goes beyond traditional text search to understand code semantics, relationships, and dependencies across all file types and programming languages.

Features

Core Search Capabilities

  • Intelligent Search - Hybrid search combining lexical, structural, and semantic methods
  • Live Visual Feedback - Real-time search results as you type with professional UI
  • Advanced Filters - Filter by language (lang:typescript), folder (in:src), or scope
  • Search Suggestions - Live autocomplete based on indexed symbols
  • Causal Analysis - Trace "if A then B" relationships through code dependencies
  • Multi-Language Support - Index and search across all file types (TypeScript, JavaScript, Python, and more)
  • Auto-Indexing - Automatic workspace indexing with real-time file watching
  • Context-Aware Results - Results include relevance scores, file paths, and relationship explanations

AI Agent Integration

  • Agent API - Programmatic interface for AI agents to search codebases
  • Structured Responses - JSON-formatted results optimized for AI consumption
  • Global Access - Available via global.codescope for direct agent access
  • Command Discovery - Automatically discoverable by Cursor's AI agent

Standalone API

  • IDE-Agnostic Core - Can be integrated into any IDE or application
  • REST API Ready - Deployable as a standalone service
  • No Dependencies - Core functionality independent of VS Code

Installation

For VS Code Extension

  1. Clone this repository
  2. Run npm install
  3. Press F5 in VS Code to launch the extension in a new window

For Standalone Integration

CodeScope's core is IDE-agnostic and can be integrated into any IDE or application. See the Standalone API section below for implementation details.

Usage

Commands

  • CodeScope: Search - Interactive search with live feedback (main command)

Search Syntax

Basic Search:

Type: "UserService"
→ Finds exact symbol matches

With Filters:

Type: "authentication lang:typescript"
→ Only searches TypeScript files

Type: "userLogin in:src"
→ Only searches in src/ folder

Type: "payment from:backend"
→ Only searches in backend/ folder

Natural Language:

Type: "if userLogin then what happens to session"
→ Traces causal relationships

Configuration

Configure CodeScope AI in VS Code settings:

  • codescope.excludePatterns - Glob patterns for files/folders to exclude from indexing (default: node_modules, .git, dist, build)
  • codescope.enableSemanticSearch - Enable semantic search (experimental, default: false)
  • codescope.indexOnStartup - Automatically index workspace on startup (default: true)
  • codescope.maxFileSize - Maximum file size to index in bytes (default: 10485760 = 10MB)

Architecture

Core Components

  • Indexer - Scans workspace files and extracts symbols, imports, exports, and relationships
  • Index Store - SQLite database (via sql.js) for fast queries and metadata storage
  • Retrieval Engine - Hybrid search engine combining multiple search methods with precision ranking
  • Advanced Search - Enhanced search with filters, scopes, and suggestions
  • Search UI - Professional interactive search interface with live feedback
  • File Watcher - Real-time file change detection for incremental indexing
  • Parser Registry - Pluggable parser system supporting multiple languages

Search Algorithm

CodeScope uses a precision-focused multi-phase search algorithm designed for accuracy:

Phase 1: Multi-Source Collection

  • Symbol search: Exact symbol matches (100 candidates)
  • Usage search: Where symbols are referenced
  • Related symbols: Imported/exported relationships
  • Full-text search: Pattern matching through file contents (200 candidates)
  • Pattern search: Code-like queries (operators, keywords)

Phase 2: Deduplication

  • Removes duplicates using file_path:line:symbol as unique key
  • Keeps highest-scoring duplicate

Phase 3: Precision Ranking Results are scored and boosted based on match quality:

  • Exact match: score × 2.0 (query "User" matches symbol "User")
  • Starts with: score × 1.8 (query "User" matches "UserService")
  • Contains: score × 1.5 (query "User" matches "getUserData")
  • Word matching: Multi-word queries match each word, boost by match ratio
  • Definition boost: score × 1.4 (definitions ranked higher than usages)
  • Usage boost: score × 1.1
  • Context boost: score × 1.2 if query appears in reason/context
  • Length penalty: score × 0.9 for overly long symbol names

Phase 4: Multi-Level Sorting

  1. Primary: By relevance score (descending)
  2. Secondary: Definitions before usages
  3. Tertiary: Shorter symbol names first (more precise)

Phase 5: Precision Filtering

  • Filters results below minScore threshold (default: 0.1)
  • Returns all precise matches above threshold (no artificial limits)
  • Higher minScore = stricter matching, fewer but more accurate results

v2.0 Foundation (Ready for Integration)

  • Vector Store - Semantic search infrastructure (in-memory, swappable with Qdrant/Milvus)
  • Embedding Service - Code embedding generation (hash-based, swappable with CodeBERT/UniXcoder)
  • Graph Store - Enhanced relationship analysis and dependency graph traversal
  • Chunking Service - Semantic code chunking for embedding generation
  • Hybrid Retrieval Engine V2 - Combines lexical, semantic, and graph-based search

Data Model

  • FileEntry - File metadata, symbols, imports, exports
  • Symbol - Class, function, method, variable definitions with location
  • ResultItem - Search results with score, reason, and context
  • ResultItemV2 - Enhanced results with embedding scores and graph distance (v2.0)

Development

Prerequisites

  • Node.js 18+
  • TypeScript 5.0+
  • VS Code 1.80+

Setup

npm install
npm run compile
npm run watch  # Watch mode for development

Project Structure

src/
├── extension.ts          # Main extension entry point
├── indexer.ts            # File scanning and parsing
├── indexStore.ts         # Database operations
├── retrievalEngine.ts    # Search engine (v1.0)
├── retrievalEngineV2.ts  # Enhanced search engine (v2.0)
├── searchUI.ts          # Interactive search interface
├── advancedSearch.ts     # Advanced search features
├── commands.ts           # VS Code command handlers
├── agentAPI.ts           # Agent API (v1.0)
├── agentAPIV2.ts         # Enhanced Agent API (v2.0)
├── agentIntegration.ts   # Cursor agent integration
├── vectorStore.ts        # Vector storage for semantic search
├── embedder.ts           # Embedding generation
├── graphStore.ts         # Graph operations
├── chunker.ts            # Code chunking
├── relationshipAnalyzer.ts # Causal analysis
├── parsers/              # Language-specific parsers
└── core/                 # Standalone API
    ├── api.ts            # Standalone API implementation
    └── index.ts          # Core exports

API Reference

For AI Agents

// Access via global object
global.codescope.search('userLogin')
global.codescope.answerCausal('if userLogin then session')
global.codescope.getRelationships('UserService')

// Or via VS Code commands
vscode.commands.executeCommand('codescope.search', 'query')

Standalone API

CodeScope's core functionality is available as a standalone API that can be integrated into any IDE, web application, or CLI tool.

Basic Usage

import { createCodeScopeAPI } from './src/core/api';

// Initialize CodeScope
const codescope = await createCodeScopeAPI({
    workspacePath: '/path/to/workspace',
    dbPath: '/path/to/index.db',
    enableSemantic: true,  // Optional: enable semantic search
    enableGraph: true       // Optional: enable graph search
});

// Index workspace
await codescope.indexWorkspace();

// Search
const results = await codescope.search({
    query: 'authentication',
    limit: 20
});

// Get statistics
const stats = await codescope.getStats();

Integration Example for Custom IDE

import { createCodeScopeAPI, CodeScopeAPI } from './src/core/api';

class MyCustomIDE {
    private codescope: CodeScopeAPI;

    async initialize() {
        // Initialize CodeScope
        this.codescope = await createCodeScopeAPI({
            workspacePath: this.workspacePath,
            dbPath: this.getIndexPath()
        });

        // Index on startup
        await this.codescope.indexWorkspace();
    }

    async searchCodebase(query: string) {
        const response = await this.codescope.search({
            query,
            limit: 20
        });

        // Display results in your IDE's UI
        return this.formatResults(response.results);
    }
}

Available Methods

  • search(request: SearchRequest) - Search codebase with filters
  • indexWorkspace() - Index entire workspace
  • indexFile(filePath: string) - Index a single file
  • getStats() - Get indexing statistics

Customization

You can customize CodeScope by:

  • Replacing Embedding Model: Swap SimpleEmbeddingModel with CodeBERT/UniXcoder
  • Replacing Vector Store: Swap InMemoryVectorStore with Qdrant/Milvus
  • Adding Custom Parsers: Extend ParserRegistry with language-specific parsers
  • Custom Search Logic: Extend RetrievalEngineV2 with domain-specific search

See src/core/api.ts for the complete API implementation.

Performance

  • Indexing: Supports 100k+ files, initial index typically completes in minutes
  • Search Latency: < 1 second for most queries
  • Storage: Efficient SQLite-based index with minimal disk usage
  • Memory: In-memory vector store (configurable)

Limitations

  • Tree-sitter parsers are optional (fallback to regex if unavailable)
  • Semantic search foundation ready but not yet integrated
  • Graph visualization UI not yet implemented
  • Multi-repo support planned for future release

Contributing

CodeScope AI is open source and welcomes contributions. Other IDEs and companies can:

  • Integrate the Core: Use src/core/api.ts to integrate CodeScope into your IDE
  • Extend Functionality: Add custom parsers, search methods, or UI components
  • Swap Components: Replace embedding models, vector stores, or graph databases
  • Build on Top: Create plugins, extensions, or services using CodeScope's API

For IDE Developers

The core API (src/core/) has no VS Code dependencies and can be used in:

  • JetBrains IDEs (IntelliJ, WebStorm, etc.)
  • Custom IDEs and editors
  • Web-based code editors
  • CLI tools and scripts
  • CI/CD pipelines

Extension Points

  • Parsers: Add language support via ParserRegistry
  • Search Engines: Extend RetrievalEngineV2 with custom search logic
  • Vector Stores: Implement IVectorStore interface for custom storage
  • Embedding Models: Implement EmbeddingModel interface for custom embeddings
  • Graph Stores: Implement IGraphStore interface for custom graph databases

Roadmap

Future enhancements planned (when I'm not being lazy 😅):

  • Semantic search integration (foundation is ready, just needs wiring up)
  • Graph visualization UI
  • Feedback collection system
  • Multi-repo support
  • CI/CD integration

License

MIT

Version

Current Version: 2.0.0


CodeScope AI - Search deeper. Know your codebase.

About

Search deeper. Know your codebase. Intelligent codebase search and analysis tool for VS Code with AI agent integration.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors