#

codebert

Here are 19 public repositories matching this topic...

RepoAnalysis / RepoSnipy

Neural search engine for discovering semantically similar Python repositories on GitHub

language-model github-repository-search streamlit-application codebert code-understanding neural-search-engine

Updated Feb 11, 2024
Python

dessertlab / EVIL

EVIL (Exploiting software VIa natural Language) is an approach to automatically generate software exploits in assembly/Python language from descriptions in natural language. The approach leverages Neural Machine Translation (NMT) techniques and a dataset that we developed for this work.

linux exploit encoder assembly decoder dataset seq2seq shellcode nmt software-exploitation codebert

Updated Mar 8, 2022
Python

jorge-martinez-gil / small-code-models

Repository about small code models

code-analysis code-similarity clone-detection t5-model codebert polycoder graphcodebert code-llms clone-detector codellms code-analysis-tools unixcoder plbart

Updated Jan 16, 2026
Python

jorge-martinez-gil / graphcodebert-interpretability

Augmenting the Interpretability of GraphCodeBERT for Code Similarity Tasks

code code-analysis pca-analysis semantic-similarity similarity-measures umap interpretability code-similarity clone-detection codebert graphcodebert

Updated Dec 3, 2025
Python

dessertlab / Targeted-Data-Poisoning-Attacks

This repository contains the code, the dataset and the experimental results related to the paper "Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks" accepted for publication at The 32nd IEEE/ACM International Conference on Program Comprehension (ICPC 2024).

python dataset code-generation vulnerabilities nmt data-poisoning-attacks codebert software-security-assessment

Updated Aug 5, 2024
Python

jorge-martinez-gil / graphcodebert-feature-integration

Improving Source Code Similarity Detection with GraphCodeBERT and Additional Feature Integration

semantic-similarity similarity-measures source-code-analysis code-similarity clone-detection codebert graphcodebert

Updated Feb 10, 2026
Python

sssszh / Vulnerability-Detection

Fine-tuning CodeBERT for Vulnerability Detection

vulnerability-detection codebert

Updated Sep 19, 2024
Python

MarttiWu / codeopt

CodeOpt: A framework for optimizing code performance using Two-Stage Sampling, Few-Shot Learning, and Iterative Self-Reflection with support for Genetic Algorithm Inspired Chain-of-Thought (GA-COT).

refactoring python nlp genetic-algorithm semantic-similarity bm25 code-optimization refactoring-tools few-shot-learning self-reflection code-performance iterative-refinement codebert in-context-learning llm chain-of-thought

Updated Dec 12, 2024
Python

hishamp3 / codeDetection

Django implementation of CodeBERT for detecting vulnerable code.

django-framework html-css codebert large-language-models llm-fine-tuning

Updated Dec 29, 2023
Python

sarvagyakrcs / s0.dev

The modern web development landscape is plagued by a peculiar paradox: despite the abundance of UI components and design systems, developers still spend countless hours reimplementing similar interfaces. S0 addresses this challenge by introducing a novel approach that combines advanced vector search capabilities.

nextjs bert similarity-search bun fastapi codebert multimodal-embeddings retrival-augmented-generation pg-vector

Updated Feb 11, 2025
Python

Vaibhav06Jha28 / ChainSage

"AI-powered vulnerability detection for Solidity smart contracts using Mistral + CodeBERT"

security ai smart-contracts solidity mistral fastapi streamlit codebert

Updated Jul 20, 2025
Python

khushnood-rafique / Transformer-Based-Unit-Test-Generation

This study compares three transformer-based mod- els—CodeT5, CodeBERT, and CodeGen.

nlp codegen codebert codet5

Updated Jun 28, 2025
Python

Nghia9912 / IntentTrace-xAI

A deterministic and neuro-symbolic framework for evaluating LLM-generated code using Abstract Syntax Trees, Semantic Embeddings, and Integrated Gradients. Think of it as a 'Digital Polygraph' for AI. It uses a three-step verification process to ensure the AI didn't 'misunderstand' your instructions

code-analysis pytorch semantic-similarity explainable-ai xai integrated-gradients codebert llm-evaluation

Updated Feb 25, 2026
Python

bosszii2709 / ai-dataset-generator

🤖 Generate tailored AI training datasets quickly and easily, transforming your domain knowledge into essential training data for model fine-tuning.

python ai dataset openai code-generation llama vulnerabilities dataset-generation nmt gradio openai-api data-poisoning-attacks codebert llm software-security-assessment finetune-gpt gpt4o gpt4o-mini

Updated Mar 5, 2026
Python

Radowan98 / ZSVulD

Implementation and dataset for A Zero-Shot Framework for Cross-Project Vulnerability Detection in Source Code (Empirical Software Engineering, 2026).

machine-learning deep-learning transfer-learning vulnerability-detection software-security zero-shot-learning empirical-software-engineering codebert

Updated Oct 29, 2025
Python

aleksibovellan / ai-python-code-validator

AI/ML Trained Python Code Validator with Gradio Web Interface

microsoft python ai code validator ml transformers torch python3 webinterface web-interface datasets gradio gradio-interface codebert

Updated Oct 29, 2024
Python

amrgaberM / vulnai

Multi-model vulnerability detection for C code using CodeBERT, GraphCodeBERT, and CodeT5. Trained on Microsoft’s Devign dataset, VulnAI identifies both keyword-based and structural vulnerabilities with a Python API and CLI.

deep-learning artificial-intelligence ensemble-learning pretrained-models fine-tuning multimodal prediciton codebert llm

Updated Jan 12, 2026
Python

yegmor / CoCLR-ML_Reproducibility_Challenge_2021

Reproducibility report ofCoSQA: 20,000+ Web Queries for Code Search and QuestionAnswering for ML Reproducibility Challenge 2021

code-search codesearchnet codebert code-question-answering cosqa codexglue coclr query-code-matching

Updated Feb 4, 2022
Python

kordy0-0 / ai-dataset-generator

🛠️ Generate AI training datasets easily, transforming complex information from documents into structured data for model fine-tuning.

python machine-learning ai dataset openai code-generation llama vulnerabilities dataset-generation nmt gradio openai-api data-poisoning-attacks codebert llm finetune-gpt gpt4o gpt4o-mini

Updated Mar 5, 2026
Python

Improve this page

Add a description, image, and links to the codebert topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the codebert topic, visit your repo's landing page and select "manage topics."