Scalable data pre processing and curation toolkit for LLMs
-
Updated
Mar 4, 2026 - Python
Scalable data pre processing and curation toolkit for LLMs
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python
convtools is a specialized Python library for dynamic, declarative data transformations with automatic code generation
A Python library that acts as a client to download, pre-process and post-process weather data. Friendly for users on VPN/PROXY connections.
Making it easier to navigate and clean TAHMO weather station data for ML development
A simplistic, general purpose pipeline framework.
Streamlit app to export and bulk-update Plex music metadata and smart playlist creator.
Artifician is an event-driven framework designed to simplify and accelerate the process of preparing datasets for Artificial Intelligence models.
A pipeline that consumes twitter data to extract meaningful insights about a variety of topics using the following technologies: twitter API, Kafka, MongoDB, and Tableau.
The Resume Application Tracking System uses Google Gemini Pro Vision to automatically parse, analyze, and categorize resumes for efficient recruitment. It integrates AI-driven vision capabilities to enhance resume processing and candidate selection.
Add a description, image, and links to the data-processing-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-processing-pipelines topic, visit your repo's landing page and select "manage topics."