Markdown Converter
Agent skill for markdown-converter
**Prompt**: Need to update the reasonchain's input taking, as we are currently taking PDF files only, what if we want the data to take in also here /home/sunny-bedi/practise/ReasonChain/reasonchain/rag/vector/add_to_vector_db.py. Also need to make sure the model /home/sunny-bedi/practise/ReasonChain
Sign in to like and favorite skills
Prompt: Need to update the reasonchain's input taking, as we are currently taking PDF files only, what if we want the data to take in also here /home/sunny-bedi/practise/ReasonChain/reasonchain/rag/vector/add_to_vector_db.py. Also need to make sure the model /home/sunny-bedi/practise/ReasonChain/reasonchain/rag/embeddings/embedding_config.py if we need to add our own model too, even the fine-tuned model saved in our path or in HF, we need that liberty to add too, this config just make it less open.
Changes Made:
add_to_vector_db.pyembedding_config.pyregister_custom_model() - General custom model registrationregister_huggingface_model() - For HuggingFace models including fine-tuned onesregister_local_model() - For local model filesregister_fine_tuned_model() - Specifically for fine-tuned modelslist_available_models() - List all registered modelsget_model_dimension() - Get embedding dimensionsremove_custom_model() - Remove custom modelsenhanced_multi_format_rag_example.py demonstrating:
Key Benefits:
Usage Examples:
# Multi-format processing file_paths = [ 'documents/report.pdf', 'data/spreadsheet.xlsx', 'content/article.md', 'presentations/slides.pptx' ] add_data_to_vector_db(file_paths, db_path='mixed_db.index') # Custom model registration register_huggingface_model( "your-username/your-fine-tuned-model", 768, "Your custom fine-tuned model" ) # Use custom model add_data_to_vector_db( file_paths, embedding_provider="hugging_face", embedding_model="your-username/your-fine-tuned-model" )
Prompt: Now what if I do not want to use files and files are extracted elsewhere and need to add the data to the vector db that option I don't see in here.
Changes Made:
add_raw_data_to_vector_db() - Add raw text data or pre-computed embeddings directlyadd_structured_data_to_vector_db() - Add structured data (JSON-like) from APIs/databasesadd_external_source_to_vector_db() - Add data from external sources (databases, APIs, etc.)direct_data_input_example.py demonstrating all direct input methodsKey Benefits:
Usage Examples:
# Add raw text data directly texts = ["Document 1 content", "Document 2 content"] add_raw_data_to_vector_db(texts=texts, db_path='raw_db.index') # Add pre-computed embeddings import numpy as np embeddings = np.random.rand(2, 768) # 2 embeddings, 768 dimensions texts = ["Text 1", "Text 2"] add_raw_data_to_vector_db(texts=texts, embeddings=embeddings) # Add structured data from API api_data = [ {"text": "Article content", "title": "Article 1", "category": "tech"}, {"text": "Blog content", "title": "Blog 1", "category": "science"} ] add_structured_data_to_vector_db( api_data, metadata_fields=['title', 'category'] ) # Add from database db_config = { 'connection_string': 'postgresql://user:pass@host/db', 'query': 'SELECT content as text, title, category FROM articles', 'metadata_fields': ['title', 'category'] } add_external_source_to_vector_db('database', db_config)
Prompt: "lets update the readme too as what we have added"
Changes Made:
Updated RAG Integration section to include new capabilities:
Added Custom Embedding Models section with examples for:
Added comprehensive Direct Data Input & External Sources section covering:
Updated Examples section to include new example files:
direct_data_input_example.pyenhanced_multi_format_rag_example.pyfine_tuned_model_example.pyUpdated code examples to use new function names:
add_pdf_to_vector_db to add_data_to_vector_dbEnhanced Features list with:
Updated Future Enhancements to reflect implemented features and add new goals:
Files Modified:
Readme.md - Comprehensive updates to document new functionalityNew Functionality Documented:
Prompt: "Currnelty in embeddings and llm we have made it very tight ther eis no way to use anything other than that like anothe rllm or embedding model to use n reason chain we shodul see hwo can we make it a little open"
Changes Made:
base_provider.py - Abstract base classes for LLM and embedding providers
BaseLLMProvider - Interface for all LLM providersBaseEmbeddingProvider - Interface for all embedding providersProviderConfig - Configuration management classprovider_registry.py - Dynamic provider registration and management
LLMProviderRegistry - Register, retrieve, and manage LLM providersEmbeddingProviderRegistry - Register, retrieve, and manage embedding providersproviders/ - Package containing concrete provider implementations
OpenAIProvider - GPT-4, GPT-3.5, etc.GroqProvider - Fast inference with Llama, MixtralOllamaProvider - Local models (Llama, Mistral, etc.)CustomModelProvider - Fine-tuned and local HuggingFace modelsAnthropicProvider - Claude models (example of easy extensibility)register_providers.py - Automatic registration of built-in providersmodel_manager.py to use provider systemcustom_provider_example.py demonstrating:
Key Benefits:
Supporting ANY Service Now:
Usage Example:
# Built-in provider (no changes needed) agent = Agent(name="GPT_Agent", model_name="gpt-4", api="openai") # New provider added in 3 steps: # 1. Implement BaseLLMProvider class CohereProvider(BaseLLMProvider): def generate_response(self, prompt, **kwargs): return self.client.generate(prompt=prompt) def generate_chat_response(self, messages, **kwargs): # ... implement chat # 2. Register provider LLMProviderRegistry.register('cohere', CohereProvider) # 3. Use it! agent = Agent(name="Cohere_Agent", model_name="command", api="cohere")
Files Created:
reasonchain/llm_models/base_provider.pyreasonchain/llm_models/provider_registry.pyreasonchain/llm_models/register_providers.pyreasonchain/llm_models/providers/__init__.pyreasonchain/llm_models/providers/openai_provider.pyreasonchain/llm_models/providers/groq_provider.pyreasonchain/llm_models/providers/ollama_provider.pyreasonchain/llm_models/providers/custom_provider.pyreasonchain/llm_models/providers/anthropic_provider.pyexamples/custom_provider_example.pyFiles Modified:
reasonchain/llm_models/model_manager.py - Enhanced with provider systemReadme.md - Added extensibility documentationprompts.md - This fileImpact: This change transforms ReasonChain from a closed system supporting 3-4 specific LLM services into an open, extensible platform that can support ANY LLM or embedding service, present or future. Users can now add support for new services in minutes without touching core code.
Prompt: "now what about embedding models"
Changes Made:
- Sentence Transformers supportproviders/sentence_transformers_provider.py
- OpenAI Embeddingsproviders/openai_embeddings_provider.py
- HuggingFace Transformersproviders/huggingface_provider.py
register_embedding_providers.py - Automatic registrationEnhanced embedding_initializer.py
use_provider_system parameter for controlEnhanced VectorDB.py
custom_embedding_provider_example.py demonstrating:
Key Benefits:
Supporting ANY Embedding Service Now:
Usage Example:
# Built-in provider (no changes needed) add_data_to_vector_db( file_paths=["documents.pdf"], embedding_provider="sentence_transformers", embedding_model="all-mpnet-base-v2" ) # Custom provider in 3 steps: # 1. Implement BaseEmbeddingProvider class CohereEmbeddingProvider(BaseEmbeddingProvider): def embed_text(self, text): return self.client.embed([text]).embeddings[0] def embed_batch(self, texts, batch_size=96): return self.client.embed(texts).embeddings def get_dimension(self): return 1024 # 2. Register provider EmbeddingProviderRegistry.register('cohere', CohereEmbeddingProvider) # 3. Use it! add_data_to_vector_db( file_paths=["documents.pdf"], embedding_provider="cohere", embedding_model="embed-english-v3.0" )
Files Created:
reasonchain/rag/embeddings/providers/__init__.pyreasonchain/rag/embeddings/providers/sentence_transformers_provider.pyreasonchain/rag/embeddings/providers/openai_embeddings_provider.pyreasonchain/rag/embeddings/providers/huggingface_provider.pyreasonchain/rag/embeddings/register_embedding_providers.pyexamples/custom_embedding_provider_example.pyFiles Modified:
reasonchain/rag/embeddings/embedding_initializer.py - Integrated provider systemreasonchain/rag/vector/VectorDB.py - Added provider supportReadme.md - Added embedding provider documentationprompts.md - This fileImpact: The embedding system is now as extensible as the LLM system. Users can add support for ANY embedding service (Cohere, Voyage AI, custom APIs, local models) in minutes by implementing 3 simple methods. The system automatically handles batch processing, dimension detection, and integration with all vector databases.