Advancing GPT Knowledge in Malaysia through Retrieval-Augmented Generation (RAG)
Harnessing Innovative Techniques for Enhanced Language Model Performance
Project Background
AI-Enhanced Economic Analysis: Bridging Local and Regional Insights
Huge Barriers for Comprehensive Market Analysis
- Over-reliance on traditional methods for market analysis, i.e. periodic analyst reports
- Complex interplay of Malaysia’s diverse cultural, linguistic and economic landscape
- Substantial amount of market data repositories and noises
Current GPT Models are limited
- Reliance on pre-trained data which can quickly become outdated
- Non-real time, lagging in a fast-paced market environment
- Limited local context and cultural understanding
Methodology
We do things step by step
1. Data Collection and Preprocessing
Set up automated daily report download process from Capital IQ and store in Amazon S3
2. RAG System Development
Integrate Retrieval System with SEA-LION GPT model
3. Model Fine-tuning and Prompt Design
Curation of dataset specific to Malaysian economic context to understand Malaysian market
4. Query Processing and Response Generation
Develop User Interface for query input, accomodating different use case
5. Testing and Validation
Conduct rigorous testing with diverse set of real-world queries about the Malaysian market
6. Feedback Loop and Continuous Improvement
Regular review process and analyse model performance metrics
Why Malaysia?
3rd Largest Economy across ASEAN
Regional Leader in Digital Economy
Advanced digital infrastructure and commitment to open data initiatives
Diverse Economic Sectors from Commodities to Technology
Provides Insights into Broader ASEAN trends
Timeline
Detailed Project Plan