I have fine-tuned a RoBERTa model using an open-source dataset to detect AI-written content. While the results were promising initially, the dataset lacked examples from modern AI models like Gemini and upgraded versions of GPT. To address this, I created a smaller dataset incorporating outputs from these models. However, when I fine-tuned the model again, it began to overfit.
Despite applying standard techniques to mitigate overfitting, I have not been able to achieve the desired results. Now, I am looking for a stable and effective roadmap to proceed further, ensuring the model performs well on both traditional and modern datasets.
Additionally, I need assistance in setting up and optimizing GPU configurations locally to effectively support model training and fine-tuning tasks.