How to Fine-Tune GPT-J on Alpaca GPT-4

Souvik Datta

Aug 31, 2023 • 7 min read

In this blog post, we're excited to share a recent experiment where we leveraged the capabilities of MonsterAPI's no-code LLM fine-tuner. Our focus was on fine-tuning the GPT-J model, which comes with an impressive 6 billion parameters.

The dataset for this task was the Alpaca GPT-4 dataset, containing a collection of instructions paired with the corresponding responses generated by the GPT-4 model.

The standout feature of this experiment is the accessibility that MonsterAPI's LLM finetuner. brings to the table. With a user-friendly GUI, this tool has streamlined what used to be a really complex process, effectively removing the prerequisite for having specialized machine learning team members. All of this at an extremely low cost.

But first, what is the vicgalle/alpaca-gpt4 Dataset?

The vicgalle/alpaca-gpt4 dataset focuses on English Instruction-Following, powered by the capabilities of GPT-4 and Alpaca prompts. Specifically designed for fine-tuning Language Model models (LLMs).

Comprising a substantial collection of 52,000 instruction-following instances, each instance has been meticulously generated by GPT-4 using Alpaca prompts. The structure of the dataset is straightforward:

instruction: A unique str description of the task.
input: An optional str context or input for the task.
output: The str answer to the instruction, generated by GPT-4.
text: A concatenated str field containing all the above components and the initial Alpaca prompt.

What sets the "alpaca-gpt4" dataset apart is its approach to creation. Unlike the original Alpaca dataset that utilized text-davinci-003 for prompt completions, the "alpaca-gpt4" dataset leverages GPT-4, resulting in responses of higher quality and depth.

What is LLM Fine-Tuning and why is it so important?

Language models like GPT-J are initially trained on vast amounts of general language data to learn patterns, grammar, and context. However, applying them directly to specific tasks or domains may not yield optimal results.

Fine-tuning comes to the rescue, allowing users to enhance the model's performance in three crucial ways:

1. More Accurate

2. Context-Aware

3. Aligned with the target application.

Fine-tuning enables us to tailor the pre-trained models to specific tasks, effectively transferring their general language knowledge to the specialized task of our choice.

However, fine-tuning an LLM is not as easy as it looks on the surface. Developers may encounter several obstacles when attempting to fine-tune foundational language models like GPT-J or LLaMA.

These challenges include:

Complex Setups: Configuring GPUs and software dependencies for fine-tuning foundational models can be intricate and time-consuming, necessitating manual management and setup.
Memory Constraints: Fine-tuning large language models demands significant GPU memory, which can be limiting for developers with resource constraints.
GPU Costs: The expenses associated with GPU usage for fine-tuning can be costly, making it a luxury not all developers can afford.
Lack of Standardized Methodologies: The absence of standardized practices can make the fine-tuning process frustrating and time-consuming, as developers may need to navigate through various documentation and forums to find the best approach.

As a result, it can be challenging for developers to tailor a model to meet their specific needs. Nevertheless, despite these challenges, fine-tuning remains a crucial step in harnessing the full potential of LLMs.

How can MonsterAPI be used to solve these challenges around LLM fine-tuning?

MonsterAPI has simplified and effectively made the often intricate fine-tuning process straightforward and quick, reducing the complex setup to a simple, easy-to-follow UI native approach.

With MonsterAPI's no-code LLM FineTuner, those challenges are effectively addressed. Here's how it benefits you:

Simplified Setup: MonsterAPI provides a user-friendly, intuitive interface that completely removes the effort of setting up a GPU environment for Fine-tuning by deploying your finetuning jobs automatically on pre-configured GPU instances. Thus, eliminating the need for manual hardware specifications and low-level configurations.
Optimized Memory Utilization: MonsterAPI FineTuner optimizes memory usage during the process, making large language model Fine-tuning manageable even with limited GPU memory.
Low-cost GPU Access: Monster API offers access to its decentralized GPU network, providing on-demand access to affordable GPU instances, reducing the overall cost and complexity associated with Fine-tuning LLMs.
Standardized workflow: The platform provides predefined tasks and recipes, guiding developers through the Fine-tuning process without the need to search through extensive documentation and forums. It also allows for flexibility to create custom tasks.

How to get started with finetuning LLMs like GPT-J?

In just five simple steps, you can set up your fine-tuning task and experience remarkable results.

So, let's get started and explore the process together!

1. Select a Language Model for Finetuning: Choose from popular open-source models like Llama 2 7B, GPT-J 6B, or StableLM 7B.

2. Select or Create a Task: Next, choose from pre-defined tasks or create a custom one to suit your needs. If your task is unique, you can even choose the "Other" option to create a custom task.

3. Select a HuggingFace Dataset: Monster API seamlessly integrates with HuggingFace datasets, offering a wide selection. With just a few clicks, your selected dataset is automatically formatted and ready for use. In our case, we used the CodeAlpaca-20k Dataset.

4. Specify Hyper-parameters: Monster API simplifies the process by pre-filling most of the hyper-parameters based on your selected LLM. You have the freedom to customize parameters such as epochs, learning rate, cutoff length, gradient accumulation steps and more.

5. Review and Submit Finetuning Job: After setting up all the parameters, you can review everything on the summary page. We know the importance of accuracy, so we provide this step to ensure you have full control. Once you're confident with the details, simply submit the job. From there, we take care of the rest.

That’s it! In just five easy steps, your job is submitted for FineTuning an LLM of your choice. After successfully setting up your fine-tuning job using Monster API, you can monitor the performance through detailed logs on WandB.

Outcome of using Monster API LLM Finetuner:

We were able to fine-tune GPT-J on Alpaca GPT-4 Dataset for 10 epochs for as low as $50.

The results of our fine-tuning job turned out to be impressive, as the model learnt and adapted to the chosen task of "Instruction-finetuning" on the specified Code generation dataset. Over a span of 10 hours with 10 epochs, we achieved significant progress. For a visual representation, attached are relevant graphs of our finetuning job using WandB Metrics, showing the training loss and evaluation loss.

Train Loss:

The training loss converged to 0.5815, with the moving average settling at 0.9179. Loss over here indicates the difference between the AI's outputs and what they should ideally be, the smaller, the better.

Evaluation Loss:

These WandB Metrics graphs offer valuable insights into the fine-tuning process, allowing for a detailed analysis of various aspects such as loss, learning rate, GPU power usage, GPU memory access, GPU temperature, etc.

Putting the model to Test -

After successfully fine-tuning the language model using MonsterAPI's LLM FineTuner, it was time to put the model to the test.

We conducted a comprehensive evaluation to assess its performance and suitability for real-world applications. To gain valuable insights, we compared its performance against the base model using the same prompts.

The evaluation included a variety of tasks, ensuring a thorough examination of the fine-tuned model's capabilities.

Performance:

We compare the performance of the base model and fine-tuned model on three popular benchmarks

Arc_easy: (A benchmark intended to test the reasoning capabilities of the Large Language Model)
Hellswag: (A benchmark that tests the autocompletion capabilities)
truthfulqa_mc : (A benchmark that tests the truthfulness of the answers returned by the model)

Benchmark Results:

The fine-tuned model outperformed the base model in all benchmarks even on extremely challenging truthfulqa_mc -

Input Prompt:

Input Prompt:

The fine-tuned model's ability to understand the nuances of tasks and thus provide accurate and contextually relevant outputs showcased the benefits of fine-tuning language models.

Download the Fine-Tuned Model weights from Hugging Face

Cost Analysis of Finetuning GPT-J on Monster API:

The fine-tuning journey with MonsterAPI's LLM FineTuner is characterized by its remarkable simplicity and affordability. With just a few clicks for setup and configuration, your task becomes operational in under 30 seconds. All of this comes at an economical price of only $50.

In stark contrast, attempting a similar experiment using 4xV100s on a conventional cloud platform could incur expenses nearing $90. This is accompanied by a considerable investment of time and manual labour from the developer for setup. Our methodology eradicates this cumbersome process, ensuring outstanding outcomes without unwarranted financial strain.

By embracing Monster API, the entire fine-tuning process gains a 1.8x boost in cost-effectiveness compared to traditional cloud alternatives. The savings generated from utilizing Monster API will progressively amplify as you expand, enabling you to attain exceptional outcomes without bearing excessive financial weight.

The Benefits of using MonsterAPI LLM finetuner:

The true value of our no-code LLM FineTuner lies in its dedication to simplifying and democratizing the use of large language models (LLMs).

By addressing common barriers like technical complexities, memory constraints, high GPU costs, and lack of standardized practices, our platform makes AI model fine-tuning accessible and efficient for all.

In doing so, it empowers developers to fully leverage LLMs, fostering the development of more sophisticated AI applications.

Ready to finetune an LLM for your business needs?

Check out our documentation on Finetuning an LLM.