
Fantastic! Let’s dive right in.
Welcome to your next adventure in machine learning! In this guide, we’ll explore the PEFT Fine-tune Bloom-560m-tagger project, a cutting-edge endeavor that combines the power of language models with efficient fine-tuning techniques. Whether you’re a seasoned data scientist or a curious enthusiast, this tutorial will equip you with the knowledge and tools to implement and understand the intricacies of this project.
Setting Up the Environment
Before we dive into the code, let’s ensure our workspace is ready. Here’s how we kick things off:
Installing Necessary Libraries
!pip install -q bitsandbytes datasets accelerate loralib
!pip install -q git+https://github.com/huggingface/transformers.git@main git+https://github.com/huggingface/peft.git
Here, we install the essential libraries needed for our project:
bitsandbytes
: Optimizes deep learning models for faster training and reduced memory usage.datasets
: Provides easy access to a vast range of datasets, simplifying data loading and preprocessing.accelerate
: A library by Hugging Face to speed up computations on your machine.loralib
: Likely a specialized library, although its specific use isn’t detailed here, it’s presumably important for our project.- We also fetch the latest versions of
transformers
andpeft
from their respective GitHub repositories.
Initializing the Workspace
from huggingface_hub import notebook_login
notebook_login()
This line imports and invokes the notebook_login
function from the huggingface_hub
library, which prompts you to log in to Hugging Face’s Hub. This step is crucial for accessing models and pushing your trained model to the Hub.
!nvidia-smi -L
We use nvidia-smi -L
to list NVIDIA GPUs available in our environment, ensuring we have the necessary hardware acceleration for training our model.
Preparing the Model and Tokenizer
Now, let’s set up our model and tokenizer:
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
import torch
import torch.nn as nn
import bitsandbytes as bnb
from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"bigscience/bloomz-560m",
load_in_8bit=True,
device_map='auto',
)
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
- We ensure that our script uses the first GPU by setting
CUDA_VISIBLE_DEVICES="0"
. torch
andtorch.nn
are imported for neural network operations.bitsandbytes
is also imported, likely to enhance model training efficiency.- We then load a pre-trained model and tokenizer using
AutoModelForCausalLM
andAutoTokenizer
from thebigscience/bloomz-560m
repository. Theload_in_8bit
option optimizes memory usage by loading the model in 8-bit precision, anddevice_map='auto'
automatically places the model’s layers onto available devices for optimal performance.
Freezing the Model Parameters
for param in model.parameters():
param.requires_grad = False # freeze the model - train adapters later
if param.ndim == 1:
# cast the small parameters (e.g. layernorm) to fp32 for stability
param.data = param.data.to(torch.float32)
Here, we freeze all parameters of the model to prevent them from being updated during training. This is essential for fine-tuning specific parts of the model, like adapters, without altering the entire network. Parameters with one dimension are cast to 32-bit floating point for numerical stability.
Enabling Gradient Checkpointing and Input Gradients
model.gradient_checkpointing_enable() # reduce number of stored activations
model.enable_input_require_grads()
Gradient checkpointing is activated to reduce memory usage during training, and input gradients are enabled, allowing the model to compute gradients with respect to its inputs.
Adjusting the Model’s Output
class CastOutputToFloat(nn.Sequential):
def forward(self, x): return super().forward(x).to(torch.float32)
model.lm_head = CastOutputToFloat(model.lm_head)
This custom class ensures the model’s output is cast to 32-bit floating point, which is necessary for maintaining numerical stability and performance.
Integrating PEFT with LoRA
Now, we integrate the Parameter Efficient Fine-tuning (PEFT) with LoRA (Low-Rank Adaptation) techniques:
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=16, # attention heads
lora_alpha=32, # alpha scaling
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, config)
LoraConfig
is used to define the configuration for LoRA, specifying the number of attention heads (r
), the alpha scaling factor (lora_alpha
), dropout rate, and task type.get_peft_model
integrates these settings into our model, preparing it for fine-tuning with both PEFT and LoRA mechanisms.
Print Trainable Parameters Function
print_trainable_parameters(model)
This function, presumably defined earlier in the code, outputs the number of trainable parameters, helping us understand the model’s complexity and the scale of the fine-tuning process.
Loading the Dataset
from datasets import load_dataset
data = load_dataset("Abirate/english_quotes")
We load a dataset of English quotes, which will be used for training the model. The datasets
library simplifies the process of fetching and preparing data for training.
Preprocessing the Data
def merge_columns(example):
example["prediction"] = example["quote"] + " ->: " + str(example["tags"])
return example
data['train'] = data['train'].map(merge_columns)
data['train'][0]
In this section, we preprocess the data by merging the quote
and tags
columns, formatting it for the model’s expected input structure.
Tokenizing the Data
data = data.map(lambda samples: tokenizer(samples['prediction']), batched=True)
The dataset is tokenized using our previously loaded tokenizer, converting text data into a format suitable for model training.
Setting Up the Trainer
trainer = transformers.Trainer(
model=model,
train_dataset=data['train'],
args=transformers.TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
warmup_steps=100,
max_steps=200,
learning_rate=2e-4,
fp16=True,
logging_steps=25,
output_dir='outputs'
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model
.config.use_cache = False # silence the warnings. Please re-enable for inference!
Here, we set up the Trainer
object from the transformers
library, specifying training arguments like batch size, learning rate, and output directory. The data_collator
is used to batch and prepare the data for training the language model.
Training the Model
trainer.train()
With everything set up, we kick off the training process using the train
method of our Trainer
object.
Sharing the Model on Hugging Face Hub
model.push_to_hub("<your_username>/bloomz-560m-tagger",
use_auth_token=True,
commit_message="basic training",
private=True)
After training, we push the model to the Hugging Face Hub, making it accessible for further use and sharing.
Loading and Using the Trained Model
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "<your_username>/bloomz-560m-tagger"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, peft_model_id)
batch = tokenizer("“Training models with PEFT and LoRa is cool” ->: ", return_tensors='pt')
with torch.cuda.amp.autocast():
output_tokens = model.generate(**batch, max_new_tokens=50)
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
In the final section, we demonstrate how to load and use the fine-tuned model. We tokenize a sample prompt and generate text using the model, showcasing the practical application of our fine-tuned language model.
Learn How To Build AI Projects
Learn How To Build AI Projects
Now, if you are interested in upskilling in 2024 with AI development, check out this 6 AI advanced projects with Golang where you will learn about building with AI and getting the best knowledge there is currently. Here’s the link.
Conclusion
Congratulations! You’ve just navigated through the comprehensive setup and execution of the PEFT Fine-tune Bloom-560m-tagger project. By breaking down each step and understanding the code, you’ve gained valuable insights into the world of machine learning and model fine-tuning. This hands-on project not only enhances your coding skills but also deepens your understanding of advanced AI methodologies. Now, it’s your turn to experiment, explore, and excel in your machine learning journey!