Training procedure

The following bitsandbytes quantization config was used during training:

load_in_8bit: True
load_in_4bit: False
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32

Model Description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/OPT%20Models/Essays%20With%20Instructions%20-%20Fine-Tune%20-%20OPT.ipynb

Intended uses & limitations

This is intended to show the possibilities. It is mainly limited by the input data.

Training & Evaluation Dataset

Dataset Source: https://huggingface.co/datasets/ChristophSchuhmann/essays-with-instructions

Hyperparameters Used

Hyperperameter	Value
Model Checkpoint	facebook/opt-2.7b
per_device_train_batch_size	8
gradient_accumulation_steps	4
fp16	True
warmup_steps	75
learning_rate	2e-4
Training Steps	150

Framework versions

Library	Version
Python	3.10.1
Torch	2.0.1+cu118
Datasets	2.14.4
Transformer	4.31.0
PEFT	0.4.0

Metric

Perplexity = 9.46

License

This model is a fine-tuned version of Meta's OPT-2.7B model.

The original OPT model is released under a custom license that does not correspond to standard open-source licenses. The training dataset (essays-with-instructions) is licensed under Apache 2.0.

Users must comply with the original OPT license as well as the dataset license.

License Notice

This model is a fine-tuned derivative of a pretrained model. Users must comply with the original model license.

Dataset Notice

This model was fine-tuned on third-party datasets which may have separate licenses or usage restrictions.

Downloads last month: 17

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

DunnBC22
/

opt-2.7b-Fine_Tuned-Essays_with_Instructions