chore: trainer

2025-02-13 21:42:03 +06:00
parent df7883dde3
commit f013e8efe6
7 changed files with 431 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,63 @@
+# Unsloth LoRA scripts
+
+## Installation
+
+1. Clone the repository:
+
+```bash
+git clone https://git.hye.su/mira/unsloth-train-scripts.git
+cd unsloth-train-scripts
+```
+
+2. Install pytorch and unsloth:
+
+```bash
+wget -qO- https://raw.githubusercontent.com/unslothai/unsloth/main/unsloth/_auto_install.py | python -
+pip install gdown  # Optional: Only needed for Google Drive datasets
+```
+
+## Project Structure
+
+```
+unsloth-lora-training/
+├── config.py          # Configuration settings
+├── data_loader.py     # Dataset loading and processing
+├── model_handler.py   # Model initialization and PEFT setup
+├── trainer.py         # Training loop and metrics
+├── main.py           # Main training script
+└── README.md         # This file
+```
+
+## Configuration
+
+All configuration settings are managed in `config.py`. The main configuration class is `TrainingConfig`
+
+To modify the default configuration, edit the `TrainingConfig` class in `config.py`:
+
+```python
+@dataclass
+class TrainingConfig:
+    base_model: str = "unsloth/Qwen2.5-7B"
+    max_seq_length: int = 16384
+    # ... modify other parameters as needed
+```
+
+## Usage
+
+```bash
+python main.py \
+    --base_model mistralai/Mistral-7B-v0.1 \
+    --dataset path/to/your/dataset.json \
+    --output_dir ./custom_output
+    --hub_token "secret"
+```
+
+### Using Google Drive Dataset
+
+Train using a dataset stored on Google Drive:
+
+```bash
+python main.py \
+    --dataset https://drive.google.com/file/d/your_file_id/view \
+    --output_dir ./drive_output
+```