使用LLaMA-Factory微调大模型_数据分析

使用llama-factory微调大模型

github 地址
https://github.com/hiyouga/llama-factory

搭建环境

git clone --depth 1 https://github.com/hiyouga/llama-factory.git
cd llama-factory

在 llama-factory 路径下创建虚拟环境

conda create -p ./venv python=3.10

激活环境

conda activate ./venv

在虚拟环境中安装依赖

python -m pip install -e .

下载数据集

我这里使用自带的数据
llama-factory/data/glaive_toolcall_zh_demo.json

下载模型

我这里使用 qwen-1_8b-chat
本地路径 /media/wmx/soft1/huggingface_cache/qwen-1_8b-chat

启动 webui

我这里是本地电脑显卡是 gtx-4070ti-super 16g ，单卡

cuda_visible_devices=0 gradio_share=1 llamafactory-cli webui

配置参数

在这里插入图片描述

因为是qwen模型，不是qwen1.5及以后的模型所以
train.lora_target: c_attn 这里必须这样，不然报错！！！

qwen-1.8b-chat.yaml：

top.adapter_path: []
top.booster: none
top.finetuning_type: lora
top.model_name: qwen1.5-1.8b-chat
top.quantization_bit: none
top.rope_scaling: none
top.template: qwen
top.visual_inputs: false
train.additional_target: ''
train.badam_mode: layer
train.badam_switch_interval: 50
train.badam_switch_mode: ascending
train.badam_update_ratio: 0.05
train.batch_size: 4
train.compute_type: fp16
train.create_new_adapter: false
train.cutoff_len: 1024
train.dataset:
- glaive_toolcall_zh_demo
train.dataset_dir: data
train.device_count: '1'
train.ds_offload: false
train.ds_stage: none
train.freeze_extra_modules: ''
train.freeze_trainable_layers: 2
train.freeze_trainable_modules: all
train.galore_rank: 16
train.galore_scale: 0.25
train.galore_target: all
train.galore_update_interval: 200
train.gradient_accumulation_steps: 8
train.learning_rate: 5e-5
train.logging_steps: 5
train.lora_alpha: 16
train.lora_dropout: 0
train.lora_rank: 8
train.lora_target: c_attn
train.loraplus_lr_ratio: 0
train.lr_scheduler_type: cosine
train.max_grad_norm: '1.0'
train.max_samples: '100000'
train.neftune_alpha: 0
train.num_train_epochs: '100'
train.optim: adamw_torch
train.packing: false
train.ppo_score_norm: false
train.ppo_whiten_rewards: false
train.pref_beta: 0.1
train.pref_ftx: 0
train.pref_loss: sigmoid
train.report_to: false
train.resize_vocab: false
train.reward_model: null
train.save_steps: 100
train.shift_attn: false
train.training_stage: supervised fine-tuning
train.upcast_layernorm: false
train.use_badam: false
train.use_dora: false
train.use_galore: false
train.use_llama_pro: false
train.use_rslora: false
train.val_size: 0
train.warmup_steps: 0

然后保存配置参数，然后点击开始微调

免费AI换脸工具

2. 软件风险：软件可能存在技术缺陷或漏洞，使用软件所面临的风险由用户自行承担。软件提供的功能仅供参考和娱乐，并不能代表真实的情况或出现的结果。1. 合法使用：... [阅读全文]

Hive（15）中使用sum() over()实现累积求和和滑动求和

三列数据，分别是员工的姓名、月份和销售额功能：对每个员工的销售业绩的累积求和以及滑动求和（每个月计算其最近三个月的总销售业绩） [阅读全文]

【大数据面试题】HBase面试题附答案

HBase是一种基于Hadoop的列示分布式非关系型数据库，它是高可靠性、高性能、面向列、可伸缩的分布式存储系统，利用HBase技术可在廉价服务器上搭建起大规模... [阅读全文]

HBase初识：很脆很能装

HBase 的设计理念依据 Google 的 BigTable 论文，论文中对于数据模型的首句介绍。Bigtable 是一个稀疏的、分布式的、持久的多维排序map。之后对于映射的解…

2024年08月02日 • 人工智能

亚马逊云科技 Amazon Bedrock 构建 AI 应用体验

大模型应用发展迅速，部署一套AI应用的需求也越来越多，从头部署花费时间太长，然而亚马逊科技全托管式生成式 AI 服务 Amazon Bedrock，Amazon Bedrock 简…

2024年08月02日 • 人工智能

HDFS的文件块大小（重点）

对于一般硬盘来说，传输速率为100M/s，一般设置块的大小128M，因为128是2的7次方，最接近于100M。比如，块的大小是1TB，传输这个1TB的数据会非常... [阅读全文]


验证码：

验证码：

使用LLaMA-Factory微调大模型

2024年08月02日 • 数据分析 •我要评论