佛山大型的网站制作,js 曲线 网站,网页设计软件dw免费下载,个人网络销售平台LLaMA-2模型部署 在文章NLP#xff08;五十九#xff09;使用FastChat部署百川大模型中#xff0c;笔者介绍了FastChat框架#xff0c;以及如何使用FastChat来部署百川模型。 本文将会部署LLaMA-2 70B模型#xff0c;使得其兼容OpenAI的调用风格。部署的Dockerfile文件…LLaMA-2模型部署 在文章NLP五十九使用FastChat部署百川大模型中笔者介绍了FastChat框架以及如何使用FastChat来部署百川模型。 本文将会部署LLaMA-2 70B模型使得其兼容OpenAI的调用风格。部署的Dockerfile文件如下
FROM nvidia/cuda:11.7.1-runtime-ubuntu20.04RUN apt-get update -y apt-get install -y python3.9 python3.9-distutils curl
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
RUN pip3 install fschatDocker-compose.yml文件如下
version: 3.9services:fastchat-controller:build:context: .dockerfile: Dockerfileimage: fastchat:latestports:- 21001:21001entrypoint: [python3.9, -m, fastchat.serve.controller, --host, 0.0.0.0, --port, 21001]fastchat-model-worker:build:context: .dockerfile: Dockerfilevolumes:- ./model:/root/modelimage: fastchat:latestports:- 21002:21002deploy:resources:reservations:devices:- driver: nvidiadevice_ids: [0, 1]capabilities: [gpu]entrypoint: [python3.9, -m, fastchat.serve.model_worker, --model-names, llama2-70b-chat, --model-path, /root/model/llama2/Llama-2-70b-chat-hf, --num-gpus, 2, --gpus, 0,1, --worker-address, http://fastchat-model-worker:21002, --controller-address, http://fastchat-controller:21001, --host, 0.0.0.0, --port, 21002]fastchat-api-server:build:context: .dockerfile: Dockerfileimage: fastchat:latestports:- 8000:8000entrypoint: [python3.9, -m, fastchat.serve.openai_api_server, --controller-address, http://fastchat-controller:21001, --host, 0.0.0.0, --port, 8000]部署成功后会占用2张A100每张A100占用约66G显存。 测试模型是否部署成功
curl http://localhost:8000/v1/models输出结果如下
{object: list,data: [{id: llama2-70b-chat,object: model,created: 1691504717,owned_by: fastchat,root: llama2-70b-chat,parent: null,permission: [{id: modelperm-3XG6nzMAqfEkwfNqQ52fdv,object: model_permission,created: 1691504717,allow_create_engine: false,allow_sampling: true,allow_logprobs: true,allow_search_indices: true,allow_view: true,allow_fine_tuning: false,organization: *,group: null,is_blocking: false}]}]
}部署LLaMA-2 70B模型成功
Prompt token长度计算 在FastChat的Github开源项目中项目提供了计算Prompt的token长度的API文件路径为fastchat/serve/model_worker.py调用方法为
curl --location localhost:21002/count_token \
--header Content-Type: application/json \
--data {prompt: What is your name?}输出结果如下
{count: 6,error_code: 0
}Conversation token长度计算 在FastChat中计算Conversation对话的token长度较为麻烦。 首先我们需要获取LLaMA-2 70B模型的对话配置调用API如下
curl --location --request POST http://localhost:21002/worker_get_conv_template输出结果如下
{conv: {messages: [],name: llama-2,offset: 0,roles: [[INST], [/INST]],sep: ,sep2: /ss,sep_style: 7,stop_str: None,stop_token_ids: [2],system_message: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you dont know the answer to a question, please dont share false information.,system_template: [INST] SYS\n{system_message}\n/SYS\n\n}}在FastChat中的对话文件fastchat/conversation.py中提供了对话加工的代码这里不再展示使用时直接复制整个文件即可该文件不依赖任何第三方模块。 我们需要将对话按照OpenAI的方式加工成对应的Prompt输入的对话messages如下 messages [{“role”: “system”, “content”: “You are Jack, you are 20 years old, answer questions with humor.”}, {“role”: “user”, “content”: “What is your name?”},{“role”: “assistant”, “content”: Well, well, well! Look who’s asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!“}, {“role”: “user”, “content”: “How old are you?”}, {“role”: “assistant”, “content”: Oh, you want to know my age? Well, let’s just say I’m older than a bottle of wine but younger than a bottle of whiskey. I’m like a fine cheese, getting better with age, but still young enough to party like it’s 1999!”}, {“role”: “user”, “content”: “Where is your hometown?”}] Python代码如下
# -*- coding: utf-8 -*-
# place: Pudong, Shanghai
# file: prompt.py
# time: 2023/8/8 19:24
from conversation import Conversation, SeparatorStylemessages [{role: system, content: You are Jack, you are 20 years old, answer questions with humor.}, {role: user, content: What is your name?},{role: assistant, content: Well, well, well! Look whos asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!}, {role: user, content: How old are you?}, {role: assistant, content: Oh, you want to know my age? Well, lets just say Im older than a bottle of wine but younger than a bottle of whiskey. Im like a fine cheese, getting better with age, but still young enough to party like its 1999!}, {role: user, content: Where is your hometown?}]llama2_conv {conv:{name:llama-2,system_template:[INST] SYS\n{system_message}\n/SYS\n\n,system_message:You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you dont know the answer to a question, please dont share false information.,roles:[[INST],[/INST]],messages:[],offset:0,sep_style:7,sep: ,sep2: /ss,stop_str:None,stop_token_ids:[2]}}
conv llama2_conv[conv]conv Conversation(nameconv[name],system_templateconv[system_template],system_messageconv[system_message],rolesconv[roles],messageslist(conv[messages]), # prevent in-place modificationoffsetconv[offset],sep_styleSeparatorStyle(conv[sep_style]),sepconv[sep],sep2conv[sep2],stop_strconv[stop_str],stop_token_idsconv[stop_token_ids],)if isinstance(messages, str):prompt messages
else:for message in messages:msg_role message[role]if msg_role system:conv.set_system_message(message[content])elif msg_role user:conv.append_message(conv.roles[0], message[content])elif msg_role assistant:conv.append_message(conv.roles[1], message[content])else:raise ValueError(fUnknown role: {msg_role})# Add a blank message for the assistant.conv.append_message(conv.roles[1], None)prompt conv.get_prompt()print(repr(prompt))加工后的Prompt如下
[INST] SYS\nYou are Jack, you are 20 years old, answer questions with humor.\n/SYS\n\nWhat is your name?[/INST] Well, well, well! Look whos asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend! /ss[INST] How old are you? [/INST] Oh, you want to know my age? Well, lets just say Im older than a bottle of wine but younger than a bottle of whiskey. Im like a fine cheese, getting better with age, but still young enough to party like its 1999! /ss[INST] Where is your hometown? [/INST]最后再调用计算Prompt的API参考上节的Prompt token长度计算输出该对话的token长度为199. 我们使用FastChat提供的对话补充接口v1/chat/completions验证输入的对话token长度请求命令为
curl --location http://localhost:8000/v1/chat/completions \
--header Content-Type: application/json \
--data {model: llama2-70b-chat,messages: [{role: system, content: You are Jack, you are 20 years old, answer questions with humor.}, {role: user, content: What is your name?},{role: assistant, content: Well, well, well! Look who\s asking the questions now! My name is Jack, but you can call me the king of the castle, the lord of the rings, or the prince of the pizza party. Whatever floats your boat, my friend!}, {role: user, content: How old are you?}, {role: assistant, content: Oh, you want to know my age? Well, let\s just say I\m older than a bottle of wine but younger than a bottle of whiskey. I\m like a fine cheese, getting better with age, but still young enough to party like it\s 1999!}, {role: user, content: Where is your hometown?}]
}输出结果为
{id: chatcmpl-mQxcaQcNSNMFahyHS7pamA,object: chat.completion,created: 1691506768,model: llama2-70b-chat,choices: [{index: 0,message: {role: assistant,content: Ha! My hometown? Well, thats a tough one. Im like a bird, I dont have a nest, I just fly around and land wherever the wind takes me. But if you really want to know, Im from a place called \The Internet\. Its a magical land where memes and cat videos roam free, and the Wi-Fi is always strong. Its a beautiful place, you should visit sometime!},finish_reason: stop}],usage: {prompt_tokens: 199,total_tokens: 302,completion_tokens: 103}
}注意输出的prompt_tokens为199这与我们刚才计算的对话token长度的结果是一致的
总结 本文主要介绍了如何在FastChat中部署LLaMA-2 70B模型并详细介绍了Prompt token长度计算以及对话conversation的token长度计算。希望能对读者有所帮助~ 笔者的一点心得是阅读源码真的很重要。 笔者的个人博客网址为https://percent4.github.io/ ,欢迎大家访问~
参考网址
NLP五十九使用FastChat部署百川大模型: https://blog.csdn.net/jclian91/article/details/131650918FastChat: https://github.com/lm-sys/FastChat 文章转载自: http://www.morning.cfpq.cn.gov.cn.cfpq.cn http://www.morning.mprtj.cn.gov.cn.mprtj.cn http://www.morning.krlsz.cn.gov.cn.krlsz.cn http://www.morning.tkqzr.cn.gov.cn.tkqzr.cn http://www.morning.txysr.cn.gov.cn.txysr.cn http://www.morning.wmmtl.cn.gov.cn.wmmtl.cn http://www.morning.tllhz.cn.gov.cn.tllhz.cn http://www.morning.srckl.cn.gov.cn.srckl.cn http://www.morning.rlns.cn.gov.cn.rlns.cn http://www.morning.mjwnc.cn.gov.cn.mjwnc.cn http://www.morning.ylsxk.cn.gov.cn.ylsxk.cn http://www.morning.brrxz.cn.gov.cn.brrxz.cn http://www.morning.kxqmh.cn.gov.cn.kxqmh.cn http://www.morning.bmfqg.cn.gov.cn.bmfqg.cn http://www.morning.rxdsq.cn.gov.cn.rxdsq.cn http://www.morning.gtmgl.cn.gov.cn.gtmgl.cn http://www.morning.qtzwh.cn.gov.cn.qtzwh.cn http://www.morning.szzxqc.com.gov.cn.szzxqc.com http://www.morning.gfhng.cn.gov.cn.gfhng.cn http://www.morning.yhyqg.cn.gov.cn.yhyqg.cn http://www.morning.llthz.cn.gov.cn.llthz.cn http://www.morning.qhmgq.cn.gov.cn.qhmgq.cn http://www.morning.fmkbk.cn.gov.cn.fmkbk.cn http://www.morning.ybyln.cn.gov.cn.ybyln.cn http://www.morning.ygwyt.cn.gov.cn.ygwyt.cn http://www.morning.tbnn.cn.gov.cn.tbnn.cn http://www.morning.bttph.cn.gov.cn.bttph.cn http://www.morning.zkjqj.cn.gov.cn.zkjqj.cn http://www.morning.npbnc.cn.gov.cn.npbnc.cn http://www.morning.elsemon.com.gov.cn.elsemon.com http://www.morning.mnrqq.cn.gov.cn.mnrqq.cn http://www.morning.fnpyk.cn.gov.cn.fnpyk.cn http://www.morning.lkrmp.cn.gov.cn.lkrmp.cn http://www.morning.ljtwp.cn.gov.cn.ljtwp.cn http://www.morning.rfkyb.cn.gov.cn.rfkyb.cn http://www.morning.ldnrf.cn.gov.cn.ldnrf.cn http://www.morning.mysmz.cn.gov.cn.mysmz.cn http://www.morning.wmmjw.cn.gov.cn.wmmjw.cn http://www.morning.rbcw.cn.gov.cn.rbcw.cn http://www.morning.wjhdn.cn.gov.cn.wjhdn.cn http://www.morning.lthgy.cn.gov.cn.lthgy.cn http://www.morning.oioini.com.gov.cn.oioini.com http://www.morning.kjcfz.cn.gov.cn.kjcfz.cn http://www.morning.kzhxy.cn.gov.cn.kzhxy.cn http://www.morning.qwfl.cn.gov.cn.qwfl.cn http://www.morning.rflcy.cn.gov.cn.rflcy.cn http://www.morning.rscrj.cn.gov.cn.rscrj.cn http://www.morning.mqss.cn.gov.cn.mqss.cn http://www.morning.jkcpl.cn.gov.cn.jkcpl.cn http://www.morning.zmqb.cn.gov.cn.zmqb.cn http://www.morning.jkpnm.cn.gov.cn.jkpnm.cn http://www.morning.iiunion.com.gov.cn.iiunion.com http://www.morning.pmdlk.cn.gov.cn.pmdlk.cn http://www.morning.dqwykj.com.gov.cn.dqwykj.com http://www.morning.ypmqy.cn.gov.cn.ypmqy.cn http://www.morning.crtgd.cn.gov.cn.crtgd.cn http://www.morning.syfty.cn.gov.cn.syfty.cn http://www.morning.spfq.cn.gov.cn.spfq.cn http://www.morning.cyysq.cn.gov.cn.cyysq.cn http://www.morning.qgfy.cn.gov.cn.qgfy.cn http://www.morning.nmlpp.cn.gov.cn.nmlpp.cn http://www.morning.qfgwx.cn.gov.cn.qfgwx.cn http://www.morning.rdtq.cn.gov.cn.rdtq.cn http://www.morning.mslsn.cn.gov.cn.mslsn.cn http://www.morning.hyhzt.cn.gov.cn.hyhzt.cn http://www.morning.dkfb.cn.gov.cn.dkfb.cn http://www.morning.jbpodhb.cn.gov.cn.jbpodhb.cn http://www.morning.wgzgr.cn.gov.cn.wgzgr.cn http://www.morning.yaqi6.com.gov.cn.yaqi6.com http://www.morning.fhrgk.cn.gov.cn.fhrgk.cn http://www.morning.qqpg.cn.gov.cn.qqpg.cn http://www.morning.wkmyt.cn.gov.cn.wkmyt.cn http://www.morning.pdkht.cn.gov.cn.pdkht.cn http://www.morning.bfnbn.cn.gov.cn.bfnbn.cn http://www.morning.tpbhf.cn.gov.cn.tpbhf.cn http://www.morning.dfffm.cn.gov.cn.dfffm.cn http://www.morning.fthqc.cn.gov.cn.fthqc.cn http://www.morning.zqzzn.cn.gov.cn.zqzzn.cn http://www.morning.nmyrg.cn.gov.cn.nmyrg.cn http://www.morning.yppln.cn.gov.cn.yppln.cn