Compare commits
10 Commits
3b4b582b59
...
74898ea619
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
74898ea619 | ||
|
|
1495adeedc | ||
|
|
50f784a8c9 | ||
|
|
a356af9a0f | ||
|
|
2e734750b6 | ||
|
|
fa97b9e194 | ||
|
|
b05c1b7eaf | ||
|
|
b51e548b92 | ||
|
|
8e037f8a3a | ||
|
|
cfa989c7ad |
25
0001-Fix-build-error-for-loongarch64.patch
Normal file
25
0001-Fix-build-error-for-loongarch64.patch
Normal file
@ -0,0 +1,25 @@
|
||||
From 3d913d653d9bf75ae56b46deeffb28874fedb6d1 Mon Sep 17 00:00:00 2001
|
||||
From: zhangzikang <zhangzikang@kylinos.cn>
|
||||
Date: Fri, 17 May 2024 10:03:26 +0800
|
||||
Subject: [PATCH] Fix build error for loongarch64
|
||||
|
||||
---
|
||||
third_party/ggml/src/ggml.c | 2 +-
|
||||
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||
|
||||
diff --git a/third_party/ggml/src/ggml.c b/third_party/ggml/src/ggml.c
|
||||
index beb7f46..2374287 100644
|
||||
--- a/third_party/ggml/src/ggml.c
|
||||
+++ b/third_party/ggml/src/ggml.c
|
||||
@@ -299,7 +299,7 @@ typedef double ggml_float;
|
||||
#if defined(_MSC_VER) || defined(__MINGW32__)
|
||||
#include <intrin.h>
|
||||
#else
|
||||
-#if !defined(__riscv)
|
||||
+#if !defined(__riscv) && !defined(__loongarch64)
|
||||
#include <immintrin.h>
|
||||
#endif
|
||||
#endif
|
||||
--
|
||||
2.43.0
|
||||
|
||||
24
Dockerfile-chatglm
Normal file
24
Dockerfile-chatglm
Normal file
@ -0,0 +1,24 @@
|
||||
#Usage:
|
||||
#1.build image:
|
||||
# docker build -f Dockerfile-chatglm -t chatglm_image .
|
||||
#2.run image:
|
||||
# docker run -it --security-opt seccomp=unconfined chatglm_image:latest
|
||||
|
||||
#base image
|
||||
FROM openeuler/openeuler:22.03
|
||||
|
||||
#update openEuler2309 source and install chatglm
|
||||
RUN echo '[everything]' > /etc/yum.repos.d/openEuler.repo && \
|
||||
echo 'name=everything' >> /etc/yum.repos.d/openEuler.repo && \
|
||||
echo 'baseurl=http://121.36.84.172/dailybuild/EBS-openEuler-23.09/rc4_openeuler-2023-09-13-21-46-47/everything/$basearch/' >> /etc/yum.repos.d/openEuler.repo && \
|
||||
echo 'enabled=1' >> /etc/yum.repos.d/openEuler.repo && \
|
||||
echo 'gpgcheck=0' >> /etc/yum.repos.d/openEuler.repo && \
|
||||
yum install -y sentencepiece chatglm-cpp wget
|
||||
|
||||
#download ggml model
|
||||
WORKDIR /model_path
|
||||
RUN wget -P /model_path https://huggingface.co/Xorbits/chatglm2-6B-GGML/resolve/main/chatglm2-ggml-q4_1.bin
|
||||
|
||||
# run ggml model
|
||||
CMD /usr/bin/chatglm_cpp_main -m /model_path/chatglm2-ggml-q4_1.bin -i
|
||||
|
||||
102
README.md
102
README.md
@ -1,37 +1,89 @@
|
||||
# chatglm.cpp
|
||||
# chatglm-cpp使用指南
|
||||
|
||||
#### 介绍
|
||||
Chinese models ChatGLM-6B & ChatGLM2-6B based on C/C++
|
||||
## 介绍
|
||||
chatglm-cpp是基于C/C++实现的ChatGLM大模型接口,可以支持用户在CPU机器上完成开源大模型的部署和使用。
|
||||
|
||||
#### 软件架构
|
||||
软件架构说明
|
||||
chatglm-cpp支持多个中文开源大模型的部署,如ChatGLM-6B,ChatGLM2-6B,Baichuan-13B等。
|
||||
|
||||
## 软件架构
|
||||
chatglm-cpp核心架构分为两层
|
||||
- 模型量化层:可以量化开源模型,减少模型大小;
|
||||
- 模型启动层:可以启动量化后的模型。
|
||||
|
||||
#### 安装教程
|
||||
特性:
|
||||
- 基于ggml的C/C++实现;
|
||||
- 通过int4/int8量化、优化的KV缓存和并行计算等多种方式加速CPU推理;
|
||||
- 互动界面是流媒体生成,具有打字机效果;
|
||||
- 无需 GPU,可只用 CPU 运行。
|
||||
|
||||
1. xxxx
|
||||
2. xxxx
|
||||
3. xxxx
|
||||
## 安装教程
|
||||
### 软硬件要求
|
||||
处理器架构:支持AArch64和X86_64处理器架构;
|
||||
|
||||
#### 使用说明
|
||||
操作系统:openEuler 23.09;
|
||||
|
||||
1. xxxx
|
||||
2. xxxx
|
||||
3. xxxx
|
||||
内存:根据不同开源模型的大小,不低于4G。
|
||||
|
||||
#### 参与贡献
|
||||
### 安装组件
|
||||
使用chatglm-cpp部署大模型,需要安装chatglm-cpp软件包。安装前,请确保已经配置了openEuler yum源。
|
||||
1. 安装:
|
||||
```
|
||||
yum install chatglm-cpp
|
||||
```
|
||||
2. 查看是否安装成功:
|
||||
```
|
||||
chatglm_cpp_main -h
|
||||
```
|
||||
若成功显示help信息则安装成功。
|
||||
|
||||
1. Fork 本仓库
|
||||
2. 新建 Feat_xxx 分支
|
||||
3. 提交代码
|
||||
4. 新建 Pull Request
|
||||
## 使用说明
|
||||
### 不使用容器
|
||||
1. 需要安装chatglm-cpp软件包:
|
||||
```
|
||||
yum install chatglm-cpp
|
||||
```
|
||||
2. 需要下载开源大模型,如ChatGLM-6B、ChatGLM2-6B等。并将下载的开源大模型通过chatglm_convert.py进行模型量化:
|
||||
```
|
||||
python3 /usr/bin/chatglm_convert.py -i model_path/ -t q4_0 -o chatglm-ggml_1.bin
|
||||
```
|
||||
其中model_path为开源大模型的存放路径,q4_0为开源大模型量化的精度,chatglm-ggml_1.bin是输出的量化模型的名称。
|
||||
|
||||
3. 启动模型,进行对话:
|
||||
```
|
||||
chatglm_cpp_main -m model_path -i
|
||||
```
|
||||
其中model_path为量化模型的存放路径。
|
||||
|
||||
#### 特技
|
||||
可通过以下命令查看命令行选项用法:
|
||||
```
|
||||
chatglm_cpp_main -h
|
||||
```
|
||||
### 使用容器
|
||||
1. 拉取容器镜像:
|
||||
```
|
||||
docker pull hub.oepkgs.net/openeuler/chatglm_image
|
||||
```
|
||||
2. 运行容器镜像,进行对话:
|
||||
```
|
||||
docker run -it --security-opt seccomp=unconfined hub.oepkgs.net/openeuler/chatglm_image
|
||||
```
|
||||
### 正常启动界面
|
||||
模型启动后的界面如图1所示:
|
||||
|
||||
**图1** 模型启动界面
|
||||

|
||||
## 规格说明
|
||||
本项目可支持在CPU级别的机器上进行大模型的部署和推理,但是模型推理速度对硬件仍有一定的要求,硬件配置过低可能会导致推理速度过慢,降低使用效率。
|
||||
|
||||
表1可作为不同机器配置下推理速度的参考:
|
||||
|
||||
表格中Q4_0,Q4_1,Q5_0,Q5_1代表模型的量化精度;ms/token代表模型的推理速度,含义为每个token推理耗费的毫秒数,该值越小推理速度越快;
|
||||
|
||||
**表1** 模型推理速度的测试数据
|
||||
|
||||
| ChatGLM-6B | Q4_0 | Q4_1 | Q5_0 | Q5_1 |
|
||||
|--------------------------------|------|------|------|------|
|
||||
| ms/token (CPU @ Platinum 8260) | 74 | 77 | 86 | 89 |
|
||||
| 模型大小 | 3.3G | 3.7G | 4.0G | 4.4G |
|
||||
| 内存占用 | 4.0G | 4.4G | 4.7G | 5.1G |
|
||||
|
||||
1. 使用 Readme\_XXX.md 来支持不同的语言,例如 Readme\_en.md, Readme\_zh.md
|
||||
2. Gitee 官方博客 [blog.gitee.com](https://blog.gitee.com)
|
||||
3. 你可以 [https://gitee.com/explore](https://gitee.com/explore) 这个地址来了解 Gitee 上的优秀开源项目
|
||||
4. [GVP](https://gitee.com/gvp) 全称是 Gitee 最有价值开源项目,是综合评定出的优秀开源项目
|
||||
5. Gitee 官方提供的使用手册 [https://gitee.com/help](https://gitee.com/help)
|
||||
6. Gitee 封面人物是一档用来展示 Gitee 会员风采的栏目 [https://gitee.com/gitee-stars/](https://gitee.com/gitee-stars/)
|
||||
|
||||
@ -2,13 +2,17 @@
|
||||
|
||||
Name: chatglm-cpp
|
||||
Version: 0.2.4
|
||||
Release: 2
|
||||
Release: 5
|
||||
License: MIT
|
||||
Summary: Port of Chinese lagre model ChatGLM-6B & ChatGLM2-6B implemented based on C/C++
|
||||
|
||||
URL: https://github.com/li-plus/chatglm.cpp
|
||||
Source0: https://github.com/li-plus/chatglm.cpp/releases/download/0.2.4/chatglm-cpp-%{version}.tar.gz
|
||||
|
||||
%ifarch loongarch64
|
||||
Patch0: 0001-Fix-build-error-for-loongarch64.patch
|
||||
%endif
|
||||
|
||||
BuildRequires: gcc,gcc-c++,cmake
|
||||
Requires: sentencepiece
|
||||
|
||||
@ -31,6 +35,7 @@ pushd chatglm_builddir
|
||||
%make_install
|
||||
install bin/main %{buildroot}%{_prefix}/local/bin/chatglm_cpp_main
|
||||
install lib/libchatglm.a %{buildroot}%{_prefix}/local/bin/libchatglm.a
|
||||
install ../chatglm_cpp/convert.py %{buildroot}%{_prefix}/local/bin/chatglm_convert.py
|
||||
mv %{buildroot}%{_prefix}/local/* %{buildroot}%{_prefix}
|
||||
|
||||
#Remove files from package sentencepiece.
|
||||
@ -46,6 +51,17 @@ popd
|
||||
/usr/lib/static/libggml.a
|
||||
|
||||
%changelog
|
||||
* Fri May 17 2024 zhangzikang <zhangzikang@kylinos.cn> - 0.2.4-5
|
||||
- Fix build error for loongarch64
|
||||
|
||||
* Wed Sep 20 2023 zhoupengcheng <zhoupengcheng11@huawei.com> - 0.2.4-4
|
||||
- packing chatglm_convert.py file
|
||||
- install python models for chatglm_convert.py int dockerfile
|
||||
- update long-term yum.repo in dockerfile
|
||||
|
||||
* Tue Sep 19 2023 zhoupengcheng <zhoupengcheng11@huawei.com> - 0.2.4-3
|
||||
- add dokerfile
|
||||
|
||||
* Wed Sep 6 2023 zhoupengcheng <zhoupengcheng11@huawei.com> - 0.2.4-2
|
||||
- Fix the conflict with the installation of package sentencepiece.
|
||||
|
||||
|
||||
BIN
chatglm.png
Normal file
BIN
chatglm.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 33 KiB |
Loading…
x
Reference in New Issue
Block a user