Installing Oobabooga on ArchLinux

Overview

This blogpost is mainly for my personal documentation but figured it might help those that are having issues installing Oobabooga on Archlinux.

This guide will go through how to install Oobabooga manually without using the one-click installer as that causes problems on ArchLinux. This guide will also show how to install GPTQ-for-LLaMA so you can run 4 bits quantized models.

For this guide, we will be using commit 50c70e2 for Oobabooga and commit 81fe867 for GPTQ-for-LLaMa.

Step 1: Installing correct version of CUDA

Ensure that we have the correct version of CUDA installed. To check that, we can run nvcc --version. This is the output you might see.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

If you are using 11.8 or 11.7, congrats! If you using the correect version! However if you are not, you will need to uninstall CUDA and install the appropriate version. To do that do this

pacman -R cuda cudnn 
curl https://archive.org/download/archlinux_pkg_cuda/cuda-11.8.0-1-x86_64.pkg.tar.zst -O
sudo pacman -U cuda-11.8.0-1-x86_64.pkg.tar.zst

Then restart your computer. Do note that running nvidia-smi may provide a different version of CUDA. Check using nvcc --version!

Step 2: Installing Conda

From now on this should be almost the same as the installation instructions on the official README with slight modifications. I will be copying most of the instructions verbatim just in case things changes.

curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh" 
bash Miniconda3.sh

Step 2: Create a new conda environment

conda create -n textgen python=3.10.9
conda activate textgen
conda install -c conda-forge cudatoolkit-dev

Step 3: Install Pytorch (This is where things are different)

Ensure that you select the correct version of pytorch from this website. If you are on CUDA 11.7, choose CUDA 11.7 and CUDA 11.8 if you are using CUDA 11.8. The instruction below is for CUDA 11.8.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Step 4: Install the web UI

git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
git checkout 50c70e2
pip install -r requirements.txt

Step 5: Install GCC11

You will need GCC11 for this to work.

NOTE: DO NOT DOWNGRADE YOUR CURRENT GCC. JUST INSTALL ANOTHER ONE! THIS MIGHT CAUSE YOUR PC TO NOT BOOT!

This step might take awhile. Be patient 💀

yay -S gcc11

Step 6: Install GPTQ-for-LLaMA

We will need to temporarily setup CXX and CC for this to work.

cd text-generation-webui
mkdir repositories
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda
cd GPTQ-for-LLaMa
git checkout 81fe867
export CXX=/usr/bin/gcc-11
export CC=/usr/bin/gcc-11
python setup_cuda.py install

Step 7: Download a model

This tutorial will showcase TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ

cd ../../
python ./download-model.py TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ

Step 8: Running the model

python server.py --model TheBloke_vicuna-7B-1.1-GPTQ-4bit-128g --wbits 4 --groupsize 128 --api --verbose --listen

Done!

Here's a screenshot demonstrating the webui! Demo of oobabooga webui

Showcase