nanogpt kv cache first attempt
1. Run basic nano-gpt
git clone https://github.com/karpathy/nanoGPT.git
Install necessary packages
pip install -r requirements.txt
I have these packages in the requirements.txt
blobfile==2.0.1
certifi==2022.12.7
charset-normalizer==3.0.1
filelock==3.9.0
idna==3.4
lxml==4.9.2
numpy==1.24.2
pycryptodomex==3.17
pytz==2022.7.1
regex==2022.10.31
requests==2.28.2
tokenizers==0.13.2
torch==2.0.0
typing_extensions==4.4.0
urllib3==1.26.14
torch==2.0.0
numpy==1.24.2
transformers==4.28.1
datasets==2.11.0
tiktoken==0.3.3
wandb==0.14.2
tqdm==4.65.0
Follow quick start guidance in nanogpt repo do make sure that we can run training and inference successfully.
python data/shakespeare_char/prepare.py
python train.py --compile=False config/train_shakespeare_char.py
python sample.py --out_dir=out-shakespeare-char
My python version is 3.11 which is too high for model compile so I added --compile=False
in train command.
With my A800 gpu, I get a loss 0.0449 after 5000 iteration training.
iter 4970: loss 0.0461, time 18.12ms, mfu 20.21%
iter 4980: loss 0.0441, time 18.14ms, mfu 20.24%
iter 4990: loss 0.0464, time 18.13ms, mfu 20.27%
step 5000: train loss 0.0383, val loss 4.7262
iter 5000: loss 0.0449, time 3352.84ms, mfu 18.26%
2. Load GPT-2 models checkpoints and test performance
https://stackoverflow.com/questions/75110981/sslerror-httpsconnectionpoolhost-huggingface-co-port-443-max-retries-exce
proxy error while trying to download gpt2 model from huggingface: https://github.com/huggingface/transformers/issues/17611
First downgrad requests version to 2.27.1
pip install requests==2.27.1
And then adding these two lines of code in train.py
and sample.py
fix the proxy connection issue for me
os.environ['CURL_CA_BUNDLE'] = ''
os.environ['HF_ENDPOINT']= 'https://hf-mirror.com'
Run sample.py
to get a test of gpt2 model with params downloaded from huggingface.
python sample.py --init_from='gpt2'
I tried to start with “please tell me a joke.” The output is not anything like joke but still very readable.
please tell me a joke
[…]
My name is Zarek, but I am extremely sad for you.
You can't even come to my house anymore
I'm sorry, I know
I have a dream
I don't know how long this thing will last
My name Is Zarek
I'm an adult who believes that
The problem with your friend is that he doesnt know
He doesn't know how to act
running time for 10 times inference:
---------------
Elapsed time: 25.4s
3. Implement KV cache for faster inference
4. Test KV cache performance
References
youtube video llm kv cache explanation
requirements.txt to run nano-gpt nano-gpt kv cache pr example
huggingface transformers kv cache source code on github
https://zhuanlan.zhihu.com/p/646577898
https://zhuanlan.zhihu.com/p/624740065
Enjoy Reading This Article?
Here are some more articles you might like to read next: