Skip to main content

Run LLM directly on Hugging Face Hub

· 2 min read

Today we will learn how to connect to a model hosted on HuggingFace Hub, and ask it questions just like you would with ChatGPT. It's really simple and will only take 10 minutes.

The tools we will need are as follows:

  • Python. We need Python 3.
  • Langchain. Langchain gives us libraries in Javascript and Python to interact with the LLMs more easily.
  • HuggingFaceHub. HuggingFace Hub has hundreds of thousands of LLMs we can use for free.

We need to get an API token from HuggingFace Hub. Once you are on HuggingFace Hub, go to your profile and then the access token section, and generate a new token.

Get API Key

Save this to your environment variables as HUGGINGFACEHUB_API_TOKEN.

All we have to do is to load in HuggingFaceHub and LLMChain from Langchain, and tell HuggingFaceHub which model we would like to use. For this example, we are going with GPT2 model. Then it's as simple as running the LLM and give it a prompt. Setting verbose to true enables us to see more logs.

huggingfacehub_demo.py
from langchain import HuggingFaceHub, LLMChain
from dotenv import load_dotenv
from langchain.prompts import PromptTemplate

load_dotenv()

hub_llm = HuggingFaceHub(
repo_id="gpt2",
model_kwargs={"max_length": 100},
)

prompt = PromptTemplate(
input_variables=["question"],
template="Give me the answer to the following sports question: {question}",
)

hub_chain = LLMChain(prompt=prompt, llm=hub_llm, verbose=True)
print(hub_chain.run("Who won world cup in 2008?"))

If you run the file python huggingfacehub_demo.py, you will see the result.

Answer

The results aren't perfect. You can play with other models and feed them more use case specific questions to get better results.