I’m a big fan of Ollama, the easiest way to run LLMs on my Mac. Ollama integrates nicely with Hugging Face, the GitHub of LLMs, so it can be very easy to run a model published on Hugging Face with Ollama.
Click on the “use this model” drop down and click on Ollama
That will show you a console command you can paste into your terminal. The model I want to try out is mxbai-embed-large-v1-Q4_K_M-GGUF because it was the highest scoring GGUF model I could find on the embeddings leaderboard, and I’m experimenting with embeddings.
ollama run hf.co/elliotsayes/mxbai-embed-large-v1-Q4_K_M-GGUF
We now have the model accessible to Ollama. Note that the name of the model will be the full path, i.e. hf.co/.../.
So, to create a vector embedding with the model you would say
import ollamaembeddings = ollama.embed( model='hf.co/elliotsayes/mxbai-embed-large-v1-Q4_K_M-GGUF',input='The man with a tie ran for office.',).embeddingsembeddings[0][:10]