AI has taken the tech world by storm. You’re probably familiar with ChatGPT, DeepSeek R1, or Anthropic’s Claude, but there’s also a subsection of open source AI models you can run on your own computer. They might not be as performant as the top-of-the-line options, but running open source models gives you token freedom, uncensored output, and a lot of fun in the process.
Open source LLMs are an important contribution to the AI industry. Meta, the creators of Facebook, have given back to the community with their Llama AI model. Currently, Llama 3 is their latest release, and today, we’ll download and run it on my computer.
Hardware Requirements
To run these AI models, you need a dedicated graphics card with at least 4 GB of VRAM. I’m using an older Nvidia RTX 2060 with 6 GB of VRAM, which demonstrates that even older hardware can run new AI models.
Downloading and Installing Ollama
With all that said, let’s go ahead and download Ollama:
- Open a web browser and visit Ollama.com.
- Download the Ollama software.
- Follow the installation instructions provided on the website.
Once you’ve installed Ollama, open your terminal and type:
Ollama
If you see a list of available commands, then the installation was successful.
Exploring the Models
Next, revisit Ollama.com to take a look at the different models available. Here you’ll find Llama 3.3, the latest build from Meta. Each model lists a number of parameters; the higher the number, the smarter the model is—but it also requires more VRAM. A good rule of thumb is to treat the parameter number as the amount of GB of VRAM needed. For example, Llama 3.3 requires around 70 GB of VRAM, as it’s designed for server infrastructure rather than a typical computer.
For local use, you’ll need a model with around 6B parameters or less. Llama 3.2, for instance, comes in models with much smaller parameter counts—3B and even 1B. These smaller models are usually trained to do one thing well rather than everything. The description for Llama 3.2 notes that it excels at following instructions, summarization, prompt rewriting, and tool usage.
A Note on Tools Integration
One great feature of Ollama is its integration with the Python programming language, allowing you to build AI agents. We’ll explore AI agents in a future post, so stay tuned!
Running Llama 3.2
To use the Llama 3.2 model (3B parameters):
- Copy the command provided on the Ollama website.
- Paste it into your terminal.
The model will automatically download and run. That’s it—now you can chat with Llama 3.2!
For example, you might try these commands:
- Summarize the Bitcoin whitepaper in three sentences.
- Create a prompt for Midjourney that generates a beautiful painting in the style of Picasso.
As you can see, this 3B parameter model runs very quickly on local hardware. Just be aware that because these models are much smaller than those provided by major companies, your results may vary. Always double-check any sensitive work.
