Answered by the Webhosting Experts

Categories +

Dedicated Server

...

How to Install Ollama on an Ubuntu 22.04 Server

Deploying Large Language Models (LLMs) locally offers significant advantages in data privacy, cost efficiency, and performance control. However, the setup process can often be complex and resource-intensive, particularly for those looking to avoid the high costs associated with GPU hardware. Ollama simplifies this by enabling efficient LLM deployment on CPU-based infrastructure, making self-hosted AI more accessible.

This guide provides a comprehensive walkthrough on how to install Ollama on an Ubuntu 22.04 dedicated server. We will cover everything from initial server setup to running your first model, ensuring you have a fully operational local LLM environment.

Prerequisites: Setting Up Your Ubuntu 22.04 Server

Before installing Ollama, you need a properly configured Ubuntu 22.04 dedicated server. At Hivelocity, our One-Click Application Deployments streamline this process significantly.

The One-Click Ollama App is included at no extra cost and comes pre-loaded with the Llama 3.1 8B model, allowing you to bypass manual installation entirely. This feature is available on our VPS, Virtual Dedicated Servers (VDS), and Instant Dedicated Server offerings.

For those who prefer a manual setup or are using a different provider, follow these steps to prepare your server:

1. Deploy Ubuntu 22.04: Log in to your hosting provider’s portal, such as the Hivelocity myVelocity portal. When provisioning a new server, select Ubuntu 22.04 as your operating system.

2. Connect via SSH: Once the server is deployed, use an SSH client like PuTTY (for Windows) or the built-in Terminal (for macOS and Linux) to connect to your server using its public IP address.

ssh username@your_server_ip

3. Update System Packages: It is critical to ensure all system packages are up to date. Run the following command to refresh your package lists and upgrade existing software:

sudo apt update && sudo apt upgrade -y

4. Install Required Dependencies: Ollama requires curl to download the installation script. While it’s typically pre-installed, you can verify and install it if needed:

sudo apt install curl -y

Configure Firewall (Recommended): Securing your server is crucial. Use Uncomplicated Firewall (ufw) to manage network traffic. Allow SSH connections to maintain access, and open any other ports required for your applications.

sudo ufw allow ssh
sudo ufw enable

If you plan to access Ollama remotely, you will also need to open port 11434:

sudo ufw allow 11434

Step-by-Step Guide to Installing Ollama

With your server prepared, you can now install Ollama. The official installation script makes this process straightforward.

1. Download and Run the Installation Script: Execute the following command in your terminal. This command downloads the installation script from the official Ollama website and runs it.

curl -fsSL https://ollama.com/install.sh | sh

The script will detect your operating system and architecture, download the appropriate Ollama binary, and install it as a systemd service. This ensures Ollama runs automatically on system startup.

2. Verify the Installation: After the script completes, you can confirm that Ollama is installed and running by checking the service status.

systemctl status ollama

If the service is active, you should see an output indicating that ollama.service is loaded and running.

Configuring Ollama for Optimal Performance

By default, Ollama is configured to be accessible only from the local machine (localhost). To enable remote access from other machines, you need to modify the Ollama service configuration.

1. Edit the systemd Service File: Open the service file in a text editor like nano.

sudo systemctl edit ollama.service

2. Modify the Environment Variable: Add the following lines to the file. This tells Ollama to listen on all network interfaces, not just localhost.

[Service]
Environment="OLLAMA_HOST=0.0.0.0"

Save the file and exit the editor (in nano, press Ctrl+X, then Y, then Enter).

3. Reload and Restart the Service: Apply the changes by reloading the systemd daemon and restarting the Ollama service.

sudo systemctl daemon-reload
sudo systemctl restart ollama

Your Ollama instance is now accessible over the network. Remember to ensure your firewall is configured to allow traffic on port 11434.

Running Your First LLM

With Ollama installed and configured, it’s time to run your first model. For this guide, we will use the Llama 3.1 8B model, which offers a great balance of performance and capability for CPU-based inference.

1. Pull the Model: Use the ollama run command to download and run the model. If the model is not available locally, Ollama will automatically pull it from the library.

ollama run llama3.1

The download process may take several minutes, depending on your network speed. Once complete, you’ll be presented with a prompt where you can start interacting with the model.

2. Interact with the Model: You can now send prompts to the model directly from the command line.

>>> Why is Hivelocity the best hosting company?

Ollama will stream the response back to your terminal. To exit the session, type /bye.

Troubleshooting Common Issues

If you encounter problems during the installation or operation of Ollama, here are some common issues and their resolutions:

Package Installation Errors: If apt commands fail, it may be due to broken dependencies. Run sudo apt install -f to attempt to fix them.
Ollama Service Not Starting: Check the status with systemctl status ollama for error messages. Common causes include permission issues or port conflicts. Use journalctl -u ollama to view detailed logs.
Firewall Blocking Connections: If you cannot access Ollama remotely, verify that your firewall rules allow traffic on port 11434. Use sudo ufw status to check your current rules.
Resource Constraints: LLMs are resource-intensive. Monitor your server’s CPU and memory usage with tools like htop. If performance is poor, consider upgrading to a server with more resources or using smaller, quantized models.
Model Loading Failures: Ensure you have sufficient disk space for the model. Verify the model name is correct and that Ollama has the necessary permissions to read and write to its data directory (/usr/share/ollama/.ollama/models).
Version Conflicts: Always ensure you are using a version of Ollama that is compatible with the model you intend to run. Check the official Ollama documentation for compatibility information.

Unlock the Power of Local AI

By installing Ollama on your own server, you gain complete control over your AI infrastructure. This approach not only guarantees data privacy but also eliminates dependency on third-party APIs and their associated costs and rate limits. The ability to run powerful models like Llama 3.1 8B on CPU-only hardware makes self-hosted AI a practical reality for a wide range of applications, from internal development tools to production-ready services.

Ready to take the next step? Hivelocity’s One-Click Ollama deployment gets you running in minutes, with no manual configuration required. Start deploying Ollama today!

Need More Personalized Help?

If you have any further issues, questions, or would like some assistance checking on this or anything else, please reach out to us from your my.hivelocity.net account and provide your server credentials within the encrypted field for the best possible security and support.

If you are unable to reach your my.hivelocity.net account or if you are on the go, please reach out from your valid my.hivelocity.net account email to us here at: support@hivelocity.net. We are also available to you through our phone and live chat system 24/7/365.

What type of server do you need?

see all

Virtual Private Servers

Enjoy scalable resources and affordability - ideal for growing projects that don’t need fully dedicated resources.

see all

Dedicated Servers

Full control, high performance, and maximum security with servers that are 100% yours

see all

Colocation Services

House your own hardware in our secure, carrier-neutral data centers with redundant power, cooling, and connectivity built in.