Logo

Debugging and Monitoring Ollama Usage

Aug 12, 2024

Ollama is such a nice backend for running LLMs on your own, you can get it up and running in a single command, and right after it’s done, you can pull and chat with different models in no time.

While getting it up and running requires pretty much no configuration, debugging and monitoring it’s not as obvious. Here are some things I’ve been doing myself on that regard.

Keep an eye on htop and nvtop

The first suggestion is pretty basic: keep an eye on CPU and GPU usage while Ollama is generating text. That will give you an immediate answer if your GPU is being used or not. Ideally the GPU should be used to achieve much faster inference times.

htop will show CPU usage and nvtop (NVIDIA) GPU usage (or radeontop for AMD GPUs):

Get familiar with eval times with --verbose

On top of checking top-like monitoring tools (no pun intended), you should get familiar with inference times of your hardware. To do that, run ollama passing --verbose. You will get the times spent to evaluate each prompt right after each response:

>>> what are the top 10 most famous literary authors
Sure, here are the top 10 most famous literary authors:

1. William Shakespeare
2. Homer
3. Dante Alighieri
4. Jane Austen
5. Victor Hugo
6. Mark Twain
7. J.K. Rowling
8. Ernest Hemingway
9. Charles Dickens
10. Emily Dickinson

total duration:       789.131073ms
load duration:        25.543401ms
prompt eval count:    38 token(s)
prompt eval duration: 18.123ms
prompt eval rate:     2096.78 tokens/s
eval count:           70 token(s)
eval duration:        703.692ms
eval rate:            99.48 tokens/s

I got values around ~100 tokens/s for a RTX 2060, ~75 tokens/s for RX 7600, and ~13 tokens/s for Ryzen 5 5500 (on tiny models such as gemma:2b).

Check the modelfile of your model

You can find crucial information about how a model is behaving by checking its Modelfile, settings such as its initial system prompt and a custom temperature can greatly affect how the model replies to prompts.

Sample command:

$ ollama show phi3 --modelfile

Sample output:

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM phi3:latest

FROM /usr/share/ollama/.ollama/models/blobs/sha256-633fc5be925f9a484b61d6f9b9a78021eeb462100bd557309f01ba84cac26adf
TEMPLATE "{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>"
PARAMETER stop <|end|>
PARAMETER stop <|user|>
PARAMETER stop <|assistant|>
LICENSE """Microsoft.
Copyright (c) Microsoft Corporation.

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE."""

There’s some history on ~/.ollama/history

You can find the history of prompts ~/.ollama/history, but sadly it only contains the text you’ve sent but not the responses.

Enable OLLAMA_DEBUG and check the service logs

It can be useful to have Ollama logging all prompts. While not as useful if you are typing out every single prompt yourself on the CLI, but when connecting Ollama to other tools (tools performing RAG for instance) can be invaluable seeing what’s been injected on the prompts.

To have it logged, set the environment variable OLLAMA_DEBUG to 1. If you want to do it directly on the systemd service, edit /etc/systemd/system/ollama.service and add the following to the [Service] section:

Environment="OLLAMA_DEBUG=1"

To check the logs:

journalctl -u ollama --no-pager

Here’s what I got when prompting this PDF article while using OpenWebUI:

<|system|>
Use the following context as your learned knowledge, inside <context></context> XML tags.
<context>
    the matter diligently. My own favorites include the Analects of Confucius and the early Dialogues of Plato, which are little else but meditations on stupidity. Acknowledging that he did not know what truth is, Socrates spent his time exposing the false beliefs of those who thought they did. I am also partial to Erasmus’ In Praise of Folly, Jonathan Swift’s Gulliver’s Travels, and, in a more modern vein, Jacques Ellul’s

and Ed.D. from Columbia University. He was the Paulette Goddard Chair of Media Ecology at New York University and chair of the Department of Culture and Communication. His scholarly interests included media, education, language, and technology.  For more information, visit www.neilpostman.org.

www.neilpostman.org 2 In modern times, the list of educationists continues to include formidable intellects—William James, for example, whose Talks to Teachers is among the best books on education ever written. Two of the greatest philosophers in this century, Ludwig Wittgenstein and Karl Popper, were elementary-school teachers who of necessity would have thought deeply about educational issues. Wittgenstein’s professor at Cambridge, Bertrand Russell, founded a school, and Russell’s colleague, Alfred North Whitehead, wrote the impeccable Aims of Education. And, of course, America’s greatest homegrown systematic philosopher, John Dewey, was an educa-tionist par excellence. In other words, the history of Western philosophy is so bound up with the subject of education that the two can hardly be separated. One might even say that just as it is natural for a physicist upon reaching his deepest understandings to be drawn toward religion, so it is natural for a mature philosopher to turn toward the problems of education. Why, then, this persistent prejudice against the subject and those who make a profession of its study? Definitive Answers await a rich and extensive research project to which sociologists, psychologists, historians, perhaps even anthropologists must contribute their perspectives. I mention anthropology because I suspect the intensity of the prejudice varies from culture to culture. There are places—China, for example—where the prejudice may not exist at all. But if

writing on education than Confucius and Plato, but he too was an educationist if we may take that word to mean a person who is seriously concerned to understand how learning takes place and what part schooling plays in facilitating or obstructing it. In this sense, Quintilian was an educationist, and so were Erasmus, John Locke, Rousseau, and Thomas Jefferson. The great English poet John Milton was so moved by the prospect of writing an essay on education that he called the reforming of education one of “the greatest and noblest designs to be thought on.”                                                    1 This article is copyright 1988 by the National Council of Teachers of English.  Reprinted with permission. “The Educationist as Painkiller” was originally published in English Education (1988), 7-17.  It was also published in Conscientious Objections (New York: Alfred A. Knopf, 1988), 82-96.    2 Postman, Neil. Conscientious Objections (New York: Alfred A. Knopf, 1988), 82.

The Educationist as Painkiller
</context>

When answer to user:
- If you don't know, just say that you don't know.
- If you don't know when you are not sure, ask for clarification.
Avoid mentioning that you obtained the information from the context.
And answer according to the language of the user's question.

Given the context information, answer the query.
Query: give me a bullet list of the philosophers cited in the article<|end|>
<|user|>
give me a bullet list of the philosophers cited in the article<|end|>
<|assistant|>

Sources