Monitor LangChain Ollama APIs
To monitor Ollama calls, the Python Agent reports these metrics:
- Input Tokens
- Output Tokens
- Time to first token metric (ms)
- Time per output token (ms)
- Average Response Time (ms)
- Prompt count
- Embedding queries count
- Errors
For Token Metrics, ensure to install transformers Python library. See transformers.
pip install transformers
Example metric path: Agent|Langchain|LLM|llama3.2_latest|Average Response Time (ms))