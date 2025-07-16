Over the last few months, we have explored how to leverage large language models (LLMs) with Llama Stack and Node.js. While TypeScript/JavaScript is often the second language supported by frameworks used to leverage LLMs, Python is generally the first. We thought it would be interesting to go through some of the same exploration by porting over our scripts to Python.

This is the first of a 4-part series in which we'll explore using Llama Stack with the Python Hey Hi (AI) We will start by looking at how tool calling and agents work when using Python with Llama Stack using the same patterns and approaches that we used to examine other frameworks.

Setting up Llama Stack

Our first step was to get a running Llama Stack instance that we could experiment with. Llama Stack is a bit different from other frameworks in a few ways.

First, instead of providing a single implementation with a set of defined APIs, it aims to standardize a set of Hey Hi (AI) and drive a number of distributions. In other words, the goal is to have many implementations of the same Hey Hi (AI) with each implementation shipped by a different organization as a distribution. As is common with this approach, a reference distribution is provided, but there are already several alternative distributions available. You can see the available distributions here.

If you have trouble finding an example or documentation that shows what you want to do, using the docs endpoint is a great resource.

Our first Python Llama Stack application

Our next step was to create an example that we could run with Python. A Python client is available, so that is what we used: llama-stack-client-python. As stated in the documentation, it is an automatically generated client based on an OpenAPI definition of the reference API. There is also Markdown documentation for the implementation itself. As mentioned in the previous section, the docs endpoint was also a great resource for understanding the client functionality.

To start, we wanted to implement the same question flow that we used in the past to explore tool calling. This consists of providing the LLM with 2 tools:

favorite_color_tool : Returns the favorite color for a person in the specified city and country.

: Returns the favorite color for a person in the specified city and country. favorite_hockey_tool : Returns the favorite hockey team for a person in the specified city and country.

Then, we ran through this sequence of questions to see how well they were answered: [...]