DSPy Breakdown - Part 1: Fundamentals for Prompt Optimization
Exploring the framework for programming the large language models, not prompting
With the rise of Language Models (LMs) , lots of data scientists are spending ton of their time on prompt engineering to test different LMs performances on their projects. Every week a new foundational model is released which is better and cheaper than all the previous releases, tempting all of the data scientists to test one more LLM for their projects. A satisfactory LLM testing requires ample time spent on prompt engineering to cover all the edge cases successfully, and to understand all the nuances a new LLM has. This takes weeks to be done.
In the age of AI (or automation), how about a tool that can automate the prompt engineering for data scientist while saving their time to focus on actual data science work i.e. isights generation from statistics, developing AI model architectures. That’s where DSPy comes in; it offers prompt tuning for different LMs, tuning LLM weights and AI agents.
Introduction
DSPy, pronounced as dee-s-pie, is developed by researcher Omar Khattab and others at Stanford NLP group. The DSPy research paper argues that existing LLM pipelines are typically implemented using hard-coded “prompt-temmplates” that are discovered via trial and error, whereas DSPy provides more systematic approach for developing and optimizing LLM pipelines.
DSPy is a framework for algorithmically optimizing LM prompts and weights. DSPy thinks on your behalf to optimize prompts saving time spent on thinking and trying different prompt engineering or prompt tuning strategies. It’ syntax is inspired from PyTorch.
Let’s go through the foundational knowledge of DSPy first, before diving into optimizing the prompts. The basic blocks of DSPy that lets you use LM in DSPy are:
- Configuring LMs
- Signatures
- Modules
Let’s go through each of them in depth.
Configuring LMs
The very first step in DSPy is initializing a language model of your choice for the future operations in the code. You can do this as follows:
1
2
3
4
import dspy
global_llm = dspy.LM('openai/gpt-4o-mini', api_key = <API_KEY>, temperature = 0.7, max_tokens = 3000, cache = False)
dspy.configure(global_llm)
dspy.LM
initializes the LLM of your choice and dspy.configure
makes the initialized LLM as default LLM for the whole program. DSPy supports all the LLM providers, it can be found here.
temperature
, cache
, max_tokens
are some of the attributes one can configure while initializing the LLM. More attributes can be read here.
By default LMs in DSPy are cached i.e. if you repeat the call, you will get the same outputs as before. Caching can be turned off by setting
cache=False
. When using small LMs like gpt-4o-mini, you may receive the same output even withcache=False
. In Jupyter Notebook, check if caching is active by observing the response time — instant responses indicate caching, while delays suggest actual LLM calls.
LMs configured above can be called directly as below:
1
2
3
4
5
global_llm("Classify this sentiment into either Positive, Neutral or Negative: I loved the product. The service is worst though.", temperature = 0.2)
global_llm(messages=[{
"role": "user",
"content": "Classify this sentiment into either Positive, Neutral or Negative: I loved the product. The service is worst though."
}], temperature = 0.5)
Signatures
Prompts comprises of three parts viz.(a) what would be the inputs & outputs (b) describing the task and (c) prompting techniques such as Chain of Thought, ReAct, Program of Thought, etc. Signatures in DSPy is a programmatic way of defining the first two parts of prompt for your task.
There are two ways for defining Signatures:
- Class-based Signatures
- Inline Signatures
Class-based Signatures
In class-based, Signatures has two attributes; InputField()
to define inputs and OutputField()
to define outputs in the prompt. The task description is declared as the docstrings in the Signatures
class. Let’s say we have task of sentiment classification of user reviews, the Signature
is defined as:
1
2
3
4
5
6
7
8
import dspy
class SentimentSignature(dspy.Signature):
"""
You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
"""
sentiment_text = dspy.InputField()
sentiment_classification = dspy.OutputField()
With the above defined signature and using standard prompting, the system prompt will be this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Your input fields are:
1. `sentiment_text` (str):
Your output fields are:
1. `sentiment_classification` (str):
All interactions will be structured in the following way, with the appropriate values filled in.
[[ ## sentiment_text ## ]]
{sentiment_text}
[[ ## sentiment_classification ## ]]
{sentiment_classification}
[[ ## completed ## ]]
In adhering to this structure, your objective is:
You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
The user prompt will be this:
1
2
3
4
[[ ## sentiment_text ## ]]
<The input text whose sentiment need to be identified would be here>
Respond with the corresponding output fields, starting with the field `[[ ## sentiment_classification ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.
In the system prompt, sentiment_text
and sentiment_classification
present as these are the variables (with the same text) declared in the SentimentSignature
signature. The docstring in the SentimentSignature
class is present at the end of the system prompt.
Inline Signatures
Contrary to class-based Signatures definition where there is class shenanigans, in inline Signatures are defined as a short string with argument names and optional types that define semantic roles for inputs/outputs.
For example, suppose our task is sentiment classification; the Signature would be: "sentence -> sentiment: bool"
where text before the arrow ->
is input and after is output. So sentence
is the input and sentiment
is output.
Signatures in inline based definition can also have multiple inputs and outputs with types. For multiple-choice question answering with reasoning the signature will look like this: "question: str, choices: list[str] -> reasoning: str, selection: int"
Similar to task description through docstrings in class-based Signatures definition, we can declare the task description with instructions
parameter as shown below:
1
2
3
4
5
6
toxicity = dspy.Predict(
dspy.Signature(
"comment -> toxic: bool",
instructions="Mark as 'toxic' if the comment includes insults, harassment, or sarcastic derogatory remarks",
)
)
dspy.Predict
is DSPy module which we will see in the following section.
Modules
DSPy Modules are abstractions of various prompting techniques. DSPy Modules
takes Signatures
as their input. Following are the built-in modules supporting various prompting techniques:
dspy.Predict
: It’s a very basic module and it doesn’t modify the Signaturedspy.ChainOfThought
: This module support Chain Of Thought(CoT) prompting technique where LM is asked to think step-by-step before returning the response.dspy.ProgramOfThought
: This module’s prompt asks LM to generate code whose execution results will dictate the response.dspy.ReAct
: This module supports ReACT prompts for AI agents.dspy.MultiChainComparison
: This module can compare multiple outputs fromChainOfThought
to produce a final prediction.
Let’s see an example of Signatures
and Modules
working together:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Signature
import dspy
class SentimentSignature(dspy.Signature):
"""
You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
"""
sentiment_text = dspy.InputField()
sentiment_classification = dspy.OutputField()
# Module
sentiment_classifier = dspy.Predict(SentimentSignature)
# Calling the 'sentiment_classifier()` with the input sentence
sentiment_classifier(sentiment_text = "I loved the product. The service is worst though.")
SentimentSignature
is already explained before with its corresponding prompt. When calling the sentiment_classifier()
, the user prompt will like this:
1
2
3
4
[[ ## sentiment_text ## ]]
I loved the product. The service is worst though.
Respond with the corresponding output fields, starting with the field `[[ ## sentiment_classification ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.
And in case of inline Signatures are used the code will be this:
1
2
3
4
5
6
7
8
9
10
11
12
# 1st Exaple
sentiment_classifier_1 = dspy.Predict("sentence: str -> sentiment: str)
sentiment_classifier_1(sentence = "I loved the product. The service is worst though.")
# 2nd Example
sentiment_classifier_2 = dspy.Predict(
dspy.Signature(
"sentence: str -> sentiment: str",
instructions = "You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative."
)
)
sentiment_classifier_2(sentence = "I loved the product. The service is worst though.")
Using Multiple Modules in one Module
We can combine multiple built-in modules into one bigger custom module. Each built-in module has forward()
function similar to PyTorch
’s forward()
, and similarly we will forward()
function in our custom module too. Below is the example of using multiple modules:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import dspy
class Hop(dspy.Module):
def __init__(self, num_docs=10, num_hops=4):
self.num_doc, self.num_hops = num_docs, num_hops
self.generate_query = dspy.ChainOfThought('claim, notes -> query')
self.append_notes = dspy.ChainOfThought('claim, notes, context -> new_notes: list[str], titles: list[str])
def forward(self, claim: str) -> list[str]:
notes = []
titles = []
for _ in range(self.num_hops):
query = self.generate_query(claim=claim, notes=notes).query
context = search(query, k=self.num_docs)
prediction = self.append_notes(claim=claim, notes=notesm context=context)
notes.extend(prediction.new_notes)
titles.extend(prediction.titles)
return dspy.Prediction(notes=notes, titles=list(set(titles)))
Final Code Example
Let’s see a final code example consisting of configuring LM and then using Signature and Modules to get response from LM.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import dspy
import os
openai_api_key = os.getenv("OPENAI_API_KEY")
# Initializing the gpt-4o-mini model with themperature of 0.8 and 3000 tokens.
language_model = dspy.LM('openai/gpt-4o-mini', api_key = openai_api_key, temperature = 0.8, max_tokens = 3000, cache = False)
# Declaring the signature for sentiment classification task
class SentimentSignature(dspy.Signature):
"""
You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
"""
sentiment_text = dspy.InputField()
sentiment_classification = dspy.OutputField()
# Declaring the module using CoT prompting techniques.
class SentimentPrediction(dspy.Module):
def __init__(self):
self.predict = dspy.ChainOfThought(SentimentSignature)
def forward(self, sentiment_text):
prediction = self.predict(sentiment_text)
return prediction.sentiment_classification