Post

DSPy Breakdown - Part 1: Fundamentals for Prompt Optimization

Exploring the framework for programming the large language models, not prompting

DSPy Breakdown - Part 1: Fundamentals for Prompt Optimization

With the rise of Language Models (LMs) , lots of data scientists are spending ton of their time on prompt engineering to test different LMs performances on their projects. Every week a new foundational model is released which is better and cheaper than all the previous releases, tempting all of the data scientists to test one more LLM for their projects. A satisfactory LLM testing requires ample time spent on prompt engineering to cover all the edge cases successfully, and to understand all the nuances a new LLM has. This takes weeks to be done.

In the age of AI (or automation), how about a tool that can automate the prompt engineering for data scientist while saving their time to focus on actual data science work i.e. isights generation from statistics, developing AI model architectures. That’s where DSPy comes in; it offers prompt tuning for different LMs, tuning LLM weights and AI agents.

Introduction

DSPy, pronounced as dee-s-pie, is developed by researcher Omar Khattab and others at Stanford NLP group. The DSPy research paper argues that existing LLM pipelines are typically implemented using hard-coded “prompt-temmplates” that are discovered via trial and error, whereas DSPy provides more systematic approach for developing and optimizing LLM pipelines.

DSPy is a framework for algorithmically optimizing LM prompts and weights. DSPy thinks on your behalf to optimize prompts saving time spent on thinking and trying different prompt engineering or prompt tuning strategies. It’ syntax is inspired from PyTorch.

Let’s go through the foundational knowledge of DSPy first, before diving into optimizing the prompts. The basic blocks of DSPy that lets you use LM in DSPy are:

  • Configuring LMs
  • Signatures
  • Modules

Let’s go through each of them in depth.

Configuring LMs

The very first step in DSPy is initializing a language model of your choice for the future operations in the code. You can do this as follows:

1
2
3
4
import dspy

global_llm = dspy.LM('openai/gpt-4o-mini', api_key = <API_KEY>, temperature = 0.7, max_tokens = 3000, cache = False)
dspy.configure(global_llm)

dspy.LM initializes the LLM of your choice and dspy.configure makes the initialized LLM as default LLM for the whole program. DSPy supports all the LLM providers, it can be found here.

temperature, cache, max_tokens are some of the attributes one can configure while initializing the LLM. More attributes can be read here.

By default LMs in DSPy are cached i.e. if you repeat the call, you will get the same outputs as before. Caching can be turned off by setting cache=False. When using small LMs like gpt-4o-mini, you may receive the same output even with cache=False. In Jupyter Notebook, check if caching is active by observing the response time — instant responses indicate caching, while delays suggest actual LLM calls.

LMs configured above can be called directly as below:

1
2
3
4
5
global_llm("Classify this sentiment into either Positive, Neutral or Negative: I loved the product. The service is worst though.", temperature = 0.2)
global_llm(messages=[{
    "role": "user",
    "content": "Classify this sentiment into either Positive, Neutral or Negative: I loved the product. The service is worst though."
}], temperature = 0.5)

Signatures

Prompts comprises of three parts viz.(a) what would be the inputs & outputs (b) describing the task and (c) prompting techniques such as Chain of Thought, ReAct, Program of Thought, etc. Signatures in DSPy is a programmatic way of defining the first two parts of prompt for your task.

There are two ways for defining Signatures:

  • Class-based Signatures
  • Inline Signatures

Class-based Signatures

In class-based, Signatures has two attributes; InputField() to define inputs and OutputField() to define outputs in the prompt. The task description is declared as the docstrings in the Signatures class. Let’s say we have task of sentiment classification of user reviews, the Signature is defined as:

1
2
3
4
5
6
7
8
import dspy

class SentimentSignature(dspy.Signature):
    """
    You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
    """
    sentiment_text = dspy.InputField()
    sentiment_classification = dspy.OutputField()

With the above defined signature and using standard prompting, the system prompt will be this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Your input fields are:
1. `sentiment_text` (str):
Your output fields are:
1. `sentiment_classification` (str):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## sentiment_text ## ]]
{sentiment_text}

[[ ## sentiment_classification ## ]]
{sentiment_classification}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.

The user prompt will be this:

1
2
3
4
[[ ## sentiment_text ## ]]
<The input text whose sentiment need to be identified would be here>

Respond with the corresponding output fields, starting with the field `[[ ## sentiment_classification ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.

In the system prompt, sentiment_text and sentiment_classification present as these are the variables (with the same text) declared in the SentimentSignature signature. The docstring in the SentimentSignature class is present at the end of the system prompt.

Inline Signatures

Contrary to class-based Signatures definition where there is class shenanigans, in inline Signatures are defined as a short string with argument names and optional types that define semantic roles for inputs/outputs.

For example, suppose our task is sentiment classification; the Signature would be: "sentence -> sentiment: bool" where text before the arrow -> is input and after is output. So sentence is the input and sentiment is output.

Signatures in inline based definition can also have multiple inputs and outputs with types. For multiple-choice question answering with reasoning the signature will look like this: "question: str, choices: list[str] -> reasoning: str, selection: int"

Similar to task description through docstrings in class-based Signatures definition, we can declare the task description with instructions parameter as shown below:

1
2
3
4
5
6
toxicity = dspy.Predict(
    dspy.Signature(
        "comment -> toxic: bool",
        instructions="Mark as 'toxic' if the comment includes insults, harassment, or sarcastic derogatory remarks",
    )
)

dspy.Predict is DSPy module which we will see in the following section.

Modules

DSPy Modules are abstractions of various prompting techniques. DSPy Modules takes Signatures as their input. Following are the built-in modules supporting various prompting techniques:

  1. dspy.Predict: It’s a very basic module and it doesn’t modify the Signature
  2. dspy.ChainOfThought : This module support Chain Of Thought(CoT) prompting technique where LM is asked to think step-by-step before returning the response.
  3. dspy.ProgramOfThought: This module’s prompt asks LM to generate code whose execution results will dictate the response.
  4. dspy.ReAct: This module supports ReACT prompts for AI agents.
  5. dspy.MultiChainComparison: This module can compare multiple outputs from ChainOfThought to produce a final prediction.

Let’s see an example of Signatures and Modules working together:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Signature

import dspy

class SentimentSignature(dspy.Signature):
    """
    You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
    """
    sentiment_text = dspy.InputField()
    sentiment_classification = dspy.OutputField()

# Module

sentiment_classifier = dspy.Predict(SentimentSignature)

# Calling the 'sentiment_classifier()` with the input sentence 
sentiment_classifier(sentiment_text = "I loved the product. The service is worst though.")

SentimentSignature is already explained before with its corresponding prompt. When calling the sentiment_classifier(), the user prompt will like this:

1
2
3
4
[[ ## sentiment_text ## ]]
I loved the product. The service is worst though.

Respond with the corresponding output fields, starting with the field `[[ ## sentiment_classification ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.

And in case of inline Signatures are used the code will be this:

1
2
3
4
5
6
7
8
9
10
11
12
# 1st Exaple
sentiment_classifier_1 = dspy.Predict("sentence: str -> sentiment: str)
sentiment_classifier_1(sentence = "I loved the product. The service is worst though.")

# 2nd Example
sentiment_classifier_2 = dspy.Predict(
    dspy.Signature(
        "sentence: str -> sentiment: str",
        instructions = "You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative."
    )
)
sentiment_classifier_2(sentence = "I loved the product. The service is worst though.")

Using Multiple Modules in one Module

We can combine multiple built-in modules into one bigger custom module. Each built-in module has forward() function similar to PyTorch’s forward(), and similarly we will forward() function in our custom module too. Below is the example of using multiple modules:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import dspy

class Hop(dspy.Module):
    def __init__(self, num_docs=10, num_hops=4):
        self.num_doc, self.num_hops = num_docs, num_hops
        self.generate_query = dspy.ChainOfThought('claim, notes -> query')
        self.append_notes = dspy.ChainOfThought('claim, notes, context -> new_notes: list[str], titles: list[str])

    def forward(self, claim: str) -> list[str]:
        notes = []
        titles = []

        for _ in range(self.num_hops):
            query = self.generate_query(claim=claim, notes=notes).query
            context = search(query, k=self.num_docs)
            prediction = self.append_notes(claim=claim, notes=notesm context=context)
            notes.extend(prediction.new_notes)
            titles.extend(prediction.titles)

        return dspy.Prediction(notes=notes, titles=list(set(titles)))

Final Code Example

Let’s see a final code example consisting of configuring LM and then using Signature and Modules to get response from LM.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import dspy
import os

openai_api_key = os.getenv("OPENAI_API_KEY")

# Initializing the gpt-4o-mini model with themperature of 0.8 and 3000 tokens.
language_model = dspy.LM('openai/gpt-4o-mini', api_key = openai_api_key, temperature = 0.8, max_tokens = 3000, cache = False)

# Declaring the signature for sentiment classification task
class SentimentSignature(dspy.Signature):
    """
    You would be given an input text; and you need to classify it into strictly these three sentiments: (a)Positive, (b)Neutral or (c)Negative.
    """
    sentiment_text = dspy.InputField()
    sentiment_classification = dspy.OutputField()


# Declaring the module using CoT prompting techniques.
class SentimentPrediction(dspy.Module):
    def __init__(self):
        self.predict = dspy.ChainOfThought(SentimentSignature)

    def forward(self, sentiment_text):
        prediction = self.predict(sentiment_text)
        return prediction.sentiment_classification
This post is licensed under CC BY 4.0 by the author.