Post

DSPy Breakdown - Part 2: Optimizing Prompts

Learnings how to use DSPy to optimze prompts for LMs

DSPy Breakdown - Part 2: Optimizing Prompts

In the previous part, we learned the basic fundamentals of DSPy, which is, how to configure the language models globally, how to inference through DSPy through Signatures and Modules. In this part, we will learn how to optimize or finetune the prompt for different LMs.

DSPy syntax is inspired from PyTorch, hence I will be refencing PyTorch modules to explain DSPy.

DSPy Examples

A machine learning model is trained or optimized using labeled data, where input data is fed to the model and its weights are adjusted to produce outputs that closely match the corresponding target labels. Simlarly, in prompt optimization a labeled data is used and prompt’s language (or text) are adjusted to produce the expected output.

A prompt an be viewed as a proxy for a machine learning model, where modifying the prompt text is analogous to adjusting the model’s weights, as it influences the behaviour and output of the underlying model.

While training a deep learning model in PyTorch, the data is typically represented using the Datasets class. Similarly, in DSPy, data used for prompt optimization is represented using the Example class.

Individual datapoints are represented with Example class as:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import dspy

sentiment_example = dspy.Example(
    sentiment_one = "The product quality was excellent and exceeded my expectations.",
    sentiment_two = "The shipping was delayed and customer service was unhelpful.",
    overall_sentiment = "neutral")

print(sentiment_example)
print(sentiment_example.sentiment_one)
print(sentiment_example.sentiment_two)
print(sentiment_example.overall_sentiment)

# Outputs
Example({'sentiment_one': 'The product quality was excellent and exceeded my expectations.', 'sentiment_two': 'The shipping was delayed and customer service was unhelpful.', 'overall_sentiment': 'neutral'})
The product quality was excellent and exceeded my expectations.
The shipping was delayed and customer service was unhelpful.
neutral

In traditional machine learnings as data is labeled, data is separated as “inputs” and “outputs”. In DSPy, Example objects have a with_inputs() method, which can mark specific fields as inputs. (The rest are just labels)

1
sentiment_example = sentiment_example.with_inputs("sentiment_one", "sentiment_two") # When with_inputs() is called, a new copy of the Example object is returned. The original object is unchanged.

For more Example documentation, refer here.

This post is licensed under CC BY 4.0 by the author.