How To Generate an OpenAPI Spec With Pydantic V2
Pydantic (opens in a new tab) is considered by many API developers to be the best data validation library for Python, and with good reason. By defining an application's models in Pydantic, developers benefit from a vastly improved development experience, runtime data validation and serialization, and automatic OpenAPI schema generation.
However, many developers don't realize they can generate OpenAPI schemas from their Pydantic models, which they can then use to create SDKs, documentation, and server stubs.
In this guide, you'll learn how to create new Pydantic models, generate an OpenAPI schema from them, and use the generated schema to create an SDK for your API. We'll start with the simplest possible Pydantic model and gradually add more features to show how Pydantic models translate to OpenAPI schemas.
Prerequisites
Before we get started, make sure you have Python (opens in a new tab) 3.8 or higher installed on your machine. Check your Python version by running the following command:
We use Python 3.12.4 in this guide, but any version of Python 3.8 or higher should work.
You can clone our example repository from GitHub (opens in a new tab) to follow along with the code snippets in this guide, or you can create a new Python project and install the required libraries as we go.
Create a New Python Project
First, create a new Python project and install the Pydantic library:
Install the Required Libraries
We'll install Pydantic and PyYAML to generate and pretty-print the OpenAPI schema:
Pydantic to OpenAPI Schema Walkthrough
Let's follow a step-by-step process to generate an OpenAPI schema from a Pydantic model without any additional libraries.
Define a Simple Pydantic Model
Create a new Python file called models.py
and define a simple Pydantic model.
In this example, we define a Pydantic model called Pet
with three fields: id
, name
, and breed
. The id
field is an integer, and the name
and breed
fields are strings.
Generate JSON Schema for the Pydantic Model
Add a new function called print_json_schema
to the models.py
file that prints the JSON schema for the Pet
model.
This function uses the model_json_schema
method provided by Pydantic to generate the JSON schema, which Python then prints to the console as YAML. We use YAML for readability, but the output is still a valid JSON schema.
Run python models.py
to generate the JSON schema for the Pet
model and print it as YAML:
Multiple Pydantic Models
Let's add another Pydantic model called Owner
to the models.py
file.
The Owner
model has two fields: id
and name
. Both fields are integers. Additionally, the Owner
model has a list of Pet
objects.
Generate JSON Schema for Multiple Pydantic Models
Update the print_json_schema
function to print the JSON schema for both the Pet
and Owner
models.
Note that we're now calling the models_json_schema
(opens in a new tab) function from pydantic.json_schema
instead of the model_json_schema
method.
Run python models.py
to generate the JSON schema for both the Pet
and Owner
models and print it as YAML:
The generated schema includes definitions for both the Pet
and Owner
models. The Owner
model has a reference to the Pet
model, indicating that the Owner
model contains a list of Pet
objects.
Note that the root of the schema includes a $defs
key that contains the definitions for both models, and the Owner
model references the Pet
model using the $ref
keyword.
Customize Pydantic JSON Schema Generation
Let's customize the generated JSON schema to reference the Pet
model using the #/components/schemas
path instead of $defs
.
We'll use the ref_template
parameter of the models_json_schema
function to specify the reference template.
Next, we'll update the print_json_schema
function to print a JSON schema that resembles an OpenAPI schema's components
section.
Run python models.py
to generate the OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the components
section, with definitions for both the Pet
and Owner
models.
The JSON Schema we generated resembles an OpenAPI schema's components
section, but to generate a valid OpenAPI schema, we need to add the openapi
and info
sections.
Edit the print_json_schema
function in models.py
to include the openapi
and info
sections in the generated OpenAPI schema.
Run python models.py
to generate the complete OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the openapi
, info
, and components
sections with definitions for both the Pet
and Owner
models.
Now we have a complete OpenAPI document that we can use to generate SDK clients for our API. However, the generated OpenAPI schema does not contain descriptions or example values for the models. We can add these details to the Pydantic models to improve the generated OpenAPI schema.
Add Descriptions to Pydantic Models
Let's add docstrings to the Pet
and Owner
models to include additional information in the generated OpenAPI schema.
If we run python models.py
, we see that our Owner
schema now includes a description field, derived from the docstring we added to the Owner
Pydantic model.
The Pet
schema now also includes a description field, derived from the docstring we added to the Pet
Pydantic model.
Add OpenAPI Titles and Descriptions to Pydantic Fields
Let's add titles and descriptions to the fields of the Pet
and Owner
models to include additional information in the generated OpenAPI schema.
We'll use the Field
class from Pydantic to add descriptions to the fields.
If we run python models.py
, we see that our Pet
schema now includes descriptions for each field.
Add OpenAPI Example Values to Pydantic Models
Examples help API users understand your API's data structures, and some SDK and documentation generators use OpenAPI example values to generate useful code snippets and documentation.
Let's add example values to the Pet
and Owner
Pydantic models. Once again, we'll use the Field
class from Pydantic to add example values to the fields.
Note that the examples are added as a list per field, using the examples
parameter.
If we run python models.py
, we see that our Pet
schema now includes example values for each field.
Marking Fields as Optional in Pydantic Models
By default, Pydantic marks all fields as required. You can mark a field as optional by setting the default
parameter to None
.
Let's mark the breed
field in the Pet
model as optional by setting the default
parameter to None
.
If we run python models.py
, we see that the breed
field in the Pet
schema now has two types: string
and null
, and it has been removed from the required
list. Only id
and name
are required fields after marking breed
as optional.
Adding Enums to OpenAPI using Pydantic Models
Enums in OpenAPI are useful for defining a set of possible values for a field.
Let's add an enum called PetType
to the Pet
model to represent different types of pets.
In our generated OpenAPI schema, we have a new pet_type
field in the Pet
schema.
This enum is represented as a separate schema in the OpenAPI document.
Define a Simple Pydantic Model
Create a new Python file called models.py
and define a simple Pydantic model.
In this example, we define a Pydantic model called Pet
with three fields: id
, name
, and breed
. The id
field is an integer, and the name
and breed
fields are strings.
Generate JSON Schema for the Pydantic Model
Add a new function called print_json_schema
to the models.py
file that prints the JSON schema for the Pet
model.
This function uses the model_json_schema
method provided by Pydantic to generate the JSON schema, which Python then prints to the console as YAML. We use YAML for readability, but the output is still a valid JSON schema.
Run python models.py
to generate the JSON schema for the Pet
model and print it as YAML:
Multiple Pydantic Models
Let's add another Pydantic model called Owner
to the models.py
file.
The Owner
model has two fields: id
and name
. Both fields are integers. Additionally, the Owner
model has a list of Pet
objects.
Generate JSON Schema for Multiple Pydantic Models
Update the print_json_schema
function to print the JSON schema for both the Pet
and Owner
models.
Note that we're now calling the models_json_schema
(opens in a new tab) function from pydantic.json_schema
instead of the model_json_schema
method.
Run python models.py
to generate the JSON schema for both the Pet
and Owner
models and print it as YAML:
The generated schema includes definitions for both the Pet
and Owner
models. The Owner
model has a reference to the Pet
model, indicating that the Owner
model contains a list of Pet
objects.
Note that the root of the schema includes a $defs
key that contains the definitions for both models, and the Owner
model references the Pet
model using the $ref
keyword.
Customize Pydantic JSON Schema Generation
Let's customize the generated JSON schema to reference the Pet
model using the #/components/schemas
path instead of $defs
.
We'll use the ref_template
parameter of the models_json_schema
function to specify the reference template.
Next, we'll update the print_json_schema
function to print a JSON schema that resembles an OpenAPI schema's components
section.
Run python models.py
to generate the OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the components
section, with definitions for both the Pet
and Owner
models.
The JSON Schema we generated resembles an OpenAPI schema's components
section, but to generate a valid OpenAPI schema, we need to add the openapi
and info
sections.
Edit the print_json_schema
function in models.py
to include the openapi
and info
sections in the generated OpenAPI schema.
Run python models.py
to generate the complete OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the openapi
, info
, and components
sections with definitions for both the Pet
and Owner
models.
Now we have a complete OpenAPI document that we can use to generate SDK clients for our API. However, the generated OpenAPI schema does not contain descriptions or example values for the models. We can add these details to the Pydantic models to improve the generated OpenAPI schema.
Add Descriptions to Pydantic Models
Let's add docstrings to the Pet
and Owner
models to include additional information in the generated OpenAPI schema.
If we run python models.py
, we see that our Owner
schema now includes a description field, derived from the docstring we added to the Owner
Pydantic model.
The Pet
schema now also includes a description field, derived from the docstring we added to the Pet
Pydantic model.
Add OpenAPI Titles and Descriptions to Pydantic Fields
Let's add titles and descriptions to the fields of the Pet
and Owner
models to include additional information in the generated OpenAPI schema.
We'll use the Field
class from Pydantic to add descriptions to the fields.
If we run python models.py
, we see that our Pet
schema now includes descriptions for each field.
Add OpenAPI Example Values to Pydantic Models
Examples help API users understand your API's data structures, and some SDK and documentation generators use OpenAPI example values to generate useful code snippets and documentation.
Let's add example values to the Pet
and Owner
Pydantic models. Once again, we'll use the Field
class from Pydantic to add example values to the fields.
Note that the examples are added as a list per field, using the examples
parameter.
If we run python models.py
, we see that our Pet
schema now includes example values for each field.
Marking Fields as Optional in Pydantic Models
By default, Pydantic marks all fields as required. You can mark a field as optional by setting the default
parameter to None
.
Let's mark the breed
field in the Pet
model as optional by setting the default
parameter to None
.
If we run python models.py
, we see that the breed
field in the Pet
schema now has two types: string
and null
, and it has been removed from the required
list. Only id
and name
are required fields after marking breed
as optional.
Adding Enums to OpenAPI using Pydantic Models
Enums in OpenAPI are useful for defining a set of possible values for a field.
Let's add an enum called PetType
to the Pet
model to represent different types of pets.
In our generated OpenAPI schema, we have a new pet_type
field in the Pet
schema.
This enum is represented as a separate schema in the OpenAPI document.
Adding Paths and Operations to the OpenAPI Schema
Now that we have generated an OpenAPI schema from our Pydantic models, we can use the schema to generate SDK clients for our API.
However, the OpenAPI document we generated, while valid, does not include the paths
section, which defines the API endpoints and operations.
When using Pydantic with FastAPI, you can define your API endpoints and operations directly in your FastAPI application. FastAPI automatically generates the OpenAPI schema for your API, including the paths
section.
Let's see how we can define API endpoints and operations in a framework-agnostic way and add them to the OpenAPI schema.
Install openapi-pydantic
We'll use the openapi-pydantic
(opens in a new tab) library to define a complete OpenAPI schema with paths and operations.
The benefit of using openapi-pydantic
is that it allows you to define the API endpoints and operations in a Python dictionary, while still getting the benefit of Pydantic's IDE support and type checking.
The library includes convenience methods to convert Pydantic models to OpenAPI schema components and add them to the OpenAPI schema.
Install the openapi-pydantic
library:
Create a new Python file called api.py
and define the API endpoints and operations using the openapi-pydantic
library.
The api.py
file saves the complete OpenAPI schema to a file named openapi.yaml
.
Run python api.py
to generate the complete OpenAPI schema with paths and operations and save it to a file named openapi.yaml
.
Our api.py
file imports Pet
and Owner
models from models.py
.
We'll use the models.py
file from the previous steps to define the Pydantic models for Pet
and Owner
.
In api.py
, we then define two response schemas as Pydantic models: PetsResponse
and OwnersResponse
.
Defining response schemas as Pydantic models allows us to reuse them in multiple operations, and to use them for validation and serialization in our API request handlers.
We'll start by defining a function called construct_base_open_api
that returns an OpenAPI
object with the base configuration for our API.
The function defines the API title, version, and servers, and includes the paths for the /pets
, /pets/{pet_id}
, and /owners
endpoints.
The /pets
path includes two operations: GET
to list all pets and POST
to create a pet.
The GET
operation returns a list of pets using the PetsResponse
schema.
Note that we added operationId
and description
fields to the operations to provide additional information about the operation.
Clear operation IDs and descriptions help API users understand the purpose of each operation and allow SDK generators to create more informative client code.
The POST
operation creates a pet using the Pet
schema as the request body and returns the created pet using the Pet
schema.
We use the PydanticSchema
class from openapi-pydantic
to reference the Pydantic model in the OpenAPI schema.
In a real-world application, you would likely not include the pet's ID in the request body as the server would generate the ID, but for simplicity, we include it here.
This translates to the following OpenAPI operation:
The /pets/{pet_id}
path includes a GET
operation to get a pet by ID.
The operation includes a path parameter pet_id
to specify the ID of the pet to retrieve.
The GET
operation's parameters
field includes the path parameter pet_id
with a description, required flag, and schema definition.
The responses
field includes a 200
response with the Pet
schema as the response body.
This translates to the following OpenAPI operation.
Note how the generated schema closely resembles the Pydantic model.
We'll leave the rest of the openapi.yaml
file, as it is similar to the components generated in the previous section.
Create a new Python file called api.py
and define the API endpoints and operations using the openapi-pydantic
library.
The api.py
file saves the complete OpenAPI schema to a file named openapi.yaml
.
Run python api.py
to generate the complete OpenAPI schema with paths and operations and save it to a file named openapi.yaml
.
Our api.py
file imports Pet
and Owner
models from models.py
.
We'll use the models.py
file from the previous steps to define the Pydantic models for Pet
and Owner
.
In api.py
, we then define two response schemas as Pydantic models: PetsResponse
and OwnersResponse
.
Defining response schemas as Pydantic models allows us to reuse them in multiple operations, and to use them for validation and serialization in our API request handlers.
We'll start by defining a function called construct_base_open_api
that returns an OpenAPI
object with the base configuration for our API.
The function defines the API title, version, and servers, and includes the paths for the /pets
, /pets/{pet_id}
, and /owners
endpoints.
The /pets
path includes two operations: GET
to list all pets and POST
to create a pet.
The GET
operation returns a list of pets using the PetsResponse
schema.
Note that we added operationId
and description
fields to the operations to provide additional information about the operation.
Clear operation IDs and descriptions help API users understand the purpose of each operation and allow SDK generators to create more informative client code.
The POST
operation creates a pet using the Pet
schema as the request body and returns the created pet using the Pet
schema.
We use the PydanticSchema
class from openapi-pydantic
to reference the Pydantic model in the OpenAPI schema.
In a real-world application, you would likely not include the pet's ID in the request body as the server would generate the ID, but for simplicity, we include it here.
This translates to the following OpenAPI operation:
The /pets/{pet_id}
path includes a GET
operation to get a pet by ID.
The operation includes a path parameter pet_id
to specify the ID of the pet to retrieve.
The GET
operation's parameters
field includes the path parameter pet_id
with a description, required flag, and schema definition.
The responses
field includes a 200
response with the Pet
schema as the response body.
This translates to the following OpenAPI operation.
Note how the generated schema closely resembles the Pydantic model.
We'll leave the rest of the openapi.yaml
file, as it is similar to the components generated in the previous section.
Generating an SDK from the OpenAPI Schema
Now that we have a complete OpenAPI schema with paths and operations, we can use it to generate an SDK client for our API.
Prerequisites for SDK Generation
Install Speakeasy by following the Speakeasy installation instructions.
On macOS, you can install Speakeasy using Homebrew:
Authenticate with Speakeasy using the following command:
Generate an SDK Using Speakeasy
Run the following command to generate an SDK from the openapi.yaml
file:
Follow the onscreen prompts to provide the necessary configuration details for your new SDK such as the name, schema location and output path. Enter openapi.yaml
when prompted for the OpenAPI document location and select TypeScript when prompted for which language you would like to generate.
Adding Speakeasy Extensions to the OpenAPI Schema
Speakeasy uses OpenAPI extensions to provide additional information for generating SDKs.
We can add extensions using OpenAPI Overlays, which are YAML files that Speakeasy overlays on top of the OpenAPI schema.
Alternatively, you can add extensions directly to the OpenAPI schema using the x-
prefix.
For example, you can add the x-speakeasy-retries
extension to have Speakeasy generate retry logic in the SDK.
Import the Dict
and Any
types from the typing
module in api.py
, and ConfigDict
from pydantic
.
We'll use these types to define the x-speakeasy-retries
extension in the OpenAPI schema.
In the OpenAPIwithRetries
class, we define the x-speakeasy-retries
extension.
Note that we need to use the alias
parameter to define the extension with the x-
prefix, then allow ourselves to use the xSpeakeasyRetries
attribute in the class by setting populate_by_name=True
in the model_config
.
We then update the construct_base_open_api
function to return an OpenAPIwithRetries
object.
Add xSpeakeasyRetries
to the OpenAPIwithRetries
object in the construct_base_open_api
function.
This translates to the following OpenAPI schema:
Import the Dict
and Any
types from the typing
module in api.py
, and ConfigDict
from pydantic
.
We'll use these types to define the x-speakeasy-retries
extension in the OpenAPI schema.
In the OpenAPIwithRetries
class, we define the x-speakeasy-retries
extension.
Note that we need to use the alias
parameter to define the extension with the x-
prefix, then allow ourselves to use the xSpeakeasyRetries
attribute in the class by setting populate_by_name=True
in the model_config
.
We then update the construct_base_open_api
function to return an OpenAPIwithRetries
object.
Add xSpeakeasyRetries
to the OpenAPIwithRetries
object in the construct_base_open_api
function.
This translates to the following OpenAPI schema:
Add Tags to the OpenAPI Schema
To group operations in the OpenAPI schema, you can use tags. This also allows Speakeasy to structure the generated SDK code and documentation logically.
Add a tags
field to the OpenAPIwithRetries
object, then add a tags
field to each operation in the construct_base_open_api
function:
Run python api.py
to update the openapi.yaml
file with the tags
field, then regenerate the SDK using Speakeasy.
Speakeasy will detect the changes to your OpenAPI schema, generate the SDK with the updated tags, and automatically increment the SDK's version number.
Take a look at the generated SDK to see how Speakeasy groups operations by tags.
We Can Help Get Your Pydantic Models Ready for SDK Generation
In this tutorial, we learned how to generate an OpenAPI schema from Pydantic models and use it to generate an SDK client using Speakeasy.
If you would like to discuss how to get your Pydantic models ready for SDK generation, give us feedback, or shoot the breeze about all things OpenAPI and SDKs, join our Slack (opens in a new tab).