JSONL responses in OpenAPI
JSON Lines (JSONL) is a convenient format for storing structured data that may be processed one record at a time. It’s a simple format where each line is a valid JSON value, typically a JSON object or array. JSONL is particularly useful for handling large datasets, streaming data, or log files where each line represents a separate record.
Understanding JSONL format
JSONL (also known as newline-delimited JSON) consists of multiple JSON objects, with each object on a separate line. Each line must be a valid JSON value, and lines are separated by a newline character (\n). More details on the format can be found on JsonLines Docs
Here’s an example of a JSONL file:
{"name": "Alice", "age": 30, "city": "New York"}
{"name": "Bob", "age": 25, "city": "San Francisco"}
{"name": "Charlie", "age": 35, "city": "Chicago"}JSONL offers several advantages over traditional JSON:
- Streaming: JSONL can be processed one line at a time, making it ideal for streaming applications.
- Append-friendly: New records can be easily appended to the end of a JSONL file.
- Memory-efficient: Processing JSONL doesn’t require loading the entire dataset into memory.
- Parallelization: JSONL data can be easily split and processed in parallel.
Defining JSONL responses in OpenAPI documents
OpenAPI v3.2.0 JSONL support
OpenAPI v3.2.0 recognizes application/jsonl as a sequential media type and introduces the itemSchema keyword for defining the structure of individual records in the stream.
JSONL responses can be defined in OpenAPI by using the application/jsonl or text/jsonl MIME type. In OpenAPI v3.0 and v3.1, JSONL isn’t natively supported, but these content types indicate that the response will be in JSONL format. OpenAPI v3.2.0 adds native support through the itemSchema keyword, which defines the schema for each line in the JSONL stream.
The following example defines a JSONL response in an OpenAPI document:
paths:
/users/export:
get:
tags:
- Users
summary: Export user data in JSONL format
description: >
This endpoint returns user data in JSONL format, with each line containing a complete user record.
This format is ideal for large datasets that need to be processed one record at a time.
responses:
'200':
description: User data in JSONL format
content:
application/jsonl:
schema:
$ref: '#/components/schemas/User'
'400':
description: Invalid request
'500':
description: Internal server error
components:
schemas:
User:
type: object
required: [id, name, email]
properties:
id:
type: string
format: uuid
description: Unique identifier for the user
name:
type: string
description: User's full name
email:
type: string
format: email
description: User's email address
age:
type: integer
description: User's age
city:
type: string
description: User's city of residenceIn this example, the /users/export endpoint returns user data in JSONL format. Each line of the response will be a valid JSON object representing a user, as defined by the User schema.
Using itemSchema in OpenAPI 3.2
With OpenAPI v3.2.0, the itemSchema keyword explicitly defines the schema for each line in the JSONL stream:
paths:
/users/export:
get:
summary: Export user data in JSONL format
responses:
"200":
description: User data in JSONL format
content:
application/jsonl:
schema:
type: string
itemSchema:
$ref: "#/components/schemas/User"The schema describes the overall stream, while itemSchema defines the structure of each individual JSON object per line. This allows tooling to generate typed deserialization for each record in the stream.
Client-side handling of JSONL responses
When working with JSONL responses, clients need to process the data line by line. Here’s an example of how to handle JSONL responses using a python SDK generated by Speakeasy:
from openapi import SDK
with SDK() as sdk:
res = sdk.users.get_users_export()
with res as jsonl_stream:
for event in jsonl_stream:
# handle event
print(f"User: {event['name']}, Email: {event['email']}")In this example, the SDK handles the streaming of JSONL data, allowing you to process each record as it arrives. The context manager (with res as jsonl_stream) ensures proper resource cleanup after processing.
Best practices for JSONL API design
When designing APIs that return JSONL responses, consider the following best practices:
Use appropriate content types
Use the application/jsonl or text/jsonl content type to clearly indicate that the response is in JSONL format. This helps clients understand how to process the response correctly.
Include clear documentation
Provide clear documentation about the JSONL format and how clients should process it. Include examples of the response format and client-side code for handling JSONL data.
Consider pagination for large datasets
Even though JSONL is efficient for streaming large datasets, consider implementing pagination to allow clients to request smaller chunks of data. This can be done using query parameters like limit and offset.
paths:
/users/export:
get:
parameters:
- name: limit
in: query
description: Maximum number of users to return
schema:
type: integer
default: 100
- name: offset
in: query
description: Number of users to skip
schema:
type: integer
default: 0Last updated on