MailSlurp logo

Structured output schemas

How to define JSON schemas for AI output.

View Markdown Agent setup

MailSlurp AI transforms convert emails, attachments, and SMS messages into structured data using AI models. You define the shape of the data you want returned using a structured output schema, and MailSlurp extracts results in that shape. In transformer pipelines those results are linked back to the source email, attachment, or SMS so they can be reviewed in MailSlurp, fetched by API, or sent to downstream systems.

Output schema

MailSlurp uses a subset of OpenAPI and JSON-schema style fields to define the response shape your AI extraction should use. Define the properties, types, and constraints you expect, and MailSlurp uses that schema as the contract for structured extraction.

Build schemas using the supported fields below rather than the full JSON Schema or OpenAPI feature set.

Supported schema fields

Field What it does
type Supported primitive types are string, number, integer, boolean, object, array, and null.
description Explains the business meaning of a field so the model knows what to extract.
properties Defines child fields for an object schema. Each property is another StructuredOutputSchema.
items Defines the schema for each element in an array.
required Lists the property names that must exist on the containing object.
nullable Allows a field to be returned as null when the source content does not contain a reliable value.
format Supported string formats are date-time and enum.
enumValues Supplies the allowed values when type is string and format is enum.
pattern Regex pattern for string fields when you need identifiers or dates in a strict format.
minimum / maximum Numeric bounds for number or integer fields.
minLength / maxLength Length bounds for strings.
minItems / maxItems Size bounds for arrays.
minProperties / maxProperties Size bounds for objects.
example / default Shows the shape and expected values you want the model to follow.
propertyOrdering Sets the preferred order of properties in generated JSON when stable ordering matters downstream.
title Optional schema name for readability and reuse in your own codebase.

For most extraction workflows the top-level schema should be an object. Use nested object and array fields for line items, parties, addresses, or repeated records.

Example response types

Say you want a response for an invoice that can be modelled like so:

Javascript

type DesiredOutput = {
  invoiceId: string;
  status: 'paid' | 'pending';
  lineItems: { name: string; amount: number }[];
}

C#

public abstract record DesiredOutput(
    string InvoiceId,
    DesiredOutput.InvoiceStatus Status,
    List<DesiredOutput.LineItem> LineItems
)
{
    public enum InvoiceStatus
    {
        Paid,
        Pending
    }

    public abstract record LineItem(string Name, decimal Amount);
}

Java

public record DesiredOutput(
        @JsonProperty("invoiceId") String invoiceId,
        @JsonProperty("status") InvoiceStatus status,
        @JsonProperty("lineItems") List lineItems
) {
  public enum InvoiceStatus {
    @JsonProperty("paid") PAID,
    @JsonProperty("pending") PENDING
  }

  public record LineItem(
          @JsonProperty("name") String name,
          @JsonProperty("amount") BigDecimal amount
  ) {}
}

Python

from enum import Enum
from decimal import Decimal
from typing import List
from pydantic import BaseModel, Field

class InvoiceStatus(str, Enum):
    PAID = "paid"
    PENDING = "pending"

class LineItem(BaseModel):
    name: str = Field(..., alias="name")
    amount: Decimal = Field(..., alias="amount")

class DesiredOutput(BaseModel):
    invoice_id: str            = Field(..., alias="invoiceId")
    status: InvoiceStatus      = Field(..., alias="status")
    line_items: List[LineItem] = Field(..., alias="lineItems")

    class Config:
        allow_population_by_field_name = True
        use_enum_values = True

You can define a corresponding schema for this type like so:

Javascript

const schema: StructuredOutputSchema = {
  type: StructuredOutputSchemaTypeEnum.object,
  description: 'Schema for an invoice email',
  properties: {
    invoiceId: {
      type: StructuredOutputSchemaTypeEnum.string,
      description: 'The invoice ID'
    },
    status: {
      type: StructuredOutputSchemaTypeEnum.string,
      format: 'enum',
      description: 'The status of the invoice',
      _enum: ['paid', 'pending']
    },
    lineItems: {
      type: StructuredOutputSchemaTypeEnum.array,
      description: 'The items on the invoice',
      items: {
        type: StructuredOutputSchemaTypeEnum.object,
        description: 'A line item',
        properties: {
          name: {
            type: StructuredOutputSchemaTypeEnum.string,
            description: 'Name of the line item'
          },
          amount: {
            type: StructuredOutputSchemaTypeEnum.number,
            description: 'Price in $'
          }
        }
      }
    }

  }
};

C#

var schema = new StructuredOutputSchema(
    type: StructuredOutputSchema.TypeEnum.Object,
    description: "Schema for an invoice email",
    properties: new Dictionary<string, StructuredOutputSchema>
    {
        {
            "invoiceId",
            new StructuredOutputSchema(
                type: StructuredOutputSchema.TypeEnum.String,
                description: "The invoice ID"
            )
        },
        {
            "status",
            new StructuredOutputSchema(
                type: StructuredOutputSchema.TypeEnum.String,
                format: "enum",
                description: "The status of the invoice",
                varEnum: ["paid", "pending"]
            )
        },
        {
            "lineItems", new StructuredOutputSchema(type: StructuredOutputSchema.TypeEnum.Array,
                description: "The items on the invoice",
                items: new StructuredOutputSchema(
                    type: StructuredOutputSchema.TypeEnum.Object, description: "A line item",
                    properties: new Dictionary<string, StructuredOutputSchema>
                    {
                        {
                            "name",
                            new StructuredOutputSchema(
                                type: StructuredOutputSchema.TypeEnum.String,
                                description: "Name of the line item"
                            )
                        },
                        {
                            "amount",
                            new StructuredOutputSchema(
                                type: StructuredOutputSchema.TypeEnum.Number,
                                description: "Price in $"
                            )
                        }
                    }
                )
            )
        }
    }
);

Java

var schema =
    new StructuredOutputSchema()
        .type(StructuredOutputSchema.TypeEnum.OBJECT)
        .description("Schema for an invoice email")
        .properties(Map.ofEntries(
            Map.entry("invoiceId",
                new StructuredOutputSchema()
                    .type(StructuredOutputSchema.TypeEnum.STRING)
                    .description("The invoice ID")),
            Map.entry("status",
                new StructuredOutputSchema()
                    .type(StructuredOutputSchema.TypeEnum.STRING)
                    .format("enum")
                    .description("The status of the invoice")
                    ._enum(List.of("paid", "pending"))),
            Map.entry("lineItems",
                new StructuredOutputSchema()
                    .type(StructuredOutputSchema.TypeEnum.ARRAY)
                    .description("The items on the invoice")
                    .items(new StructuredOutputSchema()
                        .type(StructuredOutputSchema.TypeEnum.OBJECT)
                        .description("A line item")
                        .properties(Map.ofEntries(
                            Map.entry("name",
                                new StructuredOutputSchema()
                                    .type(StructuredOutputSchema.TypeEnum.STRING)
                                    .description("Name of the line item")),
                            Map.entry("amount",
                                new StructuredOutputSchema()
                                    .type(StructuredOutputSchema.TypeEnum.NUMBER)
                                    .description("Price in $"))
                        ))
                    )
            )
        ));

Python

schema = mailslurp_client.StructuredOutputSchema(
    type='object',
    description="Schema for an invoice email",
    properties={
        "invoiceId": mailslurp_client.StructuredOutputSchema(
            type='string',
            description="The invoice ID",
        ),
        "status": mailslurp_client.StructuredOutputSchema(
            type='string',
            format="enum",
            description="The status of the invoice",
            enum=["paid", "pending"],
        ),
        "lineItems": mailslurp_client.StructuredOutputSchema(
            type='array',
            description="The items on the invoice",
            items=mailslurp_client.StructuredOutputSchema(
                type='object',
                description="A line item",
                properties={
                    "name": mailslurp_client.StructuredOutputSchema(
                        type='string',
                        description="Name of the line item",
                    ),
                    "amount": mailslurp_client.StructuredOutputSchema(
                        type='number',
                        description="Price in $",
                    ),
                }
            ),
        ),
    }
)
validate = ai_controller.validate_structured_output_schema(schema)
assert validate.valid
json_str = validate.example_output

Defining schemas with prompts

MailSlurp supports a prompt-first authoring workflow when you create AI transformers. Instead of hand-authoring JSON first, you can describe the record you want extracted and let MailSlurp derive a starting schema from that prompt before you save the transformer.

This works best when your prompt is explicit about:

  • what one result represents
  • whether the result should be a single object or an array
  • the exact field names you want returned
  • the expected formats for dates, IDs, currency codes, and statuses
  • how missing values should be handled

For example, a useful extraction prompt might look like:

Extract one invoice record from each attachment.

Return these fields:
- invoiceNumber: supplier invoice reference
- invoiceDate: YYYY-MM-DD string
- dueDate: YYYY-MM-DD string or null
- currency: one of USD, GBP, EUR
- supplier.name: company name
- totalAmount: numeric total
- lineItems: array of objects with description, quantity, unitPrice, and lineTotal

If a value is missing or ambiguous, return null rather than guessing.

Use prompt-derived schemas as a first draft, not a final contract. After MailSlurp suggests the schema, review it and tighten:

  • required fields so only truly mandatory values are enforced
  • enumValues for status-like fields
  • pattern for dates or identifiers
  • nullable for optional fields
  • nested items and properties for arrays and objects

Prompt-first design is a good fit when you know the business record you want but do not yet know the final JSON structure.

Defining schemas in visual editor

The visual editor in the MailSlurp app is the fastest way to build and refine schemas when you are iterating on live email, attachment, or SMS examples.

A practical workflow is:

  1. Start with a top-level object.
  2. Add the first-class business fields you care about, such as IDs, dates, amounts, statuses, and participants.
  3. Use nested object fields for grouped data and array fields for repeated records like line items.
  4. Add description text to fields that could be interpreted in multiple ways.
  5. Mark only the fields that must always exist as required.
  6. Test the schema against realistic samples before attaching it to inbox or phone mappings.

When refining fields in the editor, these settings usually matter most:

  • Use format: enum with enumValues for normalized labels such as PAID, DUE, or FAILED.
  • Use pattern for invoice numbers, tracking IDs, or custom date strings.
  • Use nullable: true when the source material often omits a value.
  • Use minItems or maxItems when arrays should have practical bounds.
  • Use propertyOrdering when a downstream system expects stable field order.

The visual editor is especially useful after prompt-based generation because you can turn a broad first-pass schema into a tighter extraction contract without hand-editing every nested field.

Defining schemas in code

Hand-authoring schemas in JSON is the best option when you want schema definitions in source control, code review, and automated deployment flows.

A typical StructuredOutputSchema for invoice extraction might look like this:

{
  "title": "InvoiceExtraction",
  "description": "Structured invoice data extracted from supplier emails or attachments",
  "type": "object",
  "propertyOrdering": [
    "invoiceNumber",
    "invoiceDate",
    "dueDate",
    "currency",
    "supplier",
    "lineItems",
    "totalAmount"
  ],
  "required": [
    "invoiceNumber",
    "invoiceDate",
    "currency",
    "lineItems",
    "totalAmount"
  ],
  "properties": {
    "invoiceNumber": {
      "type": "string",
      "description": "Supplier invoice reference",
      "pattern": "^[A-Z0-9-]+$",
      "example": "INV-10452"
    },
    "invoiceDate": {
      "type": "string",
      "description": "Invoice date in YYYY-MM-DD format",
      "pattern": "^\\\\d{4}-\\\\d{2}-\\\\d{2}$",
      "example": "2026-03-22"
    },
    "dueDate": {
      "type": "string",
      "description": "Payment due date in YYYY-MM-DD format",
      "pattern": "^\\\\d{4}-\\\\d{2}-\\\\d{2}$",
      "nullable": true
    },
    "currency": {
      "type": "string",
      "format": "enum",
      "enumValues": ["USD", "GBP", "EUR"]
    },
    "supplier": {
      "type": "object",
      "required": ["name"],
      "properties": {
        "name": {
          "type": "string"
        },
        "vatNumber": {
          "type": "string",
          "nullable": true
        }
      }
    },
    "lineItems": {
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "object",
        "required": ["description", "quantity", "unitPrice", "lineTotal"],
        "properties": {
          "description": {
            "type": "string"
          },
          "quantity": {
            "type": "number",
            "minimum": 0
          },
          "unitPrice": {
            "type": "number",
            "minimum": 0
          },
          "lineTotal": {
            "type": "number",
            "minimum": 0
          }
        }
      }
    },
    "totalAmount": {
      "type": "number",
      "minimum": 0
    }
  }
}

Validate schemas before using them

MailSlurp exposes a validation endpoint at POST /ai/structured-content/validate. Use it before you attach a schema to production traffic.

curl --request POST \
  --url https://api.mailslurp.com/ai/structured-content/validate \
  --header 'content-type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data @schema.json

Validation responses include:

  • valid: whether the schema can be used for structured extraction
  • errors: validation messages if the schema is not acceptable
  • exampleOutput: an example JSON payload generated from the schema

Use schemas in direct extraction calls

You can pass the same outputSchema object directly to MailSlurp AI endpoints:

  • POST /ai/structured-content/email with emailId and optional contentSelector, instructions, or transformId
  • POST /ai/structured-content/attachment with attachmentId and optional emailId, instructions, or transformId
  • POST /ai/structured-content/sms with smsId and optional instructions or transformId

For email extraction, contentSelector supports:

  • RAW
  • BODY
  • BODY_ATTACHMENTS

Direct extraction responses return a result object matching your schema. Saved transformer pipelines persist results separately so they can be reviewed in AI results, paged through, and linked back to the originating email, attachment, or SMS.

Design guidance

  • Start with the smallest schema that is useful to downstream systems.
  • Prefer explicit descriptions over vague field names like value or data.
  • Use enumValues wherever you need normalized statuses or categories.
  • Use nullable for optional fields instead of forcing the model to invent values.
  • If you need a date-only field, use type: string with a pattern because the documented format values are limited.
  • Avoid assuming full JSON Schema support for features like oneOf, allOf, or arbitrary custom keywords.