githubEdit

Parse HTML Form to Structured JSON

An example request that uses Gaffa to analyze a web form and extract all input fields, their labels, types, and properties into structured JSON.

The following example is a request we've pre-built to show you Gaffa's capabilities against our demo sitearrow-up-right. You can run this request right here in the Gaffa API Playgroundarrow-up-right.

This example demonstrates how to extract structured information from HTML forms on web pages. Gaffa uses AI to identify form elements and their properties, making it perfect for form automation, testing, accessibility audits, or building form-filling assistants.

API Request

The request below uses the POST endpointarrow-up-right to open the demo form page, wait for the modal to appear, and then parse the visible form to extract all field information, including labels, input names, placeholders, and dropdown options.

{
  "url": "https://demo.gaffa.dev/simulate/form?loadTime=3&showModal=true&modalDelay=5&formType=address",
  "proxy_location": null,
  "async": false,
  "max_cache_age": 0,
  "settings": {
    "record_request": false,
    "actions": [
      {
        "type": "parse_json",
        "data_schema": {
          "name": "AddressFormSchema",
          "description": "Extracts fields, labels, and placeholders from the demo address form",
          "fields": [
            {
              "type": "string",
              "name": "form_title",
              "description": "The heading or title of the form"
            },
            {
              "type": "object",
              "name": "full_name",
              "description": "Full name input field",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "placeholder",
                  "description": "Placeholder text shown in the input"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the input element"
                }
              ]
            },
            {
              "type": "object",
              "name": "address_line_1",
              "description": "First address line input field",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "placeholder",
                  "description": "Placeholder text shown in the input"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the input element"
                }
              ]
            },
            {
              "type": "object",
              "name": "address_line_2",
              "description": "Second address line input field",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "placeholder",
                  "description": "Placeholder text shown in the input"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the input element"
                }
              ]
            },
            {
              "type": "object",
              "name": "city",
              "description": "City input field",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "placeholder",
                  "description": "Placeholder text shown in the input"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the input element"
                }
              ]
            },
            {
              "type": "object",
              "name": "postcode",
              "description": "Postcode or ZIP code input field",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "placeholder",
                  "description": "Placeholder text shown in the input"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the input element"
                }
              ]
            },
            {
              "type": "object",
              "name": "country",
              "description": "Country selection dropdown",
              "fields": [
                {
                  "type": "string",
                  "name": "label",
                  "description": "The visible label text"
                },
                {
                  "type": "string",
                  "name": "input_name",
                  "description": "The name attribute of the select element"
                },
                {
                  "type": "array",
                  "name": "options",
                  "description": "Available country options in the dropdown",
                  "fields": [
                    {
                      "type": "string",
                      "name": "value",
                      "description": "The option value or text"
                    }
                  ]
                }
              ]
            }
          ]
        },
        "instruction": "Extract all visible form fields from this address form, including their labels, input names, placeholders, and for dropdown fields, list all available options.",
        "model": "gpt-4o-mini",
        "output_type": "inline"
      }
    ]
  }
}

Actions

Read the full documentation for these actions here.

Response

The parsed form data is returned as a structured JSON object:

Last updated