1 of 46

Documentation

Get Started

An introduction to the Gaffa Browser API. Learn how you can get started building fast, powerful web automations!

Welcome to the Gaffa documentation site! You'll find everything you need here to get started using API including , you can use to interact with our cloud browsers and you can run right away in our API Playground.

Gaffa is currently in it's very early stages, so we'd love to hear how we can improve our docs and API to make life easier for our users. If you have any questions or comments please or us . To stay up to date with latest developments, features and news on mission to support the development of revolutionary AI Agents, sign up to sporadic updates.

Credits and Pricing

View our current pricing plans on the Gaffa

Browser Requests

Browser requests are charged in terms of credits based on the following factors:

Features

Browser Requests

Making web automation requests has never been so simple.

Browser Requests allow you to send the Gaffa API a URL and a list of actions you want to be carried out, including any outputs you want from the page. We'll carry out the request on our cloud browsers and return you the response with no need to worry about proxies, IP rotation, web automation frameworks and scaling.

There's absolutely zero configuration needed and you can interact with Gaffa from any program that can send web requests. We think it's by far the simplest way to automate simple web tasks and the good news is, we're just getting started and have much more planned.

Example request

Block DOM Removals

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please and we can enable this feature for your account.

Type: block_dom_removals

This action will prevent the page from removing items from the page. This is useful if you are trying to scrape data from a Javascript-based web application that removes items from the page when they are out of view which can make grabbing data difficult.

Capture Cookies

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

Type: capture_cookies

This action will capture the browser cookies currently saved for the web page you are on and return them as a JSON object with key/values.

Parameters

See .

Usage

Capture the cookies of the current page

Capture DOM

Type: capture_dom

This action will capture and return the raw dom of the site which you can then extract data from on your end.

For common AI scenarios you may find this returns too much data so we have provided a generate_simplified_dom action which distills the DOM to only the important elements.

Parameters

See .

Usage

Capture the raw DOM of the current page

Example Output

Capture Screenshot

Type: capture_screenshot

Takes a screenshot of the current page. You can choose to take a full screen screenshot showing the whole page or just the current view.

Parameters

Name

Type

Capture Element

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please and we can enable this feature for your account.

Type: capture_element

Returns the , essentially the contents, of a particular element on the page. This can be used when you are only interested in the contents of a particular element.

Capture Snapshot

Type: capture_snapshot

This output type will return a HTML file which captures a static version of the page state. The page will load offline and can be saved to your local machine.

This will:

Load and embed all images on the page.
Embed all css files

Currently, Javascript will be disabled and interactivity might not worked as expected but this feature should be useful for preserving the page state as it was and allowing you to view it offline.

Parameters

See

Usage

The following captures the current section of the page currently visible in the browser.

Example Output

Here's an example that shows an offline snapshot of a site

Click

Type: click

Request that the browser clicks a particular element on the page.

Parameters

Name

Type

Required

Description

See .

Usage

Click an element on the page

The following code will wait 1 second and then continue with the next action, if provided.

Wait for a particular element to appear

The following code will wait for the logo to appear for a maximum of 5 seconds and it will continue with the list of actions

Download File

Type: download_file

Request a copy of the most recent file viewed in the browser.

Parameters

Name

Type

Generate Markdown

Type: generate_markdown

The markdown output format can export the data of the page (an article, table etc.) in a human and LLM readable format which removes unnecessary styling data and other "junk" that is only relevant for the site to work properly.

Gaffa exports with comments removed and unknown tags ignored.

Parameters

Generate Simplified DOM

Type: generate_simplified_dom

When you're looking at the DOM of a web page, there's a lot of unnecessary data that can be discarded if you are only interested in the page's elements or looking to export the data into a LLM. The generate_simplified_dom output format processes the HTML in the following way:

Removes all links in the head

Print

Type: print

Request that the browser prints the page to a PDF.

Parameters

Name

Type

Required

Parse Table

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

Type: parse_table

Finds a table on the page with a given selector and then converts the table data into a JSON object.

This action first fins the table headers and converts them into property names by converting them to lower case and replacing non-alphanumeric characters with underscores. It then processes each table row and for each cell is extracts the contents and saves a value. At the moment, all values will be string types.

Parameters

Name

Type

Required

Description

See .

Usage

Extract a table on the page

The following code will wait 1 second for the .large_table element to appear and return a JSON file with the headers and rows converted.

Scroll

Type: scroll

Request that the browser scrolls to a certain point on the page or, in the case of pages with infinite scrolling, scrolls for a particular amount of time.

Parameters

Name

Type

Type: type

Request that the browser type a particular bit of text into a field.

Parameters

Name

Type

Required

Description

See .

Sites that use more advanced bot detection often use keyboard events to detect unusual activity on their site, rather than immediately dropping all characters of the text into a field our platform types the text in a human-like manner.

Usage

Type into a text box

The following action will type into a particular text field.

Wait for an element to appear before typing

The following code will wait a maximum of 10 seconds for the email input to appear in the field and then type in the provided email.

Wait

Type: wait

Request that the browser waits a given amount of time or for a particular item to appear on the page.

Parameters

Name

Type

API Playground Examples

In the following pages you can view all the pre-built requests we've built to show what is possible with the Gaffa web automation API.

You can start using these in the API Playground once you've created an account.

Export Web Page to PDF

An example request that uses Gaffa to convert an HTML page to a PDF. There are lots of HMTL to PDF API's but Gaffa handles it easily, as well as doing much more.

The following example is a request we've pre-built to show you Gaffa's capabilities against our demo site. You can run this request right now in the Gaffa API Playground.

Gaffa's print to PDF feature allows you to export web pages as PDF files easily. Unlike the standard "Print to PDF" in your local browser, Gaffa's feature waits for specific items to load, uses proxies, and scales with your product's growth. Enhance your customer experience and streamline your PDF export process

API Request

The request below uses the to open the demo site on the table page, wait for the table to load and then print the webpage to a PDF in size A4 with a margin of 20 and using the portrait orientation.

Actions

Read the full documentation for these actions here.

Response

Here's an example of the PDF returned by the request after waiting for the table to load.

Convert Web Page to Markdown

An example request that uses Gaffa to convert a web page page to markdown. This could be used to export web page reports or to print the content of a page in a readable format.

The following example is a request we've pre-built to show you Gaffa's capabilities against our You can run this request right now in the .

Gaffa converts web pages to clean markdown, stripping away styling, scripts, and images. This optimizes content for LLM applications by reducing token usage while preserving essential information.

API Request

Automated Form Filling

An example request that uses Gaffa to automate the completion of a form and waits for a success modal to appear.

The following example is a request we've pre-built to show you Gaffa's capabilities against our demo site. You can run this request right now in the Gaffa API Playground.

Filling forms is tedious, Gaffa can be used to fill out a form in a human-like manner so you can spend time doing much more interesting things.

API Request

The request below uses the to open the demo site on the form simulator page with some sections pre-filled (for speed). After typing in the required information and clicking submit, Gaffa waits for the success dialog to show before returning a video of the interaction.

Actions

Response

Here's a video showing Gaffa filling out the page and waiting for the success modal.

Read more about screen recording here (TODO).

Mapping Requests

Mapping requests allow you to extract all urls from the sitemap of a website. Gaffa mapping requests have the following useful features:

Sitemap Discovery: No need to find the URL of a site's sitemap, we'll find it automatically.
Caching: If you or another Gaffa user has retrieved a sitemap within a defined timeframe we'll quickly return the cached data instead of having to fetch it all again.

API Reference

API Authentication

We use API Keys for authenticating requests to our API. In this document we'll explain how you can manage and use the keys for your account.

Creating Keys

Once your account is approved, you will need to create an API key to send your requests to our API. Go to your account and create a new key with a name. Once the key is created, copy the value and you will immediately be free to start using it to make requests.

POST v1/browser/requests

For more information on browser requests, .

The following endpoint creates a browser request and either runs it synchronously or returns immediately with an ID so you can check it status later using this endpoint.

GET v1/browser/requests/{id}

For more information on browser requests, see here.

The following endpoint allows you to query browser request for your account by ID.

GET v1/browser/requests

For more information on browser requests, .

The following endpoint allows you to query for multiple browser requests, either by status or a list of particular ids, submitting a request with neither of these will return all requests for your account.

POST v1/schemas

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

The following endpoint allows you to describe a data schema for parsing an online PDF to JSON.

PUT v1/schemas

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

The following endpoint allows you to update a data schema by ID.

GET v1/schemas

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

The following endpoint allows you to list data schemas for your account in a paged list.

DELETE v1/schemas/{id}

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please and we can enable this feature for your account.

The following endpoint allows you to delete a schema from your account.

POST v1/site/map

This endpoint creates a new site mapping request and returns the result.

GET v1/site/map

This endpoint retrieves information about previous site mapping requests, filterable by id or status

GET v1/site/map/{id}

This endpoint retrieves information about a site mapping request.

Tutorials

AI Tools

How to scrape all images from a website using Gaffa

This tutorial will show you how you can use Gaffa to retrieve all images from a site and then download all images across those pages.

Automating the collection of images from a website can save hours of manual work. Whether you're a marketer building a competitor analysis, a developer creating a dataset, or an archiver preserving digital content, doing this manually is tedious and error-prone.

In this tutorial, you'll learn how to use Gaffa's powerful Mapping and Browser Requests endpoints to automatically find, extract, and download every image from a website in a short Python script. We'll leverage features like the capture_dom action, intelligent sitemap parsing, and the download_file action to handle this efficiently and responsibly.

By the end of this guide, you'll be able to:

Use Gaffa's endpoint to discover every page on a site.
Render each page with a headless browser to capture its full DOM.
Parse and download all images using Gaffa's action with residential proxies
Run the process at scale with built-in proxy rotation and caching.

Prerequisites

Python 3.10+ installed on your machine.
A Gaffa API key. and get your API key from the dashboard.
Basic familiarity with the command line.

Set Up Your Environment

First, create a new project directory and install the required Python libraries.

Next, set your Gaffa API key as an environment variable to keep it secure.

Why This Gaffa-Powered Approach is Superior

Handles JavaScript-Rendered Content: Unlike simple HTTP scrapers, Gaffa uses a real browser, so it captures anything that is lazy-loaded by JavaScript.
Stealth Downloading with Residential Proxies: The download_file action uses real browsers and proxies, making your requests appear as legitimate user traffic.
Intelligent Caching: With `max_cache_age` set to 24 hours, repeated requests for the same image are served from cache, reducing load on target servers and improving efficiency.

Use Cases and Ideas

This technique is useful for far more than just downloading pictures. Here are a few ideas:

Competitive Analysis: Analyze the product photography styles of competitors using real browsers.
AI/ML Datasets: Build large, curated image datasets for training computer vision models with ethically-sourced images.
Website Migration & Audits: Download all assets from an old site before a migration while minimizing server impact through caching.

Next Steps

The full script is available on our .

Ready to automate your image collection with enterprise-grade infrastructure? and start building today.

Parse JSON

Paid Action: This action will consume credits based on the amount of content being parsed, see more below.

Beta Feature: This feature is currently in beta and restricted to approved users. If you're are interested in trying it, please contact support and we can enable this feature for your account.

Type: parse_json

The parse_json action extracts data from web pages and online PDFs. It uses AI to parse web content from text into a pre-defined data schema and return it as a JSON object.

The action allows you to convert unstructured content such as academic papers, forms, and webpages into JSON objects, which you can use in automations, analysis, or further processing.

This feature currently works for online PDFs and web page text.

Parameters

Name

Type

Required

Description

See .

Defining Data Schemas

A data schema tells the model exactly what JSON structure to produce.

You can define schemas in two ways:

Inline schemas (defined directly inside the action)
Reusable schemas (created via the Schema API and referenced by ID in your requests)

Schema Structure

A schema has:

Property

Type

Description

Each field in the fields array has:

Supported Field Types

Type

Description

Inline Schema Example

This example shows:

Simple fields (string, datetime) for basic data
Object fields for grouped related data with nested fields

Schema Operations

Instead of defining schemas inline every time, they can be saved to your Gaffa account and be reused across multiple requests. This makes your actions more readable, easier to maintain, and ensures consistency when parsing similar content.

Creating a Saved Schema

Use the endpoint to create a reusable schema:

Response:

Save the id returned in the response, you'll use this to reference the schema in your requests

Managing Schemas

List all schemas:

Allows you to view all schemas saved to your account:

Endpoint:

Update a schema:

Allows you to modify an existing schema by its ID:

Endpoint:

Delete a schema:

Removes a schema from your account:

Endpoint:

Common Schema Patterns

Simple List Extraction

Nested Objects

Pricing

The credits this action uses depends on the model used. Here are the current supported models and their pricing:

Model

Input Token Cost

Output Token Cost

{"data":{"total_pages":1,"total_records":2,"results":[{"id":"brq_V2PUfFA8AQPAQ5VEsewpxdGUSZkgKP","url":"https://demo.gaffa.dev/simulate/table?loadTime=3&rowCount=20","proxy_location":null,"state":"completed","credit_usage":4,"error":null,"error_reason":null,"actual_url":null,"http_status_code":200,"from_cache":false,"started_at":"2024-11-22T16:31:13.128103+00:00","completed_at":"2024-11-22T16:31:47.020851+00:00","running_time":null,"page_load_time":"00:00:03.4705813","actions":[{"id":"act_V2PUfETiTdXwzEgAW2NPURnATW7we9","type":"wait","custom_id":null,"timestamp":"2024-11-22T16:31:16.6080484+00:00","output":null,"reference":null,"iterations":null,"actions":null,"error":null},{"id":"act_V2PUfBreQxHR2SNqGXuPzzWoiyRsrm","type":"print","custom_id":null,"timestamp":"2024-11-22T16:31:40.5760333+00:00","output":"https://storage.gaffa.dev/brq/pdf/brq_V2PUfFA8AQPAQ5VEsewpxdGUSZkgKP/act_V2PUfBreQxHR2SNqGXuPzzWoiyRsrm.pdf","reference":null,"iterations":null,"actions":null,"error":null}],"video":"https://storage.gaffa.dev/brq/video/brq_V2PUfFA8AQPAQ5VEsewpxdGUSZkgKP.mp4"},{"id":"brq_V2NmHY9FsvPQEGbfVBSeV6UCp2SXjC","url":"https://demo.gaffa.dev/simulate/article?loadTime=3&paragraphs=10&images=3","proxy_location":null,"state":"completed","credit_usage":1,"error":null,"error_reason":null,"actual_url":null,"http_status_code":200,"from_cache":false,"started_at":"2024-11-22T12:52:48.708264+00:00","completed_at":"2024-11-22T12:52:54.25994+00:00","running_time":null,"page_load_time":"00:00:00.8094888","actions":[{"id":"act_V2NmHijnQa9iPDNcvhjS2GGFt5se8j","type":"wait","custom_id":null,"timestamp":"2024-11-22T12:52:49.5690537+00:00","output":null,"reference":null,"iterations":null,"actions":null,"error":null},{"id":"act_V2NmHgs27VJKB49YavtK4CcyErdfvD","type":"generate_markdown","custom_id":null,"timestamp":"2024-11-22T12:52:52.8353136+00:00","output":"https://storage.gaffa.dev/brq/md/brq_V2NmHY9FsvPQEGbfVBSeV6UCp2SXjC/act_V2NmHgs27VJKB49YavtK4CcyErdfvD.md","reference":null,"iterations":null,"actions":null,"error":null}],"video":null},{"id":"brq_V2HvS2cw4Z2wonqEAwbxoxjrkmRdEM","url":"https://demo.gaffa.dev/simulate/article","proxy_location":null,"state":"failed","credit_usage":0,"error":null,"error_reason":null,"actual_url":null,"http_status_code":null,"from_cache":false,"started_at":null,"completed_at":null,"running_time":null,"page_load_time":null,"actions":null,"video":null}],"page":1,"page_size":30},"error":null}

Documentation

Get Started

Credits and Pricing

hashtagBrowser Requests

Features

Browser Requests

hashtagExample request

Block DOM Removals

Capture Cookies

hashtagParameters

hashtagUsage

Capture DOM

hashtagParameters

hashtagUsage

hashtagExample Output

Capture Screenshot

hashtagParameters

Capture Element

Capture Snapshot

hashtagParameters

hashtagUsage

hashtagExample Output

Click

hashtagParameters

hashtagUsage

hashtagClick an element on the page

hashtagWait for a particular element to appear

Download File

hashtagParameters

Generate Markdown

hashtagParameters

Generate Simplified DOM

Print

hashtagParameters

Parse Table

hashtagParameters

hashtagUsage

hashtagExtract a table on the page

Scroll

hashtagParameters

Type

hashtagParameters

hashtagUsage

hashtagType into a text box

hashtagWait for an element to appear before typing

Wait

hashtagParameters

API Playground Examples

Export Web Page to PDF

hashtagAPI Request

hashtagActions

hashtagResponse

Convert Web Page to Markdown

hashtagAPI Request

Automated Form Filling

hashtagAPI Request

hashtagActions

hashtagResponse

hashtagRead More

Mapping Requests

API Reference

API Authentication

hashtagCreating Keys

POST v1/browser/requests

GET v1/browser/requests/{id}

GET v1/browser/requests

POST v1/schemas

PUT v1/schemas

GET v1/schemas

DELETE v1/schemas/{id}

POST v1/site/map

GET v1/site/map

GET v1/site/map/{id}

Tutorials

AI Tools

Capture Cookies

hashtagParameters

hashtagUsage

Credits and Pricing

hashtagBrowser Requests

Browser Requests

Example request

Parameters

Usage

Parameters

Usage

Example Output

Parameters

Parameters

Usage

Example Output

Parameters

Usage

Click an element on the page

Wait for a particular element to appear

Parameters

Parameters

Parameters

Parameters

Usage

Extract a table on the page

Parameters

Parameters

Usage

Type into a text box

Wait for an element to appear before typing

Parameters

API Request

Actions

Response

API Request

API Request

Actions

Response

Read More

Creating Keys

Parameters

Usage

Browser Requests

Parameters

Parameters

Creating Keys

Deleting Keys

Authenticating Requests

Parameters

Usage

Usage

Example Output

Mapping Requests

Usage

Example Output

Parameters

Usage

Example Output

Parameters

Usage

Click an element on the page

Wait for a particular element to appear

API Request

Actions

Response

Parameters

Usage

Extract a table on the page

Parameters

API Request

Parameters

Parameters

Files Supported

Usage

Download a copy of a PDF open in the Browser

Actions

Response

Usage

Print a page in landscape to PDF

Example Output

Parameters

Usage

Example Output

Usage