# Introduction What is Gaffa? Gaffa is a powerful API for browser automation which allows you to control real web browsers at scale through a simple interface with no configuration necessary. We'll handle the complexities of managing infrastructure like virtual machines, proxies and caching so you can focus on building powerful and reliable web automation and AI applications!


API Playground	Start experimenting with the Gaffa API right now.	https://gaffa.dev/dashboard/playground
Get Started	The simple steps to get you started using Gaffa in your apps.	get-started
API Reference	Explore the API and docs for the finer details	api-reference

## Key features Gaffa is ready to power your web automations: * **Simplicity** - there's no need to learn another new framework, Gaffa is accessible through a simple REST API - just tell it what site you want to visit and what actions you want to perform and it will be carried out as soon as you send the request. * **Real browsers** - headless browsers are popular but we make it simple to control real cloud-hosted browsers at scale which render JavaScript sites exactly as they would on a local machine, are harder to detect when doing scraping and allow full observability. We're also planning to allow you to go beyond just being able to control web browsers! * **Proxies** - you can easily choose to route your traffic through a network of residential proxy IP addresses to help avoid bot-detection on sites you are trying to automate. * **Scalable** - whether you want to control a single cloud browser or 100s in parallel with Gaffa you can do that easily without one thought about infrastructure management. * **Powerful data processing** - once you've accessed your desired site you can export your data in a constantly growing number of formats. If you want the [page content in markdown](/docs/gaffa/gaffa/features/browser-requests/actions/generate-markdown) to feed into a large language model or [an image](/docs/gaffa/gaffa/features/browser-requests/actions/capture-screenshot) to feed into a vision modal we can help. ## Ready to work with Gaffa? {% content-ref url="get-started" %} [get-started](/docs/gaffa/gaffa/get-started) {% endcontent-ref %} ## Stay up to date We'll be sporadically announcing updates and new features in our newsletter - [sign up here](https://gaffa.dev/#newsletter). # Get Started An introduction to the Gaffa Browser API. Learn how you can get started building fast, powerful web automations! Welcome to the Gaffa documentation site! You'll find everything you need here to get started using API including [interactive API definitions](/docs/gaffa/gaffa/api-reference), [a comprehensive list of actions](/docs/gaffa/gaffa/features/browser-requests/actions) you can use to interact with our cloud browsers and [breakdowns of our example requests](/docs/gaffa/gaffa/features/browser-requests/api-playground-examples) you can run right away in our API Playground. {% hint style="info" %} Gaffa is currently in it's very early stages, so we'd love to hear how we can improve our docs and API to make life easier for our users. If you have any questions or comments please [email us](https://emailto:support@gaffa.dev) or us [the support tool on our site](https://go.crisp.chat/chat/embed/?website_id=87a5807c-14f5-4ed3-9fbe-3d161610357b).\ \ To stay up to date with latest developments, features and news on mission to support the development of revolutionary AI Agents, sign up to sporadic [newsletter](https://gaffa.dev/#newsletter) updates. {% endhint %} {% stepper %} {% step %} ## Create an account You can sign up to create a Gaffa account [here](https://accounts.gaffa.dev/sign-up?redirect_url=https%3A%2F%2Fgaffa.dev%2F%2Fauth%2Fsign-in). After signing up you'll immediately be able to use the API to start using our [API Playground](https://gaffa.dev/dashboard/playground) which has a number of pre-built automations for [our demo site ](https://demo.gaffa.dev/)simulating a range of scenarios. #### Accessing the open web When you're ready to use Gaffa on the open web you'll need to choose a plan suitable for your needs and pay at which point the full internet will be available for you to automate. {% hint style="danger" %} In order to avoid scaling issues for our existing customers we are currently operating a queuing system for new accounts. Simply join the queue when prompted on your [account dashboard](https://gaffa.dev/dashboard) and we'll let you know when you have access.\ \ If you want to jump the queue, you can fill out a short survey to help us better understand our users and we'll approve your account sooner! {% endhint %} {% endstep %} {% step %} ## Making your first browser request The easiest way to make your first Gaffa [browser request](/docs/gaffa/gaffa/features/browser-requests) is to start using our [API Playground](https://gaffa.dev/dashboard/playground) where you can see several pre-made and interactive browser request examples of automations we've built against our test site which simulates some common scraping and web automation scenarios. You can run these examples without a paid account and also edit them easily to experiment - once you have a paid account you can also use the playground to build your automations for other sites. ### Gaffa API Playground examples Here are all the sample requests we've created for use in the API Playground.


Print to PDF	Export a web page to PDF and wait for elements to load with the Gaffa API.	export-web-page-to-pdf
Convert to Markdown	Export a web page to markdown format - useful feeding into LLM apps.	convert-web-page-to-markdown
Infinitely Scroll	Scroll the bottom of a page that infinitely loads items and record the interaction.	infinitely-scroll-an-ecommerce-site
Capture Screenshot	Interact with a page and capture the a screenshot of the whole page.	capture-a-full-height-screenshot
Form Completion	Fill out a form in a human-like way and record the interaction	automated-form-filling

{% endstep %} {% step %} ## Building your own browser requests Once you have a paid account and are ready to start building your own browser requests you'll want to read about all the other [actions ](/docs/gaffa/gaffa/features/browser-requests/actions)you can use for your solution as well as how you can easily use [proxy servers](/docs/gaffa/features/browser-requests#proxy-servers), [our cache](/docs/gaffa/features/browser-requests#caching) as well as the [other endpoints that are part of the API](/docs/gaffa/gaffa/api-reference). {% endstep %} {% endstepper %} # Credits and Pricing {% hint style="info" %} View our current pricing plans on the Gaffa [homepage](https://gaffa.dev/#pricing) {% endhint %} ## Browser Requests Browser requests are charged in terms of credits based on the following factors: * **Request length:** Billed at 1 credit per 30 seconds the request takes to run on the browser. * If screen recording is enabled, this is doubled to 2 credits per 30 seconds. * **Proxy bandwidth usage:** All requests that use a `proxy_location` parameter use our network of residential proxies and are billed at 1500 credits per 1GB of bandwidth used. * **Paid Actions:** Some actions will incur additional costs for their usage in a browser request. These are: * [JSON Parsing](/docs/gaffa/gaffa/features/browser-requests/actions/parse-json) Each successful request will deduct the corresponding number of credits from your monthly allowance. Be sure to use as many of your monthly credits as you want as they don't roll over month to month. # Browser Requests Making web automation requests has never been so simple. Browser Requests are our first main product and allow you to send the Gaffa API a URL and a list of actions you want to be carried out, including any outputs you want from the page. We'll carry out the request on our cloud browsers and return you the response with no need to worry about proxies, IP rotation, web automation frameworks and scaling. There's absolutely zero configuration needed and you can interact with Gaffa from any program that can send web requests. We think it's by far the simplest way to automate simple web tasks and the good news is, we're just getting started and have much more planned. *** ## Example request Running a new browser request is as simple as sending the following [POST body to our endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests). Below, you can see the url ([our demo site](https://demo.gaffa.dev)) and a list of actions which instruct Gaffa to wait for a table to load and print the page to PDF. {% hint style="info" %} You can read more about this particular example and how you can run it right now in our API Playground [here](/docs/gaffa/gaffa/features/browser-requests/api-playground-examples/export-web-page-to-pdf) {% endhint %} ```json { "url": "https://demo.gaffa.dev/simulate/table?loadTime=3&rowCount=20", "proxy_location": null, "async": false, "max_cache_age": 0, "max_media_bandwidth": null, "settings": { "record_request": false, "actions": [ { "type": "wait", "selector": "table" }, { "type": "print", "size": "A4", "margin": 20, "orientation": "portrait" } ] } } ``` *** ## Proxy servers {% hint style="info" %} In order to access public sites and use proxy servers you'll need to sign up for a [paid account](https://gaffa.dev/#pricing) but after that you'll be able to build automations for any site you wish. {% endhint %} Gaffa makes proxying your traffic through a global network of residential proxies super simple. Setting `proxy_location` in your request will allow you to utilize one of our partner third party proxy services to gain local access to a site. Not setting a `proxy_location` will mean the request does not use a proxy server and will use a generic datacenter IP. ### Available Locations | Proxy Server Location | Country Code | | --------------------- | ------------ | | United States | `us` | | Ireland | `ie` | | Singapore | `sg` | | France | `fr` | {% hint style="info" %} At the moment all our servers are in one location but we aim to introduce local machines to our proxy locations for a more realistic end-user load times. If this would interest you please contact support. {% endhint %} ### IP Types Currently all our IP addresses are residential IP addresses which are procured through reputable third parties. ### IP Rotation IP rotation is an essential part of any web data, scraping or automation task. In Gaffa, each browser request is treated as unique. We regularly rotate the IP addresses used so you should assume that each request will be carried out from a different IP address from the last. {% hint style="info" %} We are working to supporter a greater range of IP address scenarios, like static IPs in the future, as well as more trusted proxies for requests that require enhanced levels of security (logins etc.) {% endhint %} ### Restrictions Whilst we'll do our best to provide access to as wide a range of sites as possible we may have to restrict access to certain sites to prevent abuse of our service or of other services. Our proxy partners may also enforce restrictions on certain sites and categories of sites which we don't have any control over. *** ## Caching When we were building Gaffa we noticed that a lot of pre-existing scraping tools don't allow users to easily share their scraped web data with each other, despite many users requesting the same web pages on the same sites. Not only is this a waste of a user's allowance, it also puts a burden on the site owners who are serving the same data to different users for the same purpose. Because of this in Gaffa we have created a service-wide cache. ### How it works When making a browser request you can provide a `MaxCacheAge` parameter which is **a number in seconds equal or greater than 0**. This values denotes the maximum age of data you would accept from the API.\ \ If another user of our service has requested the same URL with exactly the same parameters and actions as you in this chosen timeframe then the response will be returned to you immediately and the response will not be carried out on one of our browsers. If there are multiple identical requests in the given timeframe then the most recent will be returned.\ \ This will save you time waiting for the response, as well as credits, because requests returned from the cache don't use any bandwidth. *** ## Screen Recording By specifying `record_request` you can ask Gaffa to screen record your automation and return a video in the response allowing you to view the magic happening or to debug your automation. Recording requests comes at an [additional cost](/docs/gaffa/gaffa/credits-and-pricing). *** ## Max Media Bandwidth If you are using Gaffa on a site with lots of images and videos and more interested in the text data on the page, you can cap how much data a page loads in MB using the `max_media_bandwidth` setting. This makes your automation faster and prevents spending credits on data you aren't interested in.\ \ With the `max_media_bandwidth` value set, Gaffa monitors data being downloaded by the page and when downloaded data exceeds the given number of MB, all further downloads of images or video will be cancelled. \ \ `max_media_bandwidth` defaults to `null` meaning downloads are not capped. {% hint style="info" %} Setting a value of 0 will cause no images to load which can work on some sites but on others this could lead to the site thinking you are using an ad blocker. {% endhint %} *** ## Time Limit Using the setting `time_limit` caps the maximum running time of the request in milliseconds. If this time expires all incomplete actions will be cancelled and the request will return an error. This cap has to be less than the maximum request running time dictated by your plan and if not set, will default to this value. *** ## Actions We currently support ten different types of actions which you can read more about [here](/docs/gaffa/gaffa/features/browser-requests/actions). *** ## Stealth We believe your AI Agents should be able to use the internet exactly how humans would. Gaffa can help you get access to sites with some of the most challenging anti-bot restrictions using a combination of proxies, human-like behavior, captcha solving and a custom browser implementation. We handle and maintain all of that so you can focus on building your solution! *** ## Examples We've created a number of sample browser requests you can read about [here](/docs/gaffa/gaffa/features/browser-requests/api-playground-examples) or you can jump straight into the [API Playground](https://gaffa.dev/dashboard/playground) to start running them right now. *** ## API Endpoints Check out our API reference for more details about the endpoints available, particularly [those you can use to query for past requests by id or status](/docs/gaffa/gaffa/api-reference/get-v1-browser-requests). # Actions When [making a Browser Request](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) you can specify a list of actions you wish for us to carry out on the requested web page. These actions conform to the following format: {% code overflow="wrap" fullWidth="false" %} ```json { "type": "", //the type of the action //other params follow as key value pairs "key": value //string, number etc. } ``` {% endcode %} ### Universal Parameters All actions have the following parameters:

Name	Type	Required	Description
`type`	`string`	true	The type name of the action.
`continue_on_fail`	`boolean`	false	Should execution of further actions continue or throw an error if this action fails. Default: `false`
`customId`	`string`	false	A customId to help you find the action in the response. Default: `null`

#### Action Execution Actions are carried out in the order they are submitted. Every action type has a `continue_on_fail` parameter which defaults to `false`, this means that if any action fails the execution of the browser request ends and an error will be returned. Setting `continue_on_fail` to `true` ensures that all actions are carried out, regardless of previous action results and an error will not be returned. #### Custom Id As shown above, you can submit a customId with each action you submit to the API. We'll include this Id in the outputs from the browser request so you can find a certain action's output and/or status easily in the response. ## Response Format When a browser request has completed, information on an action's execution {% code fullWidth="false" %} ```json { "id": "", //a unique id given to the action by Gaffa "type": "capture_screenshot", //the type of the action "query": "", //a representation of the action in querystring format "timestamp": "", //the UTC timestamp the action was executed "output": "" //if the action has an output you will find a url for this here, "error": "" //if the requesst fails the error message will be returned here } ``` {% endcode %} ## Supported Actions The Gaffa API supports the following actions detailed below. Click the "read more" buttons to read more information about each type. ### Actions without outputs

Type	Description	Read More
`click`	Click on a given element	Click
`scroll`	Scroll to a particular point on the page or, in the case of pages with infinite scrolling, scroll until a given time has elapsed.	Scroll
`type`	Type the provided text into a given element	Type
`wait`	Wait for a given time to elapse or an element to appear on page before proceeding to the next action.	Wait

### Actions with outputs

Type	Description	Read More
`capture_cookies`	Save a JSON object of cookies for the current page	Capture Cookies
`capture_dom`	Export the raw DOM page data	DOM
`capture_screenshot`	Capture a screenshot of the web page	Screenshot
`capture_snapshot`	Create a completely static version of the web page which can be accessed offline	Snapshot
`download_file`	Download an online file using Gaffa	Download File
`generate_markdown`	Convert the page into markdown	Markdown
`generate_simplified_dom`	Generate a simplified version of the DOM	Simplified DOM
`parse_json`	Parse online data to a defined JSON schema	JSON Parsing
`print`	Print the web page to a PDF	Print

# Block DOM Removals {% include "" %} **Type:** `block_dom_removals` This action will prevent the page from removing items from the page. This is useful if you are trying to scrape data from a Javascript-based web application that removes items from the page when they are out of view which can make grabbing data difficult. Using this action will block DOM removals for the rest of the browser request. ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage Capture the cookies of the current page ``` "actions": [ { "type": "block_dom_removals" } ] ``` # Capture Cookies {% include "" %} **Type:** `capture_cookies` This action will capture the browser cookies currently saved for the web page you are on and return them as a JSON object with key/values. ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage Capture the cookies of the current page ``` "actions": [ { "type": "capture_cookies" } ] ``` # Capture DOM **Type:** `capture_dom` This action will capture and return the raw dom of the site which you can then extract data from on your end. For common AI scenarios you may find this returns too much data so we have provided a [`generate_simplified_dom` ](/docs/gaffa/gaffa/features/browser-requests/actions/generate-simplified-dom)action which distills the DOM to only the important elements. ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage Capture the raw DOM of the current page ``` "actions": [ { "type": "capture_dom" } ] ``` ### Example Output {% file src="" %} # Capture Screenshot **Type:** `capture_screenshot` Takes a screenshot of the current page. You can choose to take a full screen screenshot showing the whole page or just the current view. ### Parameters

Name	Type	Required	Description
`size`	`string`	false	The size of paper the page should be printed to. Default: `view` Accepted: `["view", "fullscreen"]`

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage The following captures the current section of the page currently visible in the browser. ```json "actions": [ { "type": "capture_screenshot", "size": "view" } ] ``` ### Example Output An example screenshot in `fullscreen` mode.

# Capture Element {% include "" %} **Type**: `capture_element` Returns the [innerHTML](https://developer.mozilla.org/en-US/docs/Web/API/Element/innerHTML), essentially the contents, of a particular element on the page. This can be used when you are only interested in the contents of a particular element. ### Parameters

Name	Type	Required	Description
`selector`	`string`	true	The selector that defines the element whose contents you want to capture.
`timeout`	`integer`	false	The maximum amount of time the browser should wait for the element defined by the selector to appear. Default: 5000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage #### Click an element on the page The following code will wait 1 second for the `.page_contents` element to appear and return an html file containg the div's innerHTML. ```json "actions": [ { "type": "capture_element", "selector": ".page_contents", "timeout": 1000 } ] ``` # Capture Snapshot **Type:** `capture_snapshot` This output type will return a HTML file which captures a static version of the page state. The page will load offline and can be saved to your local machine. This will: * Load and embed all images on the page. * Embed all css files Currently, Javascript will be disabled and interactivity might not worked as expected but this feature should be useful for preserving the page state as it was and allowing you to view it offline. ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters) ### Usage The following captures the current section of the page currently visible in the browser. ```json "actions": [ { "type": "capture_snapshot", } ] ``` ### Example Output Here's an example that shows an offline snapshot of a site {% file src="" %} # Click **Type**: `click` Request that the browser clicks a particular element on the page. ### Parameters

Name	Type	Required	Description
`selector`	`string`	true	The selector that defines the page element that the browser should click on.
`timeout`	`integer`	false	The maximum amount of time the browser should wait for the element defined by the selector to appear. Default: 5000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage #### Click an element on the page The following code will wait 1 second and then continue with the next action, if provided. ```json "actions": [ { "type": "click", "selector": "a.header__logo" } ] ``` #### Wait for a particular element to appear The following code will wait for the logo to appear for a maximum of 5 seconds and it will continue with the list of actions ```json "actions": [ { "type": "wait", "selector": "a.header__logo", "timeout": 5000, "continueOnFail": true } ] ``` # Download File **Type**: `download_file` Request a copy of the most recent file viewed in the browser. ### Parameters

Name	Type	Required	Description
`timeout`	`integer`	false	The maximum amount of time the browser should wait for a file to download. Default: 5,000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Files Supported Currently this only works with PDF files. ### Usage #### Download a copy of a PDF open in the Browser The following waits 20s for a file to download and then returns it. ``` "actions": [ { "type": "download_file", "timeout": 20000 } ] ``` And the service responds with the file being in the action output: ``` "actions": [ { "id": "act_VHhrUbXjZSaYCPTqbBYD4acCzzeFGH", "type": "download_file", "query": "download_file?continue_on_fail=false&timeout=20000", "timestamp": "2025-05-30T15:02:06.6615306Z", "output": "https://storage.gaffa.dev/brq/downloads/5845df07-3749-424e-9c64-9602be19a857.pdf" } ] ``` # Generate Markdown **Type**: `generate_markdown` The markdown output format can export the data of the page (an article, table etc.) in a human and LLM readable format which removes unnecessary styling data and other "junk" that is only relevant for the site to work properly. Gaffa exports [GitHub flavoured markdown](https://github.github.com/gfm/) with comments removed and unknown tags ignored. ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage The following converts the current page to markdown: ```json "actions": [ { "type": "generate_markdown" } ] ``` ### Example Output {% file src="" %} # Generate Simplified DOM **Type:** `generate_simplified_dom` When you're looking at the DOM of a web page, there's a lot of unnecessary data that can be discarded if you are only interested in the page's elements or looking to export the data into a LLM. \ \ The `generate_simplified_dom` output format processes the HTML in the following way: * Removes all links in the `head` * Removes all `script` nodes and links to scripts * Removes all `style` nodes * Remove `style` attributes from all elements * Remove all links to stylesheets * Remove all `noscript` elements outside of the body * Finds all `hrefs` with query strings and removes the query strings * Important `meta` tags are kept, all others are removed * Remove all `alternate` links * Remove all SVG paths * Remove empty text nodes and excessive spacing ### Parameters See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage The following JSON captures the DOM of the page and simplifies it. ```json "actions": [ { "type": "generate_simplified_dom" } ] ``` {% hint style="info" %} We are actively working to improve this and to make this process more configurable - let us know if there's something you think we can improve. {% endhint %} ### Example Output {% file src="" %} # Print **Type**: `print` Request that the browser prints the page to a PDF. ### Parameters

Name	Type	Required	Description
`size`	`string`	false	The size of paper the page should be printed to. Default: `A4` Accepted: `["A4"]`
`margin`	`integer`	false	The margin of the page in pixels when the page is printed to PDF. Default: 20
`orientation`	`string`	false	Should execution of further actions continue or throw an error if this action fails. Default: portrait Accepted: `["portrait", "landscape"]`
`continue_on_fail`	`boolean`	false	Should execution of further actions continue or throw an error if this action fails. Default: true

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage #### Print a page in landscape to PDF The following JSON prints the page to a PDF in landscape with margins of 20px. ```json "actions": [ { "type": "print", "page_size": "A4", "orientation": "landscape", "margin": 20 } ] ``` ### Example Output {% file src="" %} # Parse JSON {% hint style="info" %} **Paid Action:** This action will consume credits based on the amount of content being parsed, see more [below](#pricing). {% endhint %} {% include "" %} **Type:** `parse_json` Use AI to parse web content from text into a pre-defined data schema and return it as a JSON object. *This feature currently works for online PDFs and web page text.* ### Parameters

Name	Type	Required	Description
`data_schema_id`	`string`	true	The id of the data schema you have defined that you want to transform the content into. You must provide a `data_schema` or `data_schema_id` with your request.
`data_schema`	`json`	true	A JSON object describing the data_schema you want to transform the content into. You must provide a `data_schema` or `data_schema_id` with your request.
`instruction`	`string`	false	A custom instruction, in addition to any detail you have added to the data schema, that you want to include with this particular parse.
`model`	`string``	false	The AI model you wish to use to parse the content into JSON. Default: `gpt-4o-mini` Accepted: `["gpt-4o-mini"]`
`input_token_cap`	`int`	false	The max number of source input tokens that will be passed to the AI model to parse. This can be used to prevent unnecessary credit usage. If your source input is longer than the token cap, it will be abbreviated. Default: 1,000,000
`selector`	`string`	false	The selector that defines an element you want to parse the content of - this is useful if you are only interested in the contents of a certain element.
`output_type`	`string`	false	Should the action output be saved to a file where a URL will be returned or should the parsed JSON object be included directly in the request. Default: `file` Accepted: `["file", "inline"]`

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Pricing The credits this action uses depends on the model used. Here are the current supported models and their pricing: | Model | Input Token Cost | Output Token Cost | | ------------- | -------------------------------- | ---------------------------------- | | `gpt-4o-mini` | 1 credit per 10,000 input tokens | 4 credits per 10,000 output tokens | # Parse Table {% include "" %} **Type**: `parse_table` Finds a table on the page with a given selector and then converts the table data into a JSON object. This action first fins the table headers and converts them into property names by converting them to lower case and replacing non-alphanumeric characters with underscores. It then processes each table row and for each cell is extracts the contents and saves a value. At the moment, all values will be `string` types. ### Parameters

Name	Type	Required	Description
`selector`	`string`	true	The selector that defines the table whose contents you want to parse.
`timeout`	`integer`	false	The maximum amount of time the browser should wait for the table defined by the selector to appear. Default: 5000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage #### Extract a table on the page The following code will wait 1 second for the `.large_table` element to appear and return a JSON file with the headers and rows converted. ```json "actions": [ { "type": "parse_table", "selector": ".large_table", "timeout": 1000 } ] ``` # Scroll **Type**: `scroll` Request that the browser scrolls to a certain point on the page or, in the case of pages with infinite scrolling, scrolls for a particular amount of time. ### Parameters

Name	Type	Required	Description
`percentage`	`integer`	true	The percentage the page should scroll up or down (+/-) Range: [-100 - 0 - 100] Default: 100 (% - scroll to bottom)
`wait_time`	integer	false	After arriving at the desired scroll location this the time Gaffa should monitor for changes to the page height before marking the action as succeeded. Read more below. Default: 0
`max_scroll_time`	`integer`	false	The maximum amount of time the page should be scrolled for, in milliseconds. After this time passes, the action will be cancelled. This doesn't cause the action to fail. Default: 20,000 (20s)
`scroll_speed`	`string`	false	The speed which the page should scroll to the desired point. You can read more about this below. Default: `medium` Accepted: [`slow`, `medium`, `instant`]
`interval`	`integer`	false	The amount of time, in milliseconds, that scrolling should pause between scroll events. Read more about this below. Default: 0

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Scroll Speed & Interval Gaffa gives you a flexibility about how fast you scroll down the page which can be really useful to get around restrictions enforced by some sites which detect and limit fast scrolling. By experimenting with `scroll_speed` and `interval` you will be able to create the perfect scrolling action for your scenario. The speed settings are as follows: * `instant`- the page will smoothly scroll to the desired position immediately, useful for sites with no rate limits or loading events caused by scroll actions. * `medium` - human-like scrolling at a normal speed to the desired position. Gaffa will scroll in much the same way as you would using a mouse. * `slow`- human-like scrolling at a very slow speed to the desired position. The speed is comparable to scrolling whilst reading a page. `interval`allows you to adjust the scroll speed further by inserting pauses between scroll events. {% hint style="info" %} We've found some sites with infinite scrolling and strict rate limits respond better to `immediate` speed scroll events to the bottom of the page with large `intervals`between these scrolls to keep within rate limits. {% endhint %} ### Wait Time If `wait_time` is set to 0 and Gaffa arrives at the desired location then Gaffa will immediately mark the action as succeeded. However, if another value is set then the page will be monitored for the desired amount of time to check for further expansions. If, during this period, the page expands again then Gaffa will continue scrolling to the desired location and the wait will reset. {% hint style="info" %} This can be really useful if you find that the site takes some time to load more items when you reach the bottom of the page and more will be loaded after the action has suceeded. {% endhint %} ### Usage #### Scroll a particular percentage down the page The following code will scroll half way down the page. ```json "actions": [ { "name": "scroll", "percentage": 50, } ] ``` #### Scroll an infinitely scrolling webpage The following code will scroll to the bottom of the page and then keep scrolling when new content loads for a maximum of 25 seconds, waiting 1 second for new content and scrolling at a slow pace with 1 second between scroll actions. ```json "actions": [ { "name": "scroll", "percentage": 100, "scroll_speed": "slow", "max_scroll_time": 25000, "interval": 1000, "wait_time": 1000 } ] ``` # Type **Type**: `type` Request that the browser type a particular bit of text into a field. ### Parameters

Name	Type	Required	Description
`selector`	`string`	true	The selector that defines the page element that the browser should click on.
`text`	`string`	true	The text the browser should enter into the text field.
`timeout`	`integer`	false	The maximum amount of time the browser should wait for the element that needs to be typed in to appear. Default: 5000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). {% hint style="info" %} Sites that use more advanced bot detection often use keyboard events to detect unusual activity on their site, rather than immediately dropping all characters of the text into a field our platform types the text in a human-like manner. {% endhint %} ### Usage #### Type into a text box The following action will type into a particular text field. ```json "actions": [ { "name": "type", "selector": "#postform-text", "text": "Hello world!" } ] ``` #### Wait for an element to appear before typing The following code will wait a maximum of 10 seconds for the email input to appear in the field and then type in the provided email. ```json "actions": [ { "name": "type", "selector": "form input[name="email"]", "text": "test@test.com" "timeout": 10000 } ] ``` # Wait **Type**: `wait` Request that the browser waits a given amount of time or for a particular item to appear on the page. ### Parameters

Name	Type	Required	Description
`time`	`integer`	false	The time in milliseconds that the browser should wait.
`selector`	`string`	false	The selector that defines the page element that the browser should wait to appear.
`timeout`	`integer`	false	The maximum amount of time the browser should wait for the provided selector to appear. Default: 5,000 (5s)

See [universal parameters](/docs/gaffa/gaffa/features/browser-requests/actions/..#universal-parameters). ### Usage #### Wait for a particular amount of time The following code will wait 1 second and then continue with the next action, if provided. ```json "actions": [ { "name": "wait", "time": 1000, } ] ``` #### Wait for a particular element to appear The following code will wait for a table to appear on the page for a maximum of 5 seconds. If the table has not appeared after 5 seconds the next action will be executed, if provided. ```json "actions": [ { "name": "wait", "selector": "table", "timeout": 5000, "continueOnFail": true } ] ``` # API Playground Examples In the following pages you can view all the pre-built requests we've built to show what is possible with the Gaffa web automation API. **You can start using these in the** [**API Playground**](https://gaffa.dev/dashboard/playground) **once you've created an account.** # Export Web Page to PDF An example request that uses Gaffa to convert an HTML page to a PDF. There are lots of HMTL to PDF API's but Gaffa handles it easily, as well as doing much more. ***The following example is a request we've pre-built to show you Gaffa's capabilities against our*** [***demo site.***](https://demo.gaffa.dev) ***You can run this request right now in the*** [***Gaffa API Playground***](https://gaffa.dev/dashboard/playground?templateId=html_to_pdf)***.*** Gaffa's print to PDF feature allows you to export web pages as PDF files easily. Unlike the standard "Print to PDF" in your local browser, Gaffa's feature waits for specific items to load, uses proxies, and scales with your product's growth. Enhance your customer experience and streamline your PDF export process ## API Request The request below uses the [POST endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) to open the demo site on the table page, wait for the table to load and then print the webpage to a PDF in size A4 with a margin of 20 and using the portrait orientation. ```json { "url": "https://demo.gaffa.dev/simulate/table?loadTime=3&rowCount=20", "proxy_location": null, "async": false, "max_cache_age": 0, "settings": { "record_request": false, "actions": [ { "type": "wait", "selector": "table" }, { "type": "print", "size": "A4", "margin": 20, "orientation": "portrait" } ] } } ``` ## Actions Read the full documentation for these actions here. {% content-ref url="../actions/wait" %} [wait](/docs/gaffa/gaffa/features/browser-requests/actions/wait) {% endcontent-ref %} {% content-ref url="../actions/print" %} [print](/docs/gaffa/gaffa/features/browser-requests/actions/print) {% endcontent-ref %} ## Response Here's an example of the PDF returned by the request after waiting for the table to load. {% file src="" %} # Convert Web Page to Markdown An example request that uses Gaffa to convert a web page page to markdown. This could be used to export web page reports or to print the content of a page in a readable format. *The following example is a request we've pre-built to show you Gaffa's capabilities against our* [*demo site.*](https://demo.gaffa.dev) ***You can run this request right now in the*** [***Gaffa API Playground***](https://gaffa.dev/dashboard/playground?templateId=article_to_markdown)***.*** Gaffa converts web pages to clean markdown, stripping away styling, scripts, and images. This optimizes content for LLM applications by reducing token usage while preserving essential information. ## API Request The request below uses the [POST endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) to open the demo site on the article simulator, wait for the article to load and then generate a markdown from the page's content which you can download for use in your program. ```json { "url": "https://demo.gaffa.dev/simulate/article?loadTime=3¶graphs=10&images=3", "proxy_location": null, "async": false, "max_cache_age": 0, "settings": { "record_request": false, "actions": [ { "type": "wait", "selector": "article" }, { "type": "generate_markdown" } ] } } ``` ## Actions {% content-ref url="../actions/wait" %} [wait](/docs/gaffa/gaffa/features/browser-requests/actions/wait) {% endcontent-ref %} {% content-ref url="../actions/generate-markdown" %} [generate-markdown](/docs/gaffa/gaffa/features/browser-requests/actions/generate-markdown) {% endcontent-ref %} ## Response Here's an example of the PDF returned by the request after waiting for the article to load. {% file src="" %} # Infinitely Scroll an Ecommerce Site An example request that uses Gaffa to infinitely scroll down a simulated ecommerce site whilst recording the interaction. *The following example is a request we've pre-built to show you Gaffa's capabilities against our* [*demo site.*](https://demo.gaffa.dev) ***You can run this request right now in the*** [***Gaffa API Playground***](https://gaffa.dev/dashboard/playground?templateId=infinite_scroll)***.*** Gaffa automates infinite scrolling on dynamic pages like e-commerce storefronts. Set a duration, and Gaffa will capture all content as it scrolls. Each session can be recorded as a video for playback, letting you debug or review the interaction. ## API Request The request below uses the [POST endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) to open the demo site on the ecommerce site simulator with an infinitely scrolling storefront. It will wait for and dismiss a dialog box, wait for a product to load and then scroll down the page for a maximum of 20 seconds - if new items load it will keep scrolling. ```json { "url": "https://demo.gaffa.dev/simulate/ecommerce?loadTime=3&showModal=true&modalDelay=0&itemCount=infinite", "proxy_location": null, "async": false, "max_cache_age": 0, "settings": { "record_request": true, "actions": [ { "type": "wait", "selector": "div[role=\"dialog\"]", "timeout": 10000 }, { "type": "click", "selector": "[data-testid=\"accept-all-button\"]" }, { "type": "wait", "selector": "[data-testid^=\"product-1\"]", "timeout": 5000 }, { "type": "scroll", "percentage": 100, "max_scroll_time": 20000 } ] } } ``` ## Actions {% content-ref url="../actions/wait" %} [wait](/docs/gaffa/gaffa/features/browser-requests/actions/wait) {% endcontent-ref %} {% content-ref url="../actions/click" %} [click](/docs/gaffa/gaffa/features/browser-requests/actions/click) {% endcontent-ref %} {% content-ref url="../actions/scroll" %} [scroll](/docs/gaffa/gaffa/features/browser-requests/actions/scroll) {% endcontent-ref %} ## Response Here's a video showing Gaffa scrolling the page for 20 seconds as more items load. {% embed url="" %} Gaffa scrolling to the bottom of a simulated ecommerce page! {% endembed %} ## Read More Read more about screen recording here. (TODO) {% content-ref url="../../../get-started" %} [get-started](/docs/gaffa/gaffa/get-started) {% endcontent-ref %} # Capture a Full Height Screenshot An example request that uses Gaffa to dismiss a modal, scroll to the bottom of a page and then capture a full height screenshot. *The following example is a request we've pre-built to show you Gaffa's capabilities against our* [*demo site.*](https://demo.gaffa.dev) ***You can run this request right now in the*** [***Gaffa API Playground***](https://gaffa.dev/dashboard/playground?templateId=screenshot_ecommerce)***.*** Gaffa can also capture screenshots at any point during your interaction for use in your app or just to work out exactly was being shown at a given point in time. You can capture just what is shown as if you were looking at the screen or the full height of the page. ## API Request The request below uses the [POST endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) to open the demo site on the ecommerce page with 20 items, wait for and dismiss the dialog, scroll to the bottom of the page and capture a full height screenshot. ```json { "url": "https://demo.gaffa.dev/simulate/ecommerce?loadTime=3&showModal=true&modalDelay=0&itemCount=20", "proxy_location": null, "async": false, "max_cache_age": 0, "settings": { "record_request": false, "actions": [ { "type": "wait", "selector": "div[role=\"dialog\"]", "timeout": 10000 }, { "type": "click", "selector": "[data-testid=\"accept-all-button\"]" }, { "type": "wait", "selector": "[data-testid^=\"product-1\"]", "timeout": 5000 }, { "type": "scroll", "percentage": 100 }, { "type": "capture_screenshot", "size": "fullscreen" } ] } } ``` ## Actions {% content-ref url="../actions/wait" %} [wait](/docs/gaffa/gaffa/features/browser-requests/actions/wait) {% endcontent-ref %} {% content-ref url="../actions/click" %} [click](/docs/gaffa/gaffa/features/browser-requests/actions/click) {% endcontent-ref %} {% content-ref url="../actions/scroll" %} [scroll](/docs/gaffa/gaffa/features/browser-requests/actions/scroll) {% endcontent-ref %} {% content-ref url="../actions/capture-screenshot" %} [capture-screenshot](/docs/gaffa/gaffa/features/browser-requests/actions/capture-screenshot) {% endcontent-ref %} ## Response The export full height screenshot of the page showing all items.

# Automated Form Filling An example request that uses Gaffa to automate the completion of a form and waits for a success modal to appear. *The following example is a request we've pre-built to show you Gaffa's capabilities against our* [*demo site.*](https://demo.gaffa.dev) ***You can run this request right now in the*** [***Gaffa API Playground***](https://gaffa.dev/dashboard/playground?templateId=form_fill)***.*** Filling forms is tedious, Gaffa can be used to fill out a form in a human-like manner so you can spend time doing much more interesting things. ## API Request The request below uses the [POST endpoint](/docs/gaffa/gaffa/api-reference/post-v1-browser-requests) to open the demo site on the form simulator page with some sections pre-filled (for speed). After typing in the required information and clicking submit, Gaffa waits for the success dialog to show before returning a video of the interaction. ```json { "url": "https://demo.gaffa.dev/simulate/form?loadTime=3&showModal=false&modalDelay=0&formType=address&firstName=John&lastName=Doe&address1=123%20Main%20Street&city=London&country=UK", "proxy_location": null, "async": false, "max_cache_age": 0, "settings": { "record_request": true, "actions": [ { "type": "type", "selector": "#email", "text": "johndoe@example.com" }, { "type": "type", "selector": "#state", "text": "CA" }, { "type": "type", "selector": "#zipCode", "text": "12345" }, { "type": "click", "selector": "button[type='submit']" }, { "type": "wait", "selector": "[role=\"dialog\"] h2:has-text(\"Success!\")", "timeout": 10000 } ] } } ``` ## Actions {% content-ref url="../actions/type" %} [type](/docs/gaffa/gaffa/features/browser-requests/actions/type) {% endcontent-ref %} {% content-ref url="../actions/click" %} [click](/docs/gaffa/gaffa/features/browser-requests/actions/click) {% endcontent-ref %} {% content-ref url="../actions/wait" %} [wait](/docs/gaffa/gaffa/features/browser-requests/actions/wait) {% endcontent-ref %} ## Response Here's a video showing Gaffa filling out the page and waiting for the success modal. {% embed url="" %} Gaffa can help automatically fill out your forms! {% endembed %} ## Read More Read more about screen recording here (TODO). # API Authentication We use API Keys for authenticating requests to our API. In this document we'll explain how you can manage and use the keys for your account. ## Creating Keys Once your account is approved, you will need to create an API key to send your requests to our API. \ \ Go to your account [**Dashboard > API Keys**](https://gaffa.dev/dashboard/api-tokens) and create a new key with a name. Once the key is created, copy the value and you will immediately be free to start using it to make requests. {% hint style="info" %} You can create as many keys as wish but always remember to treat the key as a secret and do not reveal in public blog posts or GitHub repositories. If someone uses your key to make requests with your leaked key we won't be responsible! {% endhint %} ## Deleting Keys If you are worried you have exposed your Gaffa API key or just want to periodically rotate your keys you can create another key and then delete your old keys. Deleted keys will immediately stop working for new requests to the API but past browser requests made using old keys will still be available. ## Authenticating Requests Our API is secured with a customer header `X-API-Key` whose value should be any current API key in your account. That's all you need to add to your request! # POST v1/browser/requests {% hint style="info" %} For more information on browser requests, [see here](/docs/gaffa/gaffa/features/browser-requests). {% endhint %} The following endpoint creates a browser request and either runs it synchronously or returns immediately with an ID so you can check it status later using this endpoint. {% openapi-operation spec="gaffa-api" path="/v1/browser/requests" method="post" %} [Broken link](/docs/gaffa/gaffa/api-reference/broken-reference) {% endopenapi-operation %} # GET v1/browser/requests/{id} {% hint style="info" %} For more information on browser requests, [see here](/docs/gaffa/gaffa/features/browser-requests). {% endhint %} The following endpoint allows you to query browser request for your account by ID. {% openapi-operation spec="gaffa-api" path="/v1/browser/requests/{id}" method="get" %} [Broken link](/docs/gaffa/gaffa/api-reference/broken-reference) {% endopenapi-operation %} # GET v1/browser/requests {% hint style="info" %} For more information on browser requests, [see here](/docs/gaffa/gaffa/features/browser-requests). {% endhint %} The following endpoint allows you to query for multiple browser requests, either by status or a list of particular ids, submitting a request with neither of these will return all requests for your account. {% openapi-operation spec="gaffa-api" path="/v1/browser/requests" method="get" %} [Broken link](/docs/gaffa/gaffa/api-reference/broken-reference) {% endopenapi-operation %} # Beta Endpoints The following features are currently in beta and only avaialble to select users. If you are interested in trying out any of these features, please contact support and we can enable them for you. ## JSON Data Parsing Allows you to describe a JSON data schema for your data and then convert an online PDF into this data format using AI. ### Endpoints: {% content-ref url="beta-endpoints/post-v1-schemas" %} [post-v1-schemas](/docs/gaffa/gaffa/api-reference/beta-endpoints/post-v1-schemas) {% endcontent-ref %} {% content-ref url="beta-endpoints/put-v1-schemas" %} [put-v1-schemas](/docs/gaffa/gaffa/api-reference/beta-endpoints/put-v1-schemas) {% endcontent-ref %} {% content-ref url="beta-endpoints/delete-v1-schemas-id" %} [delete-v1-schemas-id](/docs/gaffa/gaffa/api-reference/beta-endpoints/delete-v1-schemas-id) {% endcontent-ref %} {% content-ref url="beta-endpoints/get-v1-schemas" %} [get-v1-schemas](/docs/gaffa/gaffa/api-reference/beta-endpoints/get-v1-schemas) {% endcontent-ref %} # POST v1/schemas The following endpoint allows you to describe a data schema for parsing an online PDF to JSON. {% openapi-operation spec="gaffa-api" path="/v1/schemas" method="post" %} [Broken link](/docs/gaffa/gaffa/api-reference/beta-endpoints/broken-reference) {% endopenapi-operation %} # PUT v1/schemas The following endpoint allows you to update a data schema by ID. {% openapi-operation spec="gaffa-api" path="/v1/schemas/{id}" method="put" %} [Broken link](/docs/gaffa/gaffa/api-reference/beta-endpoints/broken-reference) {% endopenapi-operation %} # DELETE v1/schemas/{id} The following endpoint allows you to delete a schema from your account. {% openapi-operation spec="gaffa-api" path="/v1/schemas/{id}" method="delete" %} [Broken link](/docs/gaffa/gaffa/api-reference/beta-endpoints/broken-reference) {% endopenapi-operation %} # GET v1/schemas The following endpoint allows you to list data schemas for your account in a paged list. {% openapi-operation spec="gaffa-api" path="/v1/schemas" method="get" %} [Broken link](/docs/gaffa/gaffa/api-reference/beta-endpoints/broken-reference) {% endopenapi-operation %} # Convert any webpage into LLM-ready Markdown using Gaffa The ability to convert websites into LLM-friendly markdown is powerful when building applications for summarization, Q\&A, or knowledge extraction. In this guide, you'll learn how to use the [Gaffa API](https://gaffa.dev/) to extract the main content of any web page using browser rendering and convert it into structured markdown. By the end of this guide, you’ll be able to: * Render web pages using Gaffa’s API. * Extract clean page content. * Generate structured markdown suitable for LLM-based Q\&A or summarization. ### **Prerequistes** 1. Install Python 3.10 or newer. 2. Create a virtual environment ```sh python -m venv venv && source venv/bin/activate ``` 3. Install the required libraries ```sh pip install requests openai ``` 4. Get your [Gaffa API](https://gaffa.dev/dashboard/api-keys) key and [OpenAI API](https://platform.openai.com/signup) key, and store them as environment variables: ```sh GAFFA_API_KEY=your_gaffa_api_key OPENAI_API_KEY=your_openai_api_key ``` ### Convert a webpage to Markdown In the code below, we define a function that takes a URL as input, makes a POST request to the Gaffa API, invoking the [generate\_markdown](/docs/gaffa/gaffa/features/browser-requests/actions/generate-markdown) action, which uses the browser rendering engine to extract the main content of the page and convert it into markdown. {% code overflow="wrap" lineNumbers="true" %} ```python import requests import openai GAFFA_API_KEY = os.getenv("GAFFA_API_KEY") OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # Fetch the markdown content from Gaffa def fetch_markdown_with_gaffa(url): payload = { "url": url, "proxy_location": None, "async": False, "max_cache_age": 0, "settings": { "record_request": False, "actions": [ { "type": "wait", "selector": "article" }, { "type": "generate_markdown" } ] } } # Set the headers for the request headers = { "x-api-key": GAFFA_API_KEY, "Content-Type": "application/json" } # Make the POST request to the Gaffa API print("Calling Gaffa API to generate markdown...") response = requests.post("https://api.gaffa.dev/v1/browser/requests", json=payload, headers=headers) response.raise_for_status() # Extract the markdown URL from the response markdown_url = response.json()["data"]["actions"][1]["output"] # Fetch the markdown content from the generated URL print(f"📥 Fetching markdown from: {markdown_url}") markdown_response = requests.get(markdown_url) markdown_response.raise_for_status() return markdown_response.text ``` {% endcode %} ### Ask questions using OpenAI Now that we have the markdown content, we can ask questions about it using the OpenAI API. The function below takes the markdown content and a question as input and uses the OpenAI API to generate a summary based on the provided content. In this case, we are using the [gpt-3.5-turbo](https://platform.openai.com/docs/models) model, but you can choose any other model. {% code overflow="wrap" lineNumbers="true" %} ```python def ask_question(markdown, question): openai.api_key = OPENAI_API_KEY prompt = ( f"You are an assistant helping analyze different webpages.\n\n" f"Markdown content:\n{markdown[:3000]}\n\n" f"Question: {question}\nAnswer as clearly as possible." ) response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": prompt} ] ) return response.choices[0].message["content"] ``` {% endcode %} The markdown becomes the model’s context, enabling accurate answers about the original web content. ### User Interaction and Execution Having defined the functions, we can now create a simple command-line interface that allows users to input a URL and ask questions about the content. {% code overflow="wrap" lineNumbers="true" %} ```python def main(): url = input("Enter the URL of the article: ") try: markdown = fetch_markdown_with_gaffa(url) print("\n✅ Markdown successfully retrieved from Gaffa.\n") while True: question = input("Ask a question about the content (or type 'exit'): ") if question.lower() == "exit": break answer = ask_question(markdown, question) print(f"\n💬 Answer: {answer}\n") except Exception as e: print(f"⚠️ Error: {e}") if __name__ == "__main__": main() ``` {% endcode %} ### Full Script The full script is available to download from the [Gaffa Python Examples GitHub repo](https://github.com/GaffaAI/GaffaPythonExamples/blob/main/scripts/WebpageToMarkdown/markdown_generator.py). ### Running the Script To run the script, simply execute it in your terminal: ```sh python your_script_name.py ``` With your script running, you can enter any URL of any web page, and the script will fetch the markdown content and allow you to ask questions about it. # Capture a Full-Height Screenshot of a Webpage In just a few lines of JSON inlined in a single cURL command, you can automate: * Dismissing Wikipedia’s EU cookie consent banner (if present) * Waiting for the main heading on the Artificial Intelligence article * Scrolling through every section (lazy-loaded images and all) * Capturing a full-page PNG for archiving, visual regression, or documentation All without installing Playwright or managing headless browsers, Gaffa handles it for you server-side via the[ Browser Requests API](https://gaffa-1.gitbook.io/gaffa/features/browser-requests). ### Prerequisites * A valid Gaffa API key * A simple HTTP client (cURL, Postman, axios, etc.). * Familiarity with the[ API Playground](https://gaffa.dev/dashboard/playground) for testing browser requests. * Target URL for this tutorial, for this we'll use wikipedia: {% stepper %} {% step %} ### Execute the Request Use cURL with the full JSON payload inlined to ensure Gaffa receives exactly what you intend: ```sh curl https://api.gaffa.dev/v1/browser/requests \ --request POST \ --header 'Content-Type: application/json' \ --header 'X-API-Key: YOUR_API_KEY' \ --data '{ "url": "https://en.wikipedia.org/wiki/Artificial_intelligence", "async": false, "max_cache_age": 0, "settings": { "actions": [ { "type": "wait", "selector": "#cookie-policy-notice", "timeout": 10000, "continue_on_fail": true }, { "type": "click", "selector": "#cookie-policy-notice", "continue_on_fail": true }, { "type": "wait", "selector": "#firstHeading", "timeout": 10000 }, { "type": "scroll", "percentage": 100 }, { "type": "capture_screenshot", "size": "fullscreen" } ] } }' ``` Replace YOUR\_API\_KEY with your actual token from your [Dashboard.](https://gaffa.dev/dashboard/api-keys) This command has the following actions: 1. **Wait** (optional): Detect and accept Wikipedia’s cookie banner if it appears. If it fails, that simply means no banner was present or it did not load in time. Since continue\_on\_fail defaults to true, Gaffa will move on without halting the workflow, ensuring the rest of the steps still execute. 2. **Wait**: Ensure the main heading (#firstHeading) is loaded. 3. **Scroll**: Scroll through the entire page to trigger any lazy-loaded content. 4. **Capture** Screenshot: Produce a full-page PNG. {% endstep %} {% step %} ### Retrieve Your Screenshot A successful response returns JSON like: {% code lineNumbers="true" %} ```json { "data": { "id": "brq_VJX3mbESLiyCFYvZQEUih9RdDYovog", "url": "https://en.wikipedia.org/wiki/Artificial_intelligence", "proxy_location": null, "state": "completed", "credit_usage": 2, "http_status_code": 200, "from_cache": false, "started_at": "2025-06-09T15:55:46.4235903Z", "completed_at": "2025-06-09T15:56:27.9381332Z", "running_time": "00:00:40.7348244", "page_load_time": "00:00:02.2087117", "actions": [ { "id": "act_VJX3memaue6YUgFcn44uNscZbVUpYg", "type": "wait", "query": "wait?selector=%23cookie-policy-notice%2C%20.mw-cookie-consent-container&timeout=10000&continue_on_fail=true", "timestamp": "2025-06-09T15:55:48.6323091Z", "error": "action_timed_out" }, { "id": "act_VJX3mkwfwNPdGiMUpqKr34Tm5xzyUU", "type": "click", "query": "click?selector=%23cookie-policy-notice%20button%2C%20.mw-cookie-consent-container%20button&continue_on_fail=true&timeout=5000", "timestamp": "2025-06-09T15:55:58.7949275Z", "error": "action_timed_out" }, { "id": "act_VJX3mkSJ3sevWRXUCjFy6zwfD172fV", "type": "wait", "query": "wait?selector=%23firstHeading&timeout=10000&continue_on_fail=false", "timestamp": "2025-06-09T15:56:03.9581113Z" }, { "id": "act_VJX3mbq9Jgj8EwADszW2AqdeJJXJiY", "type": "scroll", "query": "scroll?percentage=100&max_scroll_time=20000&scroll_speed=medium&continue_on_fail=false", "timestamp": "2025-06-09T15:56:03.9691994Z" }, { "id": "act_VJX3mjBQYv8zTsXv1SkgUnBkzNFmJU", "type": "capture_screenshot", "query": "capture_screenshot?size=fullscreen&continue_on_fail=false", "timestamp": "2025-06-09T15:56:20.0727905Z", "output": "https://storage.gaffa.dev/brq/image/brq_VJX3mbESLiyCFYvZQEUih9RdDYovog/act_VJX3mjBQYv8zTsXv1SkgUnBkzNFmJU_full.png" } ] }, "error": null } ``` {% endcode %} The response contains the following information: * **data.id**: Unique request identifier. * **data.state**: "completed" means the workflow finished (even if some steps timed out). * **data.credit\_usage**: Credits consumed for this run. * **data.started\_at** / **data.completed\_at**: Workflow timing. * **data.running\_time** and **data.page\_load\_time**: Performance metrics. * **data.actions**: Each action’s details, including successes, timeouts, and final screenshot URL. Within the list of actions you'll be able to see the capture\_screenshot action which contains an **output** parameter containing the full size screenshot that was captured. {% endstep %} {% endstepper %} If you don't want to use cURL, you can also run this query in the [Gaffa API Playground](https://gaffa.dev/dashboard/playground) which is an easy way to get started. ### Use Cases Gaffa's screenshot action could be used for a huge number of use cases, but here are a few ideas: * **Visual Regression**: Integrate into your CI pipeline to compare changes over time. * **Archival**: Schedule daily captures for audit or compliance purposes. * **Monitoring**: Automate periodic checks to detect visual bugs or layout shifts. #### All this is powered by Gaffa’s hosted headless browsers with no local setup required. Experiment with more actions and build complex browser workflows easily. Refer to the full[ Browser Requests API documentation](https://gaffa-1.gitbook.io/gaffa/features/browser-requests) for additional capabilities. \\ # July 2025 Here are the major changes we've released in July 2025: ## Features ### Beta Features * We've refined the [`parse_json` ](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/parse-json)action with a few new features: * You can now provide a predefined `data_schema_id` to parse your webpage content into or you can provide a `data_schema` JSON object directly to the action so you can parse content with a single request. * If you are only interested in a certain part of the page, you can now define a `selector` and only the contents of that will be parsed using AI. * You can now provide `output_type` which tells Gaffa to return the required data as a file or embedded in the request. * We've added the [`block_dom_removals` ](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/block-dom-removals)action which is really useful if you are trying to scrape a site that implements infinite scrolling where items are removed from the page when out of view, usually modern javascript web apps. ## Documentation We've now added an [llms.txt](https://gaffa.dev/docs/llms-full.txt) file for the docs so you can import all the Gaffa docs into your favourite large language model and get it to write your automations for you! # June 2025 Here are the major changes we've released in June 2025: ## Features ### **Beta Features** * The [`parse_json` ](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/parse-json)action now supports all web pages (as well as PDFs) so you can define a schema and convert any web page into that format. {% hint style="info" %} If you are interested in trying any of these beta features, please [contact support](https://gaffa.dev/support). {% endhint %} ## Tutorials We've added a new tutorial which walks you through how to use Gaffa to [capture a full height screenshot of a web page](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/capture-a-full-height-screenshot-of-a-webpage). # May 2025 We've released several new features, some of which will remain in beta whilst we fine tune them. These features are: ## Features ### Available Now [**Download File**](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/download-file) **-** download an online file using Gaffa (only available for PDFs for now). ### Beta Features The following are avaiable only to select accounts but [message us](https://gaffa.dev/support) if you'd like to try them! [**Capture Element**](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/capture-element) - extract the `innerHTML` of a particular element. [**JSON Parsing**](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/parse-json) - define a schema and then parse online data into JSON using a large language model, currently only works with online PDFs. [**Parse Table**](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/parse-table) - parse a table to a JSON object. [**Capture Cookies**](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/features/browser-requests/actions/capture-cookies) - save a JSON object with the cookies for a given web page. ## Tutorials We've added a new tutorial which walks you through how to use Gaffa in a Python script to ask questions about the content of a web page. [Convert any webpage into LLM-ready Markdown using Gaffa](https://app.gitbook.com/s/yUba6osOT5MkKiV0wmgr/tutorials/convert-any-webpage-into-llm-ready-markdown-using-gaffa) # Pre-May 2025 ## April 2025 #### 02.04.2025 * **Subscriptions and Credits** * You can now buy "pay as you go" credits to be used without a subscription, or to complement the credits in your subscription for larger one-off jobs. * We've adjusted the plans and credits slightly, take a look at the [updated subscriptions](https://gaffa.dev/#pricing). * **Actions** * We've edited the [Click](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/click.md), [Type ](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/type.md)and [Wait](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/wait.md) actions to better support finding elements to interact with or wait for inside iframes with no extra configuration necessary. ## March 2025 #### 17.03.2025 * **Proxies** * We have added support for another European location, [France](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/#proxy-servers)\\ * **Actions** * [Simplified DOM](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/generate-simplified-dom.md) action no longer removes classes. * [Click](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/click.md) default `timeout` now 5 seconds * [Scroll](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/scroll.md) removed timeout and add new functionality using `wait_time`, `max_scroll_time`, `scroll_speed` and `interval` * [Type](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/type.md) default timeout now 5 seconds * [Wait](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/actions/wait.md) default timeout now 5 seconds\\ * **Settings** * [`max_media_bandwidth`](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/#max-media-bandwidth) caps media downloads to prevent excess data usage. * [`time_limit`](https://github.com/GaffaAI/Docs/blob/main/features/browser-requests/#time-limit) added to cap the duration of requests.\\ * **Stealth** * We've developed some new browser technology which makes Gaffa's browser look even more like human-initiated website traffic.