For the complete documentation index, see llms.txt. This page is also available as Markdown.

April

Samples, Blog & Tutorials

How to Scrape a Table with Python (The Easy Way)

We published a full walkthrough of both table scraping approaches using capture_dom with BeautifulSoup and using Gaffa's parse_table action, directly covering when to use each and how to get clean, structured output either way. Read the blog.

Web Scraping with JavaScript Using Gaffa

We published a full walkthrough on scraping the web in JavaScript using Gaffa's REST API. It covers two core actions: generate_markdown for extracting clean, readable content and capture_dom for pulling raw HTML when you need more control. Also touches on dynamic content handling, geo-routing, async mode, and actions like parse_table and parse_json. Read the blog.

Gaffa at Major League Hacking's Global Hack Week

We presented a live session at MLH's Global Hack Week Cloud, walking through web scraping fundamentals, an intro to Gaffa, and two hands-on demos, one scraping a Wikipedia article and feeding it into an OpenAI Q&A session using generate_markdown, and another using parse_json to extract structured data from both web pages and hosted PDFs. The session also covered scraping legality, dynamic content, and how parse_json field descriptions can enforce specific output formats. We've also added both demo notebooks to our Python examples repository:

Read the recap.

Last updated