Introduction

This integration allows users to extract data from webpages using Clay’s Scrape Website action. With this action, you can retrieve any content on a web page, including body text, links, emails, phone numbers, and keywords from specified URLs. By leveraging this functionality, users can efficiently visit webpages and extract their entire content, gather and incorporate web-based data into their Clay workflows. This streamlines information extraction processes for various business and research purposes, making data collection more efficient and comprehensive.

Input

Name	Is Optional	Description	Type
Website URL			url
Scrape Delay in Seconds	true	The number of seconds to wait before scraping the website. This gives the javascript in the website time to load. If you find a bunch of results are missing, try adding a delay. Maximum of 10 seconds.	number
Keep Non-Text in Body	true	When scraping body text, we automatically remove any scripts, styles, or images that may be present in the returned text. However, in certain cases, you may want to keep this content. If so, set this to true.	boolean
Output Fields	true	Optionally, select the fields you want to receive in your output data.	text
Extract Custom Regex	true	Use this field to extract custom data from the website. For example, if you want to extract all of the wikipedia links from the website, you can use https?://([a-z]{2,3}.)?wikipedia.org/wiki/[a-zA-Z0-9_-]*	text

Output

Name	Type
Title	text
Keywords	text
Description	text
Favicon	url
Social Links	object
Extracted Keywords	array
Links	array
Emails	array
Phone Numbers	array
Images	array
Body Text	text

Extract URLs and Emails from Text with Clay

Find Keywords in Website with Google

Find Keywords with Clay

Run Zenrows Scrape with Zenrows

Parse Data from URL with ScrapeMagic

Scrape Website with Clay

Introduction

Input

Output