Scrape data and extract email addresses, phone numbers and social media information with Phantombusters Web Crawler.
This web crawler has been designed for marketers, salespeople, growth-hackers and recruiters. Why? Because browsing the web for basic data such as emails, phone numbers, and Instagram, Twitter, Facebook or LinkedIn accounts is a big part of the lead generation process.
Extracting data with a web crawler consists in letting a bot browse the web for you. Specify what information you need and see it scrape the data you're looking for.
This web crawler can also be used for deep crawling. Set the depth to 2 and above and the crawler will go deeper on your target website in order to find and extract the data you're looking for.
You'll now see the 3 configuration dots blinking. Click on them.
4. Specify which domains you want to crawle
In the target field you'll be a list of websites you want to scrape data from. This can be either one domain. Or a Google Spreadsheet URL containing many domains.
Your spreadsheet should contain a list of URLs (one link per row). You can specify the name of the column that contains the links. Simply enter the column name in the field below.
Don't forget to make that spreadsheet publicly accessible!
5. Specify the data you want to scrape with the web crawler
Tick the boxes of every data you want the web crawler to scrape. For the moment the following are available:
Facebook Page URLs
Instagram Profile URLs
Twitter account URLs
LinkedIn company URLs
YouTube channel URLs
6. Specify the condition for the web crawler to exit
In order to go on to the next website, your web crawler will need an exit condition. The reason for this is because when a set a great depth (2 and above), your crawler might get stuck on the website for quite a long time and use up all your execution time.
7. Other informations
Scrape multiple results per website: Tick this box if you want to catch not the first but every available of the items you're after on all the pages you're browsing.
Visits only websites that start with URL: This option is useful is you want to go deep only on specific pages. Only those containing the word search for instance, or companies.
Depth: The default crawler's depth is set to 0.
Start your automation!
You're all set. Just click "launch" to get your automation started!
Set this automation on repeat
Once your automation's configuration is ready, you can schedule repetitive launches. This will allow you to avoid rate limits, scrape more data and get your automated workflows to spread over days, weeks, even months.
To do so, go to your dashboard and look for your automation's “Settings” button.
Then, select a frequency:
And Save those new settings at the bottom of the page.
This API will output CSV and/or JSON containing the following fields:
Share this API
Your friends & colleagues need to know about this!