Content Grabber is used for web scraping and web automation. It can extract content from almost any website and save it as structured data in a format of your choice, including Excel reports, XML, CSV and most databases.
Price comparison portals / mobile apps
Market Intelligence and Monitoring
B2B integration / process automation
Content Grabber is targeted at companies with a critical reliance on web scraping, and a focus on scalability and reliability.
The web contains a massive amount of data and Content Grabber will extract it faster and more reliable than any other software, with the help of multi-threading, optimized web browsers, and many other performance tuning options.
The ease-of-use and visual approach of the Content Grabber agent editor makes it suitable for building hundreds of web scraping agents, much faster than with any other software.
The agent editor automatically detects and configures the required commands. It will automatically create lists of content and links, handle pagination and web forms, download or upload files and configure any other action you perform on a web page. At the same time, you always have the option to manually fine tune the commands, so Content Grabber gives you both simplicity and control.
With hundreds of web scraping agents, you need the right tools to manage them all, and Content Grabber will not let you down. You can view status and logs of all your agents, or run and schedule your agents in one centralized location.
Content Grabber was designed from the very beginning with performance and scalability as the top priority. Multi-threading is used wherever appropriate to limit common web scraping bottlenecks such as web page retrieval.
Optimized web browsers
Web browsers are used to load and parse web pages, and Content Grabber has a range of different browsers to achieve maximum performance in every scenario - from a fully dynamic web browser to the ultra-fast HTML5 parser only browser. Different types of browsers can be used on the same website and Content Grabber will normally use many browsers at the same time – all running multi-threaded.
All web scraping tools spend most of their time waiting for new web pages to load, so it?s important to optimize this process. Content Grabber will automatically optimize page loads, but will also allow you to get under the hood to fine tune every aspect of the process.
Web scraping is notoriously unreliable and will often fail because of problems you have no control over. We understand that reliability is extremely important in many situations, so we have tackled this difficult issue head on and added strong support for debugging, error handling and logging.
Content Grabber has one of the best debuggers of any web automation software, and this will help you build reliable agents where all issues that can be resolved at design time are resolved at design time.
Many web scraping errors are unavoidable even with the best designed agents, and this is where error handling comes into play. One example could be an unreliable website that suddenly starts returning only error pages, and requires a web browser restart to start functioning again.
Many dynamic websites have bugs causing errors that are impossible to handle gracefully. Dynamic websites are small applications running in your web browser, and they may crash, hang, leak memory or cause many other fatal issues.
Content Grabber uses a health monitor process that looks for problems in the running web browsers, and restarts browsers that have run into trouble. A restarted web browser will continue from the point where it failed, so in most situations, this will not cause any interruption to the web scraping process.
Logging & notifications
Some website errors may occur very rarely, and may be impossible to catch during debugging. An example could be CAPTCHA protection that appears after hours of web scraping, or simply a broken Internet connection. Content Grabber can log all activity and errors, including the full HTML of web pages that are causing problems. This makes it much easier to identify runtime errors and take appropriate action to resolve these.
Notifications can be used to notify an administrator about specific problems, such as missing web content or other errors.
Content Grabber can email status reports to an administrator when errors or notifications have occurred during web scraping.
The Content Grabber agent editor has a typical point and click user interface where you click on the content you want to extract, or on the buttons and links you want to follow.
The agent editor sets itself apart from the crowd with its built-in smarts that automatically detect and configure all commands. It will automatically create lists of content and links, handle pagination and web forms, download or upload files, and configure any other action you perform on a web page. At the same time, you always have the option to manually fine tune the commands, so Content Grabber gives you both simplicity and control.
The Content Grabber agent editor is so simple to use that it can easily be used by beginners, and the built-in smarts enable users to quickly build large numbers of web scraping agents.
Data is everything when it comes to web scraping. Content Grabber allows you to load data from any source and use it in your agents for anything you need. You can also export extracted data to almost anywhere. This flexibility is key - enabling your technology to grow with your business.
Once data has been extracted and exported, it can be distributed by email, FTP or a custom defined destination.
Content Grabber is designed to manage hundreds of agents in a professional web scraping environment with development, testing and productions servers.
Logs, schedules and status information for all agents can be managed in one centralized location, and all proxies, database connections and script libraries can be managed on a per server basis.
No one wants to write scripts to get things done and with Content Grabber you rarely have to. However, if you have some unusual requirements, or you need to fine tune some process, it's nice to know the ability is there.
Content Grabber has a fully-fledged built-in script editor with IntelliSense that is more than capable when building smaller scripts.
Build royalty free self-contained web scraping agents that can run anywhere without the Content Grabber software. A self-contained agent is a single executable file that is easy to send or copy anywhere, and has a multitude of powerful configuration options.
You are free to sell or give away your self-contained agents and you can add promotional messages and advertisements to the agents' user interface. Content Grabber imagery / adverts are also included. Note: If you want to white-label your self-contained agent you will need to use the Premium Edition of Content Grabber.
You can run agents from the command-line by using the Content Grabber command-line program. With this you can specify command-line parameters that can easily be used as input data by your agents.
This software is not reviewed yet.