Katana Overview
A web crawling framework designed to navigate and parse for hidden details
What is Katana?
Katana is a command-line interface (CLI) web crawling tool written in Golang, designed to be fast, efficient, and provide simple output. It is designed to crawl websites to gather information and endpoints. One of its defining features is the ability to use headless browsing to crawl applications. This means that it can crawl single-page applications (SPAs) built using technologies such as JavaScript, Angular, or React to effectively access and gather information from these types of applications.
-
Check out this great ProjectDiscovery blog post from the initial launch for more information on Katana.
-
We also wrote a great “deep dive” blog on Katana’s field extraction
Features and capabilities
- Fast and fully configurable web crawling
- Support for Standard and Headless modes
- JavaScript parsing and crawling support
- Customizable automatic form-filling
- Customizable output through preconfigured fields
- Customizable scope control through preconfigured fields and Regex
- Support Inputs through STDIN, URL, and LIST
- Supported Outputs of STDOUT, FILE, and JSON
Additional Katana resources
As an open source tool with a robust community there are a lot of community-created resources available. We are happy to share those to offer even more information about our tools.
- https://medium.com/@sherlock297/katana-framework-how-to-use-it-to-scan-and-mass-collect-website-data-107f5ae326e0
- https://medium.com/@cuncis/katana-an-overview-of-the-powerful-web-application-security-scanner-cheat-sheet-6fc50236aff6
- https://www.geeksforgeeks.org/katana-crawling-and-spidering-framework/
Support
Questions about using Katana? Issues working through installation? Cool story or use case you want to share? Get in touch!
Check out the Help section of the docs or reach out to us on Discord.