Isoxya API Demo (Bash)

2019-05-16 · Isoxya

This post was originally published on the website of Pavouk OÜ (Estonia). On 2020-06-12, I announced that Pavouk OÜ was closing. The posts I wrote have been moved here.


We’re pleased to announce the publication of the Isoxya API Demo (Bash) code to GitHub, available under the BSD3 licence. We realise it’s not always easy to get started with a new system, and the more help you can get, the better—especially if you’re allowed to share and modify it as desired! We think that using this code as reference, it could save programmers hours of working out how to become proficient in the usage of the Isoxya API—although we recommend that you use it as inspiration only, rather than that using it directly for production code.

Isoxya is a High-Performance Internet Data Processor and Web Crawler. It is designed as a next-generation web crawler, scalable for large sites (millions of pages), cost-effective for tiny sites (1+ pages), offering flexible data processing using multi-industry plugins, delivering results via data streaming to multiple storage backends. It is magicked via a REST API using JSON, and is available now for private preview.

This repository is a collection of Shell scripts (Bash), demoing basic usage of the Isoxya API over the command-line. Although it is possible to launch a crawl generating a new site-snapshot entirely using these scripts, they are not intended to be used directly in production; rather, they are intended to inform (and inspire?!) as to the basic functionality of Isoxya, and to provide an implementation reference, taken in combination with the Isoxya API Manual (available separately to interested parties).

Isoxya uses Tigrosa, a Secure Authentication & Authorisation Proxy, to handle common concepts such as organisations, users, and sessions. Tigrosa sits in front of an API (almost certainly a REST API using JSON, but not necessarily so), and handles the primary security layer, including organisation and user management, proxying the remainder of requests through to the backend, in this case, Isoxya. Thus, much of the security-related logic, although used in Isoxya, is in fact Tigrosa’s domain. Tigrosa is not currently available separately.