This post was originally published on the website of Pavouk OÜ (Estonia). On 2020-06-12, I announced that Pavouk OÜ was closing. The posts I wrote have been moved here.
What is Isoxya? That is a very good question, and one which I have pondered much during the time since I made the first commit to the first code repository. Blank files, containing absolutely nothing, stretching out before me like snow on mountains. Empty. Inspiring. And intimidating. Now, a little over two years later, I have put the final touches to a series of programs which open up so many possibilities. Interconnected, yet distinct, working together as part of a far larger whole. I have long been fascinated with systems, the word used in an abstract as well as computing sense, and I am frequently enchanted to observe just how many similarities they have to life as a human.
To give a more technical answer, Isoxya is a large-scale internet data-processing engine. By large-scale, I mean it can process websites with many millions—nay, tens of millions—of pages. By data-processing engine, I mean it can extract and transform that data in myriad ways, powering many different types of software, many different industries. The meaning of internet is, I hope, fairly clear—although on a broader level, considering what it ‘is’ and what role it has for our human civilisations is perhaps deserving of deeper consideration.
Isoxya is, in essence, a web crawler, or web ‘spider’. These are the things on top of which search-engines are built. Indeed, the word pavouk means spider in Czech. It would be a mistake, however, to consider it as simply yet another SEO web crawler. Rather, it defines a plugin system, abstracting away the complexities of running a large-scale web crawler, solving many of the challenges in building such a system in a robust and scalable manner, whilst providing a straightforward, considered interface. Websites can be checked for SEO, e-commerce data can be extracted, content can be audited, and human language can be analysed—all using the same crawling system. If it’s possible to write a small script to process data from a single webpage, it’s likely possible to process data from millions of pages, with minimal or no code changes.
Thus, it is with pleasure and curiosity I present to you the first version of Isoxya. The first version is stable, has already crawled many millions of webpages, and is a powerful foundation on which to build. It is now officially released, and available to technical SEO or data-processing companies with significant usage requirements, by individual negotiation. Stay tuned as Isoxya expands to become available to smaller companies and use-cases as well, and as new products are built on top of it.
May you find peace, and help others to do likewise.