Open-Source Postgres-XL Docker, NGINX Test Upstream, Isoxya API Demo (Shell) updates

2019-06-21 · Isoxya

This post was originally published on the website of Pavouk OÜ (Estonia). On 2020-06-12, I announced that Pavouk OÜ was closing. The posts I wrote have been moved here.


We’re pleased to announce lots of updates to three open-source projects we maintain: Postgres-XL Docker, NGINX Test Upstream, and Isoxya API Demo (Shell). So much of the time in tech, we’re standing on the shoulders of giants, and it’s good to give something back. We make grateful use of lots of open-source components in Isoxya, the high-performance internet data processor and web crawler, and are gradually working on open-sourcing some of the various small components or discoveries we’ve made along the way.

If you’re wondering why the image is of elephants rather than spiders, check out the Postgres-XL and PostgreSQL logos. ;)

Postgres-XL Docker

Postgres-XL Docker is a Docker image source for Postgres-XL, the scalable open-source PostgreSQL-based database cluster. The images are based on Debian.

The images allow for arbitrary database cluster topologies, allowing GTM, GTM Proxy, Coordinator, and Datanode nodes to be created and added as desired. Each service runs in its own container, communicating over a backend network. Coordinator nodes also connect to a frontend network.

We’ve just migrated what was previously living in a personal account to Pavouk’s GitHub and Pavouk’s Docker Hub accounts. We took the opportunity to do a round of work on the packaging, fixing and improving various things:

  • [#26] packaged Postgres-XL 10R1.1 (XL_10_R1_1-6-g68c378f), releasing to Docker Hub
  • [#26] packaged Postgres-XL 9.5 R1.6 (XL9_5_STABLE-1017-gaed0774), releasing to Docker Hub
  • [#22] removed PG_NET_CLUSTER_A environment variable, which had proven to be rather fragile especially when creating and destroying lots of Postgres-XL stacks using Docker Swarm, replacing with auto-detection for updating pg_hba.conf
  • [#15] improved GTM healthcheck
  • [#24] fixed and improved Coordinator healthcheck
  • [#20] removed pgxc_ctl; this is no longer the right way to use Postgres-XL Docker, and removing it reduces confusion and reduces the size of the image

Using the latest releases, it’s now possible to create and initialise a test cluster (1-GTM, 2-Coordinator, 2-Datanode) in just 2 commands, including clustering the nodes together, bringing them online, and installing node-type-aware healthchecks. Configuring other cluster topologies is also easy, as is using Docker Swarm, Kubernetes, or another orchestrator. This vastly cuts down on time and complexity of the pgxc_ctl method, since no SSH services are required, and the healthchecks are applied independently and nodes restarted as required until the cluster is stable. In our experience, this makes disaster-recovery far easier, and in many cases, entirely automatic.

NGINX Test Upstream

NGINX Test Upstream is a simple configuration for NGINX, logging body payloads. This is helpful for testing purposes, especially when developing APIs or proxies. A simple healthcheck using Curl is also added.

There’s not really much to say about it, except that it’s a simple wrapper around NGINX, applying a couple of configs to log request bodies. We find this very useful when testing Tigrosa and Isoxya.

curl -XPUT -d'{"abc":123}' localhost
127.0.0.1 -  [03/Jun/2019:16:21:25 +0000] "PUT / HTTP/1.1" 202 0 "" "curl/7.52.1" "" "{\"abc\":123}"

Isoxya API Demo (Shell)

Isoxya API Demo (Shell) is a collection of Shell scripts (Bash), demoing basic usage of the Isoxya API over the command-line. Although it is possible to launch a crawl generating a new site-snapshot entirely using these scripts, they are not intended to be used directly in production; rather, they are intended to inform (and inspire?!) as to the basic functionality of Isoxya, and to provide an implementation reference, taken in combination with the Isoxya API Manual (available separately to interested parties).

We update this repository quite often, sometimes multiple times a week, with tweaked Shell scripts, and new scripts exposing new functionality of Isoxya. Recent updates have included demonstrating the usage of the new maximum pages and maximum depth settings released in Isoxya 1.2, as well as the organisation user-agent whitelists used to identify Isoxya during crawling.