An open empty notebook on a white desk next to an iPhone and a MacBook

Orchestrating notebooks with Camel K, an integration-based approach

When you achieve somewhat of a level of maturity in your data analytics pipelines you also tend to start exploring various and flexible ways to orchestrate the ETL processes of data you have and derive various tables for different access patterns, as required by the business downstream. However, similar to the “object impedance mismatch” in the object vs. relation database worlds, there’s an “impedance mismatch” between data-engineers and business folks when it comes to the expectations they have on the speed of delivery, quality, correctness, maintainability.

In the mindset of the business folk, the derivative work or data that he requires is just a simple SQL that is run on the “raw data” which has infinite amounts of CPU power, and infinite amounts of memory, probably running on GPUs anyway and thus just needs to be written/queried as such and it will return results in an instant.

Continue reading →

Ivideon on Kubernetes, the simplest form of HA video surveillance

The decision to go to Kubernetes (managed deployment on Digital Ocean) was so that I could also free my home-server and eventually shut it down. There I hosted the Windows-based Ivideon server, on a Windows VM running on Proxmox. The internet in Romania being broadband helped a lot. I’m actively using Ivideon to record activity around the properties we hold.

Though in need of one for a quick deployment, I couldn’t find a HELM chart or something already done. On the other hand, the server itself is pretty stateless and just needs a configuration file and a PV to hold the archive, so not too big of a challenge to write a deployment file for it.

Continue reading →

Swithing oceans, from Heztner to Digital Ocean

I think I’ve ignored the Kubernetes movement for many years now. I used to maintain Docker-based infrastructures a couple of years ago, over bare-metal, mostly for work purposes. It was an interesting learning experience back then in the details of the containers foundation.

Still, I am still prudent about infrastructure in general and for a long time have favored pure and simple bare-metal or bare-metal + VM solutions in favor of containers for most of the critical data workloads (a.k.a. Big Data). Even in work deployments, we bypassed the usual performance penalties of containers by bind mounting the disk or using IPVLAN for networking when pure performance was needed. My favoritism for bare-metal is based on the fact that you can’t just ignore +50 years of evolution and documentation (if we intent on saying that we consider the “birth” of 1st operating system (UNIX) in 1969). I don’t want to go earlier than that …

Continue reading →