Data engineers and their unlocking potential for business use-cases

IMG_20181030_141321Nate Kupp currently holds the position of Director of Infrastructure and Data Science at Thumbtack and has presented this year his talk and success story entitled: “From humble beginnings: building the data stack at Thumbtack”. This is one of the presentations I’ve enjoyed much because it was similar to one of the pains I’ve also experienced in my day-to-day work.

A difference between Nate’s approach and mine is the executive sponsors (and a bit of luck of being in the right place, right time and the right management mentality). My experience on the other hand is, from my perspective a failure, but for others a small success against overwhelming odds.

Continue reading →

On workflow engines and where Airflow fits in

With the occasion of the CrunchConf 2018 there was a presentation on “Operating data pipeline using Airflow @ Slack” from Ananth Packkildurai. If you don’t know what Airflow is, it’s an workflow engine of the similar likes of Oozie and Azkaban. It’s based on the concept of a DAG which you write in Python and execute on a cluster.

As in the case of the Kafka presentation by Tim Berglund, we’ve asked the hard questions and they got popular pretty soon. In the case of Airflow, in the eco-system of workflow engines, we had quite a heavy question.

Continue reading →

On Kafka’s place in the MQ landscape

Just got back from CrunchConf 2018. A good panel of speakers and an interesting conference. Lots of food and drinks. Good atmosphere, helpful organizers. Fun times, good memories. The conference was a blast with most of my questions hitting the top votes with a little help from the community.

I decided in the context of the conference that I will share my thoughts on the presentations, at least for those that were intriguing and for those that my questions got the top votes. All in all, I would like to appraise good presentations, devoid of hype and commercialism. There seems to be some hype in today’s world around the Big Data projects, with the naive jumping ship to the next cool project.

Continue reading →

Going to the Crunch Data Engineering and Analytics Conference, 29-31 October 2018 in Budapest

I remember in 2016 my current employer provided the opportunity to go to the Cassandra Summit 2016 edition in San Jose. An exhausting and long 30-hour flight, tons of preparations with the US visa a few weeks ahead, a booking mistake that I had to pay with my own card until it was fixed and many more “troubles” later, I was finally there.

The thing about some conferences is that not all presentations are put online. In this case for Cassandra Summit 2016, the Datastax community has provided all recordings of the conference presentations but this is not true for most. Which is just nice of them to do for the community as such material can be later referenced to.

Continue reading →