Lou Marvin Caraig - Head of Engineering @ Athenian

Summary

I'm currently leading a fully distributed Engineering team that spans many time zones as the Head of Engineering. I have +2 years of experience in leading distributed teams and +8 years of experience as a Software Development Engineer.

I designed and implemented scalable and highly distributed systems running under heavy loads and dealing with large amounts of data in real-time with both Python and Go (mostly using Kafka, Zookeeper, and Druid). I worked with different backend technologies and APIs, I'm proficient with continuous integration and both unit, integration, and e2e testing. I'm an expert Docker user and I'm also competent in deploying and orchestrating the resources in live environments using Terraform and Kubernetes, specifically on GKE.

I'm also skilled on the front-end specifically with React.js and proficient in applying the best practices also using Storybook for better components re-usability, and Cypress for both integration, e2e, and snapshot testing.

I occasionally worked in applied Machine Learning using Keras and Scikit-learn, mostly applied to images and dealt with data visualizations as well with both Python, Javascript, and R.

I'm an expert Git user, have some working experience with Scala, and have some minor experience with Rust and Elisp.

I love open-source and I contribute to open-source projects and write technical posts on my blog. I'm an avid learner and passionate about state-of-the-art technologies. I'm very proactive and a team player that can work independently, capable of building and deploying a system from the ground up. When leading a team I mostly care about fostering transparency, clear communication, and knowledge sharing to ensure a collaborative environment and keep a high standard.

Work experience

Head of Engineering

December 2020Current

Athenian

I'm responsible for leading the distributed Engineering Team (5 persons including me). This consists of the same responsibilities as per the Engineering Team Lead, plus:

provide stronger technical leadership and direction for the Engineering Team, and ensure the right balance between effectiveness and tech debt,
responsible for the career plan and the growth of the members of the Engineering Team,
responsible for the Engineering spend.

Successfully built a healthy and collaborative working environment by fostering clear communication, knowledge sharing, also thanks to some initiatives such as monthly talks held by members of the Engineering Team.

Successfully contributed to the definition and the planning of the system architecture thanks to the expertise of the whole Engineering Team.

Led multiple experiments and initiatives to improve both part o the stack and the processes:

built PoCs for improving the data model using event-sourcing and Druid,
improved the Python API performance by preloading the DB data in-memory,
introduced Storybook and Cypress as part of the front-end stack for better components re-usability and testing,
etc.

I'm reporting to the CEO.

Engineering Team Lead

December 2019December 2020

Athenian

I was responsible for leading the distributed Engineering Team (5 persons including me). This consisted of the following responsibilities:

coordinate with the Product Manager and CEO to prioritize work in order to be fast and goal-oriented,
analyze feasibility and estimation with the whole Engineering Team,
coordinate the Engineering Team's work to ensure we're heading in the right direction,
ensure and foster good communication, trust, and collaboration,
be an individual contributor as well,
have the 1:1s with everyone in the Engineering Team,
have the last call on Engineering decisions and be the main responsible and accountable,

Successfully lead the Engineering Team into delivering the first MVP of the product in ~5 months.

I was reporting to the CEO.

Senior Software Development Engineer - Applications Team

September 2019December 2019

source{d}

I've been given also the responsibility of acting as the team lead (3 persons including me). This consists of the following responsibilities:

be the bridge between the team and the VP of Engineering and Product team
coordinate the design of new projects and features eventually along with the other teams
be the maintainer of most of the repositories
plan the team’s work in the kanban board aligned with the priority given by the Product team
coordinate with other teams

Successfully added important features to metadata-retrieval for fetching Github metadata, such as support for multiple tokens, support for multiple organizations, parallel downloads, etc.

The metadata-retrieval is written in Go and also supports multiple providers (Github, Bitbucket) using both REST and GraphQL APIs.

I was reporting to the VP of Engineering.

Software Development Engineer - Applications Team

December 2018September 2019

source{d}

Successfully contributed to maintaining and improving the opens-source product called engine that provides a CLI to query git objects in local repositories by gluing together all the components of the source{d} stack.

Successfully contributed since its inception a new product cakked source{d} CE (evolution of engine composed of the open-source projects sourced-ce and sourced-ui).

Successfully prepared and delivered demos and proofs-of-concept for both investors and potential customers showing the potential of the product at its maximum. This required also data visualization skills for the preparation of the dashboards.

The engine was written in Go and was heavily relying on Docker. It consisted of a daemon running in a container orchestrating and communicating other components also running in containers. The communication was done using gRPC, and the orchestration using the Docker API. The source{d} CE was also written in Go and relies on docker-compose for coordination and on fork of Apache Superset for dashboarding.

I was reporting to the Applications Team Lead.

Software Development Engineer

July 2018December 2018

Tech Engines

Given the early stage of the product, I managed to design, experiment and deploy different proofs-of-concept using different technologies.

The core of the system was written in Scala with some parts in Python and based on Spark and its ecosystem.

I was reporting to the CTO.

Software Development Engineer

August 2013July 2018

Viralize

Successfully managed to maintain and improve the AdServer serving hundreds of millions of ads a month by ensuring scalability, high availability and low latency.

Successfully re-implemented the analytics ETL pipeline from scratch by decreasing the time-to-query of an event from more than 20 minutes to dozen of seconds. The new pipeline was able to crunch billions of events a day in real-time thanks to its capability to scale horizontally.

Successfully re-implemented the billing system and made it real-time by consuming the events from the ETL data pipeline achieving an exactly-once data delivery semantic.

Successfully implemented a mechanism to enable running experiments by serving different assets on a selectable target of the traffic. All the experiments were analyzable and comparable to each other. The experiments were analyzed sometimes also by resorting to data visualization and statical hypothesis testing.

Managed to implement basic Machine Learning based components that have been part of the product such as a Sentiment Analysis one applied to YouTube comments and a Recommendation System for content selection.

All the stack was written in Python using different web frameworks and databases depending on the needs. The ETL pipeline used Kafka as the messaging system and Zookeeper for coordination, and Druid as the main data store. The front end was written in React.js and the scientific part was implemented using the Python ecosystem.

I was reporting to the CTO.

Machine Learning and Computer Vision Engineer Contractor

November 2017December 2017

Muse

Successfully built an MVP of a Face Recognition System running on a single-page web app. The web app running in a Docker container permits the user to upload a video from which the faces are extracted and clustered together. The UI provides a labeling mechanism to label these clusters and train a model. A trained model can then be used to predict the faces of another video.

The web app was served using Flask (Python), all the image/video manipulation has been done using OpenCV and FFmpeg and for the face clustering and recognition have been used OpenFace, dlib, and scipy.

I was reporting to the CTO.

Education

Bachelor of Science (BS) in Computer Science

20092013

University of Florence

Final grade: 110/110 cum laude (elective courses: Artificial Intelligence, Neural Networks)

Projects

Pydockenv

2019

Github link

pydockenv is a CLI that aims to give the same experience of having a virtual environment, but backed by Docker! The idea is to make the usage of Docker completely hidden so that even non-expert Docker users can leverage the advantages provided by using it as the underlying engine.

Face Anonymizer

2018

Github link

Pixelates the faces detected in the provided image or video using DNN Face Dectector in OpenCV.

Self-Driving Car

2018

Github link

Python project that implements a Convolutional Neural Network for a self-driving car running in a simulated environment. The structure of the network is based on a paper by NVIDIA. The project covers data gathering, data cleansing, data augmentation, model selection, training and testing.

Image Quantizer

2016

Github link

Python project that implements and shows the difference between different methods for performing image color quantization.

Other Projects

Github link

All other projects are available on Github. There are various personal open-source projects related to both web technologies, machine learning, computer vision and I also contributed to some repositories related to data engineering such as: druid, pydruid and kafka-python.

Publications

Personal Blog

2017Present

lmcaraig.com

A personal blog with technical posts. List of the publications:

A New Training Algorithm for Kanerva's Sparse Distributed Memory

2012

arxiv

The SDM was thought to be a model of human long term memory. Its architecture permits to store arxivbinary patterns and to retrieve them using partially matching ones. This introduces a new training approach that can handle efficiently even non-random data, and adds the capability to recognize inverted patterns. This approach uses a signal model and suggests a different way of creating the hard locations in the memory.