Image by Free-Photos from Pixabay

Git has relished its stance as the epitome of collaboration for many years. As a developer I interact with Git on a daily basis; committing, pushing to the remote and the occasional hair pulling when things go wrong (explains my hair loss). Recently while musing on “how does Git actually work?”, I realized how little I know about this deeply-rooted way of software development. And goes without saying that so many blunders that kept me guessing for hours sometimes could’ve been avoided if I knew the structure/functionality that dictates Git.

Git is an essential part of a software development project…

Image by Mylene2401 from Pixabay

Transfer learning is an important topic. As a civilization, we have been passing on the knowledge from one generation to the other, enabling the technological advancement that we enjoy today. It’s the edifice that supports most of the state-of-the-art models that are blowing steam, empowering many services that we take for granted.

Transfer learning is about having a good starting point for the downstream task we’re interested in solving.

In this article, we’re going to discuss how to piggyback on transfer learning to get a warm start to solve an image classification task. …

Image by Erik Stein from Pixabay

In the last article, we looked at models that deal with non-time-series data. Time to turn our heads towards some other models. Here we will be discussing deep sequential models. They are predominantly used to process/predict time series data.

Link to Part 1, in case you missed it.

Simple Recurrent Neural Networks (RNNs)/Elman Networks

Simple recurrent neural networks (referred to also as RNNs) are to time-series problems as CNNs to computer vision. In a time-series problem, you feed a sequence of values to a model and ask it to predict the next n values of that sequence. RNNs go through each value of the sequence while…

Image by Thomas Breher from Pixabay

If you thought machine learning is the crush that you wouldn’t have guts to talk to, Deep learning is the dad of your crush! Due to the unprecedented advances in hardware and researchers’ appetite for better and bigger models, deep learning is becoming intimidating and elusive by the day. The more research bubbles up everyday, the more it pushes the level of the basic knowledge you should have. So, for all those folks, who’s hesitant to dive straight into the murky and tacky goo-iness of deep learning, I hope this article would boost your confidence. …

Photo by Kaley Dykstra on Unsplash

Ideation of Seq2Seq or sequence-to-sequence models came in a paper by Ilya Sutskever in “Sequence to Sequence Learningwith Neural Networks”. They are essentially a certain organization of deep sequential models (a.k.a. RNN based models) (e.g. LSTMs/GRUs)[1] (discussed later). The main type of problems addressed by these models is,

mapping an arbitrary length sequence to another arbitrary length sequence

Where might we come across such problems? Pretty much anywhere. Applications of,

  • Machine translation
  • Text summarization
  • Question answering

are few examples that can capitalize on such a model. These applications have a very unique problem formulation requiring the ability to map…

Docker … whale … you get it.

As a data scientist, I grapple with Docker on a daily basis. Creating images, spinning up contains have become as common as writing Python scripts for me. And this journey has its achievements as well as moments, “I wish I knew that before”.

This article discusses some of the best practices while using Docker for your data science projects. By no means this is an exhaustive checklist. But this covers most things I’ve come across as a data scientist.

This article assumes basic-to-moderate knowledge of Docker. For example, you should know what Docker is used for and should be able…

Light on Math Machine Learning

Photo by Jelleke Vanooteghem on Unsplash

TLDP; (too long didn’t pay? No worries, still you get access to code with the following link)

GloVe implementation with Keras: [here]

In this article, you will learn about GloVe, a very powerful word vector learning technique. This article will focus explaining the why GloVe is better and the motivation behind the cost function of GloVe which is the most crucial part of the algorithm. . The code will be discussed in detail in a later article.

To visit my previous articles in this series use the following letters.

A B C D* E F G H I J K L* M N O P Q R S T U V W X Y Z

GloVe is…

Light on Math Machine Learning

Courtesy of Pixabay

This story introduces you to a Github repository which contains an atomic up-to-date Attention layer implemented using Keras backend operations. Available at attention_keras .

Why Keras?

With the unveiling of TensorFlow 2.0 it is hard to ignore the conspicuous attention (no pun intended!) given to Keras. There was greater focus on advocating Keras for implementing deep networks. Keras in TensorFlow 2.0 will come with three powerful APIs for implementing deep networks.

  • Sequential API — This is the simplest API where you first call model = Sequential() and keep adding layers, e.g. model.add(Dense(...)) .
  • Functional API — Advance API where you can create…

Light on Math Machine Learning

Courtesy of


Neural style transfer (NST) is a very neat idea. NST builds on the key idea that,

it is possible to separate the style representation and content representations in a CNN, learnt during a computer vision task (e.g. image recognition task).

Following this concept, NST employs a pretrained convolution neural network (CNN) to transfer styles from a given image to another. This is done by defining a loss function that tries to minimise the differences between a content image, a style image and a generated image, which will be discussed in detail later. …

Light on Math Machine Learning

Courtesy of


This article aims at introducing decision trees; a popular building block of highly praised models such as xgboost. A decision tree is simply a set of cascading questions. When you get a data point (i.e. set of features and values), you use each attribute (i.e. a value of a given feature of the data point) to answer a question. The answer to each question decides the next question. At the end of this sequence of questions, you will end up with a probability of the data point belonging to each class.

Note: This article is behind the Medium paywall. However…

Thushan Ganegedara

Author (Manning/Packt) | DataCamp instructor | Senior Data Scientist @ QBE | PhD. Youtube: @DeepLearningHero Twitter:@thush89, LinkedIN: thushan.ganegedara

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store