Assitan Koné
Aug 16

Why is Kaggle so scary?

Table of contents

Watch the video

Author

Assitan Koné
Founder @Codistwa
Empty space, drag to resize

SHARE

Introduction

In this article, we will talk about Kaggle. You know, this huge platform where you can practice machine learning, deep learning, etc.

When dealing with data science, Kaggle is the go-to platform.

What is Kaggle? Kaggle is a data science platform acquired by Google in 2017. So it includes competitions where you can win money. And do you see the famous recommendation algorithm of Netflix? Well, it was a competition on this very platform, and the prize was $1 million, which is pretty motivating, right?
But first and foremost, Kaggle is educational, so you can find tutorials, datasets, and free competitions. Also, there is a big community to help you

What do we need to enter a competition?

There are very simple competitions tailored for beginners. You see, it's all about knowledge here.
get started with kaggle
So you can absolutely start with the Titanic competition because it’s tailored for beginners. You can find examples of code to help your creativity, especially when it comes to exploratory data analysis and feature engineering. After that, you can keep joining competitions to get started in data science.

Overall, this is an experiment, right? This is data science. This is why for example you can see many submissions.
kaggle submissions
You know, usually, people have about 100 submissions so they can reach the top because it's so hard to understand what it’s important to get a “perfect” score. For example, don’t try to submit all the features, select a few and then add more.

How to win a competition?

So, the problem will come with the score.

We've reached a plateau. Does that mean we suck? That Kaggle isn't for us?

What would be interesting to improve your model? So, yes, you need to make several attempts. It could be discouraging, but that is normal. Also, what is important is the algorithms.

Ensemble methods is the best algorithm for those types of competitions. For example, you can use Random Forest.

Why? Well, this is an all-purpose algorithm. That's quick to train because it can be parallelized. So definitely try this algorithm every time you try to do a competition on Kaggle.
\Learnworlds\Codeneurons\Pages\ZoneRenderers\CourseCards
Let me give you a resource: https://farid.one/kaggle-solutions/
You know, it's very interesting to see the code with the explanation of what the top Kagglers have done on their code. The code that can help you be the first or second. With this website, you can click and see exactly the process of those people. Finally, what I found a little bit frustrating is that the platform is quite slow, but this is machine learning. Sometimes you have so much data to train.

How to break into data science with Kaggle?

Now, is Kaggle good for learning data science?

First of all, it's important to understand that doing data science on Kaggle is a special process.

In truth, it doesn't correspond to the real world. The datasets are a little too clean, even if you can find datasets with a lot of data cleaning to do. But you have to look for these datasets. On the other hand, even if one of the secrets is to use a set of methods, you have to work hard on feature engineering to get the score right. Which is good news. We're focusing on one of the most important parts of a data science project.

In conclusion, I really encourage you to use this platform to improve your skills. As I've said, you have also a community. You can get scores if you participate in those communities. So don't hesitate to do that.
\Learnworlds\Codeneurons\Pages\ZoneRenderers\CourseCards

Math for Machine Learning Comprehensive Guide

Understand quickly why math is important to do machine learning.
Write your awesome label here.
Sign up. Be inspired. Code.

Get a FREE Machine Learning Roadmap!

Subscribe to our newsletter to get your gift.

Get tips to teach yourself data science without being overwelmed in your email box. Get secrets to think and act like a Data Scientist on a daily basis. 
Write your awesome label here.
Created with