Organizing Code, Experiments, and Research for Kaggle Competitions

and I neglect. Train me and I keep in mind. Contain me and I study.

holds true, and studying by doing is likely one of the most instructive processes to amass a brand new ability. Within the discipline of knowledge science and machine studying, taking part in competitions is likely one of the handiest methods to achieve hands-on expertise and improve your abilities and talents.

Kaggle is the world’s largest information science group, and its competitions are extremely revered within the trade. Most of the world’s main ML conferences (e.g., NeurIPS), organizations (e.g., Google), and universities (e.g., Stanford) host competitions on Kaggle.

The featured Kaggle Competitions award medals to high performers on the personal leaderboard. Lately, I’ve participated in my very first medal-awarding Kaggle competitors, and I used to be lucky sufficient to earn a Silver Medal. This was the NeurIPS – Ariel Data Challenge 2025. I don’t intend to share my answer right here. Should you’re , you possibly can take a look at my solution here.

What I didn’t notice previous to participation is how a lot Kaggle assessments moreover simply ML abilities.

Kaggle assessments one’s coding and software program engineering abilities. It burdened one’s means to correctly set up their codebase to be able to shortly iterate and take a look at new concepts. It additionally examined the power to trace experiments and leads to a transparent method.

Being a part of the NeurIPS 2025 Competition Track, a analysis convention, additionally examined the power to analysis and find out about a brand new area shortly and successfully.

All in all, this competitors humbled me so much and taught me many classes moreover ML.

The aim of this text is to share a few of these non-ML classes with you. All of them revolve round one precept: group, group, group.

First, I’ll persuade you why clear code structuring and course of group isn’t time losing or good to have, however fairly important for competing in Kaggle particularly and any profitable information science undertaking normally. Then, I’ll share with you among the strategies I used and classes realized relating to code structuring and the experimentation course of.

I need to begin with a word of humility. Certainly not am I an skilled on this discipline. I’m nonetheless within the outset of my journey. All I hope for is that some readers will discover a few of these classes helpful and can study from my pitfalls. When you have every other suggestions or strategies, I urge you to share them in order that all of us can study collectively.

1 Science Golden Tip: Set up

It is no secret that natural scientists like to keep detailed records of their work and research process. Unclear steps may (and will) lead to incorrect conclusions and understanding. Irreproducible work is the bane of science. For us data scientists, why should it be any different?

1.1 But Speed is Important!

The common counterargument is that the nature of data science is fast-paced and iterative. Generally speaking, experimentation is cheap and quick; besides, who in the world prefers writing documentation over coding and building models?

As much as I sympathize with this thought and I love quick results, I fear that this mindset is short-sighted. Remember that the final goal of any data science project is to either deliver accurate, data-supported, and reproducible insights or to build reliable and reproducible models. If fast work compromises the end goal, then it is not worth anything.

My solution to this dilemma is to make the mundane parts of organization as simple, quick, and painless as possible. We shouldn’t seek total deletion of the organization process, but rather fix its faults to make it as efficient and productive as possible.

1.2 Costs of Lack of Organization

Imagine with me this scenario. For each of your experiments, you have a single notebook on Kaggle that does everything from loading and preprocessing the data to training the model, evaluating it, and finally submitting it. By now, you have run dozens of experiments. You discover a small bug in the data loading function that you used in all your experiments. Fixing it will be a nightmare because you will have to go through each of your notebooks, fix the bug, ensure no new bugs were introduced, and then re-run all your experiments to get the updated results. All of this would have been avoided if you had a clear code structure and your code were reusable and modular.

Drivendata (2022) mentions an incredible instance of the prices of an unorganized information science undertaking. It mentions the story of a failed information science undertaking that took months to finish and price tens of millions of {dollars}. The failure got here right down to an incorrect conclusion found early within the undertaking. A code bug within the information cleansing polluted the information and led to unsuitable insights. If the crew had higher tracked the information sources and transformations, they’d have caught the bug earlier, and cash would have been saved.

If there may be one lesson to remove from this part, it’s that group is just not a nice-to-have, however fairly an important a part of any information science undertaking. And not using a clear code construction and course of group, we’re certain to make errors, waste time, and produce irreproducible work.

1.3 What to trace and set up?

There are three main aspects that I consider worth the effort to track:

Codebase
Experiments Results and Configurations
Research and Learning

2 The Codebase

After all, code is the backbone of any data science project. So, there is a lesson or two to learn from software engineers here.

2.1 Repo Structure

As long as you give much thought to the structure of your codebase, you are doing great.

There is no one universally agreed upon structure (nor will ever be). So, this section is highly subjective and opinionated. I will discuss the general structure I like and use.

I like to initialize my work with the widely popular Cookiecutter Data Science (ccds) template. Once you initialize a undertaking with ccds, it creates a folder with the next construction. ¹

├── LICENSE            <- Open-source license if one is chosen
├── Makefile           <- Makefile with comfort instructions like `make information` or `make prepare`
├── README.md          <- The highest-level README for builders utilizing this undertaking.
├── information
│   ├── exterior       <- Knowledge from third get together sources.
│   ├── interim        <- Intermediate information that has been remodeled.
│   ├── processed      <- The ultimate, canonical information units for modeling.
│   └── uncooked            <- The unique, immutable information dump.
│
├── docs               <- A default mkdocs undertaking; see www.mkdocs.org for particulars
│
├── fashions             <- Skilled and serialized fashions, mannequin predictions, or mannequin summaries
│
├── notebooks          <- Jupyter notebooks. Naming conference is a quantity (for ordering),
│                         the creator's initials, and a brief `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml     <- Challenge configuration file with bundle metadata for 
│                         {{ cookiecutter.module_name }} and configuration for instruments like black
│
├── references         <- Knowledge dictionaries, manuals, and all different explanatory supplies.
│
├── reviews            <- Generated evaluation as HTML, PDF, LaTeX, and so forth.
│   └── figures        <- Generated graphics and figures for use in reporting
│
├── necessities.txt   <- The necessities file for reproducing the evaluation setting, e.g.
│                         generated with `pip freeze > necessities.txt`
│
├── setup.cfg          <- Configuration file for flake8
│
└── {{ cookiecutter.module_name }}   <- Supply code to be used on this undertaking.
    │
    ├── __init__.py             <- Makes {{ cookiecutter.module_name }} a Python module
    │
    ├── config.py               <- Retailer helpful variables and configuration
    │
    ├── dataset.py              <- Scripts to obtain or generate information
    │
    ├── options.py             <- Code to create options for modeling
    │
    ├── modeling                
    │   ├── __init__.py 
    │   ├── predict.py          <- Code to run mannequin inference with educated fashions          
    │   └── prepare.py            <- Code to coach fashions
    │
    └── plots.py                <- Code to create visualizations

2.1.1 Surroundings Administration

When you use ccds, you are prompted to select an environment manager. I personally prefer uv by Astral. It data all of the used packages within the pyproject.toml file and permits us to recreate the identical setting by merely utilizing uv sync.

Below the hood, uv makes use of venv. I discover utilizing uv a lot less complicated than immediately managing digital environments as a result of managing and studying pyproject.toml is way less complicated than necessities.txt.

Furthermore, I discover uv a lot less complicated than conda. uv is constructed particularly for python whereas conda is way more generic.

2.1.2 The Generated Module

A great part of this template is the { cookiecutter.module_name } directory. In this directory, you defined a Python package that shall contain all the important parts of your code (e.g. preprocessing functions, models definition, inference function, etc.).

I find the usage of the package quite helpful, and in Section 2.3, I’ll focus on what to put right here and what to put in Jupyter Notebooks.

2.1.3 Staying Versatile

Don’t regard this structure as perfect or complete. You don’t have to use everything ccds provides, and you may (and should) alter it if the project requires it. ccds provides you with a great starting point for you to tune to your exact project needs and demands.

2.2 Version Control

Git has become an absolute necessity for any project involving code. It allows us to track changes, revert to earlier versions, and, with GitHub, collaborate with team members.

When you use Git, you basically access a time machine that can remedy any faults you introduce to your code. Today, the use of Git is non-negotiable.

2.3 The Three Code Types

Choosing when to use Python scripts and when to use Jupyter Notebooks is a long-debated topic in the data science community. Here I present my stance on the topic.

I like to separate all of my code into one of three directories:

The Module
Scripts
Notebooks

2.3.1 The Module

The module should contain all the important functions and classes you create.

Its usage helps us minimize redundancy and create a single source of truth for all the important operations happening on the data.

In data science projects, some operations will be repeated in all your training and inference workflows, such as reading the data from files, transforming data, and model definitions. Repeating all these functions in all your notebooks or scripts is difficult and extremely boring. Using a module allows us to write the code once and then import it everywhere.

Moreover, this helps reduce errors and mistakes. When a bug in the module is discovered, you fix it once in the module, and it’s automatically fixed in all scripts and notebooks importing it.

2.3.2 Scripts

The scripts directory contains .py files. These files are the only source of generating outputs from the project. They are the interface to interacting with our module and code.

The two main usages for these files are training and inference. All the used models should be created by running one of the scripts, and all submissions on Kaggle should be made by such files.

The usage of these scripts helps make our results reproducible. To reproduce an older result (train the same model, for example), one only has to clone the same version of the repo and run the script used to make the old results 2.

Because the scripts are run from the CLI, utilizing a library to handle CLI arguments simplifies the code. I like utilizing typer for easy scripts that don’t have many config choices and utilizing hydra for complicated ones (I’ll focus on hydra in additional depth later).

2.3.3 Notebooks

Jupyter Notebooks are wonderful for exploration and prototyping because of the short feedback loop they provide.

On many occasions, I start writing code in a notebook to quickly test it and figure out all mistakes. Only then would I transfer it to the module.

However, notebooks shouldn’t be used to create final results. They are hard to reproduce and track changes in. Therefore, always use the scripts to create final outputs.

3 Running the Codebase on Kaggle

Using the structure discussed in the previous section, we need to follow these steps to run our code on Kaggle:

Clone The Repo
Install Required Packages
Run one of the Scripts

Because Kaggle provides us with a Jupyter Notebook interface to run our code and most Kaggle competitions have restrictions on internet access, submissions aren’t as straightforward as running a script on our local machine. In what follows, I will discuss how to perform each of the above steps on Kaggle.

3.1 Cloning The Repo

First of all, we can’t directly clone our repo from GitHub in the submission notebook because of the internet restrictions. However, Kaggle allows us to import outputs of other Kaggle notebooks into our current notebook. Therefore, the solution is to create a separate Kaggle notebook that clones our repo and installs the required packages. This notebook’s output is then imported into the submission notebook.

Most likely, you will be using a private repo. The simplest way to clone a private repo on Kaggle is to use a personal access token (PAT). You can create a PAT on GitHub by following this guide. An important follow is to create a PAT particularly for Kaggle with the minimal required permissions.

Within the cloning pocket book, you should utilize the next code to clone your repo:

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
github_token = user_secrets.get_secret("GITHUB_TOKEN")
person = "YOUR_GITHUB_USERNAME"
CLONE_URL = f"https://oauth2:{github_token}@github.com/{person}/YOUR_REPO_NAME.git"
get_ipython().system(f"git clone {CLONE_URL}")

This code downloads your repo into the working listing of the present pocket book. It assumes that you’ve got saved your PAT in a Kaggle secret named GITHUB_TOKEN. Just be sure you activate the key within the pocket book settings earlier than working it.

3.2 Putting in Required Packages

In the cloning notebook, you can also install the required packages. If you are using uv, you can build your custom module, install it, and install its dependencies by running the following commands: 3.

cd ariel-2025 && uv construct

This creates a wheel file within the dist/ listing on your module. You may then set up it and all its dependencies in a customized listing by working: ⁴.

pip set up /path/to/wheel/file --target /path/to/customized/dir

Be certain to switch /path/to/wheel/file and /path/to/customized/dir with the precise paths. /path/to/wheel/file would be the path to the .whl file contained in the REPO_NAME/dist/ listing. The /path/to/customized/dir might be any listing you want. Bear in mind the customized listing path as a result of subsequent notebooks will depend on it to import your module and your undertaking dependencies.

I wish to each obtain the repo and set up the packages in a single pocket book. I identify this pocket book the identical identify because the repo to simplify importing it later.

3.3 Operating One of many Scripts

The first thing to do in any subsequent notebook is to import the notebook containing the cloned repo and installed packages. When you do this, Kaggle stores the contents of /kaggle/working/ from the imported notebook into a directory named /kaggle/input/REPO_NAME/, where REPO_NAME is the name of the repo 5.

Many instances, your scripts will create outputs (e.g., submission recordsdata) relative to their places. By default, your code will stay on /kaggle/enter/REPO_NAME/, which is read-only. Subsequently, it’s essential to copy the contents of the repo to /kaggle/working/, which is the present working listing and is read-write. Whereas this can be pointless, it’s a good follow that causes no hurt and prevents foolish points.

cp -r /kaggle/enter/REPO_NAME/REPO_NAME/ /kaggle/working/

Should you immediately run your scripts from /kaggle/working/scripts/, you’re going to get import errors as a result of Python can’t discover the put in packages and your module. This will simply be solved by updating the PYTHONPATH setting variable. I exploit the next command to replace it after which run my scripts:

! export PYTHONPATH=/kaggle/enter/REPO_NAME/custom_dir:$PYTHONPATH && cd /kaggle/working/REPO_NAME/scripts && python your_script.py --arg1 val1 --arg2 val2

I normally identify any pocket book working a script with the script identify for simplicity. Furthermore, once I re-run the pocket book on Kaggle, I identify the model with the hash of the present Git commit to maintain monitor of which model of the code was used to generate the outcomes. ⁶.

3.4 Gathering Every part Collectively

At the end, two notebooks are necessary:

The Cloning Notebook: clones the repo and installs the required packages.
The Script Notebook: runs one of the scripts.

You may need more script notebooks in the pipeline. For example, you may have one notebook for training and another for inference. Each of these notebooks will follow the same structure as the script notebook discussed above.

Separating each step in the pipeline (e.g. data preprocessing, training, inference) into its own notebook is useful when one step takes a long time to run and rarely changes. For example, in the Ariel Data Challenge, my preprocessing step took more than seven hours to run. If I had everything in one notebook, I would have to wait seven hours every time I tried a new idea. Moreover, time limits on Kaggle kernels would have made it impossible to run the entire pipeline in one notebook.

Each notebook would then import the previous notebook’s output and run its own step, and build from there. A good advice is to make the paths of any data files or models arguments to the scripts so that you can easily change them when running on Kaggle or any other environment.

When you update your code, re-run the cloning notebook to update the code on Kaggle. Then, re-run only the necessary script notebooks to generate the new results.

3.5 Is all this Effort Worth it?

Absolutely yes!

I know that the specified pipeline will add some overhead when starting your project. However, it will save you much more time and effort in the long run. You will be able to write all your code locally and run the same code on Kaggle.

When you create a new model, all you have to do is copy one of the script notebooks and change the script. No conflicts will arise between your local and Kaggle code. You will be able to track all your changes using Git. You will be able to reproduce any old results by simply checking out the corresponding Git commit and re-running the necessary notebooks on Kaggle.

Moreover, you will be able to develop on any machine you like. Everything is centralized on GitHub. You can work from your local machine. If you need more power, you can work from a cloud VM. If you want to train on Kaggle, you can do that too. All your code and environment are the same everywhere.

This is such a small price to pay for such a great convenience. Once the pipeline is set up, you can forget about it and focus on what matters: researching and building models!

4 Recording Learnings and Research

When diving into a new domain, a huge part of your time will be spent researching, studying, and reading papers. It is easy to get lost in all the information you read, and you can forget where you encountered a certain idea or concept. To that end, it is important to manage and organize your learning.

4.1 Readings Tracking

Rajpurkar (2023) suggests having a listing of all of the papers and articles you learn. This lets you shortly overview what you might have learn and refer again to it when wanted.

Professor Rajpurkar additionally suggests annotating every paper with one, two, or three stars. One-star papers are irrelevant papers, however you didn’t know that earlier than studying them. Two-star papers are related. Three-star papers are extremely related. This lets you shortly filter your readings in a while.

You also needs to take notes on every paper you learn. These notes ought to concentrate on how the paper pertains to your undertaking. They need to be quick to be reviewed simply, however have sufficient particulars to know the primary concepts. Within the papers checklist, it is best to hyperlink studying notes to every paper for straightforward entry.

I additionally like retaining notes on the papers themselves, akin to highlights. Should you’re utilizing a PDF reader or an e-Ink gadget, it is best to retailer the annotated model of the paper for future reference and hyperlink it in your notes. Should you desire studying on paper, you possibly can scan the annotated model and retailer it digitally.

4.2 Instruments

For most documents, I like using Google Docs because it allows me to access my notes from anywhere. Moreover, you can write on Google Docs in Markdown, which is my preferred writing format (I am using it to write this article).

Zotero is a good device for managing analysis papers. It’s nice at storing and organizing papers. You may create a group for every undertaking and retailer all of the related papers there. Importing papers may be very straightforward utilizing the browser extension, and exporting citations in BibTeX format is simple.

5 Experiment Monitoring

In data science projects, you will often run many experiments and try many ideas. Once again, it is easy to get lost in all this mess.

We have already made a great step forward by structuring our codebase properly and using scripts to run our experiments. Nevertheless, I want to discuss two software tools that allow us to do even better.

5.1 Wandb

Weights and Biases (wandb), pronounced “w-and-b” (for weights and biases) or “wand-b” (for being magical like a wand) or “wan-db” (for being a database), is a good device for monitoring experiments. It permits us to run a number of experiments and save all their configurations and leads to a central place.

Determine 1: Wandb Dashboard Picture from Adrish Dey’s Configuring W&B Projects with Hydra article

Wandb supplies us with a dashboard to check the outcomes of various experiments, the hyperparameters used, and the coaching curves. It additionally tracks system metrics akin to GPU and CPU utilization.

Wandb additionally integrates with Hugging Face libraries, making it straightforward to trace experiments when utilizing transformers.

When you begin utilizing a number of experiments, wandb turns into an indispensable device.

5.2 Hydra

Hydra is a device constructed by Meta that simplifies configuration administration. It permits you to outline all of your configuration in YAML recordsdata and simply override them from the CLI.

It’s a very versatile device and suits a number of use circumstances. This guide discusses find out how to use Hydra for experiment configuration.

6 The Finish-to-Finish Course of

Figure 2: End-to-End Organized Kaggle Competition Process created by the Author using Mermaid.js

Figure 2 summarizes the method mentioned on this article. First, we analysis concepts and report our learnings. Then, we experiment with these concepts on our native machines in Jupyter Notebooks. As soon as we now have a working concept, we refactor the code into our module and create scripts to run the experiments. We run the brand new experiment(s) on Kaggle. Lastly, we monitor the outcomes of the brand new experiments.

As a result of all the pieces is fastidiously tracked, we’re capable of predict our shortcomings and shortly head again to the analysis or growth phases to repair them.

7 Conclusion

Disorder is the source of all evil in data science projects. If we are to produce reliable and reproducible work, we must strive for organization and clarity in our processes. Kaggle competitions are no exception.

In this article, we discussed a technique to organize our codebase, tips to track research and learnings, and tools to track experiments. Figure 2 summarizes the proposed method.

I hope this text was useful to you. When you have every other suggestions or strategies, please share them within the feedback part beneath.

Better of luck in your subsequent competitors!

7.1 References

Drivendata. (2022). The 10 Rules of Reliable Data Science.

Rajpurkar, P. (2023). Harvard CS197: AI Research Experiences. https://www.cs197.seas.harvard.edu

Source link

Organizing Code, Experiments, and Research for Kaggle Competitions

KV Cache Is Eating Your VRAM. Here’s How Google Fixed It With TurboQuant.

Proxy-Pointer RAG: Structure Meets Scale at 100% Accuracy with Smarter Retrieval

Dreaming in Cubes | Towards Data Science

AI Agents Need Their Own Desk, and Git Worktrees Give Them One

Your RAG System Retrieves the Right Data — But Still Produces Wrong Answers. Here’s Why (and How to Fix It).

Europe Warns of a Next-Gen Cyber Threat

Sources say NSA is using Mythos Preview, and a source says it is also being used widely within the DoD, despite Anthropic’s designation as a supply chain risk (Axios)

Today’s NYT Wordle Hints, Answer and Help for April 20 #1766

Scandi-style tiny house combines smart storage and simple layout

Our Favorite Apple Watch Has Never Been Less Expensive

Featured Picks

Australia To Become A Green Hydrogen Superpower?

4chan has been down since Monday night after “pretty comprehensive own”

HMRC using AI to scour suspected tax cheats’ social media

Organizing Code, Experiments, and Research for Kaggle Competitions

1 Science Golden Tip: Set up

1.3 What to trace and set up?

2.1.1 Surroundings Administration

2.1.2 The Generated Module

2.1.3 Staying Versatile

2.3.3 Notebooks

3.2 Putting in Required Packages

3.3 Operating One of many Scripts

3.4 Gathering Every part Collectively

4.2 Instruments

5 Experiment Monitoring

5.2 Hydra

6 The Finish-to-Finish Course of

7 Conclusion

7.1 References

Related Posts