Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • An interview with ASML CEO Christophe Fouquet, as the company navigates political instability in The Netherlands and abroad and the impacts of Trump’s trade war (Adam Satariano/New York Times)
    • Today’s NYT Connections: Sports Edition Hints, Answers for June 6 #256
    • New discovery links red blood cells to organ damage
    • What surviving cancer outliers can teach us: The tech behind a new paradigm in oncology
    • Elon Musk Is Posting Through It
    • Today’s NYT Mini Crossword Answers for June 6
    • M&S hackers sent abuse and ransom demand directly to CEO
    • Your DNA Is a Machine Learning Model: It’s Already Out There
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Friday, June 6
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»The Journey from Jupyter to Programmer: A Quick-Start Guide
    Artificial Intelligence

    The Journey from Jupyter to Programmer: A Quick-Start Guide

    Editor Times FeaturedBy Editor Times FeaturedJune 5, 2025No Comments17 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    , myself included, begin their coding journey utilizing a Jupyter Notebook. These recordsdata have the extension .ipynb, which stands for Interactive Python Pocket book. Because the extension identify suggests, it has an intuitive and interactive person interface. The pocket book is damaged down into ‘cells’ or small blocks of separated code or markdown (textual content) language. Outputs are displayed beneath every cell as soon as the code inside that cell has been executed. This promotes a versatile and interactive atmosphere for coders to construct their coding expertise and begin engaged on information science tasks.

    A typical instance of a Jupyter Pocket book is under:

    Instance of a Jupyter Pocket book with code cells, markdown cells and a pattern visualisation.

    This all sounds nice. And don’t get me fallacious, to be used circumstances resembling conducting solo analysis or exploratory information evaluation (EDA), Jupyter Notebooks are nice. The problems come up once you ask the next questions:

    • How do you flip a Jupyter Pocket book into code that may be leveraged by a enterprise?
    • Are you able to collaborate with different builders on the identical challenge utilizing a model management system?
    • How will you deploy code to a manufacturing atmosphere?

    Fairly quickly, the constraints of solely utilizing Jupyter Notebooks inside a industrial context will begin to trigger issues. It’s merely not designed for these functions. The final answer is to organise code in a modular trend.

    By the top of this text, it is best to have a transparent understanding of tips on how to construction a small information science challenge as a Python program and admire some great benefits of transitioning to a programming method. You may try an instance template to complement this text in my github here.


    Disclaimer

    The contents of this text are based mostly on my expertise of migrating away from solely utilizing Jupyter Notebooks to jot down code. Do notebooks nonetheless have a objective? Sure. Are there alternative routes to organise and execute code past the strategies I focus on on this article? Sure.

    I wished to share this info to assist anybody eager to make the transfer away from notebooks and in the direction of writing scripts and applications. If I’ve missed any options of Jupyter Notebooks that mitigate the constraints I’ve talked about, please drop a remark!

    Let’s get again to it.


    Programming: what’s the massive deal?

    For the aim of this text, I’ll be specializing in the Python programming language as that is the language I exploit for information science tasks. Structuring code as a Python program unlocks a variety of functionalities which are tough to realize when working solely inside a Jupyter Pocket book. These advantages embody collaboration, versatility and portability – you’re merely in a position to do extra along with your code. I’ll clarify these advantages additional down – stick with me a bit longer!

    Python applications are sometimes organised into modules and packages. A module is a python script (recordsdata with a .py extension) that incorporates python code which may be imported into different recordsdata. A package deal is a listing that incorporates python modules. I’ll focus on the aim of the file __init__.py later within the article.

    Schematic of package deal and module construction in a knowledge science challenge

    Anytime you import a python library into your code, resembling built-in libraries like os or third-party libraries like pandas , you’re interacting with a python program that’s been organised right into a package deal and modules.

    For instance, let’s say you wish to use the randint perform from numpy. This perform means that you can generate a random integer based mostly on specified parameters. You would possibly write:

    from numpy.random import randint

    Let’s annotate that import assertion to indicate what you’re really importing.

    On this occasion, numpy is a package deal; random is a module and randint is a perform.

    So, it seems you most likely work together with python applications regularly. This poses the query, what does the journey appear like in the direction of changing into a python programmer?

    The nice transition: the place do you even begin?

    The trick to constructing a useful python program is all within the file construction and organisation. It sounds boring but it surely performs a brilliant necessary half in setting your self up for achievement!

    Let me use an analogy to elucidate: each home has a drawer that has nearly every thing in it; instruments, elastic bands, drugs, your hopes and desires, the lot. There’s no rhyme or motive, it’s a dumping floor of nearly every thing. Consider this as a Jupyter Pocket book. This one file sometimes incorporates all phases of a challenge, from importing information, exploring what the info appears to be like like, visualising traits, extracting options, coaching a mannequin and so forth. For a challenge that’s destined to be deployed on a manufacturing system or co-developed with colleagues, it’s going to trigger chaos. What’s wanted is a few organisation, to place all of the instruments in a single compartment, the medication in one other and so forth.

    An effective way to do this with code is to make use of a challenge template. One which I exploit incessantly is the Cookie Cutter Data Science template. You may create an entire listing in your challenge with all of the related recordsdata wanted to do absolutely anything in a couple of easy operations in a terminal window – see the hyperlink above for info on tips on how to set up and run Cookie Cutter.

    Under are a few of the key options of the challenge template:

    • package deal or src listing — listing for python scripts/modules, geared up with examples to get you began
    • readme.md — file to explain utilization, setup and tips on how to run the package deal
    • docs listing — containing recordsdata that allow seamless autodocumentation
    • Makefile— for writing OS ambivalent bespoke run instructions
    • pyproject.toml/necessities.txt — for dependency administration
    Venture template created by the Cookie Cutter Information Science package deal.

    Prime tip. Ensure to maintain Cookie Cutter updated. With each launch, new options are added in keeping with the ever-evolving information science universe. I’ve learnt fairly a couple of issues from exploring a brand new file or function within the template!

    Alternatively, you need to use different templates to construct your challenge resembling that offered by Poetry. Poetry is a package deal supervisor which you need to use to generate a challenge template that’s extra light-weight than Cookie Cutter.

    The easiest way to work together along with your challenge is thru an IDE (Built-in Growth Setting). This software program, resembling Visual Studio Code (VS Code) or PyCharm, embody a wide range of options and processes that allow you to code, take a look at, debug and package deal your work effectively. My private choice is VS Code!


    From cells to scripts: let’s get coding

    Now that we now have a improvement atmosphere and a properly structured challenge template, how precisely do you write code in a python script in case you’ve solely ever coded in a Jupyter Pocket book? To reply that query, let’s first contemplate a couple of industry-standard coding Best Practices.

    • Modular — observe the software program engineering philosophy of ‘Single Responsibility Principle’. All code ought to be encapsulated in capabilities, with every perform performing a single activity. The Zen of Python states: ‘Easy is healthier than advanced’.
    • Readable — if code is readable, then there’s a superb likelihood it will likely be maintainable. Make sure the code is filled with docstrings and feedback!
    • Trendy — format code in a constant and clear method. The PEP 8 guidelines are designed for this objective to advise how code ought to be introduced. You may set up autoformatters resembling Black in an IDE in order that code is mechanically formatted in compliance with PEP 8 every time the python script is saved. For instance, the correct stage of indentation and spacing will probably be utilized so that you don’t even have to consider it!
    • Versatile — if code is encapsulated into capabilities or courses, these may be reused all through a challenge.

    For a deeper dive into coding finest apply, this article is a incredible overview of ideas to stick to as a Information Scientist, remember to test it out!

    With these finest practices in thoughts, let’s return to the query: how do you write code in a python script?


    Module construction

    First, separate the totally different phases of your pocket book or challenge into totally different python recordsdata. And ensure to call them in keeping with the duty. For instance, you may need the next scripts in a typical machine studying package deal: information.py, preprocess.py, options.py, practice.py, predict.py, consider.py and so forth. Relying in your challenge construction, these would sit inside the package deal or src listing.

    Inside every script, code ought to be organised or ‘encapsulated’ right into a courses and/or capabilities. A function is a reusable block of code that performs a single, well-defined activity. A class is a blueprint for creating an object, with its personal set of attributes (variables) and strategies (capabilities). Encapsulating code on this method permits reusability and avoids duplication, thus conserving code concise.

    A script would possibly solely want one perform if the duty is straightforward. For instance, a knowledge loading module (e.g. information.py) could solely comprise a single perform ‘load_data’ which masses information from a csv file right into a pandas DataFrame. Different scripts, resembling a knowledge processing module (e.g. preprocess.py) will inherently contain extra duties and therefore requires extra capabilities or a category to encapsulate these duties.

    Instance template of a typical module in a knowledge science challenge.

    Prime tip. Transitioning from Jupyter Notebooks to scripts could take a while and everybody’s private journey will look totally different. Some Information Scientists I do know write code as python scripts right away and don’t contact a pocket book. Personally, I exploit a pocket book for EDA, I then encapsulate the code into capabilities or courses earlier than porting to a script. Do no matter feels best for you.

    There are a couple of instruments that may assist with the transition. 1) In VS Code, you’ll be able to choose a number of strains, proper click on and choose Run Python > Run Choice/Line in Python Terminal. That is much like operating a cell in Jupyter Pocket book. 2) You may convert a pocket book to a python script by clicking File > Obtain as > Python (.py). I wouldn’t suggest that method with giant notebooks for worry of making monster scripts, however the possibility is there!

    The ‘__main__’ occasion

    At this level, we’ve established that code ought to be encapsulated into capabilities and saved inside clearly named scripts. The subsequent logical query is, how are you going to tie all these scripts collectively so code will get executed in the correct order?

    The reply is to import these scripts right into a single-entry level and execute the code in a single place. Throughout the context of growing a easy challenge, this entry level is often a script named foremost.py (however may be known as something). On the high of foremost.py, simply as you’ll import needed built-in packages or third-party packages from PyPI, you’ll import your individual modules or particular courses/capabilities from modules. Any courses or capabilities outlined in these modules will probably be accessible to make use of by the script they’ve been imported into.

    To do that, the package deal listing inside your challenge must comprise a __init__.py file, which is often left clean for easy tasks. This file tells the python interpreter to deal with the listing as a package deal, which means that any recordsdata with a .py extension get handled as modules and may due to this fact be imported into different recordsdata.

    The construction of foremost.py is challenge dependent, however it would usually be dictated by the mandatory order of code execution. For a typical machine studying challenge, you’ll first want to make use of the load_data perform from the module information.py. You then would possibly instantiate the preprocessor class that’s imported from the module preprocess.py and apply a wide range of class strategies to the preprocessor object. You’ll then transfer onto function engineering and so forth till you have got the entire workflow written out. This workflow would sometimes be contained or referenced inside a conditional assertion on the backside of foremost.py.

    Wait….. who talked about something a few conditional assertion? The conditional assertion is as follows:

    if __name__ == '__main__': 
       #  add code right here

    __name__ is a particular python variable that may have two totally different values relying on how the script is run:

    • If the script is run immediately in terminal, the interpreter assigns the __name__ variable the worth '__main__'. As a result of the assertion if '__name__=='__main__': is true, any code that sits inside this assertion is executed.
    • If the script is run as an imported module, the interpreter assigns the identify of the module as a string to the __name__ variable. As a result of the assertion if if '__name__=='__main__': is fake, the contents of this assertion just isn’t executed.

    Some extra info on this may be discovered here.

    Given this course of, you’ll must reference the grasp perform inside the if '__name__=='__main__': conditional assertion in order that it’s executed when foremost.py is run. Alternatively, you’ll be able to place the code beneath if '__name__=='__main__': to realize the identical final result.

    Instance template of foremost.py, which serves as the principle entry level to this system

    foremost.py (or any python script) may be executed in terminal utilizing the next syntax:

    python3 foremost.py

    Upon operating foremost.py, code will probably be executed from all of the imported modules within the specified order. This is identical as clicking the ‘run all’ button on a Jupyter Notebook the place every cell is executed in sequential order. The distinction now’s that the code is organised into particular person scripts in a logical method and encapsulated inside courses and capabilities.

    You can too add CLI (command-line interface) arguments to your code utilizing instruments resembling argparse and typer, permitting you to toggle particular variables when operating foremost.py within the terminal. This supplies quite a lot of flexibility throughout code execution.

    So we’ve now reached the most effective half. The pièce de résistance. The actual explanation why, past having fantastically organised and readable code, it is best to go to the trouble of Programming.


    The top sport: what’s the purpose of programming?

    Let’s stroll via a few of the key advantages of shifting past Jupyter Notebooks and transitioning to writing Python scripts as an alternative.

    Visualisation of the important thing advantages to programming. Picture generated by creator.
    • Packaging & distribution — you’ll be able to package deal and distribute your python program so it may be shared, put in and run on one other pc. Package deal managers resembling pip, poetry or conda can be utilized to put in the package deal, simply as you’ll set up packages from PyPI, resembling pandas or numpy. The trick to efficiently distributing your package deal is to make sure that the dependencies are managed appropriately, which is the place the recordsdata pyproject.toml or necessities.txt are available in. Some helpful assets may be discovered here and here.
    • Deployment — while there are a number of strategies and platforms to deploy code, utilizing a modular method will put you in good stead to get your code manufacturing prepared. Instruments resembling Docker allow the deployment of applications or purposes in remoted environments known as containers, which may be simply managed via CI/CD (steady integration & deployment) pipelines. It’s value noting that whereas Jupyter Notebooks may be deployed utilizing JupyterLab, this method lacks the flexibleness and scalability of adopting a modular, script-based workflow.
    • Model management — shifting away from Jupyter Notebooks opens up the great worlds of model management and collaboration. Model management methods resembling Git are very a lot {industry} normal and provide a wealth of advantages, offering you employ them appropriately! Observe the motto ‘incremental adjustments are key’ and make sure that you make small, common commits with logical commit messages in crucial language everytime you make useful adjustments while growing. This may make it far simpler to maintain monitor of adjustments and take a look at code. Here is a brilliant helpful information to utilizing git as a knowledge scientist.

    Enjoyable reality. It’s usually discouraged to commit Jupyter Notebooks to model management methods as it’s tough to trace adjustments!

    • (Auto)Documentation — everyone knows that documenting code will increase its readability thus serving to the reader perceive what the code is doing. It’s thought-about finest apply so as to add docstrings to capabilities and courses inside python scripts. What’s actually cool is that we are able to use these docstrings to construct an index of formatted documentation of your complete challenge within the type of html recordsdata. Instruments resembling Sphinx allow you to do that in a fast and straightforward method. You may learn my earlier article which takes you thru this course of step-by-step.
    • Reusability — adopting a modular method promotes the reuse of code. There are lots of frequent duties inside information science tasks, resembling cleaning information or scaling options. There’s little level in reinventing the wheel, so in case you can reuse capabilities or courses with minor modification from earlier tasks, so long as there are not any confidentiality restrictions, then save your self that point! You may need a utils.py or courses.py module which incorporates ambivalent code that can be utilized throughout modules.
    • Configuration administration — while that is attainable with a Jupyter Pocket book, it’s common apply to make use of configuration administration for a python program. Configuration administration refers to organising and managing a challenge’s parameters and variables in a centralised method. As an alternative of defining variables all through the code, they’re saved in a file that sits inside the challenge listing. Because of this you don’t want to interrogate the code to vary a parameter. An outline of this may be discovered here.

    Notice. For those who use a YAML file (.yml) for configuration, this requires the python package deal yaml. Ensure to put in the pyyaml package deal (not ‘yaml’) utilizing pip set up pyyaml. Forgetting this will result in “package deal not discovered” errors—I’ve made this error, perhaps greater than as soon as..

    • Logging — utilizing loggers inside a python program lets you simply monitor code execution, present debugging info and monitor a program or utility. While this performance is feasible inside a Jupyter Pocket book, it’s usually thought-about overkill and is fulfilled with the print() assertion as an alternative. By utilizing python’s logger module, you’ll be able to format a logging object to your liking. It has 5 totally different messaging ranges (information, debug, warning, error, important) relative to the severity of the occasions being logger. You may embody logging messages all through the code to offer perception into code execution, which may be printed to terminal and/or written to a file. You may be taught extra about logging here.

    When are Jupyter Notebooks helpful?

    As I eluded firstly of this text, Jupyter Notebooks nonetheless have their place in information science tasks. Their easy-to-use interface makes them nice for exploratory and interactive duties. Two key use circumstances are listed under:

    • Conducting exploratory information evaluation on a dataset through the preliminary phases of a challenge.
    • Creating an interactive useful resource or report back to show analytical findings. Notice there are many instruments on the market that you need to use on this nature, however a Jupyter Pocket book also can do the trick.

    Ultimate ideas

    Thanks for sticking with me to the very finish! I hope this dialogue has been insightful and has shed some gentle on how and why to begin programming. As with most issues in Information Science, there isn’t a single ‘right’ strategy to resolve an issue, however a thought-about multi-faceted method relying on the duty at hand.

    Shout out to my colleague and fellow information scientist Hannah Alexander for reviewing this text 🙂

    Thanks for studying!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Your DNA Is a Machine Learning Model: It’s Already Out There

    June 6, 2025

    Inside Google’s Agent2Agent (A2A) Protocol: Teaching AI Agents to Talk to Each Other

    June 6, 2025

    How to Design My First AI Agent

    June 5, 2025

    Decision Trees Natively Handle Categorical Data

    June 5, 2025

    Landing your First Machine Learning Job: Startup vs Big Tech vs Academia

    June 5, 2025

    Pairwise Cross-Variance Classification | Towards Data Science

    June 5, 2025
    Leave A Reply Cancel Reply

    Editors Picks

    An interview with ASML CEO Christophe Fouquet, as the company navigates political instability in The Netherlands and abroad and the impacts of Trump’s trade war (Adam Satariano/New York Times)

    June 6, 2025

    Today’s NYT Connections: Sports Edition Hints, Answers for June 6 #256

    June 6, 2025

    New discovery links red blood cells to organ damage

    June 6, 2025

    What surviving cancer outliers can teach us: The tech behind a new paradigm in oncology

    June 6, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    US green energy braces for federal funding cuts

    May 30, 2025

    Titanium Mini Portable Wrench with eight functions

    May 28, 2025

    Everyone in AI is talking about Manus. We put it to the test.

    March 16, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.