Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • June deadline approaches for Hawthorne sale process
    • Today’s NYT Mini Crossword Answers for June 4
    • New tiny nudibranch species discovered in Taiwan
    • Why the Budget’s CGT changes are a disaster for angel investors and startups
    • OpenAI and Anthropic Sign Letter to Prevent AI-Developed Biological Weapons
    • New York sports betting statements bill advances
    • SwitchBot Launches the Most Complete Home Weather Station I’ve Seen
    • What It Takes for Future-Ready Power Distribution
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Thursday, June 4
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Beyond Code Generation: AI for the Full Data Science Workflow
    Artificial Intelligence

    Beyond Code Generation: AI for the Full Data Science Workflow

    Editor Times FeaturedBy Editor Times FeaturedMarch 26, 2026No Comments11 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    feeling a continuing sense of AI FOMO. Every single day, I see individuals sharing AI ideas, new brokers and abilities they constructed, and vibe-coded apps. I’m more and more realizing that adapting rapidly to AI is changing into a requirement for staying aggressive as an information scientist right now.

    However I’m not solely speaking about brainstorming with ChatGPT, producing code with Cursor, or sharpening a report with Claude. The larger shift is that AI can now take part in a way more end-to-end information science workflow.

    To make the thought concrete, I attempted it on an actual mission utilizing my Apple Well being information.


    A Easy Instance — Apple Well being Evaluation

    Context

    I’ve been carrying an Apple Watch every single day since 2019 to trace my well being information, corresponding to coronary heart price, vitality burned, sleep high quality, and so forth. This information incorporates years of behavioral indicators about my every day life, however the Apple Well being app principally surfaces it with easy development views. 

    I attempted to research a two-year Apple Well being export six years in the past. Nevertheless it ended up changing into a kind of facet initiatives that you simply by no means completed… My aim this time is to extract extra insights from the uncooked information rapidly with the assistance of AI. 

    What I needed to work with

    Listed here are the related assets I’ve:

    1. Uncooked Apple Well being export information: 1.85GB in XML, uploaded to my Google Drive.
    2. Pattern code to parse the uncooked export to structured datasets in my GitHub repo from six years in the past. However the code could possibly be outdated. 
    Uncooked XML information screenshot by the writer

    Workflow with out AI

    A typical workflow with out AI would look loads like what I attempted six years in the past: Examine the XML construction, write Python to parse it into structured native datasets, conduct EDA with Pandas and Numpy, and summarize the insights. 

    I’m positive each information scientist is conversant in this course of — it’s not rocket science, nevertheless it takes time to construct. To get to a elegant insights report, it will take a minimum of a full day. That’s why that 6-year-old repo remains to be marked as WIP…

    AI end-to-end workflow

    My up to date workflow with AI is:

    1. AI locates the uncooked information in my Google Drive and downloads it.
    2. AI references my outdated GitHub code and writes a Python script to parse the uncooked information.
    3. AI uploads the parsed datasets to Google BigQuery. After all, the evaluation may be performed domestically with out BigQuery, however I set it up this strategy to higher resemble an actual work surroundings.
    4. AI runs SQL queries towards BigQuery to conduct the evaluation and compile an evaluation report.  

    Primarily, AI handles practically each step from information engineering to evaluation, with me performing extra as a reviewer and decision-maker.

    AI-generated report

    Now, let’s see what Codex was in a position to generate with my steerage and a few back-and-forth in half-hour, excluding the time to arrange the surroundings and tooling. 

    I selected Codex as a result of I primarily use Claude Code at work, so I wished to discover a distinct device. I used this opportunity to arrange my Codex surroundings from scratch so I can higher consider all the trouble required. 

    You may see that this report is nicely structured and visually polished. It summarized invaluable insights into annual tendencies, train consistency, and the impression of journey on exercise ranges. It additionally supplied suggestions and acknowledged limitations and assumptions. What impressed me most was not simply the pace, however how rapidly the output started to appear to be a stakeholder-facing evaluation as an alternative of a tough pocket book. 

    Please be aware that the report is sanitized for my information privateness.

    Codex-generated report (numbers adjusted for information privateness, screenshot by the writer)
    Codex-generated report (numbers adjusted for information privateness, screenshot by the writer)
    Codex-generated report (numbers adjusted for information privateness, screenshot by the writer)

    How I Truly Did It

    Now that we have now seen the spectacular work AI can generate in half-hour, let me break it down and present you all of the steps I took to make it occur. I used Codex for this experiment. Like Claude Code, it could run within the desktop app, an IDE, or the CLI.

    1. Arrange MCP

    To allow Codex to entry instruments, together with Google Drive, GitHub, and Google BigQuery, the subsequent step was to arrange Mannequin Context Protocol (MCP) servers.

    The simplest strategy to arrange MCP is to ask Codex to do it for you. For instance, after I requested it to arrange Google Drive MCP, it configured my native recordsdata rapidly with clear subsequent steps on tips on how to create an OAuth consumer within the Google Cloud Console. 

    It doesn’t all the time succeed on the primary attempt, however persistence helps. Once I requested it to arrange BigQuery MCP, it failed a minimum of 10 instances earlier than the connection succeeded. However every time, it supplied me with clear directions on tips on how to take a look at it and what data was useful for troubleshooting.

    Codex MCP arrange screenshots by the writer
    Codex MCP arrange screenshots by the writer

    2. Make a plan with the Plan Mode

    After establishing the MCPs, I moved to the precise mission. For an advanced mission that entails a number of information sources/instruments/questions, I often begin with the Plan Mode to choose the implementation steps. In each Claude Code and Codex, you possibly can allow Plan Mode with /plan. It really works like this: you define the duty and your tough plan, the mannequin asks clarifying questions and proposes a extra detailed implementation plan so that you can overview and refine. Within the screenshots beneath, you could find my first iteration with it. 

    Plan Mode screenshots by the writer – Half 1
    Plan Mode screenshots by the writer – Half 2
    Plan Mode screenshots by the writer – Half 3

    3. Execution and iteration

    After I hit “Sure, implement this plan”, Codex began executing by itself, following the steps. It labored for 13 minutes and generated the primary evaluation beneath. It moved quick throughout totally different instruments, nevertheless it did the evaluation domestically because it encountered extra points with the BigQuery MCP. After one other spherical of troubleshooting, it was in a position to add the datasets and run queries in BigQuery correctly.

    First evaluation output screenshot by the writer

    Nevertheless, the first-pass output was nonetheless shallow, so I guided it to go deeper with follow-up questions. For instance, I’ve flight tickets and journey plans from previous travels in my Google Drive. I requested it to search out them and analyze my exercise patterns throughout journeys. It efficiently situated these recordsdata, extracted my journey days, and ran the evaluation.

    After a couple of iterations, it was in a position to generate a way more complete report, as I shared at first, inside half-hour. Yow will discover its code here. That was most likely some of the necessary classes from the train: AI moved quick, however depth nonetheless got here from iteration and higher questions.

    Codex finding my previous journey dates (screenshot by the writer)

    Takeaways for Knowledge Scientists

    What AI Modifications

    Above is a small instance of how I used Codex and MCPs to run an end-to-end evaluation with out manually writing a single line of code. What are the takeaways for information scientists at work? 

    1. Assume past coding help. Quite than utilizing AI just for coding and writing, it’s value increasing its position throughout the total information science lifecycle. Right here, I used AI to find uncooked information in Google Drive and add parsed datasets to BigQuery. There are a lot of extra AI use circumstances associated to information pipelining and mannequin deployment.
    2. Context turns into a power multiplier. MCPs are what made this workflow rather more highly effective. Codex scanned my Google Drive to find my journey dates and skim my outdated GitHub code to search out pattern parsing code. Equally, you possibly can allow different company-approved MCPs to assist your AI (and your self) higher perceive the context. For instance:
      – Connect with Slack MCP and Gmail MCP to seek for previous related conversations.
      – Use Atlassian MCP to entry the desk documentation on Confluence.
      – Arrange Snowflake MCP to discover the info schema and run queries.
    3. Guidelines and reusable abilities matter. Though I didn’t display it explicitly on this instance, you need to customise guidelines and create abilities to information your AI and lengthen its capabilities. These matters are value their very own article subsequent time 🙂 

    How the Function of Knowledge Scientists Will Evolve

    However does this imply AI will substitute information scientists? This instance additionally sheds mild on how information scientists’ roles will pivot sooner or later. 

    1. Much less handbook execution, extra problem-solving. Within the instance above, the preliminary evaluation Codex generated was very primary. The standard of AI-generated evaluation relies upon closely on the standard of your downside framing. You might want to outline the query clearly, break it into actionable duties, determine the proper method, and push the evaluation deeper.
    2. Area information is vital. Area information remains to be very a lot required to interpret outcomes accurately and supply suggestions. For instance, AI observed my exercise stage had declined considerably since 2020. It couldn’t discover a convincing rationalization, however mentioned: “Doable causes embody routine adjustments, work schedule, life-style shifts, damage, motivation, or much less structured coaching, however these are inferences, not findings.” However the actual cause behind it, as you might need realized, is the pandemic. I began working from house in early 2020, so naturally, I burned fewer energy. This can be a quite simple instance of why area information nonetheless issues — even when AI can entry all of the previous docs in your organization, it doesn’t imply it’ll perceive all of the enterprise nuances, and that’s your aggressive benefit. 
    3. This instance was comparatively simple, however there are nonetheless many lessons of labor the place I might not belief AI to function independently right now, particularly initiatives that require stronger technical and statistical judgment, corresponding to causal inference.

    Necessary Caveats

    Final however not least, there are some issues you will have to bear in mind whereas utilizing AI:

    1. Knowledge safety. I’m positive you’ve heard this many instances already, however let me repeat it as soon as extra. The info safety danger of utilizing AI is actual. For a private facet mission, I can set issues up nonetheless I need and take my very own danger (truthfully, granting AI full entry to Google Drive seems like a dangerous transfer, so that is extra for illustration functions). However at work, all the time comply with your organization’s steerage on which instruments are protected to make use of and the way. And ensure to learn via each single command earlier than clicking “approve”. 
    2. Double-check the code. For my easy mission, AI can write correct SQL with out issues. However in additional difficult enterprise settings, I nonetheless see AI make errors in its code now and again. Typically, it joins tables with totally different granularities, inflicting fanning out and double-counting. Different instances, it misses vital filters and circumstances. 
    3. AI is handy, nevertheless it may accomplish your ask with surprising negative effects… Let me inform you a shaggy dog story to finish this text. This morning, I turned on my laptop computer and noticed an alert of no disk storage left — I’ve a 512GB SSD MacBook Professional, and I used to be fairly positive I had solely used round half of the storage. Since I used to be taking part in with Codex final evening, it grew to become my first suspect. So I truly requested it, “hey did you do something? My ‘system information’ had grown by 150GB in a single day”. It responded, “No, Codex solely takes xx MB”. Then I dug up my recordsdata and noticed a 142GB “bigquery-mcp-wrapper.log”… Possible, Codex arrange this log when it was troubleshooting the BigQuery MCP setup. Later within the precise evaluation process, it exploded into an enormous file. So sure, this magical wishing machine comes at a value. 

    This expertise summed up the tradeoff nicely for me: AI can dramatically compress the gap between uncooked information and helpful evaluation, however getting probably the most out of it nonetheless requires judgment, oversight, and a willingness to debug the workflow itself.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    I Built a C++ Backend So My GPU Would Stop Eating Air

    June 3, 2026

    I Spent May Evaluating Different Engines for OCR

    June 3, 2026

    Why AI Is NOT Stealing Your Job

    June 3, 2026

    What AI Agents Should Never Do on Their Own

    June 3, 2026

    Exploring Income Patterns with Python Pandas, Matplotlib, and Seaborn

    June 2, 2026

    From Local App to Public Website in Minutes

    June 2, 2026

    Comments are closed.

    Editors Picks

    June deadline approaches for Hawthorne sale process

    June 4, 2026

    Today’s NYT Mini Crossword Answers for June 4

    June 4, 2026

    New tiny nudibranch species discovered in Taiwan

    June 4, 2026

    Why the Budget’s CGT changes are a disaster for angel investors and startups

    June 4, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Bally’s Chicago casino reaches full height as $1.7B riverfront project nears completion

    May 2, 2026

    Innovative multitool combines screwdriver and wrench

    October 7, 2025

    Lost in Translation: How AI Exposes the Rift Between Law and Logic

    May 22, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.