in 2022, issues have been wildly completely different.
Youngsters these days don’t know what it’s like.
I used to spend hours:
- Writing Python and SQL code from scratch, line by line
- Memorizing which libraries to import and what capabilities they contained (from sklearn.metrics import r2_score)
- Debugging code errors
- Writing documentation for my code
- Constructing dashboards to research giant datasets
Even in simply the final yr, as AI instruments have change into more and more extra superior, my job as an information scientist has modified. I’m much less of a coding machine and extra of a strategist. Somebody who understands the information in my group rather well and is aware of how you can greatest current it and derive insights from it.
Claude is altering issues even quicker
Claude is a type of instruments that I consider will remodel the trade and this profession quicker than anybody can think about. I received’t lie, it’s form of scary. On the identical time, there are methods by which information scientists can take possession of this software, grasp it, and proceed to remain forward of the sport.
Listed below are 3 CRUCIAL expertise each information scientist ought to be engaged on mastering proper now:
1. Claude Dashboards
I used to spend a complete day constructing a Tableau dashboard for a consumer simply to discover a couple of questions on a big dataset that may by no means be checked out once more in a couple of months.
Now, Claude can generate a completely working, interactive dashboard in a couple of minutes, full with:
- KPI metric playing cards
- Line charts
- Bar charts
- Drill-down buttons
- Tabs
- … and Extra
Let’s showcase a easy instance utilizing the AEP hourly energy dataset (CC0 license).
Claude Immediate:
I’ve a time sequence dataset of hourly power consumption (AEP_MW) with a datetime column. Construct me an interactive HTML dashboard that features:
1. 4 KPI playing cards exhibiting common load, peak load, minimal load,
and summer time vs winter comparability
2. A line chart exhibiting common load by hour of day break up by weekday vs weekend
3. A bar chart of common month-to-month load with increased months highlighted in a hotter shade
4. A bar chart of common load by day of week with weekends in a unique shade. Use a clear, minimal fashion.
The end result appears to be like like this:

Just a few insights instantly stand out from the dashboard that wouldn’t be doable to acquire from a uncooked CSV:
- Weekday consumption peaks sharply round 5-6 PM, whereas weekends peak earlier (round 2 PM) and at a decrease stage general
- July and August consumption is considerably increased than spring months, confirming robust summer time seasonality from air-con load
- Saturday and Sunday masses are constantly about 10% decrease than weekdays
All these dashboards are good for doing EDA in addition to for producing one-time reviews for stakeholders who simply wish to know what’s occurring at a single cut-off date. It’s also possible to generate a dashboard on a schedule so you will get a brand new report each week.
2. Claude Cowork for Prioritizing Jira Tickets & Duties

Right here’s what a typical Monday morning used to appear to be for me: open Jira, click on by 20 open tickets, attempt to bear in mind the context on each, determine what’s blocking what, and write a tough precedence record for the week.
Claude Cowork is completely different from Claude Chat in that it truly connects to your desktop and might learn/write recordsdata. It could hook up with Jira (Or one other Scrum/Agile platform), and summarize your priorities for the week. Right here’s an instance:
Pull all my open tickets from the present dash. For each, give me: the ticket ID, a one-sentence abstract of what must occur, the present standing, and any blockers. Rank them by precedence and inform me what I ought to sort out first as we speak.

Listed below are a couple of different prompts you should utilize with Cowork:
Writing tickets to Jira
Listed below are my notes from as we speak’s mannequin assessment assembly: [paste notes – or link to the notes if your Cowork is connected to Google Drive]. Create Jira tickets for every motion merchandise within the DS challenge.
For each, write a transparent title, a 2-sentence description of what
must occur and why, set the precedence based mostly on urgency,
and assign them to the present dash.
Getting ready for a stakeholder assembly
Learn the final 3 weeks of feedback on tickets tagged ‘model-deployment’ and write me a 5-bullet standing abstract I can share with the engineering workforce lead. Hold it non-technical.
Drafting documentation from scratch
Open the file preprocessing_pipeline.py in my challenge folder and write a README part explaining what the pipeline does, what inputs it expects, and what it outputs.
Finish-of-sprint reporting
Based mostly on the closed tickets from this dash, write a 3-paragraph dash abstract for my supervisor that covers what we shipped, what we realized, and what’s carrying over to subsequent dash.
This can be a large time saver and also will preserve you extra organized.
3. Debugging with Claude Code

Claude Code is a command-line software that runs in your terminal with full entry to your codebase. It could:
- Learn recordsdata throughout your challenge
- Run instructions
- Execute assessments
- Make modifications throughout a number of recordsdata
For information scientists, probably the most instantly helpful utility is debugging pipelines.
Right here’s an actual situation I bumped into at work just lately with dbt. The names of the fashions and recordsdata have been modified so I don’t share any confidential firm info.
I ran dbt run --select fct_energy_forecast and acquired this:Database Error in mannequin fct_energy_forecast column "meter_reading_mw" doesn't exist LINE 14: AVG(meter_reading_mw) AS avg_load_mw,
The issue with dbt fashions is {that a} column error in a downstream mart mannequin doesn’t let you know the place the column truly broke. It might have been renamed within the uncooked supply, within the staging mannequin, in an intermediate aggregation layer, or within the mart itself. To search out the basis trigger manually, you’d need to open every file within the dependency chain one after the other, hint the column identify by each transformation, and determine the place the previous identify was by no means up to date. On a challenge with 24 fashions and 6 sources, that may very well be over an hour of studying, re-running and re-building fashions.
I handed it to Claude Code as a substitute:
My dbt mannequin fct_energy_forecast is failing with ‘column meter_reading_mw doesn’t exist’.
Discover the place this column is outlined upstream, hint all dependent
fashions and supply recordsdata, determine what occurred, and repair it.
Claude learn each file within the dependency chain and got here again in about 40 seconds with a prognosis.
It then utilized the repair throughout all three traces, re-ran the mannequin, and confirmed it handed.
Conclusion
As instruments evolve, our roles will too. Claude is altering the kind of work that information scientists are going to finish up doing. As a substitute of spending 8 hours a day debugging numerous dbt and Python errors, these errors will probably be resolved in 2 minutes, permitting us extra time to dive deeper into our information and ask extra vital questions. As information scientists in 2026, it’s vital that we repeatedly develop our skillset and stay updated.
It’s additionally vital to notice that whereas Claude has plenty of capabilities, it’s nonetheless AI and might (and does) make errors. Information scientists who’ve mastery of Claude will nonetheless be wanted to validate information, enhance prompts and processes, and proper Claude when it’s improper.

