, Python is quick sufficient — particularly while you lean on NumPy, Polars, or different properly‑tuned libraries written in compiled languages like C. However from time to time, you find yourself with a sizzling loop that gained’t vectorise: possibly you’re strolling a listing of strings to wash them up, otherwise you’re parsing messy textual content the place every character issues. You profile it, you verify the offender, and also you stare at a for loop that eats half your runtime. That is the second Rust shines.
Rust offers you predictable efficiency, tight management over reminiscence, and fearless concurrency, with out the trouble of guide reminiscence administration. For those who’re pondering — not one other language to be taught!, the excellent news is you don’t have to abandon Python to make use of Rust. You possibly can preserve your orchestration, your notebooks, your assessments — and transfer solely the tiny, boring interior loops to Rust. This retains the Rust studying curve to an absolute minimal.
On this article, I’ll show the right way to name Rust from Python and examine the efficiency variations between operating pure Python and a Python/Rust mixture. This gained’t be a tutorial on Rust programming, as I’m assuming you no less than know the fundamentals of that.
Why trouble?
Now, you would possibly assume: if I do know Rust, why would I even trouble integrating it with Python —simply program in Rust, proper?
Properly, first, I might say that realizing Rust doesn’t mechanically make it the perfect language to your entire software. For a lot of techniques, e.g., ML, AI, scripting and Internet backends, and so forth, Python is already the language of selection.
Secondly, most code shouldn’t be performance-critical. For these elements which are, you typically want solely a really small subset of Rust to make an actual distinction, so a tiny little bit of Rust data can go a great distance.
Lastly, Python’s ecosystem is difficult to switch. Even when Rust properly, Python offers you quick entry to instruments like:
- pandas
- NumPy
- scikit-learn
- Jupyter
- Airflow
- FastAPI tooling
- an enormous quantity of scripting and automation libraries
Rust could also be quicker, however Python typically wins on ecosystem attain and improvement comfort.
Hopefully, I’ve completed sufficient to persuade you to offer integrating Rust with Python an opportunity. With that being stated, let’s get began.
Rust and Maturin
For our use instances, we want two issues: Rust and a device referred to as maturin.
Most of you’ll learn about Rust. A quick compiled language that has come to the fore in recent times. You won’t have heard of maturin, although.
Maturin is principally a construct and packaging device for Python extensions written in Rust (utilizing PyO3 or rust-cpython). It helps us do the next:
Builds your Rust code right into a Python module
- Takes your Rust crate and compiles it right into a shared library (.pyd on Home windows, .so on Linux, .dylib on macOS) that Python can import.
- Routinely units the right compiler flags for launch/debug and for the Python model you’re concentrating on.
- Works with PyO3’s extension module characteristic, so Python can import the compiled library as a traditional module.
Packages wheels for distribution
- Wheels are the .whl information you add to PyPI (precompiled binaries).
- Maturin helps constructing wheels for manylinux, macOS, and Home windows that work throughout Python variations and platforms.
- It cross-compiles when wanted, or runs inside a Docker picture to fulfill PyPI’s “manylinux” guidelines.
Publishes to PyPI
- With one command, Maturin can construct your Rust extension and add it.
- Handles credentials, metadata, and platform tags mechanically.
Integrates Rust into Python packaging
- Maturin generates a pyproject.toml that defines your undertaking so Python instruments like pip know the right way to construct it.
- Helps PEP 517, so pip set up works even when the consumer doesn’t have maturin put in.
- Works seamlessly with setuptools while you combine Python and Rust code in a single bundle.
OK, that’s sufficient of the idea, let’s get right down to writing, operating, and timing some code samples.
Organising a improvement surroundings
As common, we’ll arrange a separate improvement surroundings to do our work. That method, our work gained’t intrude with some other initiatives we’d have on the go. I take advantage of the UV device for this, and I’m utilizing WSL2 Ubuntu for Home windows as my working system.
$ uv init pyrust
$ cd pyrust
$ uv venv pyrust
$ supply pyrust/bin/activate
(pyrust) $
Putting in rust
Now we are able to set up Rust with this easy command.
(pyrust) $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Ultimately, 3 choices might be displayed in your display screen like this.
Welcome to Rust!
It will obtain and set up the official compiler for the Rust
programming language, and its bundle supervisor, Cargo.
...
...
...
1) Proceed with customary set up (default - simply press enter)
2) Customise set up
3) Cancel set up
Press 1, then press Enter when prompted for set up choices should you’d like to make use of the default choices. To make sure Rust is correctly put in, run the next command.
(pyrust) $ rustc --version
rustc 1.89.0 (29483883e 2025-08-04)
Instance 1 — A Whats up World equal
Let’s begin with a easy instance of calling Rust from Python. Create a brand new sub-folder and add these three information.
Cargo.toml
[package]
identify = "hello_rust"
model = "0.1.0"
version = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
pyo3 = { model = "0.25", options = ["extension-module"] }
pyproject.toml
[build-system]
requires = ["maturin>=1.5,<2"]
build-backend = "maturin"
[project]
identify = "hello_rust"
model = "0.1.0"
requires-python = ">=3.9"
Lastly, our Rust supply file goes into the subfolder src/lib.rs
use pyo3::prelude::*;
/// A easy operate we’ll expose to Python
#[pyfunction]
fn greet(identify: &str) -> PyResult {
Okay(format!("Whats up, {} from Rust!", identify))
}
/// The module definition
#[pymodule]
fn hello_rust(_py: Python, m: &Certain<'_, PyModule>) -> PyResult<()> {
m.add_function(wrap_pyfunction!(greet, m)?)?;
Okay(())
}
Now run it with …
(pyrust) $ python -c "import hello_rust as hr; print(hr.greet('world'))"
# Output
Whats up, world from Rust!
We put our Rust code in src/lib.rs to observe the conference that Rust library code goes there, reasonably than in src/predominant.rs, which is reserved for stand-alone Rust executable code.
Maturin + PyO3 seems inside src/lib.rs for the #[pymodule] operate, which registers your Rust capabilities for Python to name.
Instance 2 — Python loops vs Rust loops
Take into account one thing intentionally mundane however consultant: you’ve gotten a listing of sentences and have to normalise them. By normalise, I imply changing them into a typical, constant kind earlier than additional processing.
Suppose we need to lowercase the whole lot, drop punctuation, and break up into tokens. That is laborious to vectorise effectively as a result of the logic branches on each character.
In pure Python, you would possibly write this:-
# ------------------------
# Python baseline
# ------------------------
def process_one_py(textual content: str) -> checklist[str]:
phrase = []
out = []
for c in textual content:
if c.isalnum():
phrase.append(c.decrease())
else:
if phrase:
out.append("".be a part of(phrase))
phrase = []
if phrase:
out.append("".be a part of(phrase))
return out
# Run the above for a lot of inputs
def batch_process_py(texts: checklist[str]) -> checklist[list[str]]:
return [process_one_py(t) for t in texts]
So, for instance,
(pyrust) $ batch_process_py["Hello, World! 123", "This is a test"]
Would return,
[['hello', 'world', '123'], ['this', 'is', 'a', 'test']]
That is what the Rust equal would possibly appear like,
/// src/lib.rs
use pyo3::prelude::*;
use pyo3::wrap_pyfunction;
/// Course of one string: lowercase + drop punctuation + break up on whitespace
fn process_one(textual content: &str) -> Vec {
let mut out = Vec::new();
let mut phrase = String::new();
for c in textual content.chars() {
if c.is_alphanumeric() {
phrase.push(c.to_ascii_lowercase());
} else if c.is_whitespace() {
if !phrase.is_empty() {
out.push(std::mem::take(&mut phrase));
}
}
// ignore punctuation totally
}
if !phrase.is_empty() {
out.push(phrase);
}
out
}
#[pyfunction]
fn batch_process(texts: Vec) -> PyResult>> {
Okay(texts.iter().map(|t| process_one(t)).acquire())
#[pymodule]
fn rust_text(_py: Python<'_>, m: &Certain) -> PyResult<()> {
m.add_function(wrap_pyfunction!(batch_process, m)?)?;
Okay(())
}
Okay, let’s run these two applications with a big enter (500,000 texts) and see what the run-time variations are. For that, I’ve written a benchmark Python script as follows.
from time import perf_counter
from statistics import median
import random
import string
import rust_text # the compiled extension
# ------------------------
# Python baseline
# ------------------------
def process_one_py(textual content: str) -> checklist[str]:
phrase = []
out = []
for c in textual content:
if c.isalnum():
phrase.append(c.decrease())
elif c.isspace():
if phrase:
out.append("".be a part of(phrase))
phrase = []
# ignore punctuation
if phrase:
out.append("".be a part of(phrase))
return out
def batch_process_py(texts: checklist[str]) -> checklist[list[str]]:
return [process_one_py(t) for t in texts]
# ------------------------
# Artificial information
# ------------------------
def make_texts(n=500_000, vocab=10_000, mean_len=40):
phrases = ["".join(random.choices(string.ascii_lowercase, k=5)) for _ in range(vocab)]
texts = []
for _ in vary(n):
L = max(3, int(random.expovariate(1/mean_len)))
texts.append(" ".be a part of(random.selection(phrases) for _ in vary(L)))
return texts
texts = make_texts()
# ------------------------
# Timing helper
# ------------------------
def timeit(fn, *args, repeat=5):
runs = []
for _ in vary(repeat):
t0 = perf_counter()
fn(*args)
t1 = perf_counter()
runs.append(t1 - t0)
return median(runs)
# ------------------------
# Run benchmarks
# ------------------------
py_time = timeit(batch_process_py, texts)
rust_time = timeit(rust_text.batch_process, texts)
n = len(texts)
print("n--- Benchmark ---")
print(f"Python median: {py_time:.3f} s | throughput: {n/py_time:,.0f} texts/s")
print(f"Rust 1-thread median: {rust_time:.3f} s | throughput: {n/rust_time:,.0f} texts/s")
As earlier than, we have to compile our Rust code so Python can import it. Within the earlier instance, maturin was used not directly because the construct backend by way of pyproject.toml. Right here, we name it straight from the command line:
(pyrust) $ maturin develop --release
And now we are able to merely run our benchmark code like this.
(pyrust) $ python benchmark.py
--- Benchmark ---
Python median: 5.159 s | throughput: 96,919 texts/s
Rust 1-thread median: 3.024 s | throughput: 165,343 texts/s
That was an inexpensive velocity up with out an excessive amount of effort. There’s yet another factor we are able to use to get even better decreases in runtime.
Rust has entry to a parallelising library referred to as Rayon, making it simple to unfold code throughout a number of CPU cores. In a nutshell, Rayon …
- Let’s you exchange sequential iterators (iter()) with parallel iterators (par_iter()).
- Routinely splits your information into chunks, distributes the work throughout CPU threads, after which merges the outcomes.
- Abstracts away the complexity of thread administration and synchronisation
Instance 3 — Including parallelism to our current Rust code
That is easy. If we have a look at the Rust code from the earlier instance, we solely have to make the next three minor modifications (marked with feedback under).
/// src/lib.rs
use pyo3::prelude::*;
use pyo3::wrap_pyfunction;
/// Add this line - Change 1
use rayon::prelude::*;
/// Course of one string: lowercase + drop punctuation + break up on whitespace
fn process_one(textual content: &str) -> Vec {
let mut out = Vec::new();
let mut phrase = String::new();
for c in textual content.chars() {
if c.is_alphanumeric() {
phrase.push(c.to_ascii_lowercase());
} else if c.is_whitespace() {
if !phrase.is_empty() {
out.push(std::mem::take(&mut phrase));
}
}
// ignore punctuation totally
}
if !phrase.is_empty() {
out.push(phrase);
}
out
}
#[pyfunction]
fn batch_process(texts: Vec) -> PyResult>> t
/// Add this operate - change 2
#[pyfunction]
fn batch_process_parallel(texts: Vec) -> PyResult>> t
#[pymodule]
fn rust_text(_py: Python<'_>, m: &Certain) -> PyResult<()> {
m.add_function(wrap_pyfunction!(batch_process, m)?)?;
// Add this line - change 3
m.add_function(wrap_pyfunction!(batch_process_parallel, m)?)?;
Okay(())
}
In our benchmark Python code, we solely want so as to add a name to the parallel Rust code and print out the brand new outcomes.
...
...
# ------------------------
# Run amended benchmarks
# ------------------------
py_time = timeit(batch_process_py, texts)
rust_time = timeit(rust_text.batch_process, texts)
rust_par_time = timeit(rust_text.batch_process_parallel, texts)
n = len(texts)
print("n--- Benchmark ---")
print(f"Python median: {py_time:.3f} s | throughput: {n/py_time:,.0f} texts/s")
print(f"Rust 1-thread median: {rust_time:.3f} s | throughput: {n/rust_time:,.0f} texts/s")
print(f"Rust Rayon median: {rust_par_time:.3f} s | throughput: {n/rust_par_time:,.0f} texts/s")
Listed below are the outcomes from operating the amended benchmark.
--- Benchmark ---
Python median: 5.171 s | throughput: 96,694 texts/s
Rust 1-thread median: 3.091 s | throughput: 161,755 texts/s
Rust Rayon median: 2.223 s | throughput: 224,914 texts/s
The parallelised Rust code shaved about 27% off the non-parallelised Rust time and was greater than twice as quick because the naked Python code. Not too shabby.
Abstract
Python is often quick sufficient for many duties. But when profiling exhibits a gradual spot that may’t be vectorised and actually impacts your runtime, you don’t have to surrender on Python or rewrite your entire undertaking. As an alternative, you possibly can transfer simply the performance-critical elements to Rust and depart the remainder of your code as it’s.
With PyO3 and maturin, you possibly can compile Rust code right into a Python module that works easily along with your current libraries. This allows you to preserve most of your Python code, assessments, packaging, and workflows, whereas getting the velocity, reminiscence security, and concurrency advantages of Rust the place you want them most.
The straightforward examples and benchmarks right here present that rewriting only a small a part of your code in Rust could make Python a lot quicker. Including Rayon for parallelism boosts efficiency much more, with only some code modifications and no sophisticated instruments. This can be a sensible and simple technique to velocity up Python workloads with out switching your entire undertaking to Rust.

