This tutorial shows you how to run the same expression on different execution engines. You’ll learn when to choose each backend, see how Xorq moves data between them using Apache Arrow, and compare backend performance to find the best fit for your workload.
After completing this tutorial, you’ll know how to pick the right backend and understand the performance trade-offs.
Important
This tutorial requires DuckDB support. Install with pip install "xorq[duckdb]" or pip install "xorq[examples]" for all tutorial dependencies.
Why switch backends?
Different backends excel at different tasks. DuckDB handles analytical queries efficiently, Pandas works great for small datasets and prototyping, and DataFusion gives you custom UDF capabilities.
Xorq lets you write your expression once and run it anywhere. Same code, different engines.
Tip
Xorq uses Apache Arrow to move data between backends without serialization overhead. This makes backend switching fast and memory-efficient.
To see this in practice, you’ll run the same expression on the iris dataset across three backends: embedded, DuckDB, and Pandas.
Run on the embedded backend
You’ll start with Xorq’s default embedded backend. This uses DataFusion, an in-memory query engine optimized for Arrow operations.
Notice how the expression code is identical. Only the backend connection changed.
Note
This DuckDB connection is in-memory. To use a persistent database file, pass database="my_db.duckdb" to connect().
Switch to Pandas
Now you’ll run the same expression on Pandas. Pandas is great for small datasets and interactive analysis, making it perfect for prototyping and working with data that fits in memory.
So far, you’ve loaded data separately into each backend. But what if you start analysis in one backend and need to switch to another mid-workflow? That’s where data transfer comes in.
Move data between backends
Sometimes you need to move data from one backend to another. Xorq makes this easy with .into_backend().
First, see what you can do on the embedded backend:
What happened? The trade at time 12 gets the price from time 10 (most recent before 12). The trade at time 28 gets the price from time 20 (most recent before 28). This is an “as-of” temporal join—a DuckDB feature not available in the embedded backend.
.into_backend() transfers data between backends using Arrow’s zero-copy protocol.
Tip
Move data to a different backend when you need specific features (like DuckDB’s AsOf joins for temporal data) or better performance for your query type.
Compare backend performance
You’ll time the same query on different backends to see performance characteristics.
For small datasets like iris, performance differences are minimal. With larger datasets, you’ll see DuckDB and the embedded backend outperform Pandas.
Choose the right backend
Each backend has different strengths. The embedded backend works well for most tasks. For analytical queries on larger datasets or temporal operations like AsOf joins, DuckDB is a better fit. Pandas works best for small datasets and quick prototyping.
Warning
Not all backends support every operation. For example, some complex window functions might work in DuckDB but not in Pandas. Check the documentation if you hit an unsupported operation error.
Now that you understand when to use each backend, here’s a complete workflow that ties everything together.
Complete example
Here’s a full example showing backend switching:
import xorq.api as xo# Connect to backendsembedded = xo.connect()duckdb = xo.duckdb.connect()# Load data in embedded backenddata = xo.examples.iris.fetch(backend=embedded)# Build expressionexpr = ( data .filter(xo._.sepal_length >6) .group_by("species") .agg(avg_width=xo._.sepal_width.mean()))# Execute on embedded backendresult1 = expr.execute()print("Embedded result:", result1)# Move to DuckDB and execute theredata_in_duck = data.into_backend(duckdb)expr_duck = ( data_in_duck .filter(xo._.sepal_length >6) .group_by("species") .agg(avg_width=xo._.sepal_width.mean()))result2 = expr_duck.execute()print("DuckDB result:", result2)
Next steps
Now you know how to switch backends. Continue learning: