Back to catalog
Season 17 12 Episodes 42 min 2026

Apache Cassandra with Python

2026 Edition. A technical podcast series exploring the distributed architecture of Apache Cassandra and how to interact with it using the DataStax Python Driver. Covers data modeling, execution profiles, LWTs, async queries, and the cqlengine Object Mapper.

Databases Distributed Computing
Apache Cassandra with Python
Now Playing
Click play to start
0:00
0:00
1
The Big Picture
An introduction to Apache Cassandra. Learn why global-scale applications choose this distributed NoSQL database and how it differs from traditional relational systems.
3m 12s
2
Consistent Hashing and The Ring
Dive into the architecture of Cassandra. We explore consistent hashing, the token ring, and how data is partitioned across multiple nodes without a master server.
3m 40s
3
Query Driven Data Modeling
Unlearn everything you know about relational databases. Learn how Cassandra's query-driven modeling requires denormalization, and the crucial difference between partition keys and clustering keys.
3m 01s
4
Connecting with Python
Get started with the DataStax Python Driver. Learn how to instantiate a Cluster, connect to a Session, and establish communication with your Cassandra nodes.
3m 44s
5
Execution Profiles
Manage complex workloads seamlessly by using Execution Profiles. Learn how to configure load balancing, timeouts, and consistency levels per-query without polluting your cluster setup.
3m 39s
6
Prepared Statements
Learn how to execute CQL commands from Python. We cover simple statements and the critical performance benefits of using Prepared Statements for frequent queries.
2m 53s
7
Paging Large Queries
Never crash your app by loading a massive dataset into memory. Discover how the Python driver automatically pages large query results and how to manage fetch sizes.
3m 26s
8
High Throughput Async Queries
Maximize your application's throughput. Learn how to use execute_async, ResponseFutures, and callbacks to run concurrent requests against Cassandra.
3m 57s
9
Lightweight Transactions
Implement compare-and-set operations safely. Learn how Lightweight Transactions (LWTs) work in Cassandra and how to inspect the specialized applied column in your Python results.
3m 38s
10
The Object Mapper Models
Avoid raw CQL strings and model your data using Python classes. Learn how to use cqlengine to define tables, specify primary keys, and synchronize your schema.
3m 36s
11
Making Queries with cqlengine
Retrieve and filter data fluently using QuerySet objects in the cqlengine Object Mapper. We cover filtering operators, immutability, and limitations on ordering.
3m 55s
12
Vector Search for AI
Future-proof your skill set with Cassandra 5.0's Vector Search. Discover how to store and query high-dimensional vectors to power modern AI and machine learning applications.
3m 25s

Episodes

1

The Big Picture

3m 12s

An introduction to Apache Cassandra. Learn why global-scale applications choose this distributed NoSQL database and how it differs from traditional relational systems.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 1 of 12. Relational databases hit a hard wall when you try to deploy them across multiple continents. You end up fighting latency, downtime, or a brittle architecture where a single failing server takes down your write operations. Massive global scale requires a completely different database paradigm. That paradigm is Apache Cassandra. It is an open-source, distributed NoSQL database built for immense scale. When engineers first look at Cassandra, they often bring baggage from relational systems. They look for the primary node that handles writes and the read-only replicas that follow it. You need to drop that mental model immediately. Cassandra does not use a primary-replica design. It operates entirely on a masterless, multi-primary architecture. Every single node in the cluster is identical. Any node can accept a read request, and any node can accept a write request. To understand why it works this way, look at its history. Cassandra was originally developed by combining two major research breakthroughs. First, it took the fully distributed, masterless network design from Amazon Dynamo. This dictates how nodes communicate, discover each other, and replicate data across the network. Second, it adopted the log-structured storage engine from Google Bigtable, which handles how data is physically written to disk. The result is a system engineered specifically for continuous availability across multiple datacenters. Consider a global social network. You have active users in Tokyo, London, and New York simultaneously. If a user in London updates their profile, that write operation needs to happen instantly. Routing that request across the Atlantic to a single central database is too slow. With Cassandra, the user writes to a local node in the London datacenter. That node accepts the write locally and immediately takes responsibility for replicating it to Tokyo and New York in the background. Here is the key insight. Because every node acts as a primary, there is no single point of failure. Cassandra arranges its nodes in a logical ring. When data enters the system, a mathematical hash determines exactly which nodes in the ring own that specific piece of data. If a major outage takes the entire London datacenter offline, the social network does not go down. Tokyo and New York continue accepting reads and writes without interruption. When London comes back online, the other datacenters automatically sync the missing data to the recovered nodes. You achieve true global availability with zero downtime. This masterless design also means scaling is predictable and linear. If you need more storage or more write capacity, you simply plug another node into the ring. The cluster automatically detects the new hardware and redistributes the data to balance the load across the active machines. Cassandra forces you to trade the comfort of traditional database queries for something much harder to build at scale: the absolute guarantee that no matter what hardware fails, your database stays online and your operations succeed. If you want to help support the show, you can search for DevStoriesEU on Patreon. That is all for this one. Thanks for listening, and keep building!
2

Consistent Hashing and The Ring

3m 40s

Dive into the architecture of Cassandra. We explore consistent hashing, the token ring, and how data is partitioned across multiple nodes without a master server.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 2 of 12. Naive data hashing completely breaks down the moment you add a new server to your database cluster. When the number of servers changes, almost every piece of data has to move to a new location. The solution to this scaling chaos is Consistent Hashing and The Ring. A standard way to distribute data across multiple servers is modulo hashing. If you have eight nodes, you take a partition key like a user ID, hash it, and divide the result by eight. The remainder tells you which node gets the user profile. This works perfectly until your storage fills up and you plug in a ninth node. Now you divide by nine. The remainders change. Almost every user profile in your system suddenly belongs on a different server, causing a massive, cluster-crushing storm of data movement. Cassandra avoids this entirely. It uses consistent hashing to predictably distribute data across the cluster without relying on any central coordinator. Instead of a simple modulo calculation, Cassandra maps both the data and the nodes to a fixed, continuous circular space called the token ring. By default, Cassandra uses a hash function that generates a massive range of possible numbers. The lowest possible hash value connects directly back to the highest, forming a closed circle. Every physical node in the cluster is assigned a specific number, or token, somewhere on this ring. When you insert a user profile, Cassandra hashes the user ID to produce a token. To find out which node owns this profile, the system locates the data token on the ring and moves clockwise. The first node it encounters is the owner. Think back to our eight-node cluster. If we add a ninth physical node, it gets assigned a single new token on the ring, landing between two existing nodes. Because data ownership is determined by walking clockwise, this new ninth node only takes over a specific slice of data from its immediate clockwise neighbor. The other seven nodes do nothing. Their data remains completely untouched. Here is the key insight. Assigning exactly one token to one physical node creates operational problems. It is difficult to balance the data perfectly, and when you add a new node, only one neighboring server is responsible for handing over the data. That single neighbor gets hammered under heavy load. To fix this, Cassandra uses virtual nodes, or vnodes. Instead of giving a physical server one massive contiguous slice of the ring, vnodes chop the ring into many smaller ranges. A single physical node is assigned hundreds of different tokens distributed randomly around the ring. When you add that ninth physical node using vnodes, it claims many small slices of the ring from all the existing servers at once. Now, instead of one neighbor doing all the heavy lifting, the entire cluster evenly shares the work of streaming data to the new machine. Consistent hashing and virtual nodes decouple data placement from the raw number of physical servers, allowing a cluster to scale up smoothly and operate predictably without any central master dictating where data should go. That is your lot for this one. Catch you next time!
3

Query Driven Data Modeling

3m 01s

Unlearn everything you know about relational databases. Learn how Cassandra's query-driven modeling requires denormalization, and the crucial difference between partition keys and clustering keys.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 3 of 12. In a relational database, you build your tables first, define your relationships, and then write your queries. In Cassandra, if you do not know your exact queries before you start, your database will fail. This is the core principle of Query Driven Data Modeling. Many developers come to Cassandra with strong relational habits. You naturally look for ways to normalize your data, set up foreign keys, and avoid duplication. You need to drop that mindset entirely. Cassandra does not support joins. If you try to normalize your data across multiple tables, you will end up performing joins in your application code, which destroys the performance benefits you chose Cassandra for in the first place. Cassandra requires query-driven data modeling. You start by mapping the exact questions your application needs to ask the database. In this model, one query typically equals one table. If you need to access the same data in three different ways, you create three different tables holding the same data. This brings us to denormalization. Duplicating data is not a mistake here; it is the fundamental strategy. Disk space is cheap, but distributed reads across a network are expensive. By writing data together in the exact shape of your read query, Cassandra can retrieve it in a single operation without searching the entire cluster. To make this work, you have to understand the primary key. It is not just a unique identifier. It controls exactly where and how your data is stored on disk. The primary key has two distinct parts: the partition key and the clustering key. The partition key dictates which physical node in your cluster holds the data. All rows sharing the same partition key are stored together on the same node. The clustering key determines the sorting order of those rows on the disk within that specific partition. Take the magazine publication scenario from the documentation. Suppose your application needs to fetch all magazines released by a specific publisher. Your partition key must be the publisher name. When the query arrives, Cassandra hashes the publisher name, identifies the exact node holding that publisher's data, and goes straight to it. To organize the results naturally, you might use the publication date as your clustering key. Now, not only are all magazines for that publisher grouped on one node, but they are physically stored in chronological order. The database just streams the pre-sorted data back. Here is the key insight. You are trading write complexity for massive read speed. You write the same data to multiple tables to satisfy different application views, but when a user asks for that data, it returns in milliseconds because the database does zero computation to assemble it. Thanks for listening. Take care, everyone.
4

Connecting with Python

3m 44s

Get started with the DataStax Python Driver. Learn how to instantiate a Cluster, connect to a Session, and establish communication with your Cassandra nodes.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 4 of 12. Connect to a traditional database, and you point your application at a single host address. Connect to a distributed database, and you might assume you have to manually track dozens of server addresses in your configuration files. You do not, because your database driver actually acts as a smart router. Today, we look at Connecting with Python. Consider a Python microservice booting up. It needs to establish communication with a local three-node Cassandra cluster. To begin, you bring in the Cluster class from the cassandra dot cluster module. You initialize this class by passing it a list of IP addresses known as contact points. If your nodes use a non-standard port, you can also specify a port argument here; otherwise, it defaults to 9042. Developers often confuse this step with building a standard single-DSN connection string, where you must explicitly list exactly what you want to talk to. With Cassandra, you do not need to list every single node in your infrastructure. If you have a massive thirty-node cluster, passing just two or three IP addresses as contact points is perfectly fine. Here is the key insight. When the Python driver starts, it reaches out to one of those contact points to bootstrap itself. It queries system tables to download the current cluster topology. By doing this, it automatically discovers the IP addresses of the rest of the nodes. The driver dynamically maintains this network map in the background. If you scale out and add nodes later, the driver detects the change and adjusts its routing automatically without requiring an application restart. Once you instantiate your Cluster object with those initial contact points, you call its connect method. This action returns a Session object. The Session handles the actual connection pooling to the nodes it just discovered. When calling connect, you can optionally pass a keyspace name. A keyspace functions as a namespace for your data. If provided, the driver sets it as the default for all future operations on that Session. Because the Session manages complex connection pools under the hood, it is designed to be long-lived and thread-safe. You typically create one Session at application startup and reuse it. Now, the second piece of this is connecting to a managed cloud service like DataStax Astra. Astra operates differently and does not expose raw IP addresses for contact points. Instead, you download a secure connect bundle. This is a zip file containing the required certificates and mutual TLS connection details. In your Python code, you skip the list of IP addresses. Instead, you provide a cloud configuration dictionary to the Cluster object. This dictionary contains a key called secure connect bundle, which points to the local file path of your zip file. You combine this with a plaintext authentication provider object configured with your client ID and secret. Calling connect then yields a standard Session object, functioning exactly like the local cluster setup. The critical takeaway is that whether you pass a few local IP addresses or a cloud secure bundle, the Python driver takes your initial entry point, maps the distributed database network, and entirely abstracts the routing logic away from your application code. That is all for this one. Thanks for listening, and keep building!
5

Execution Profiles

3m 39s

Manage complex workloads seamlessly by using Execution Profiles. Learn how to configure load balancing, timeouts, and consistency levels per-query without polluting your cluster setup.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 5 of 12. As your application grows, applying a one-size-fits-all timeout or consistency level to all your database queries becomes a massive operational bottleneck. You end up failing background jobs prematurely or stalling your user interface while waiting for a heavy read. Execution Profiles are the mechanism that resolves this tension. Many developers confuse execution configuration with legacy cluster-level setup. In older paradigms, you defined global parameters directly on the cluster object, meaning every query shared the exact same timeout. Execution profiles replace that rigid pattern. They allow you to maintain multiple discrete configurations simultaneously within a single active session. Consider a multi-tenant web application. Your front-end interface requires a strict one-second timeout to ensure the web pages stay snappy. Meanwhile, your background reporting tasks need thirty seconds to safely aggregate large partitions. An execution profile is a standalone, named bundle of request settings tailored precisely for these different workloads. To build this, you start by creating instances of an execution profile object. For the reporting task, you instantiate a profile and set the request timeout parameter to thirty seconds. You can go further and attach a specific load balancing policy to this object, perhaps routing those heavy analytical reads exclusively to a dedicated analytics data center. Then, you create a second distinct profile object for your user interface, assigning it a one-second timeout and a local load balancing policy. You can also bundle distinct retry policies or consistency levels into these profiles depending on what the query demands. Here is the key insight. You register these profiles once, during your initial cluster setup, rather than building them during the query execution itself. When instantiating the cluster, you pass a dictionary to the execution profiles argument. This dictionary maps simple string names to the profile objects you just created. The driver manages these configurations under the hood. There is always a built-in fallback profile, which you can override by mapping a configuration to a specific constant called execution profile default. If you execute a query without explicitly naming a profile, the driver automatically applies these default settings. When you actually need to run a query, you use the standard session execute or execute async methods. Alongside your query string or prepared statement, you pass an execution profile argument using the simple string name you registered earlier. The driver intercepts this name, retrieves the bundled settings tied to it, and applies them to that specific request. Your user interface query is strictly capped at one second, and the background job runs comfortably for thirty seconds. Both queries run concurrently, multiplexed over the exact same session and the exact same connection pool. Moving your configuration into execution profiles decouples your query behavior from your physical database connection. It allows a single application to shape its database traffic dynamically based on the specific needs of each function, without ever needing to establish separate sessions. Thanks for hanging out. Hope you picked up something new.
6

Prepared Statements

2m 53s

Learn how to execute CQL commands from Python. We cover simple statements and the critical performance benefits of using Prepared Statements for frequent queries.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 6 of 12. If you are using simple string formatting for your high-throughput queries, you are forcing your database to burn precious CPU cycles re-parsing the exact same structure thousands of times a second. The way to stop this waste, and vastly improve your application performance, is by using Prepared Statements. When you connect to Cassandra, the most direct way to interact with the database is by passing a string to the session dot execute method. You hand it a query, and it returns a result set. By default, every row in that result comes back as a Python namedtuple. This means you can access your column values using simple dot notation, like row dot name or row dot age. It is clean, and it works well for one-off operations. But sending a raw string is highly inefficient for queries you run constantly. You might think you are already doing things right if you use positional parameters. If you pass a query string containing percent-s markers along with a sequence of values, the Python driver formats that query securely. This prevents injection attacks, but architecturally, it changes nothing about the performance. You are still sending the complete query string over the network to Cassandra every single time. Cassandra still has to receive the text, parse the syntax, and calculate the query plan from scratch. Here is the key insight. You do not need to parse the same structure twice. This is where the session dot prepare method comes in. Instead of executing the query immediately, you pass your query string to the prepare method. In this string, you replace the dynamic values with question marks. The driver sends this template to Cassandra. Cassandra parses it, validates it, computes the most efficient execution plan, and then generates a unique identifier for this specific statement. It sends this ID back to your Python application. From that moment on, whenever you need to run that query, your application does not send a string. It binds your specific variables to the prepared statement object and sends only the unique ID along with the raw bytes of the values. Think about a high-traffic authentication service. You need to look up a user by their ID during login. The query is select star from users where user id equals question mark. If your service handles ten thousand logins a second, sending the full query string ten thousand times wastes network bandwidth and database CPU. By preparing the statement once when your application starts, you reduce the network payload significantly. Cassandra sees the ID, instantly pulls up the pre-calculated execution plan, and retrieves the data immediately. Prepared statements shift the heavy lifting of query parsing from a recurring per-request tax to a single one-time setup cost. That is your lot for this one. Catch you next time!
7

Paging Large Queries

3m 26s

Never crash your app by loading a massive dataset into memory. Discover how the Python driver automatically pages large query results and how to manage fetch sizes.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 7 of 12. You run a simple select query on a massive table, and suddenly your application crashes with an out-of-memory error. Or at least, it would, if your database driver tried to load everything at once. Instead, the driver handles this gracefully in the background. Today, we are talking about Paging Large Queries. When you execute a query using the Cassandra Python driver, it does not attempt to pull millions of rows into your application memory. By default, the driver automatically pages the results, fetching exactly 5000 rows at a time. The result set returned to you acts as a standard Python iterator. As you loop through the rows, the driver transparently reaches out to the database to fetch the next batch just before you run out. Your code looks exactly like a standard loop, but underneath, the driver guarantees memory safety by only holding a single page of data in memory at any given moment. You are not stuck with the default of 5000 rows. You can control this by setting the fetch size property on your statement object before passing it to the execute method. If you lower the fetch size to 100, the driver will hold less data in memory but will make more frequent network round trips to the database. Do not confuse this mechanism with traditional SQL pagination using limit and offset commands. SQL offset pagination forces the database to scan and discard rows before returning your data, which gets drastically slower the deeper you page. Cassandra uses a cursor-based approach. The driver uses an internal marker to track the exact physical location in the database where the last read finished. Automatic paging is perfect for background data processing, but it does not work for building stateless web applications. Consider a web endpoint providing a continuously scrolling list of thousands of audit logs to a user interface. You cannot keep a database connection and an active iterator open on your server while waiting for the user to scroll. You need a way to stop the query, close the request, and resume it later. Here is the key insight. The driver provides a property on the result set called the paging state. This is an opaque byte string representing the exact cursor position of your query. To build a stateless API, you execute a query, grab a single page of audit logs, and retrieve this paging state. You convert the bytes to a plain hex string and send it to your frontend alongside the data. When the user scrolls to the bottom of the interface, the frontend sends that exact hex string back to your server. Your backend decodes the string and passes it into the execute method as the paging state parameter. Cassandra instantly resumes the query right where it left off. Using the paging state allows you to disconnect the memory lifespan of your application from the massive scale of your database tables, enabling you to stream infinite data to clients using a strictly finite amount of server RAM. That is all for this one. Thanks for listening, and keep building!
8

High Throughput Async Queries

3m 57s

Maximize your application's throughput. Learn how to use execute_async, ResponseFutures, and callbacks to run concurrent requests against Cassandra.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 8 of 12. Waiting for a database response before sending your next request is the easiest way to throttle your own application. If your Python script sits idle during network round trips, you are leaving massive write performance on the table. To fix this, you need High Throughput Async Queries. When you use the standard execute method on a Cassandra session, your code blocks. It sends the query, waits for the database to process it, and waits for the network to carry the response back. For a data ingestion pipeline, this idle time is fatal to your overall throughput. To unlock real speed, the DataStax Python driver provides the execute async method. First, we need to clear up a common confusion. If you hear the word async in Python, you likely think of the standard library asyncio module and the async await keywords. This is not that. The Cassandra driver relies on its own lightweight event loop running in a background thread. When you call execute async, you are using the driver's custom mechanism, not Python's built in asynchronous features. When you pass a query to execute async, it does not wait for a database response. It returns control to your program instantly. What it hands back is an object called a ResponseFuture. This object is a promise. It represents a database operation that has been sent to the cluster but has not yet finished. Because your main thread is no longer waiting, you need a way to know when the query actually completes or if it fails. You manage this by attaching callback functions directly to the ResponseFuture. First, you define a success function that takes the result set as an argument. Then, you define an error function that takes an exception as an argument. Finally, you link both of these functions to the future using an add callbacks method. When the database replies, the background thread receives the network packet and automatically triggers the correct function. Here is the key insight. This mechanism allows you to pipeline hundreds of operations simultaneously. Consider an IoT pipeline receiving hundreds of temperature readings per second. Using the synchronous method, you process one reading, wait for the database, and then process the next. Using execute async, you loop through your incoming readings without stopping. For each reading, you fire off an insert query, grab the ResponseFuture, attach your success and error callbacks, and immediately move to the next reading. You push hundreds of requests onto the network in a fraction of a second. The driver multiplexes these queries over your existing database connections. The Cassandra cluster handles them in parallel and streams the results back. Your success and error callbacks fire as the responses arrive, completely independently of the order in which you sent them. This approach drastically increases your throughput because it overlaps the network wait time for all those queries. You are bound only by your network bandwidth and the database capacity, rather than being limited by thread blocking. You do need to keep track of how many requests you have in flight so you do not overwhelm the driver queue, but the underlying principle is to keep the network pipeline full. The single most important takeaway here is that database throughput is about minimizing idle time, and by heavily utilizing execute async with attached callbacks, you force your application to spend its time pushing data instead of waiting on the network. If you found this useful and want to support the show, you can search for DevStoriesEU on Patreon. That is all for this one. Thanks for listening, and keep building!
9

Lightweight Transactions

3m 38s

Implement compare-and-set operations safely. Learn how Lightweight Transactions (LWTs) work in Cassandra and how to inspect the specialized applied column in your Python results.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 9 of 12. In a distributed system with no global locks, two users try to register the exact same username at the exact same millisecond. How do you guarantee only one of them actually gets it? The mechanism that resolves this is called a Lightweight Transaction. Before explaining how to implement them, we need to clear up the name. A Lightweight Transaction in Cassandra is not a traditional multi-table ACID transaction. You cannot open a transaction block, write to three different tables, and roll everything back if one step fails. A Lightweight Transaction is strictly scoped to a single partition. It is a conditional operation, essentially a distributed compare and set. To claim a unique username without race conditions, you write a standard insert query, but you append the specific clause IF NOT EXISTS to the end of it. Cassandra will check the cluster to see if that partition key already exists. If it is absent, the write goes through. For an update operation, you use the clause IF followed by a column name and an expected value. You can instruct the database to update an account status only IF the current status matches a specific string. Executing these queries from Python requires handling the response differently than a normal write. A standard write in Cassandra either succeeds silently or throws a timeout error. But when you append an IF clause, the database must tell your application whether the condition was actually met. The Python driver handles this by returning a specialized result set. When you execute a Lightweight Transaction, the first row of the returned data contains a special boolean system column. The driver exposes this column under the name applied, surrounded by square brackets. You execute your insert query and fetch the first row from the result. You then evaluate that bracket applied bracket column. If you are returning dictionary rows from the driver, you access the key as a string. If the value evaluates to true, the condition was met, and your new user successfully claimed the username. The operation is complete. Here is the key insight. You need to understand exactly what happens when that bracket applied bracket column evaluates to false. If the insert fails because the username is already taken, Cassandra does not just return a simple rejection flag. The driver populates the result row with the actual data that caused your condition to fail. Because your condition was rejected, the database reads the existing row and hands it back to your Python application in the exact same response. You receive the false applied flag, and alongside it, you receive the current state of the conflicting record. If you were trying to update a balance based on an expected previous balance, you get the actual current balance back immediately. Your Python code knows exactly what data blocked the transaction, entirely eliminating the need to run a follow-up select query to find out why the write was rejected. Lightweight Transactions give you strict concurrency control at the partition level, and the Python driver makes it highly efficient by handing you the blocking state for free whenever a condition fails. Thanks for listening. Take care, everyone.
10

The Object Mapper Models

3m 36s

Avoid raw CQL strings and model your data using Python classes. Learn how to use cqlengine to define tables, specify primary keys, and synchronize your schema.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 10 of 12. Writing raw CQL strings inside your Python application code can quickly become an unmaintainable mess. You end up with multi-line strings full of variable interpolation, making typos easy and schema refactoring painful. You need a way to represent your database tables as native Python objects. This is exactly what The Object Mapper Models provide. The Datastax Python driver includes an object mapper called cqlengine. It lets you define Cassandra tables using standard Python classes. If you have ever used Django ORM or SQLAlchemy, the syntax will look very familiar. But there is one major difference to clear up right away. In those relational mappers, you declare foreign keys to link tables together. Cassandra is not a relational database, so relational bindings simply do not exist here. A model in cqlengine maps exactly to one standalone Cassandra table. Let us build a Comment model for a photo application. We want to group all comments for a specific photo, and we want them ordered chronologically. First, you create a Python class called Comment that inherits from the cqlengine Model base class. Inside this class, you define your columns as class attributes. We start with the attribute photo underscore id. You assign it to a UUID column type and pass the argument primary underscore key equals True. Because this is the very first primary key you declare in the class, cqlengine automatically assigns it as the partition key. Next, we need a way to uniquely identify and order each comment within that photo partition. You define a second attribute called comment underscore id. You assign this to a TimeUUID column type, and you also pass primary underscore key equals True. This is where it gets interesting. In cqlengine, any primary key defined after the partition key automatically becomes a clustering key. You do not need a special clustering configuration block. The literal top-to-bottom order in which you define the column attributes inside your Python class dictates the primary key structure in Cassandra. After the primary keys are set, you add the actual data columns. You define a body attribute and assign it a Text column type. It is just regular data, so no primary key arguments are needed. Now you have a declarative Python class that describes your table schema. But the table does not exist in your database yet. To push this schema to Cassandra, you use a function called sync underscore table. You pass your Comment model class directly into this function. The mapper reads your class structure, translates it into the correct raw CQL statement, and executes it. If the table does not exist, sync underscore table creates it. If the table already exists, it checks to see if you have added new columns to your Python class and alters the table to match. It is important to know that sync underscore table will never drop data or alter existing primary keys, keeping your core data structure safe during updates. The true power of defining models this way is not just hiding database syntax, but locking your schema definition directly into the application layer where the business logic lives. That is all for this one. Thanks for listening, and keep building!
11

Making Queries with cqlengine

3m 55s

Retrieve and filter data fluently using QuerySet objects in the cqlengine Object Mapper. We cover filtering operators, immutability, and limitations on ordering.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 11 of 12. Filtering database records with an object mapper usually feels effortless, letting you search by any field you want. But in Cassandra, if you try to filter on a column that is not explicitly indexed or part of a primary key, the database will flat out refuse to run your query. The solution is understanding how to construct valid requests, which brings us to Making Queries with cqlengine. In cqlengine, when you want to retrieve data from a model, you interact with a QuerySet. If you have a model representing a database table, you can access records by calling the objects attribute followed by the all method. This returns a QuerySet representing every row in that table. Pulling every row is rarely what you want in a distributed database, so you need a way to narrow down the results. To restrict the data returned, you use the filter method. A common point of confusion here is how this filtering actually happens. Developers coming from other frameworks sometimes assume the object mapper pulls all the data into the Python client and filters the list in memory. That is entirely incorrect. The filter method maps directly to a strict CQL WHERE clause. This means the columns you pass to the filter method must comply with Cassandra query rules. You can only filter on partition keys, clustering columns, or columns with a secondary index. If you attempt to filter on a standard, unindexed text field, cqlengine will not hide the error or process it locally. It will pass the query straight to Cassandra, and Cassandra will reject it. Let us look at a concrete scenario. You have an Automobile model and you want to find all cars made by Tesla manufactured after the year 2012. You start by calling objects dot filter. You pass the manufacturer keyword argument set to the string Tesla. To handle the year condition, cqlengine provides special filtering operators. You apply these by appending a double underscore and an operator abbreviation directly to the column name. For strictly greater than, you use double underscore g t. So you add a second keyword argument to your filter call: year double underscore g t equals 2012. The mapper smoothly translates this into a valid CQL query. There are several other operators available for different conditions. If you wanted to check for specific car models instead of a year, you could use the double underscore in operator. You would pass model double underscore in, set to a Python list containing the names Model S and Model 3. The database will return records matching any value in that list. Here is the key insight. QuerySets are completely immutable. When you call the filter method on an existing QuerySet, it does not modify that object in place. Instead, it returns a brand new QuerySet with the additional filters applied. You can create a base QuerySet that only filters for the manufacturer Tesla and assign it to a variable. Then, you can use that single variable to generate multiple different filtered QuerySets for different years or models, just by calling filter on the base variable again. The original base query remains completely unchanged. Because QuerySets are immutable, you can programmatically construct highly specific, complex queries step by step, safely reusing base conditions across your application before any network call is ever made. Thanks for hanging out. Hope you picked up something new.
12

Vector Search for AI

3m 25s

Future-proof your skill set with Cassandra 5.0's Vector Search. Discover how to store and query high-dimensional vectors to power modern AI and machine learning applications.

Download
Hi, this is Alex from DEV STORIES DOT EU. Apache Cassandra with Python, episode 12 of 12. Large Language Models are changing how we build applications, but they bring a major storage challenge. When an artificial intelligence model needs context, traditional database queries fail because they look for exact character matches, completely missing the actual intent behind a prompt. Modern artificial intelligence workloads require a fundamentally different way of querying data. That is where Vector Search comes in, introduced natively in Cassandra 5.0. It is easy to confuse vector search with simple full-text lexical search. A standard full-text index, like Lucene, is strictly keyword-based. If a user searches your database for the phrase database backup, a lexical search scans for those exact strings. Vector search operates differently. Instead of literal keywords, vectors capture the underlying semantic meaning of the data. A vector search understands that saving a data snapshot or archiving tables relate to the exact same concept as a database backup, even if the words share zero common letters. To make this work, Cassandra relies on vector embeddings. An embedding is an array of floating-point numbers generated by a machine learning model. These arrays act as mathematical coordinates representing the deeper meaning of your text. You store these arrays directly in Cassandra alongside your standard application data. When you need to find relevant content within massive document collections, you perform a vector search. The database compares the vector of the incoming query against the vectors stored in your tables. It calculates the mathematical distance between them. Shorter distances indicate stronger semantic similarity. Consider building an internal chatbot for a large engineering team. You have thousands of pages of technical documentation. An engineer types a question asking how to decommission a staging cluster. First, your application passes that question to an embedding model, which translates the sentence into an array of floats. Next, you send that numerical array to Cassandra as a vector search query. Cassandra instantly scans the multi-dimensional space and retrieves the internal documents whose embeddings sit closest to the question. It returns the correct manual for retiring testing environments because the semantic meaning aligns perfectly, regardless of the differing terminology. This capability is built explicitly for artificial intelligence applications. Most of these applications rely on retrieving highly relevant context to feed back to the language model before it generates a response. By bringing vector search directly into Cassandra 5.0, you eliminate the need to run a separate, standalone vector database. You keep your primary operational records and their semantic embeddings in the exact same distributed architecture, relying on the high availability and scaling Cassandra provides. Since this concludes our series on Apache Cassandra, I encourage you to explore the official documentation and try generating embeddings for your own data hands-on. If you have suggestions for what technologies we should cover next, visit devstories dot eu and let us know. Vector search bridges the gap between language and storage, turning complex human intent into a geometry problem that Cassandra can solve at scale. That is all for this one. Thanks for listening, and keep building!