Biopython Fundamentals
v1.87 — 2026 Edition. A comprehensive 18-episode guide to using Biopython (v1.87 - 2026) for sequence analysis, parsing biological data formats, running BLAST, handling 3D structures, phylogenetic trees, and more.
Episodes
Introduction to Biopython & The Seq Object
3m 33sDiscover the foundation of Biopython: the Seq object. We explore how sequence objects differ from standard Python strings and learn how to perform biological operations like reverse complement and translation.
Rich Sequence Data: The SeqRecord Object
4m 14sLearn how to wrap sequences in rich metadata using the SeqRecord object. We cover how identifiers, names, descriptions, and dictionary annotations are stored alongside the raw sequence.
Reading and Writing Files with SeqIO
3m 21sMaster sequence file conversions and batch processing with Bio.SeqIO. This episode explains the difference between reading single-record files and parsing multi-record datasets.
Extracting Genes with SeqFeature
3m 39sDive into the complex world of sequence features. We explain how Biopython represents gene coordinates, strands, and fuzzy locations using the SeqFeature object.
Pairwise Sequence Alignment
3m 57sLearn how to compare two sequences directly using the Bio.Align module. We discuss the PairwiseAligner, substitution scoring, and gap penalties for both global and local alignments.
Handling Multiple Sequence Alignments
3m 43sTransition from pairwise to multiple sequence alignments. This episode covers parsing alignment files with AlignIO and treating alignments like 2D arrays to slice out specific columns.
Querying NCBI Databases Programmatically
4m 04sAutomate your literature and sequence searches. Discover how to query NCBI databases using Entrez.esearch and retrieve exact IDs without using a web browser.
Running BLAST over the Internet
3m 28sTrigger remote BLAST searches directly from Python. Learn how to use qblast to send sequences to the NCBI servers and safely save the raw XML results.
Parsing Natively: Unpacking BLAST XML
3m 32sMake sense of complex BLAST outputs. This episode walks through parsing BLAST XML files into native Python objects to extract alignments, High-scoring Segment Pairs (HSPs), and E-values.
Navigating 3D Structures with Bio.PDB
3m 40sStep into three dimensions. We explore the PDB module, parsing macromolecular structures, and understanding the Structure-Model-Chain-Residue-Atom (SMCRA) architecture.
Measuring Protein Geometry
4m 00sCalculate spatial relationships in proteins. This episode covers calculating interatomic distances and using NeighborSearch to find atoms within a specific radius.
Phylogenetic Trees in Python
3m 44sParse, manipulate, and draw evolutionary trees with Bio.Phylo. We cover reading Newick files, tree traversal, and isolating specific clades.
Sequence Motif Analysis
4m 00sUncover hidden patterns in DNA. Discover how to create sequence motifs, build Position-Weight Matrices (PWMs), and scan target sequences for transcription factor binding sites.
Swiss-Prot and ExPASy Integration
3m 14sAccess the gold standard of protein databases. We detail how to fetch records via Bio.ExPASy and parse dense Swiss-Prot flat files to extract curated protein metadata.
Visualizing Genomes with GenomeDiagram
3m 29sTurn raw GenBank files into publication-quality images. Learn how GenomeDiagram constructs circular and linear genome maps by layering tracks and feature arrows.
Population Genetics with Bio.PopGen
3m 49sAnalyze genetic variation across populations. This episode introduces Bio.PopGen to parse Genepop files and easily extract allele frequencies and heterozygosity metrics.
Biochemical Pathways with KEGG
3m 56sConnect the metabolic dots. Learn how to parse KEGG enzyme and pathway records to trace biochemical reactions and chemical compound structures.
Cluster Analysis for Gene Expression
3m 52sGroup genes by their behavior. In this final episode, we cover the Bio.Cluster module, applying K-means and hierarchical clustering to microarray expression data.