High Performance MySQL, 3rd Edition

High Performance MySQL, 3rd Edition

Read it now on the O’Reilly learning platform with a 10-day free trial.

O’Reilly members get unlimited access to books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Book description

How can you bring out MySQL’s full power? With High Performance MySQL, you’ll learn advanced techniques for everything from designing schemas, indexes, and queries to tuning your MySQL server, operating system, and hardware to their fullest potential. This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover.

Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL.

Show and hide more

Publisher resources

Table of contents Product information

Table of contents

  1. Foreword
  2. Preface
    1. How This Book Is Organized
      1. A Broad Overview
      2. Building a Solid Foundation
      3. Configuring Your Application
      4. MySQL as an Infrastructure Component
      5. Miscellaneous Useful Topics
      1. From Baron
      2. From Peter
      3. From Vadim
      4. From Arjen
      1. From Jeremy
      2. From Derek
      1. MySQL’s Logical Architecture
        1. Connection Management and Security
        2. Optimization and Execution
        1. Read/Write Locks
        2. Lock Granularity
          1. Table locks
          2. Row locks
          1. Isolation Levels
          2. Deadlocks
          3. Transaction Logging
          4. Transactions in MySQL
            1. AUTOCOMMIT
            2. Mixing storage engines in transactions
            3. Implicit and explicit locking
            1. The InnoDB Engine
              1. InnoDB’s history
              2. InnoDB overview
              1. Storage
              2. MyISAM features
              3. Compressed MyISAM tables
              4. MyISAM performance
              1. The Archive engine
              2. The Blackhole engine
              3. The CSV engine
              4. The Federated engine
              5. The Memory engine
              6. The Merge storage engine
              7. The NDB Cluster engine
              1. OLTP storage engines
              2. Column-oriented storage engines
              3. Community storage engines
              1. Logging
              2. Read-only or read-mostly tables
              3. Order processing
              4. Bulletin boards and threaded discussion forums
              5. CD-ROM applications
              6. Large data volumes
              1. ALTER TABLE
              2. Dump and import
              3. CREATE and SELECT
              1. Why Benchmark?
              2. Benchmarking Strategies
                1. What to Measure
                1. Designing and Planning a Benchmark
                2. How Long Should the Benchmark Last?
                3. Capturing System Performance and Status
                4. Getting Accurate Results
                5. Running the Benchmark and Analyzing Results
                6. The Importance of Plotting
                1. Full-Stack Tools
                2. Single-Component Tools
                1. http_load
                2. MySQL Benchmark Suite
                3. sysbench
                  1. The sysbench CPU benchmark
                  2. The sysbench file I/O benchmark
                  3. The sysbench OLTP benchmark
                  4. Other sysbench features
                  1. Introduction to Performance Optimization
                    1. Optimization Through Profiling
                    2. Interpreting the Profile
                    1. Instrumenting PHP Applications
                    1. Profiling a Server’s Workload
                      1. Capturing MySQL’s queries to a log
                      2. Analyzing the query log
                      1. Using SHOW PROFILE
                      2. Using SHOW STATUS
                      3. Using the slow query log
                      4. Using the Performance Schema
                      1. Single-Query Versus Server-Wide Problems
                        1. Using SHOW GLOBAL STATUS
                        2. Using SHOW PROCESSLIST
                        3. Using query logging
                        4. Making sense of the findings
                        1. The diagnostic trigger
                        2. What kinds of data should you collect?
                        3. Interpreting the data
                        1. Using the USER_STATISTICS Tables
                        2. Using strace
                        1. Choosing Optimal Data Types
                          1. Whole Numbers
                          2. Real Numbers
                          3. String Types
                            1. VARCHAR and CHAR types
                            2. BLOB and TEXT types
                            3. Using ENUM instead of a string type
                            1. Pros and Cons of a Normalized Schema
                            2. Pros and Cons of a Denormalized Schema
                            3. A Mixture of Normalized and Denormalized
                            1. Materialized Views
                            2. Counter Tables
                            1. Modifying Only the .frm File
                            2. Building MyISAM Indexes Quickly
                            1. Indexing Basics
                              1. Types of Indexes
                                1. B-Tree indexes
                                  1. Types of queries that can use a B-Tree index
                                  1. Building your own hash indexes
                                  2. Handling hash collisions
                                  1. Isolating the Column
                                  2. Prefix Indexes and Index Selectivity
                                  3. Multicolumn Indexes
                                  4. Choosing a Good Column Order
                                  5. Clustered Indexes
                                    1. Comparison of InnoDB and MyISAM data layout
                                      1. MyISAM’s data layout
                                      2. InnoDB’s data layout
                                      1. Supporting Many Kinds of Filtering
                                      2. Avoiding Multiple Range Conditions
                                      3. Optimizing Sorts
                                      1. Finding and Repairing Table Corruption
                                      2. Updating Index Statistics
                                      3. Reducing Index and Data Fragmentation
                                      1. Why Are Queries Slow?
                                      2. Slow Query Basics: Optimize Data Access
                                        1. Are You Asking the Database for Data You Don’t Need?
                                        2. Is MySQL Examining Too Much Data?
                                          1. Response time
                                          2. Rows examined and rows returned
                                          3. Rows examined and access types
                                          1. Complex Queries Versus Many Queries
                                          2. Chopping Up a Query
                                          3. Join Decomposition
                                          1. The MySQL Client/Server Protocol
                                            1. Query states
                                            1. The parser and the preprocessor
                                            2. The query optimizer
                                            3. Table and index statistics
                                            4. MySQL’s join execution strategy
                                            5. The execution plan
                                            6. The join optimizer
                                            7. Sort optimizations
                                            1. Correlated Subqueries
                                              1. When a correlated subquery is good
                                              1. Optimizing COUNT() Queries
                                                1. What COUNT() does
                                                2. Myths about MyISAM
                                                3. Simple optimizations
                                                4. Using an approximation
                                                5. More complex optimizations
                                                1. Optimizing GROUP BY WITH ROLLUP
                                                1. Optimizing ranking queries
                                                2. Avoiding retrieving the row just modified
                                                3. Counting UPDATEs and INSERTs
                                                4. Making evaluation order deterministic
                                                5. Writing a lazy UNION
                                                6. Other uses for variables
                                                1. Building a Queue Table in MySQL
                                                2. Computing the Distance Between Points
                                                3. Using User-Defined Functions
                                                1. Partitioned Tables
                                                  1. How Partitioning Works
                                                  2. Types of Partitioning
                                                  3. How to Use Partitioning
                                                  4. What Can Go Wrong
                                                  5. Optimizing Queries
                                                  6. Merge Tables
                                                  1. Updatable Views
                                                  2. Performance Implications of Views
                                                  3. Limitations of Views
                                                  1. Stored Procedures and Functions
                                                  2. Triggers
                                                  3. Events
                                                  4. Preserving Comments in Stored Code
                                                  1. Prepared Statement Optimization
                                                  2. The SQL Interface to Prepared Statements
                                                  3. Limitations of Prepared Statements
                                                  1. How MySQL Uses Character Sets
                                                    1. Defaults for creating objects
                                                    2. Settings for client/server communication
                                                    3. How MySQL compares values
                                                    4. Special-case behaviors
                                                    1. Natural-Language Full-Text Searches
                                                    2. Boolean Full-Text Searches
                                                    3. Full-Text Changes in MySQL 5.1
                                                    4. Full-Text Tradeoffs and Workarounds
                                                    5. Full-Text Configuration and Optimization
                                                    1. Internal XA Transactions
                                                    2. External XA Transactions
                                                    1. How MySQL Checks for a Cache Hit
                                                    2. How the Cache Uses Memory
                                                    3. When the Query Cache Is Helpful
                                                    4. How to Configure and Maintain the Query Cache
                                                      1. Reducing fragmentation
                                                      2. Improving query cache usage
                                                      1. How MySQL’s Configuration Works
                                                        1. Syntax, Scope, and Dynamism
                                                        2. Side Effects of Setting Variables
                                                        3. Getting Started
                                                        4. Iterative Optimization by Benchmarking
                                                        1. Inspecting MySQL Server Status Variables
                                                        1. How Much Memory Can MySQL Use?
                                                        2. Per-Connection Memory Needs
                                                        3. Reserving Memory for the Operating System
                                                        4. Allocating Memory for Caches
                                                        5. The InnoDB Buffer Pool
                                                        6. The MyISAM Key Caches
                                                          1. The MyISAM key block size
                                                          1. InnoDB I/O Configuration
                                                            1. The InnoDB transaction log
                                                              1. Log file size and the log buffer
                                                              2. How InnoDB flushes the log buffer
                                                              1. Configuring the tablespace
                                                              2. Old row versions and the tablespace
                                                              1. InnoDB Concurrency Configuration
                                                              2. MyISAM Concurrency Configuration
                                                              1. Optimizing for BLOB and TEXT Workloads
                                                              2. Optimizing for Filesorts
                                                              1. What Limits MySQL’s Performance?
                                                              2. How to Select CPUs for MySQL
                                                                1. Which Is Better: Fast CPUs or Many CPUs?
                                                                2. CPU Architecture
                                                                3. Scaling to Many CPUs and Cores
                                                                1. Random Versus Sequential I/O
                                                                2. Caching, Reads, and Writes
                                                                3. What’s Your Working Set?
                                                                4. Finding an Effective Memory-to-Disk Ratio
                                                                5. Choosing Hard Disks
                                                                1. An Overview of Flash Memory
                                                                2. Flash Technologies
                                                                3. Benchmarking Flash Storage
                                                                4. Solid-State Drives (SSDs)
                                                                  1. Using RAID with SSDs
                                                                  1. RAID Failure, Recovery, and Monitoring
                                                                  2. Balancing Hardware RAID and Software RAID
                                                                  3. RAID Configuration and Caching
                                                                    1. The RAID stripe chunk size
                                                                    2. The RAID cache
                                                                    1. SAN Benchmarks
                                                                    2. Using a SAN over NFS or SMB
                                                                    3. MySQL Performance on a SAN
                                                                    4. Should You Use a SAN?
                                                                    1. How to Read vmstat Output
                                                                    2. How to Read iostat Output
                                                                    3. Other Helpful Tools
                                                                    4. A CPU-Bound Machine
                                                                    5. An I/O-Bound Machine
                                                                    6. A Swapping Machine
                                                                    7. An Idle Machine
                                                                    1. Replication Overview
                                                                      1. Problems Solved by Replication
                                                                      2. How Replication Works
                                                                      1. Creating Replication Accounts
                                                                      2. Configuring the Master and Replica
                                                                      3. Starting the Replica
                                                                      4. Initializing a Replica from Another Server
                                                                      5. Recommended Replication Configuration
                                                                      1. Statement-Based Replication
                                                                      2. Row-Based Replication
                                                                      3. Statement-Based or Row-Based: Which Is Better?
                                                                      4. Replication Files
                                                                      5. Sending Replication Events to Other Replicas
                                                                      6. Replication Filters
                                                                      1. Master and Multiple Replicas
                                                                      2. Master-Master in Active-Active Mode
                                                                      3. Master-Master in Active-Passive Mode
                                                                      4. Master-Master with Replicas
                                                                      5. Ring Replication
                                                                      6. Master, Distribution Master, and Replicas
                                                                      7. Tree or Pyramid
                                                                      8. Custom Replication Solutions
                                                                        1. Selective replication
                                                                        2. Separating functions
                                                                        3. Data archiving
                                                                        4. Using replicas for full-text searches
                                                                        5. Read-only replicas
                                                                        6. Emulating multisource replication
                                                                        7. Creating a log server
                                                                        1. Why Replication Doesn’t Help Scale Writes
                                                                        2. When Will Replicas Begin to Lag?
                                                                        3. Plan to Underutilize
                                                                        1. Monitoring Replication
                                                                        2. Measuring Replication Lag
                                                                        3. Determining Whether Replicas Are Consistent with the Master
                                                                        4. Resyncing a Replica from the Master
                                                                        5. Changing Masters
                                                                          1. Planned promotions
                                                                          2. Unplanned promotions
                                                                          3. Locating the desired log positions
                                                                          1. Errors Caused by Data Corruption or Loss
                                                                          2. Using Nontransactional Tables
                                                                          3. Mixing Transactional and Nontransactional Tables
                                                                          4. Nondeterministic Statements
                                                                          5. Different Storage Engines on the Master and Replica
                                                                          6. Data Changes on the Replica
                                                                          7. Nonunique Server IDs
                                                                          8. Undefined Server IDs
                                                                          9. Dependencies on Nonreplicated Data
                                                                          10. Missing Temporary Tables
                                                                          11. Not Replicating All Updates
                                                                          12. Lock Contention Caused by InnoDB Locking Selects
                                                                          13. Writing to Both Masters in Master-Master Replication
                                                                          14. Excessive Replication Lag
                                                                            1. Don’t duplicate the expensive part of writes
                                                                            2. Do writes in parallel outside of replication
                                                                            3. Prime the cache for the replication thread
                                                                            1. What Is Scalability?
                                                                              1. A Formal Definition
                                                                              1. Planning for Scalability
                                                                              2. Buying Time Before Scaling
                                                                              3. Scaling Up
                                                                              4. Scaling Out
                                                                                1. Functional partitioning
                                                                                2. Data sharding
                                                                                3. Choosing a partitioning key
                                                                                4. Multiple partitioning keys
                                                                                5. Querying across shards
                                                                                6. Allocating data, shards, and nodes
                                                                                7. Arranging shards on nodes
                                                                                8. Fixed allocation
                                                                                9. Dynamic allocation
                                                                                10. Mixing dynamic and fixed allocation
                                                                                11. Explicit allocation
                                                                                12. Rebalancing shards
                                                                                13. Generating globally unique IDs
                                                                                14. Tools for sharding
                                                                                1. MySQL Cluster (NDB Cluster)
                                                                                2. Clustrix
                                                                                3. ScaleBase
                                                                                4. GenieDB
                                                                                5. Akiban
                                                                                1. Keeping active data separate
                                                                                1. Connecting Directly
                                                                                  1. Splitting reads and writes in replication
                                                                                  2. Changing the application configuration
                                                                                  3. Changing DNS names
                                                                                  4. Moving IP addresses
                                                                                  1. Load balancers
                                                                                  2. Load-balancing algorithms
                                                                                  3. Adding and removing servers in the pool
                                                                                  1. What Is High Availability?
                                                                                  2. What Causes Downtime?
                                                                                  3. Achieving High Availability
                                                                                    1. Improving Mean Time Between Failures
                                                                                    2. Improving Mean Time to Recovery
                                                                                    1. Shared Storage or Replicated Disk
                                                                                    2. Synchronous MySQL Replication
                                                                                      1. MySQL Cluster
                                                                                      2. Percona XtraDB Cluster
                                                                                      1. Promoting a Replica or Switching Roles
                                                                                      2. Virtual IP Addresses or IP Takeover
                                                                                      3. Middleman Solutions
                                                                                      4. Handling Failover in the Application
                                                                                      1. Benefits, Drawbacks, and Myths of the Cloud
                                                                                      2. The Economics of MySQL in the Cloud
                                                                                      3. MySQL Scaling and HA in the Cloud
                                                                                      4. The Four Fundamental Resources
                                                                                      5. MySQL Performance in Cloud Hosting
                                                                                        1. Benchmarks for MySQL in the Cloud
                                                                                        1. Amazon RDS
                                                                                        2. Other DBaaS Solutions
                                                                                        1. Common Problems
                                                                                        2. Web Server Issues
                                                                                          1. Finding the Optimal Concurrency
                                                                                          1. Caching Below the Application
                                                                                          2. Application-Level Caching
                                                                                          3. Cache Control Policies
                                                                                          4. Cache Object Hierarchies
                                                                                          5. Pregenerating Content
                                                                                          6. The Cache as an Infrastructure Component
                                                                                          7. Using HandlerSocket and memcached Access
                                                                                          1. Why Backups?
                                                                                          2. Defining Recovery Requirements
                                                                                          3. Designing a MySQL Backup Solution
                                                                                            1. Online or Offline Backups?
                                                                                            2. Logical or Raw Backups?
                                                                                              1. Logical backups
                                                                                              2. Raw backups
                                                                                              1. Incremental and differential backups
                                                                                              1. Data consistency
                                                                                              2. File consistency
                                                                                              1. The Binary Log Format
                                                                                              2. Purging Old Binary Logs Safely
                                                                                              1. Making a Logical Backup
                                                                                                1. SQL dumps
                                                                                                2. Delimited file backups
                                                                                                1. How LVM snapshots work
                                                                                                2. Prerequisites and configuration
                                                                                                3. Creating, mounting, and removing an LVM snapshot
                                                                                                4. LVM snapshots for online backups
                                                                                                5. Lock-free InnoDB backups with LVM snapshots
                                                                                                6. Planning for LVM backups
                                                                                                7. Other uses and alternatives
                                                                                                1. Restoring Raw Files
                                                                                                  1. Starting MySQL after restoring raw files
                                                                                                  1. Loading SQL files
                                                                                                  2. Loading delimited files
                                                                                                  1. Delayed replication for fast recovery
                                                                                                  2. Recovering with a log server
                                                                                                  1. Causes of InnoDB corruption
                                                                                                  2. How to recover corrupted InnoDB data
                                                                                                  1. MySQL Enterprise Backup
                                                                                                  2. Percona XtraBackup
                                                                                                  3. mylvmbackup
                                                                                                  4. Zmanda Recovery Manager
                                                                                                  5. mydumper
                                                                                                  6. mysqldump
                                                                                                  1. Interface Tools
                                                                                                  2. Command-Line Utilities
                                                                                                  3. SQL Utilities
                                                                                                  4. Monitoring Tools
                                                                                                    1. Open Source Monitoring Tools
                                                                                                    2. Commercial Monitoring Systems
                                                                                                    3. Command-Line Monitoring with Innotop
                                                                                                    1. Percona Server
                                                                                                    2. MariaDB
                                                                                                    3. Drizzle
                                                                                                    4. Other MySQL Variants
                                                                                                    5. Summary
                                                                                                    1. System Variables
                                                                                                    2. SHOW STATUS
                                                                                                      1. Thread and Connection Statistics
                                                                                                      2. Binary Logging Status
                                                                                                      3. Command Counters
                                                                                                      4. Temporary Files and Tables
                                                                                                      5. Handler Operations
                                                                                                      6. MyISAM Key Buffer
                                                                                                      7. File Descriptors
                                                                                                      8. Query Cache
                                                                                                      9. SELECT Types
                                                                                                      10. Sorts
                                                                                                      11. Table Locking
                                                                                                      12. InnoDB-Specific
                                                                                                      13. Plugin-Specific
                                                                                                      1. Header
                                                                                                      2. SEMAPHORES
                                                                                                      3. LATEST FOREIGN KEY ERROR
                                                                                                      4. LATEST DETECTED DEADLOCK
                                                                                                      5. TRANSACTIONS
                                                                                                      6. FILE I/O
                                                                                                      7. INSERT BUFFER AND ADAPTIVE HASH INDEX
                                                                                                      8. LOG
                                                                                                      9. BUFFER POOL AND MEMORY
                                                                                                      10. ROW OPERATIONS
                                                                                                      1. InnoDB Tables
                                                                                                      2. Tables in Percona Server
                                                                                                      1. Copying Files
                                                                                                        1. A Naïve Example
                                                                                                        2. A One-Step Method
                                                                                                        3. Avoiding Encryption Overhead
                                                                                                        4. Other Options
                                                                                                        1. Invoking EXPLAIN
                                                                                                          1. Rewriting Non-SELECT Queries
                                                                                                          1. The id Column
                                                                                                          2. The select_type Column
                                                                                                          3. The table Column
                                                                                                            1. Derived tables and unions
                                                                                                            2. An example of complex SELECT types
                                                                                                            1. Lock Waits at the Server Level
                                                                                                              1. Table Locks
                                                                                                                1. Finding out who holds a lock
                                                                                                                1. Using the INFORMATION_SCHEMA Tables
                                                                                                                1. A Typical Sphinx Search
                                                                                                                2. Why Use Sphinx?
                                                                                                                  1. Efficient and Scalable Full-Text Searching
                                                                                                                  2. Applying WHERE Clauses Efficiently
                                                                                                                  3. Finding the Top Results in Order
                                                                                                                  4. Optimizing GROUP BY Queries
                                                                                                                  5. Generating Parallel Result Sets
                                                                                                                  6. Scaling
                                                                                                                  7. Aggregating Sharded Data
                                                                                                                  1. Installation Overview
                                                                                                                  2. Typical Partition Use
                                                                                                                  1. Phrase Proximity Ranking
                                                                                                                  2. Support for Attributes
                                                                                                                  3. Filtering
                                                                                                                  4. The SphinxSE Pluggable Storage Engine
                                                                                                                  5. Advanced Performance Control
                                                                                                                  1. Full-Text Searching on Mininova.org
                                                                                                                  2. Full-Text Searching on BoardReader.com
                                                                                                                  3. Optimizing Selects on Sahibinden.com
                                                                                                                  4. Optimizing GROUP BY on BoardReader.com
                                                                                                                  5. Optimizing Sharded JOIN Queries on Grouply.com
                                                                                                                  Show and hide more

                                                                                                                  Product information

                                                                                                                  • Title: High Performance MySQL, 3rd Edition
                                                                                                                  • Author(s): Baron Schwartz, Peter Zaitsev, Vadim Tkachenko
                                                                                                                  • Release date: March 2012
                                                                                                                  • Publisher(s): O'Reilly Media, Inc.
                                                                                                                  • ISBN: 9781449314286