Indexer optimization
Indexer Batching
Magento can increase the memory for processing a large amount of data by using memory engines instead of InnoDB. The algorithm increases the memory value for the max_heap_table_size
and tmp_table_size
MySQL parameters.
When the allocated memory size for a temporary table will be greater than 20% of innodb_buffer_pool_size
, the following message is written to the Magento log.
Memory size allocated for the temporary table is more than 20% of innodb_buffer_pool_size.
To prevent this error message, update innodb_buffer_pool_size
or decrease the batch size value (which decreases memory usage for the temporary table).
The interface BatchSizeManagementInterface
provides the ability to set the MEMORY table size for indexer processes according to batch size and index row size.
The interface IndexTableRowSizeEstimatorInterface
calculates the memory size for all rows per entity in the index table. The entity can store several rows in an index table generated by different store dimensions, such as count of websites and customer groups.
Batching configuration
Batching is available for the following indexers:
Index name | Configuration file | Configured object | Parameter name | Default value |
---|---|---|---|---|
catalog_product_price (Product Price) | Magento/Catalog/etc/di.xml |
Magento\Catalog\Model\ResourceModel\Product\Indexer\Price\BatchSizeCalculator |
batchRowsCount[‘default’] | 5000 |
cataloginventory_stock (Stock) | Magento/CatalogInventory/etc/di.xml |
Magento\CatalogInventory\Model\Indexer\Stock\Action\Full |
batchRowsCount[‘default’] | 200 |
catalog_category_product (Category Products) | Magento/Catalog/etc/di.xml |
Magento\Catalog\Model\Indexer\Category\Product\Action\Full |
batchRowsCount | 100000 |
catalog_product_attribute (Product Attribute) | Magento/Catalog/etc/di.xml |
Magento\Catalog\Model\ResourceModel\Product\Indexer\Eav\BatchSizeCalculator |
batchSizes[‘decimal’], batchSizes[‘source’] | 1000, 1000 |
Changing the batch size can help you optimize indexer running time. For example, for a store with the following characteristcs:
- 10 websites
- 10 store groups
- 20 store views
- 300 tier prices
- About 40,000 products (of which 254 are configurable)
reducing the batch size for catalog_product_price
indexer from 5000 to 1000 decreases the execution time from about 4 hours to less than 2 hours. You can experiement to determine the ideal batch size. In general, halving the batch size can decrease the indexer execution time.
The following examples illustrate how to define a custom batch size for configurable products. Add these samples to your {Your_Module_Name}/etc/di.xml
.
Indexer Table Switching
Magento optimizes certain indexer processes to prevent deadlocks and wait locks caused by read/write collisions on the same table. In these cases, Magento uses separate tables for performing read operations and reindexing. As a result of this table switching process, customers are not impacted when you run a full reindex. For example, when catalog_product_price
is reindexing, customers won’t be slowed down as they navigate on Categories pages, search products, or user layer navigation filters with price filters.
Magento uses the following tables to support table switching.
Indexer name | Tables used |
---|---|
catalog_product_price |
catalog_product_index_price , catalog_product_index_price_replica |
cataloginventory_stock |
cataloginventory_stock_status , cataloginventory_stock_status_replica |
catalog_category_product |
catalog_category_product_index , catalog_category_product_index_replica |
catalog_product_attribute (select, multiselect attributes) |
catalog_product_index_eav , catalog_product_index_eav_replica |
catalog_product_attribute (decimal values) |
catalog_product_index_eav_decimal , catalog_product_index_eav_decimal_replica |
catalogrule_rule |
catalogrule_product , catalogrule_product_replica , catalogrule_product_price , catalogrule_product_price_replica , catalogrule_group_website , catalogrule_group_website_replica |
Make sure that these indexers are in “Update By Schedule” mode. If “Update On Save” mode is selected, some data can be lost if you make changes during full reindex.
The indexer table switching mechanism requires additional database storage.