Skip to content

Conversation

@aheejin
Copy link
Member

@aheejin aheejin commented Jan 20, 2026

In ModuleSplitter::shareImportableItems, we now check if each module element is used by a secondary module and export & import it only when it is necessary. This removes the need to run RemoveUnusedModuleElements pass for secondary modules at the end, and also removes unnecessary exports from the primary module in case they are not used in any secondary modules.

This reduces the running time on a 'acx_gallery' reproducer provided by @biggs0125 before from 78.2s to 9.8s in average on my machine, reducing it by around 87.4%.

In that 'acx_gallery' program, this PR reduces the size of the primary module by 6.5% and its export section by 80.5%, and reduces the all combined module size (primary + secondaries) by 6%.


Detailed analysis for 'acx_gallery', a case where we split a module into 301 (1 primary + 300 secondary) modules:

  • Before this PR: Time: 78.2s
    Task breakdown:
Task                                          Total Time (ms)      Percentage
---------------------------------------------------------------------------
shareImportableItems                               42595.0000          62.45%
removeUnusedSecondaryElements                      20867.0000          30.59%
writeModule_secondary                               1932.4024           2.83%
writeModule_primary                                 1398.1200           2.05%
exportImportCalledPrimaryFunctions                   908.5040           1.33%
classifyFunctions                                    246.8890           0.36%
indirectReferencesToSecondaryFunctions               138.9600           0.20%
indirectCallsToSecondaryFunctions                     98.9370           0.15%
moveSecondaryFunctions                                21.1316           0.03%
initExportedPrimaryFuncs                               1.6690           0.00%
thunkExportedSecondaryFunctions                        0.3248           0.00%
setupTablePatching                                     0.0010           0.00%
---------------------------------------------------------------------------
Overall Total                                      68208.9388         100.00%
  • After this PR: Time: 9.8s
    Task breakdown:
Task                                          Total Time (ms)      Percentage
---------------------------------------------------------------------------
writeModule_secondary                               1242.9138          32.76%
shareImportableItems                                 816.0900          21.51%
writeModule_primary                                  788.7880          20.79%
exportImportCalledPrimaryFunctions                   590.2790          15.56%
classifyFunctions                                    172.8060           4.55%
indirectReferencesToSecondaryFunctions                91.8864           2.42%
indirectCallsToSecondaryFunctions                     79.0815           2.08%
moveSecondaryFunctions                                11.5988           0.31%
initExportedPrimaryFuncs                               0.8271           0.02%
thunkExportedSecondaryFunctions                        0.1599           0.00%
setupTablePatching                                     0.0008           0.00%
---------------------------------------------------------------------------
Overall Total                                       3794.4313         100.00%

We can see that shareImportableItems, which took up the largest share, has been reduced significantly, and removeUnusedSecondaryElements, which was the second largest, is not necessary anymore.

In `ModuleSplitter::shareImportableItems`, we now check if each module
element is used by a secondary module and export & import it only when
it is necessary. This removes the need to run RemoveUnusedModuleElements
pass for secondary modules at the end, and also removes unnecessary
exports from the primary module in case they are not used in any
secondary modules.

This reduces the running time on a 'acx_gallery' reproducer provided by
@biggs0125 before from 78.3s to 16.4s in average on my machine, reducing
it by around 79%.

In that 'acx_gallery' program, this PR reduces the size of the primary
module by 8% and reduces the all combined module size (primary +
secondaries) by 3.7%.

---

Detailed analysis for 'acx_gallery', a case where we split a module into
301 (1 primary + 300 secondary) modules:

- Before this PR:
Time: 78.3s
Task breakdown:
```
Task                                          Total Time (ms)      Percentage
---------------------------------------------------------------------------
shareImportableItems                               34472.0000          49.90%
removeUnusedSecondaryElements                      24720.2000          35.78%
moveSecondaryFunctions                              4938.0500           7.15%
writeModule_secondary                               2892.7603           4.19%
writeModule_primary                                  897.3630           1.30%
exportImportCalledPrimaryFunctions                   667.9780           0.97%
indirectReferencesToSecondaryFunctions               201.9830           0.29%
indirectCallsToSecondaryFunctions                    133.2190           0.19%
classifyFunctions                                    118.3200           0.17%
setupTablePatching                                    42.6402           0.06%
thunkExportedSecondaryFunctions                        0.9307           0.00%
initExportedPrimaryFuncs                               0.8008           0.00%
---------------------------------------------------------------------------
Overall Total
```

- After this PR:
Time: 16.4s
Task breakdown:
```
Task                                          Total Time (ms)      Percentage
---------------------------------------------------------------------------
moveSecondaryFunctions                              5341.1700          43.94%
writeModule_secondary                               2960.8319          24.36%
shareImportableItems                                1568.6400          12.91%
writeModule_primary                                  765.7240           6.30%
exportImportCalledPrimaryFunctions                   762.9670           6.28%
indirectReferencesToSecondaryFunctions               362.3300           2.98%
classifyFunctions                                    204.4540           1.68%
indirectCallsToSecondaryFunctions                    139.2110           1.15%
setupTablePatching                                    45.2461           0.37%
thunkExportedSecondaryFunctions                        3.0373           0.02%
initExportedPrimaryFuncs                               0.9982           0.01%
---------------------------------------------------------------------------
Overall Total                                      12154.6095         100.00%
```

We can see that `shareImportableItems`, which took up the largest share,
has been reduced significantly, and `removeUnusedSecondaryElements`,
which was the second largest, is not necessary anymore.
@aheejin aheejin requested a review from tlively January 20, 2026 17:03

#define DELEGATE_FIELD_NAME_KIND(id, field, kind) \
if (cast->field.is()) { \
if (kind == ModuleItemKind::Global) { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A switch here might be better than an if-else chain. If nothing else, it would give us a compiler error if we ever add a new ModuleItemKind.

Comment on lines +975 to +997
NameCollector collector(used);
for (auto& global : secondary.globals) {
if (!global->imported()) {
collector.walk(global->init);
}
}
for (auto& segment : secondary.dataSegments) {
used.memories.insert(segment->memory);
if (segment->offset) {
collector.walk(segment->offset);
}
}
for (auto& segment : secondary.elementSegments) {
if (segment->table.is()) {
used.tables.insert(segment->table);
}
if (segment->offset) {
collector.walk(segment->offset);
}
for (auto* item : segment->data) {
collector.walk(item);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this except checking for tables and memories in active segments could be replaced with a call to collector.walkModuleCode().


for (auto& table : primary.tables) {
auto secondaryTable = secondary.getTableOrNull(table->name);
if (!secondaryTable && !used.tables.count(table->name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this condition different from the others? Maybe worth a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants