Skip to content

Conversation

@zhangwl9
Copy link
Contributor

@zhangwl9 zhangwl9 commented Jan 14, 2026

Why are the changes needed?

Close #4037.

Brief change log

  • The new feature involves two methods in "TableRuntimeRefreshExecutor".
    • #getNextExecutingTime()
      • When dynamic refresh interval is enabled,retrieving the latest refresh interval from "DefaultTableRuntime" to determine the next execution time for "TableRuntimeRefreshExecutor#execute"
    • #execute()
      • Depending on the evaluated results, determine whether the table requires optimization.
      • When the evaluation table requires no optimization, extend the interval for its next refresh (incrementally increasing the interval by a fixed step).
      • When evaluating table requiring optimization, shorten its next refresh interval (halving the interval each time).
  • dynamic refresh interval in [ refresh-tables.interval(1 min),max_interval(1 hour)]

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added type:docs Improvements or additions to documentation module:ams-server Ams server module module:common labels Jan 14, 2026
@zhangwl9 zhangwl9 changed the title [AMORO-4037] Add support for dynamic refresh interval for table metadata refreshing [AMORO-4037] Add support for dynamic refresh interval for table metadata refreshing in TableRuntimeRefreshExecutor Jan 14, 2026
@zhangwl9 zhangwl9 force-pushed the AMORO-dynamic-refresh-interval-dev branch from 60e829e to 298152d Compare January 15, 2026 07:06
@zhangwl9
Copy link
Contributor Author

zhangwl9 commented Jan 16, 2026

@zhoujinsong @turboFei @xxubai Could you help me review it when you are free? Thanks!

@zhangwl9 zhangwl9 force-pushed the AMORO-dynamic-refresh-interval-dev branch from 298152d to 4fec76a Compare January 21, 2026 05:42
@turboFei turboFei requested a review from Copilot January 22, 2026 04:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a dynamic refresh interval feature for table metadata refreshing in TableRuntimeRefreshExecutor. The feature uses an AIMD (Additive Increase Multiplicative Decrease) algorithm to adjust refresh intervals based on whether tables need optimization, helping to reduce resource consumption for healthy tables while maintaining responsiveness for tables requiring optimization.

Changes:

  • Added configuration properties for adaptive refresh intervals with configurable maximum interval and increase step
  • Implemented adaptive interval calculation logic that increases intervals for healthy tables and decreases them for tables needing optimization
  • Added comprehensive test coverage for the adaptive interval functionality

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs/user-guides/configurations.md Added documentation for three new configuration properties related to adaptive refresh intervals
amoro-format-iceberg/src/main/java/org/apache/amoro/table/TableProperties.java Defined constants for adaptive refresh table properties and their default values
amoro-common/src/main/java/org/apache/amoro/config/OptimizingConfig.java Added fields, getters, setters, and updated equals/hashCode/toString methods for adaptive refresh configuration
amoro-ams/src/main/java/org/apache/amoro/server/table/TableConfigurations.java Integrated parsing of adaptive refresh properties from table properties into OptimizingConfig
amoro-ams/src/main/java/org/apache/amoro/server/table/DefaultTableRuntime.java Added fields and methods to track latest refresh interval and optimization evaluation status
amoro-ams/src/main/java/org/apache/amoro/server/scheduler/inline/TableRuntimeRefreshExecutor.java Implemented core adaptive interval logic with AIMD algorithm and updated execute method to apply adaptive intervals
amoro-ams/src/test/java/org/apache/amoro/server/scheduler/inline/TestTableRuntimeRefreshExecutor.java Added comprehensive test cases covering various adaptive interval scenarios including boundary conditions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zhangwl9 zhangwl9 force-pushed the AMORO-dynamic-refresh-interval-dev branch from c83edf5 to 16b59c2 Compare January 22, 2026 08:39
张文领 and others added 3 commits January 27, 2026 14:15
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@zhangwl9 zhangwl9 force-pushed the AMORO-dynamic-refresh-interval-dev branch from 16b59c2 to 3298bcf Compare January 27, 2026 06:15
Copy link
Member

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I left some comments; please let me know what you think. thanks.

IMHO, this feature is valuable(especially when users use HMS as the catalog), as we can't use the event-based trigger as RestCatalog, but when there are many "static table" in the catalog, it will be a great burden for AMS to evaluate these tables, and changing the interval for every table is not a doable(as there are so many tables), and keep the default behavior as the same as before, would not add any inconvenience if we don't need it.

| self-optimizing.min-plan-interval | 60000 | The minimum time interval between two self-optimizing planning action |
| self-optimizing.filter | NULL | Filter conditions for self-optimizing, using SQL conditional expressions, without supporting any functions. For the timestamp column condition, the ISO date-time formatter must be used. For example: op_time > '2007-12-03T10:15:30'. |
| self-optimizing.refresh-table.adaptive.enabled | false | Whether to enable adaptive refresh interval for refreshing table metadata |
| self-optimizing.refresh-table.adaptive.max-interval | 3600000(1 hour) | The maximum time interval in milliseconds to refresh table metadata |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can set the max-interval to an unset value, so that we can keep the behavior same as before the current pr.

public static final int SNAPSHOT_MIN_COUNT_DEFAULT = 1;

public static final String SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_ENABLED =
"self-optimizing.refresh-table.adaptive.enabled";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to set self-optimizing.refresh-table.adaptive.enabled and self-optimizing.refresh-table.adaptive.max-interval, how about using only self-optimizing.refresh-table.adaptive.max-interval? Some value (like -1 or zero) means that disable it, and some other value means that to enable it.

public static final boolean SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_ENABLED_DEFAULT = false;
public static final String SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL =
"self-optimizing.refresh-table.adaptive.max-interval";
public static final long SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_DEFAULT =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about add the time unit in the name, SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_DEFAULT -> SELF_OPTIMIZING_REFRESH_TABLE_ADAPTIVE_MAX_INTERVAL_MS_DEFAULT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the other places

if (optimizingConfig.isMetadataBasedTriggerEnabled()
&& !MetadataBasedEvaluationEvent.isEvaluatingNecessary(
optimizingConfig, table, tableRuntime.getLastPlanTime())) {
tableRuntime.setLatestEvaluatedNeedOptimizing(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better that we change the return value of tryEvaluatingPendingInput to boolean -- indicate whether we need to optimize after evaluating?
So that we can call tableRuntime.setLatestEvaluteNeedOptimizing in one place in execute below,

Please let me what do you think about it.

// Update adaptive interval according to evaluated result.
if (defaultTableRuntime.getOptimizingConfig().getRefreshTableAdaptiveEnabled()) {
long newInterval = getAdaptiveExecutingInterval(defaultTableRuntime);
defaultTableRuntime.setLatestRefreshInterval(newInterval);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can call tableRuntime.setLatestEvaluatedNeedOptimizing(true); and defaultTableRuntime.setLatestRefreshInterval(newInterval); both in this if block, so that we wouldn't miss either

executor.execute(mockTableRuntime);

// Verify that setLatestRefreshInterval was called with the expected value
verify(mockTableRuntime, Mockito.times(1)).setLatestRefreshInterval(expectedInterval);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead using verify here to validate, is there any way we can validate the value of the true interval after executor.execute in L147.

The true interval is the value keeped in the table runtime.


// Test unhealthy table (need optimizing) - interval should decrease
mockTableRuntime =
getMockTableRuntimeWithAdaptiveInterval(tableRuntime, true, true, INTERVAL * 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to reuse this tableRuntime so we can observe the change in interval before and after two consecutive execute calls? Reusing a single tableRuntime function would also be more suitable for production scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ams-server Ams server module module:common type:docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support dynamic refresh interval for refreshing table metadata

2 participants