This section documents some of the improvements that we plan to make to Sleeper.
The following improvements are actively being worked on:
- #3446 Bulk export.
- #1330 Support deploying a published version of Sleeper.
- #1389 Upgrade to AWS SDK v2.
The following are likely to be worked on in the near future:
- #4393 Batch up partition splitting commits.
- #3693 Improvements to declarative deployment with infrastructure as code.
- #1391 Create a library of repeatable, sustained, large-scale performance tests.
- #1388 Rust implementations for operations on data files.
The following improvements will be worked on in future (these are in no particular order):
- #576 Use Arrow types in the table schema.
- #4396 Failure handling / backpressure for state store updates.
- #4398 Trigger compaction dispatch in transaction log follower.
- Scaling improvements.
- #4525 Mitigate limitations on throughput of state store updates.
- #4215 Service that maintains an up-to-date cache of the state store.
- #4218 Batch up updates to job trackers from state store commits.
- #4214 Mitigate memory limitations with multiple Sleeper tables.
- #4395 Table state partitioning.
- #4394 Parallelise garbage collection.
- Usability improvements.
- #1392 Create a predicate language for specifying filters on queries.
- #1390 Review and extend the integrations with Athena and Trino.
- Metrics page. Review and extend the metrics produced.
- Purge data from a table, i.e. delete any items matching a predicate.
We also have an article on potential deployment improvements, examining how the current deployment setup relates to the planned improvements linked above.