Skip to content

Latest commit

 

History

History
41 lines (33 loc) · 2.58 KB

roadmap.md

File metadata and controls

41 lines (33 loc) · 2.58 KB

Roadmap

This section documents some of the improvements that we plan to make to Sleeper.

The following improvements are actively being worked on:

  • #3446 Bulk export.
  • #1330 Support deploying a published version of Sleeper.
  • #1389 Upgrade to AWS SDK v2.

The following are likely to be worked on in the near future:

  • #4393 Batch up partition splitting commits.
  • #3693 Improvements to declarative deployment with infrastructure as code.
  • #1391 Create a library of repeatable, sustained, large-scale performance tests.
  • #1388 Rust implementations for operations on data files.

The following improvements will be worked on in future (these are in no particular order):

  • #576 Use Arrow types in the table schema.
  • #4396 Failure handling / backpressure for state store updates.
  • #4398 Trigger compaction dispatch in transaction log follower.
  • Scaling improvements.
    • #4525 Mitigate limitations on throughput of state store updates.
    • #4215 Service that maintains an up-to-date cache of the state store.
    • #4218 Batch up updates to job trackers from state store commits.
    • #4214 Mitigate memory limitations with multiple Sleeper tables.
    • #4395 Table state partitioning.
    • #4394 Parallelise garbage collection.
  • Usability improvements.
    • #1328 Unify admin client and related scripts.
    • #1786 REST API.
    • Python API improvements. This is currently basic and needs further work.
  • #1392 Create a predicate language for specifying filters on queries.
  • #1390 Review and extend the integrations with Athena and Trino.
  • Metrics page. Review and extend the metrics produced.
  • Purge data from a table, i.e. delete any items matching a predicate.

We also have an article on potential deployment improvements, examining how the current deployment setup relates to the planned improvements linked above.