Create a partition to the iceberg tables created by S3-Data-Lake Connector #57501
Replies: 2 comments 1 reply
-
thanks for starting this discussion! some questions about how you want to partition:
(also, just to check - I think we're already sorting on PK, does that match what you're seeing?) good catch on the compaction thing - lemme take a look at that, I have some theories about it |
Beta Was this translation helpful? Give feedback.
-
thanks for the response! really appreciate the good work :) well, i developed a destination (a clunky one that i would be happy to replace) in the past (using Athena, don't know what i thinking!!) which is currently used in our production one of the limitations is, that you don't have optional parameters that you could set on the schema level, so you'd have to have a generic destination level decision, so technically you could create different destinations for each partition type you'd want which isn't ideal but that's not something that we'd solve now :) if to have more specific answers from the experience i had so far:
regarding compactions, yeah, it would be nice to have it fixed. |
Beta Was this translation helpful? Give feedback.
-
https://docs.airbyte.com/integrations/destinations/s3-data-lake
I really like the new connector for Iceberg and really keen on using it in production.
One thing that prevents me to do so is that the tables are created with sorting but without partitions
When having tables with even millions of records, not using a partition is very problematic and make the performance very bad
I'd suggest to create the cursor field as a partition and sort by the primary key
another issue i noticed was that when i ran compactions on the table and rerun the airbyte sync i get this error
java.lang.IllegalArgumentException: Cannot fast-forward: main is not an ancestor of airbyte_staging
i'm assuming that the airbyte_staging branch suppose to be on the same snapshot as the main branch
to fix that i manually changed the snapshot on the airbyte_staging branch to align with the main branch
given the compactions on iceberg tables are a required procedure it would be great if this could be fixed
Beta Was this translation helpful? Give feedback.
All reactions