-
Notifications
You must be signed in to change notification settings - Fork 99
Support insert_into_select
for Postgres table
#688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
hi~ @JelteF The test cases aren't fully covered yet. would appreciate some feedback before finalizing the details. Thanks! |
src/pgduckdb_planner.cpp
Outdated
ListCell *l; | ||
RangeTblEntry *select_rte = NULL; | ||
foreach(l, copied_query->rtable) { | ||
RangeTblEntry *rte = (RangeTblEntry *)lfirst(l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
foreach_node
is nicer to use and requires less boilerplate
ListCell *l; | |
RangeTblEntry *select_rte = NULL; | |
foreach(l, copied_query->rtable) { | |
RangeTblEntry *rte = (RangeTblEntry *)lfirst(l); | |
RangeTblEntry *select_rte = NULL; | |
foreach_node(RangeTblEntry, rte, copied_query->rtable) { |
src/pgduckdb_hooks.cpp
Outdated
if (query->onConflict || query->returningList) { | ||
elog(elevel, "DuckDB does not support INSERT ON CONFLICT or RETURNING on Postgres tables"); | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not allow these? Do they not work? I would expect that if we insert using postgres its functionality that thees things would automatically work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, not saying that this should be implemented in the first version of this feature. But I'd at least like a code comment explaining why we do not support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, block them since I haven't tested them yet. I've done some initial testing, and RETURNING works out of box while onConflict doesn't. Will dig deeper into the WHY and add comments to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ON CONFLICT is also working
src/pgduckdb_hooks.cpp
Outdated
@@ -169,10 +182,16 @@ IsAllowedStatement(Query *query, bool throw_error) { | |||
/* We don't support modifying statements on Postgres tables yet */ | |||
if (query->commandType != CMD_SELECT) { | |||
RangeTblEntry *resultRte = list_nth_node(RangeTblEntry, query->rtable, query->resultRelation - 1); | |||
if (!::IsDuckdbTable(resultRte->relid)) { | |||
if (::IsHeapTable(resultRte->relid) && query->commandType == CMD_INSERT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only heap tables? I'd expect this should logic would work fine for most table access method, because they implement the table access method interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES, you are right! This is becoming much more exciting if we get the whole thing functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI is failing in many places, because duckdb.force_execution
is true
in our tests. Apparently something breaks when selecting from generate_series()
in duckdb for a PG INSERT.
My example code from #30:
Fails with this error:
I think that would be worked around by #583, but then the opposite would hold (i.e. inserting into a varchar column woudl fail). So I think we need some logic to make sure that the types match against eachother, and maybe add some code to convert the type that we get back from DuckDB to the type that PG wants for the INSERT. I think that would also fix the |
YES, the type check and conversion are substantial. I manually added an |
insert_into_select
for Postgres(heap) table insert_into_select
for Postgres table
dabffc2
to
e7f7aaf
Compare
The MR is ready for review~ @JelteF Please take a look when you have a chance. Currently, the regression is failing for
update: DuckDB cli has the exact same output. |
insert_into_select
for Postgres table insert_into_select
for Postgres table
@@ -14,3 +14,4 @@ extern bool duckdb_explain_ctas; | |||
PlannedStmt *DuckdbPlanNode(Query *parse, const char *query_string, int cursor_options, ParamListInfo bound_params, | |||
bool throw_error); | |||
duckdb::unique_ptr<duckdb::PreparedStatement> DuckdbPrepare(const Query *query, const char *explain_prefix = NULL); | |||
bool IsAllowedPostgresInsert(Query *query, bool throw_error = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you don't have https://editorconfig.org/ installed in your editor. Could you do that and re-save this file to add a trailing newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
return false; | ||
} | ||
|
||
/* TODO: Checking supported target column types */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wanna implement this TODO still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, the TODO comment can be removed. The follow-up code handles the checking through blacklist
I think you can probably fix the test by changing the plain |
continue; | ||
} | ||
|
||
if (attr->atttypid == BYTEAOID || attr->atttypid == XMLOID || pgduckdb::pg::IsDomainType(attr->atttypid) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoding some unsupported types, I think we should instead determine the allowed types. Also, why are these types unsupported? Because we do support BYTEA as a type for regular duckdb execution, so that seems surprising that it doesn't work here.
INSERT INTO tbl select r['a']::int, r['b'] from duckdb.query($$ SELECT 21 a, 'ghi' b $$) r; | ||
ERROR: no partition of relation "tbl" found for row | ||
DETAIL: Partition key of the failing row contains (a) = (21). | ||
SELECT * FROM tbl ORDER BY a LIMIT 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to also check the contents of each partition, to double check that values go to the correct partition.
} | ||
|
||
if (!select_rte) { | ||
elog(elevel, "DuckDB does not support INSERT without a subquery"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for this error case afaict
IsAllowedPostgresInsert(Query *query, bool throw_error) { | ||
int elevel = throw_error ? ERROR : DEBUG4; | ||
if (query->commandType != CMD_INSERT) { | ||
elog(elevel, "DuckDB only supports INSERT/SELECT on Postgres tables"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for this error case afaict
|
||
DROP TABLE tbl; | ||
-- case: ON CONFLICT DO NOTHING | ||
CREATE TABLE tbl (a int PRIMARY KEY, b text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with a SERIAL
column or a column with GENERATED ALWAYS AS IDENTITY
.
To set expectations: I think this is a very cool feature and your implementation looks like the right direction. But it is relatively complex and impactful, so I would like to give it an appropriate round of testing and "baking" (aka testing by merging it and having it on the |
Totally agree, 1.1 makes sense. btw, any target date for 1.0 release? |
The current plan is to do it quickly after the DuckDB 1.3.0 release, so early May. But dates may obviously slip. |
Partially fixes #30