-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Initial support for dynamically linked crates #134767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
cc @m-ou-se |
This comment has been minimized.
This comment has been minimized.
I think that is indicative of a fundamental issue with your implementation. Adding new private impls should not change the ABI. The whole point of the RFC is to allow changing the implementation of the dylib without changing it's ABI. We already have support for dylibs where the ABI depends on the implementation. |
Also the interface file is missing either the Replacing all function bodies in an interface file with |
A stable ABI across rustc versions is a lot more involved than just mangling all symbols the same. You have to make sure multiple copies of the standard library seamlessly interoperate including using the same Edit: To put it another way, I did strongly prefer if stable ABI across rustc versions and stable ABI across crate versions with a single rustc version are treated as entirely separate problems implemented entirely separately. For stable ABI across crate versions you don't need to generate interface files. You can just reuse the existing rmeta file format, but only encode the subset of the items corresponding to the stable ABI. |
I can suggest the following solution for the above problem with impl's: Split indices (disambiguators) into 2 sets:
One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of source code. Stable metadata format is not a part of the RFC.
"stable ABI" is a bad wording here. Neither I nor the RFC offers a stable ABI. And all these issues are outside the scope of the proposal. |
These two statements are conflicting with each other. Being able to build a library and application with a different compiler version requires both to share an ABI, in other words it requires having a stable ABI. |
Only |
This comment was marked as resolved.
This comment was marked as resolved.
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🔒 Merge conflict This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again. How do I rebase?Assuming
You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial. Please avoid the "Resolve conflicts" button on GitHub. It uses Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Error message
|
☔ The latest upstream changes (presumably #139996) made this pull request unmergeable. Please resolve the merge conflicts. |
@bors r+ |
Initial support for dynamically linked crates This PR is an initial implementation of [rust-lang/rfcs#3435](rust-lang/rfcs#3435) proposal. ### component 1: interface generator Interface generator - a tool for generating a stripped version of crate source code. The interface is like a C header, where all function bodies are omitted. For example, initial crate: ```rust #[export] #[repr(C)] pub struct S { pub x: i32 } #[export] pub extern "C" fn foo(x: S) { m1::bar(x); } pub fn bar(x: crate::S) { // some computations } ``` generated interface: ```rust #[export] #[repr(C)] pub struct S { pub x: i32, } #[export] pub extern "C" fn foo(x: S); pub fn bar(x: crate::S); ``` The interface generator was implemented as part of the pretty-printer. Ideally interface should only contain exportable items, but here is the first problem: - pass for determining exportable items relies on privacy information, which is totally available only in HIR - HIR pretty-printer uses pseudo-code(at least for attributes) So, the interface generator was implemented in AST. This has led to the fact that non-exportable items cannot be filtered out, but I don't think this is a major issue at the moment. To emit an interface use a new `sdylib` crate type which is basically the same as `dylib`, but it doesn't contain metadata, and also produces the interface as a second artifact. The current interface name is `lib{crate_name}.rs`. #### Why was it decided to use a design with an auto-generated interface? One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of a source code. The option with a stable metadata format has not been investigated in detail, but it is not part of RFC either. Here is the the related discussion: rust-lang/rfcs#3435 (comment) Original proposal suggests using the source code for the dynamic library and all its dependencies. Metadata is obtained from `cargo check`. I decided to use interface files since it is more or less compatible with the original proposal, but also allows users to hide the source code. ##### Regarding the design with interfaces in Rust, files generally do not have a special meaning, unlike C++. A translation unit i.e. a crate is not a single file, it consists of modules. Modules, in turn, can be declared either in one file or divided into several. That's why the "interface file" isn't a very coherent concept in Rust. I would like to avoid adding an additional level of complexity for users until it is proven necessary. Therefore, the initial plan was to make the interfaces completely invisible to users i. e. make them auto-generated. I also planned to put them in the dylib, but this has not been done yet. (since the PR is already big enough, I decided to postpone it) There is one concern, though, which has not yet been investigated(rust-lang#134767 (comment)): > Compiling the interface as pretty-printed source code doesn't use correct macro hygiene (mostly relevant to macros 2.0, stable macros do not affect item hygiene). I don't have much hope for encoding hygiene data in any stable way, we should rather support a way for the interface file to be provided manually, instead of being auto-generated, if there are any non-trivial requirements. ### component 2: crate loader When building dynamic dependencies, the crate loader searches for the interface in the file system, builds the interface without codegen and loads it's metadata. Routing rules for interface files are almost the same as for `rlibs` and `dylibs`. Firstly, the compiler checks `extern` options and then tries to deduce the path himself. Here are the code and commands that corresponds to the compilation process: ```rust // simple-lib.rs #![crate_type = "sdylib"] #[extern] pub extern "C" fn foo() -> i32 { 42 } ``` ```rust // app.rs extern crate simple_lib; fn main() { assert!(simple_lib::foo(), 42); } ``` ``` // Generate interface, build library. rustc +toolchain1 lib.rs // Build app. Perhaps with a different compiler version. rustc +toolchain2 app.rs -L. ``` P.S. The interface name/format and rules for file system routing can be changed further. ### component 3: exportable items collector Query for collecting exportable items. Which items are exportable is defined [here](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#the-export-attribute) . ### component 4: "stable" mangling scheme The mangling scheme proposed in the RFC consists of two parts: a mangled item path and a hash of the signature. #### mangled item path For the first part of the symbol it has been decided to reuse the `v0` mangling scheme as it much less dependent on compiler internals compared to the `legacy` scheme. The exception is disambiguators (https://doc.rust-lang.org/rustc/symbol-mangling/v0.html#disambiguator): For example, during symbol mangling rustc uses a special index to distinguish between two impls of the same type in the same module(See `DisambiguatedDefPathData`). The calculation of this index may depend on private items, but private items should not affect the ABI. Example: ```rust #[export] #[repr(C)] pub struct S<T>(pub T); struct S1; pub struct S2; impl S<S1> { extern "C" fn foo() -> i32 { 1 } } #[export] impl S<S2> { // Different symbol names can be generated for this item // when compiling the interface and source code. pub extern "C" fn foo() -> i32 { 2 } } ``` In order to make disambiguation independent of the compiler version we can assign an id to each impl according to their relative order in the source code. The second example is `StableCrateId` which is used to disambiguate different crates. `StableCrateId` consists of crate name, `-Cmetadata` arguments and compiler version. At the moment, I have decided to keep only the crate name, but a more consistent approach to crate disambiguation could be added in the future. Actually, there are more cases where such disambiguation can be used. For instance, when mangling internal rustc symbols, but it also hasn't been investigated in detail yet. #### hash of the signature Exportable functions from stable dylibs can be called from safe code. In order to provide type safety, 128 bit hash with relevant type information is appended to the symbol ([description from RFC](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#name-mangling-and-safety)). For now, it includes: - hash of the type name for primitive types - for ADT types with public fields the implementation follows [this](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#types-with-public-fields) rules `#[export(unsafe_stable_abi = "hash")]` syntax for ADT types with private fields is not yet implemented. Type safety is a subtle thing here. I used the approach from RFC, but there is the ongoing research project about it. [https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html](https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html) ### Unresolved questions Interfaces: 1. Move the interface generator to HIR and add an exportable items filter. 2. Compatibility of auto-generated interfaces and macro hygiene. 3. There is an open issue with interface files compilation: rust-lang#134767 (comment) 4. Put an interface into a dylib. Mangling scheme: 1. Which information is required to ensure type safety and how should it be encoded? ([https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html](https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html)) 2. Determine all other possible cases, where path disambiguation is used. Make it compiler independent. We also need a semi-stable API to represent types. For example, the order of fields in the `VariantDef` must be stable. Or a semi-stable representation for AST, which ensures that the order of the items in the code is preserved. There are some others, mentioned in the proposal.
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
@bors r+ |
Initial support for dynamically linked crates This PR is an initial implementation of [rust-lang/rfcs#3435](rust-lang/rfcs#3435) proposal. ### component 1: interface generator Interface generator - a tool for generating a stripped version of crate source code. The interface is like a C header, where all function bodies are omitted. For example, initial crate: ```rust #[export] #[repr(C)] pub struct S { pub x: i32 } #[export] pub extern "C" fn foo(x: S) { m1::bar(x); } pub fn bar(x: crate::S) { // some computations } ``` generated interface: ```rust #[export] #[repr(C)] pub struct S { pub x: i32, } #[export] pub extern "C" fn foo(x: S); pub fn bar(x: crate::S); ``` The interface generator was implemented as part of the pretty-printer. Ideally interface should only contain exportable items, but here is the first problem: - pass for determining exportable items relies on privacy information, which is totally available only in HIR - HIR pretty-printer uses pseudo-code(at least for attributes) So, the interface generator was implemented in AST. This has led to the fact that non-exportable items cannot be filtered out, but I don't think this is a major issue at the moment. To emit an interface use a new `sdylib` crate type which is basically the same as `dylib`, but it doesn't contain metadata, and also produces the interface as a second artifact. The current interface name is `lib{crate_name}.rs`. #### Why was it decided to use a design with an auto-generated interface? One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of a source code. The option with a stable metadata format has not been investigated in detail, but it is not part of RFC either. Here is the the related discussion: rust-lang/rfcs#3435 (comment) Original proposal suggests using the source code for the dynamic library and all its dependencies. Metadata is obtained from `cargo check`. I decided to use interface files since it is more or less compatible with the original proposal, but also allows users to hide the source code. ##### Regarding the design with interfaces in Rust, files generally do not have a special meaning, unlike C++. A translation unit i.e. a crate is not a single file, it consists of modules. Modules, in turn, can be declared either in one file or divided into several. That's why the "interface file" isn't a very coherent concept in Rust. I would like to avoid adding an additional level of complexity for users until it is proven necessary. Therefore, the initial plan was to make the interfaces completely invisible to users i. e. make them auto-generated. I also planned to put them in the dylib, but this has not been done yet. (since the PR is already big enough, I decided to postpone it) There is one concern, though, which has not yet been investigated(rust-lang#134767 (comment)): > Compiling the interface as pretty-printed source code doesn't use correct macro hygiene (mostly relevant to macros 2.0, stable macros do not affect item hygiene). I don't have much hope for encoding hygiene data in any stable way, we should rather support a way for the interface file to be provided manually, instead of being auto-generated, if there are any non-trivial requirements. ### component 2: crate loader When building dynamic dependencies, the crate loader searches for the interface in the file system, builds the interface without codegen and loads it's metadata. Routing rules for interface files are almost the same as for `rlibs` and `dylibs`. Firstly, the compiler checks `extern` options and then tries to deduce the path himself. Here are the code and commands that corresponds to the compilation process: ```rust // simple-lib.rs #![crate_type = "sdylib"] #[extern] pub extern "C" fn foo() -> i32 { 42 } ``` ```rust // app.rs extern crate simple_lib; fn main() { assert!(simple_lib::foo(), 42); } ``` ``` // Generate interface, build library. rustc +toolchain1 lib.rs // Build app. Perhaps with a different compiler version. rustc +toolchain2 app.rs -L. ``` P.S. The interface name/format and rules for file system routing can be changed further. ### component 3: exportable items collector Query for collecting exportable items. Which items are exportable is defined [here](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#the-export-attribute) . ### component 4: "stable" mangling scheme The mangling scheme proposed in the RFC consists of two parts: a mangled item path and a hash of the signature. #### mangled item path For the first part of the symbol it has been decided to reuse the `v0` mangling scheme as it much less dependent on compiler internals compared to the `legacy` scheme. The exception is disambiguators (https://doc.rust-lang.org/rustc/symbol-mangling/v0.html#disambiguator): For example, during symbol mangling rustc uses a special index to distinguish between two impls of the same type in the same module(See `DisambiguatedDefPathData`). The calculation of this index may depend on private items, but private items should not affect the ABI. Example: ```rust #[export] #[repr(C)] pub struct S<T>(pub T); struct S1; pub struct S2; impl S<S1> { extern "C" fn foo() -> i32 { 1 } } #[export] impl S<S2> { // Different symbol names can be generated for this item // when compiling the interface and source code. pub extern "C" fn foo() -> i32 { 2 } } ``` In order to make disambiguation independent of the compiler version we can assign an id to each impl according to their relative order in the source code. The second example is `StableCrateId` which is used to disambiguate different crates. `StableCrateId` consists of crate name, `-Cmetadata` arguments and compiler version. At the moment, I have decided to keep only the crate name, but a more consistent approach to crate disambiguation could be added in the future. Actually, there are more cases where such disambiguation can be used. For instance, when mangling internal rustc symbols, but it also hasn't been investigated in detail yet. #### hash of the signature Exportable functions from stable dylibs can be called from safe code. In order to provide type safety, 128 bit hash with relevant type information is appended to the symbol ([description from RFC](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#name-mangling-and-safety)). For now, it includes: - hash of the type name for primitive types - for ADT types with public fields the implementation follows [this](https://github.com/m-ou-se/rfcs/blob/export/text/0000-export.md#types-with-public-fields) rules `#[export(unsafe_stable_abi = "hash")]` syntax for ADT types with private fields is not yet implemented. Type safety is a subtle thing here. I used the approach from RFC, but there is the ongoing research project about it. [https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html](https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html) ### Unresolved questions Interfaces: 1. Move the interface generator to HIR and add an exportable items filter. 2. Compatibility of auto-generated interfaces and macro hygiene. 3. There is an open issue with interface files compilation: rust-lang#134767 (comment) 4. Put an interface into a dylib. Mangling scheme: 1. Which information is required to ensure type safety and how should it be encoded? ([https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html](https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html)) 2. Determine all other possible cases, where path disambiguation is used. Make it compiler independent. We also need a semi-stable API to represent types. For example, the order of fields in the `VariantDef` must be stable. Or a semi-stable representation for AST, which ensures that the order of the items in the code is preserved. There are some others, mentioned in the proposal.
The job Click to see the possible cause of the failure (guessed by this bot)
|
💔 Test failed - checks-actions |
☔ The latest upstream changes (presumably #140127) made this pull request unmergeable. Please resolve the merge conflicts. |
This PR is an initial implementation of rust-lang/rfcs#3435 proposal.
component 1: interface generator
Interface generator - a tool for generating a stripped version of crate source code. The interface is like a C header, where all function bodies are omitted. For example, initial crate:
generated interface:
The interface generator was implemented as part of the pretty-printer. Ideally interface should only contain exportable items, but here is the first problem:
So, the interface generator was implemented in AST. This has led to the fact that non-exportable items cannot be filtered out, but I don't think this is a major issue at the moment.
To emit an interface use a new
sdylib
crate type which is basically the same asdylib
, but it doesn't contain metadata, and also produces the interface as a second artifact. The current interface name islib{crate_name}.rs
.Why was it decided to use a design with an auto-generated interface?
One of the main objectives of this proposal is to allow building the library and the application with different compiler versions. This requires either a metadata format compatible across rustc versions or some form of a source code. The option with a stable metadata format has not been investigated in detail, but it is not part of RFC either. Here is the the related discussion: rust-lang/rfcs#3435 (comment)
Original proposal suggests using the source code for the dynamic library and all its dependencies. Metadata is obtained from
cargo check
. I decided to use interface files since it is more or less compatible with the original proposal, but also allows users to hide the source code.Regarding the design with interfaces
in Rust, files generally do not have a special meaning, unlike C++. A translation unit i.e. a crate is not a single file, it consists of modules. Modules, in turn, can be declared either in one file or divided into several. That's why the "interface file" isn't a very coherent concept in Rust. I would like to avoid adding an additional level of complexity for users until it is proven necessary. Therefore, the initial plan was to make the interfaces completely invisible to users i. e. make them auto-generated. I also planned to put them in the dylib, but this has not been done yet. (since the PR is already big enough, I decided to postpone it)
There is one concern, though, which has not yet been investigated(#134767 (comment)):
component 2: crate loader
When building dynamic dependencies, the crate loader searches for the interface in the file system, builds the interface without codegen and loads it's metadata. Routing rules for interface files are almost the same as for
rlibs
anddylibs
. Firstly, the compiler checksextern
options and then tries to deduce the path himself.Here are the code and commands that corresponds to the compilation process:
P.S. The interface name/format and rules for file system routing can be changed further.
component 3: exportable items collector
Query for collecting exportable items. Which items are exportable is defined here .
component 4: "stable" mangling scheme
The mangling scheme proposed in the RFC consists of two parts: a mangled item path and a hash of the signature.
mangled item path
For the first part of the symbol it has been decided to reuse the
v0
mangling scheme as it much less dependent on compiler internals compared to thelegacy
scheme.The exception is disambiguators (https://doc.rust-lang.org/rustc/symbol-mangling/v0.html#disambiguator):
For example, during symbol mangling rustc uses a special index to distinguish between two impls of the same type in the same module(See
DisambiguatedDefPathData
). The calculation of this index may depend on private items, but private items should not affect the ABI. Example:In order to make disambiguation independent of the compiler version we can assign an id to each impl according to their relative order in the source code.
The second example is
StableCrateId
which is used to disambiguate different crates.StableCrateId
consists of crate name,-Cmetadata
arguments and compiler version. At the moment, I have decided to keep only the crate name, but a more consistent approach to crate disambiguation could be added in the future.Actually, there are more cases where such disambiguation can be used. For instance, when mangling internal rustc symbols, but it also hasn't been investigated in detail yet.
hash of the signature
Exportable functions from stable dylibs can be called from safe code. In order to provide type safety, 128 bit hash with relevant type information is appended to the symbol (description from RFC). For now, it includes:
#[export(unsafe_stable_abi = "hash")]
syntax for ADT types with private fields is not yet implemented.Type safety is a subtle thing here. I used the approach from RFC, but there is the ongoing research project about it. https://rust-lang.github.io/rust-project-goals/2025h1/safe-linking.html
Unresolved questions
Interfaces:
Mangling scheme:
We also need a semi-stable API to represent types. For example, the order of fields in the
VariantDef
must be stable. Or a semi-stable representation for AST, which ensures that the order of the items in the code is preserved.There are some others, mentioned in the proposal.