-
Notifications
You must be signed in to change notification settings - Fork 216
Enable C++ only text generation #3260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables C++-only text generation endpoints by conditionally disabling Python-dependent code and replacing references to the old text processor with a new template processor. Key changes include:
- Adding preprocessor guards (#if (PYTHON_DISABLE == 0)) around Python interpreter initialization and related includes in tests and production code.
- Replacing references and include paths for the text processor with those for text utilities and the new PyJinjaTemplateProcessor.
- Introducing debug output (via std::cout) in the ContinuousBatchingServableInitializer code.
Reviewed Changes
Copilot reviewed 22 out of 24 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/test/llm/llmnode_test.cpp | Wrapped Python initialization and updated include for text processing utilities. |
src/mediapipe_internal/mediapipegraphexecutor.hpp | Moved LLM session side packet setting outside Python guard. |
src/mediapipe_internal/mediapipegraphdefinition.cpp | Adjusted Python resource guard closing for node initialization. |
src/llm/visual_language_model/legacy/servable_initializer.cpp | Replaced loadTextProcessor with loadTemplateProcessor behind a PYTHON_DISABLE guard. |
src/llm/text_utils.hpp | Updated copyright year and removed Python embedding includes. |
src/llm/servable_initializer.cpp | Renamed function from loadTextProcessor to loadTemplateProcessor and added new parsing function for max model length. |
src/llm/language_model/continuous_batching/servable_initializer.cpp | Added extensive debug logging using std::cout. |
Other files | Updated include references and conditional Python blocks to support C++-only text generation. |
Files not reviewed (2)
- src/BUILD: Language not supported
- src/llm/BUILD: Language not supported
std::cout << "Parsed models path: " << parsedModelsPath << std::endl; | ||
if (!status.ok()) { | ||
return status; | ||
} | ||
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties()); | ||
std::cout << "Properties casted successfully." << std::endl; | ||
|
||
properties->modelsPath = parsedModelsPath; | ||
std::cout << "Models path set to: " << properties->modelsPath << std::endl; | ||
|
||
properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens(); | ||
std::cout << "Scheduler max_num_batched_tokens set to: " << properties->schedulerConfig.max_num_batched_tokens << std::endl; | ||
|
||
properties->schedulerConfig.cache_size = nodeOptions.cache_size(); | ||
std::cout << "Scheduler cache_size set to: " << properties->schedulerConfig.cache_size << std::endl; | ||
|
||
properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse(); | ||
std::cout << "Scheduler dynamic_split_fuse set to: " << properties->schedulerConfig.dynamic_split_fuse << std::endl; | ||
|
||
properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs(); | ||
std::cout << "Scheduler max_num_seqs set to: " << properties->schedulerConfig.max_num_seqs << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug output using std::cout appears in production code; consider replacing these statements with the project's logging framework (e.g., SPDLOG) for consistency and better control.
std::cout << "Parsed models path: " << parsedModelsPath << std::endl; | |
if (!status.ok()) { | |
return status; | |
} | |
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties()); | |
std::cout << "Properties casted successfully." << std::endl; | |
properties->modelsPath = parsedModelsPath; | |
std::cout << "Models path set to: " << properties->modelsPath << std::endl; | |
properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens(); | |
std::cout << "Scheduler max_num_batched_tokens set to: " << properties->schedulerConfig.max_num_batched_tokens << std::endl; | |
properties->schedulerConfig.cache_size = nodeOptions.cache_size(); | |
std::cout << "Scheduler cache_size set to: " << properties->schedulerConfig.cache_size << std::endl; | |
properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse(); | |
std::cout << "Scheduler dynamic_split_fuse set to: " << properties->schedulerConfig.dynamic_split_fuse << std::endl; | |
properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs(); | |
std::cout << "Scheduler max_num_seqs set to: " << properties->schedulerConfig.max_num_seqs << std::endl; | |
spdlog::info("Parsed models path: {}", parsedModelsPath); | |
if (!status.ok()) { | |
return status; | |
} | |
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties()); | |
spdlog::info("Properties casted successfully."); | |
properties->modelsPath = parsedModelsPath; | |
spdlog::info("Models path set to: {}", properties->modelsPath); | |
properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens(); | |
spdlog::info("Scheduler max_num_batched_tokens set to: {}", properties->schedulerConfig.max_num_batched_tokens); | |
properties->schedulerConfig.cache_size = nodeOptions.cache_size(); | |
spdlog::info("Scheduler cache_size set to: {}", properties->schedulerConfig.cache_size); | |
properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse(); | |
spdlog::info("Scheduler dynamic_split_fuse set to: {}", properties->schedulerConfig.dynamic_split_fuse); | |
properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs(); | |
spdlog::info("Scheduler max_num_seqs set to: {}", properties->schedulerConfig.max_num_seqs); |
Copilot uses AI. Check for mistakes.
🛠 Summary
Enabling text generation endpoints in OVMS with Python disabled.
🧪 Checklist
``