Skip to content

Enable C++ only text generation #3260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Enable C++ only text generation #3260

wants to merge 3 commits into from

Conversation

mzegla
Copy link
Collaborator

@mzegla mzegla commented Apr 25, 2025

🛠 Summary

Enabling text generation endpoints in OVMS with Python disabled.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@mzegla mzegla requested a review from Copilot April 25, 2025 12:23
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables C++-only text generation endpoints by conditionally disabling Python-dependent code and replacing references to the old text processor with a new template processor. Key changes include:

  • Adding preprocessor guards (#if (PYTHON_DISABLE == 0)) around Python interpreter initialization and related includes in tests and production code.
  • Replacing references and include paths for the text processor with those for text utilities and the new PyJinjaTemplateProcessor.
  • Introducing debug output (via std::cout) in the ContinuousBatchingServableInitializer code.

Reviewed Changes

Copilot reviewed 22 out of 24 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/test/llm/llmnode_test.cpp Wrapped Python initialization and updated include for text processing utilities.
src/mediapipe_internal/mediapipegraphexecutor.hpp Moved LLM session side packet setting outside Python guard.
src/mediapipe_internal/mediapipegraphdefinition.cpp Adjusted Python resource guard closing for node initialization.
src/llm/visual_language_model/legacy/servable_initializer.cpp Replaced loadTextProcessor with loadTemplateProcessor behind a PYTHON_DISABLE guard.
src/llm/text_utils.hpp Updated copyright year and removed Python embedding includes.
src/llm/servable_initializer.cpp Renamed function from loadTextProcessor to loadTemplateProcessor and added new parsing function for max model length.
src/llm/language_model/continuous_batching/servable_initializer.cpp Added extensive debug logging using std::cout.
Other files Updated include references and conditional Python blocks to support C++-only text generation.
Files not reviewed (2)
  • src/BUILD: Language not supported
  • src/llm/BUILD: Language not supported

Comment on lines 133 to 153
std::cout << "Parsed models path: " << parsedModelsPath << std::endl;
if (!status.ok()) {
return status;
}
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties());
std::cout << "Properties casted successfully." << std::endl;

properties->modelsPath = parsedModelsPath;
std::cout << "Models path set to: " << properties->modelsPath << std::endl;

properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens();
std::cout << "Scheduler max_num_batched_tokens set to: " << properties->schedulerConfig.max_num_batched_tokens << std::endl;

properties->schedulerConfig.cache_size = nodeOptions.cache_size();
std::cout << "Scheduler cache_size set to: " << properties->schedulerConfig.cache_size << std::endl;

properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse();
std::cout << "Scheduler dynamic_split_fuse set to: " << properties->schedulerConfig.dynamic_split_fuse << std::endl;

properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs();
std::cout << "Scheduler max_num_seqs set to: " << properties->schedulerConfig.max_num_seqs << std::endl;
Copy link
Preview

Copilot AI Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug output using std::cout appears in production code; consider replacing these statements with the project's logging framework (e.g., SPDLOG) for consistency and better control.

Suggested change
std::cout << "Parsed models path: " << parsedModelsPath << std::endl;
if (!status.ok()) {
return status;
}
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties());
std::cout << "Properties casted successfully." << std::endl;
properties->modelsPath = parsedModelsPath;
std::cout << "Models path set to: " << properties->modelsPath << std::endl;
properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens();
std::cout << "Scheduler max_num_batched_tokens set to: " << properties->schedulerConfig.max_num_batched_tokens << std::endl;
properties->schedulerConfig.cache_size = nodeOptions.cache_size();
std::cout << "Scheduler cache_size set to: " << properties->schedulerConfig.cache_size << std::endl;
properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse();
std::cout << "Scheduler dynamic_split_fuse set to: " << properties->schedulerConfig.dynamic_split_fuse << std::endl;
properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs();
std::cout << "Scheduler max_num_seqs set to: " << properties->schedulerConfig.max_num_seqs << std::endl;
spdlog::info("Parsed models path: {}", parsedModelsPath);
if (!status.ok()) {
return status;
}
auto properties = std::static_pointer_cast<ContinuousBatchingServableProperties>(servable->getProperties());
spdlog::info("Properties casted successfully.");
properties->modelsPath = parsedModelsPath;
spdlog::info("Models path set to: {}", properties->modelsPath);
properties->schedulerConfig.max_num_batched_tokens = nodeOptions.max_num_batched_tokens();
spdlog::info("Scheduler max_num_batched_tokens set to: {}", properties->schedulerConfig.max_num_batched_tokens);
properties->schedulerConfig.cache_size = nodeOptions.cache_size();
spdlog::info("Scheduler cache_size set to: {}", properties->schedulerConfig.cache_size);
properties->schedulerConfig.dynamic_split_fuse = nodeOptions.dynamic_split_fuse();
spdlog::info("Scheduler dynamic_split_fuse set to: {}", properties->schedulerConfig.dynamic_split_fuse);
properties->schedulerConfig.max_num_seqs = nodeOptions.max_num_seqs();
spdlog::info("Scheduler max_num_seqs set to: {}", properties->schedulerConfig.max_num_seqs);

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant