Skip to content

Bump PyTorch to 2.7.0 #3455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

AlexanderDokuchaev
Copy link
Collaborator

@AlexanderDokuchaev AlexanderDokuchaev commented Apr 23, 2025

Changes

Update pytorch to 2.7.0
Regenerate ref data for test_generate_text_data_functional, changed output of hf-internal-testing/tiny-random-gpt2

Tests

manual/job/post_training_quantization/661/ - fail FX on save compressed models
nightly/job/TriggerBetta/1029/ -
examples :

  • llm_compression_qat_with_lora assert 0.034 == 0.027 ± 2.0e-03
  • llm_compression_synthetic - AssertionError: metric word_count: 81 != 83

wc - pass
Test_Install - pass

@github-actions github-actions bot added documentation Improvements or additions to documentation NNCF PT Pull requests that updates NNCF PyTorch labels Apr 23, 2025
@github-actions github-actions bot added NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX labels Apr 24, 2025
@AlexanderDokuchaev AlexanderDokuchaev changed the title torch270 Bump PyTorch to 2.7.0 Apr 24, 2025
@ljaljushkin
Copy link
Contributor

It appears that F.linear() is the source of indeterminism.

When model is on the CUDA, the code below

model = AutoModelForCausalLM.from_pretrained(BASE_TEST_MODEL_ID, device_map="cuda")
torch.use_deterministic_algorithms(True)

leads to an error:

../../env/nncf-py/lib/python3.10/site-packages/torch/nn/modules/linear.py:125: in forward
return F.linear(input, self.weight, self.bias)

RuntimeError: Deterministic behavior was enabled with either torch.use_deterministic_algorithms(True) or at::Context::setDeterministicAlgorithms(true), but this operation is not deterministic because it uses CuBLAS and you have CUDA >= 10.2. To enable deterministic behavior in this case, you must set an environment variable before running your PyTorch application: CUBLAS_WORKSPACE_CONFIG=:4096:8 or CUBLAS_WORKSPACE_CONFIG=:16:8. For more information, go to https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility

Then, with export CUBLAS_WORKSPACE_CONFIG = :4096 :8, the results are the same with torch 2.6.0 and 2.7.0

The same thing may happen with qat-lora sample on CPU.
When I switched to GPU, wwb reference were different:
image
use_deterministic_algorithms + CUBLAS_WORKSPACE_CONFIG aligns it.

Propose updating references, since with use_deterministic_algorithms the test would be slower, but the results were the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants