Skip to content

Bluetooth: CSIP: Fix ntf issue to clients on reboot #88402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

niym-ot
Copy link
Collaborator

@niym-ot niym-ot commented Apr 10, 2025

On reboot, client list to notify is not updated properly. Fix is to check and add the reconnected clients on security changed cb.

Subscription check is added before notify to clients. BT Enable check is added in the register function before adding bonded devices to client list.

Also typo is corrected in add_bonded_addr_to_client_list in the second loop.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

subsys/bluetooth/audio/csip_set_member.c:843

  • The loop increment originally used 'i' instead of 'j', which could lead to incorrect iteration behavior. The fix correctly updates the loop to use 'j++' starting at index 0.
for (size_t j = 1U; j < ARRAY_SIZE(svc_inst->clients); i++) {

@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch from e31ddd6 to 598b0f9 Compare April 10, 2025 08:38
@Thalley Thalley requested a review from Copilot April 10, 2025 08:40
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

subsys/bluetooth/audio/csip_set_member.c:848

  • [nitpick] Consider rephrasing the comment to improve clarity, for example: '/* Check if device is registered; if not, add it */'.
/* Check if device is registered, it not, add it */

@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch 2 times, most recently from 5510eab to 52194a5 Compare April 16, 2025 06:52
@niym-ot niym-ot requested a review from Thalley April 16, 2025 07:45
@Thalley
Copy link
Collaborator

Thalley commented Apr 21, 2025

@niym-ot for the future, please, as per https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#workflow-suggestions-that-help-reviewers, allow reviewers to resolve non-trivial comments :)

@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch from 52194a5 to 5afc95e Compare April 23, 2025 12:09
@Thalley Thalley requested a review from Copilot April 23, 2025 12:40
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses a bug in the CSIP client notification mechanism on reboot by ensuring that reconnected clients are properly added and notified. Key changes include:

  • Inclusion of settings-related functionality with a new commit handler to restore the client list from bonded devices.
  • Introduction of a global mutex (svc_inst_lock) to protect concurrent operations, particularly in notify and unregister functions.
  • Correction of a typo in the iteration variable in add_bonded_addr_to_client_list.

@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch 2 times, most recently from d2a37a6 to 5a802a3 Compare April 24, 2025 12:43
@niym-ot niym-ot requested a review from Thalley April 24, 2025 13:19
@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch from 5a802a3 to c3e862d Compare April 24, 2025 15:30
@niym-ot niym-ot requested a review from Thalley April 24, 2025 16:07
Thalley
Thalley previously approved these changes Apr 24, 2025
Comment on lines +90 to +92
#if defined(CONFIG_BT_SETTINGS)
static int csip_settings_commit(void)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be outside the scope of this PR, but it looks like that if CONFIG_BT_BONDABLE=n, then there's some parts of our code that shouldn't be run (should be guarded by CONFIG_BT_BONDABLE), but I think we can do that in a follow-up PR, but something like this probably should since it's only dealing with bonds at the moment

@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch from c3e862d to f4e9ead Compare April 25, 2025 06:40
@niym-ot niym-ot requested a review from Thalley April 25, 2025 06:58
Copy link
Collaborator

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we would add some sort of test for the mutex behavior. I'm not convinced that the current implement properly guards the service instance members, but on the other hand it's better than what it was, and not the main purpose of this PR.

I suggest to fix the comment about memset, and then follow up with another PR to ensure correct guarding of all svc_inst access, as I think there are more cases where we may memset memory while other functions in other threads assume that it is valid.

Comment on lines 1031 to 1032

memset(svc_inst, 0, sizeof(*svc_inst));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memset needs to be done while we hold the mutex, otherwise we can risk memsetting something that has just been initialized in bt_csip_set_member_register.

We shall, however, never memset the mutex. After memsetting the mutex then it will be broken, as the mutex becomes invalid and is not initialized again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be many approaches

  1. store the mutex locally in function , memset the instance and save the mutex back to instance struct and unlock
  2. init the mutex everytime you register (no need to save it )
  3. keep the mutex out of the instance structure

which one you suggest ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do not memset the entire struct. If we keep the mutex as the start or end of the struct, we can apply offsetof when memsetting to avoid memsetting that like we do in https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/audio/bap_iso.c#L36-L54
  2. Alternatively we can reset instances more manually like https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/bluetooth/audio/bap_broadcast_assistant.c#L961-L1015

I think it makes the most sense to keep the mutex in the struct. I think option 2 can work nicely.

So I'd go for option 2, 4 or 5, where 2 is arguably semantically the best (and also the simplest)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went with option 4

On reboot, client list to notify is not updated properly.
Fix is to check and add the reconnected clients on
security changed cb.

Subscription check is added before notify to clients.
BT Enable check is added in the register function before adding
bonded devices to client list.

Also typo is corrected in add_bonded_addr_to_client_list in the
second loop.

Signed-off-by: Nithin Ramesh Myliattil <[email protected]>
@niym-ot niym-ot force-pushed the Fix_ntf_issue_in_csip_set_member branch from f4e9ead to a6596e5 Compare April 25, 2025 10:26
Copy link
Collaborator

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are fixing the issue that the PR sets out to.

Arguably all the mutex stuff should not have been part of this, as it requires a much larger change to function properly, and is only partially implemented by this PR.

I suggest to try to get this merged to fix the primary issue, and then we can work towards fixing the mutex stuff in a follow-up PR.

Comment on lines +942 to +945
err = k_mutex_lock(&inst->mutex, K_NO_WAIT);
if (err != 0) {
LOG_DBG("Failed to lock mutex: %d", err);
return -EBUSY;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to take the mutex in the register function?

The caller doesn't have a reference to the instance at this point, and there cannot be an activities on the instance at this point, so I don't see any way for multiple threads to access the instance at this point.

Copy link
Collaborator Author

@niym-ot niym-ot Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two threads can call register function and instance_cnt gets updated . can lead to undefined behaviour

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR created
#89087

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a bug in the way we use instance_cnt, so I've also created #89089 to fix that

@kartben kartben merged commit a94cdaf into zephyrproject-rtos:main Apr 25, 2025
27 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in Bluetooth LE Audio Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants