-
Notifications
You must be signed in to change notification settings - Fork 22
Question: Can changes be presented multiple times or missed? #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@bartelink We ran into an issue where Cosmos had some problems and kept redelivering the same documents, so the change feed processor does appear to redeliver documents if the checkpoint doesn't write or something happens. Our solution is that we keep a running list of document ids processed in the last 5 minutes and ignore a document if it is in that list. |
Thanks @jiffypopjr I can imagine implementing that as a workaround (and emitting a warning from my projector when that's tripped). Wrt my overall reason for asking this question though
|
Technically the idea is to provide an at-least-once delivery. @mkolt can probably give more details. There are multiple reasons for a change to be sent twice:
Scenarios I'm aware of that might lead to missing items:
Scenarios that seem like you are missing items
|
Thanks, that's exactly the sort of response I was after.
this seems a reasonable default. I wasn't able to discern that this is the case based on navigating intellisense and reading docs - did I miss it or should it be added somewhere? Ideally something like what you said would make it into readme.md or somewhere else prominent My outstanding question (which reading the code will doubtless reveal, but should not be my only way to find out) is whether interesting/dangerous/simple things happen when 2 partitions become 3 and vice versa - what's the algorithm and does it intrinsically risk duplicating items or is there an obvious simple answer as to why that's already catered for ? I'm happy to raise that as a separate ticket if you feel that makes sense and let this Issue focus on reasons for >1 delivery and its interaction with how one checkpoints |
@ealsur Regarding the medium post - I like it and feel it needs to be linked to from the repo (I'll do the PR if you won't!). There's plenty prior art for doing such a thing in other repos - your writing these is very helpful for people considering whether CFP is a fit for their needs. I guess I eventually might have bingled |
Thanks @bartelink! I think what you mention would be a great addition to the current README, I'll see if I can make some time and write it down 😄 (BTW, thanks for the corrections in the post! My english is still work in progress hehe) |
When we write to Cosmos via the Table API and listen using the ChangeFeedProcessorLibrary to increment a counter, we consistently see missed changes. We have two applications, one creating a record, and another updating it in quick succession. The amount of changes we miss goes down if there is less happening in ProcessChangesAsync and if we decrease the polling interval, but it does not go away. This is on the scale of 5000 changes. |
@jkdey I'm asking a very different question - what you are describing is by design. The bottom line is that the changefeed always contains one instance of any one document. You'll see a document iff it has been updated to trigger it "moving beyond your cursor", which would make its position in the modification order get updated such that you 'see it again' if you happen to have traversed it already. Remember, there is nothing anywhere storing a copy of every change - you're just querying all the documents in order of when they were last touched. |
Hi @bartelink to clarify the situation, we appear to be consistently missing the last update to a record. Is this consistent with expected behavior? |
That would be surprising and concerning - we make considerable use of it and have not run into such issues - the bottom line is that its just queries say WHERE LSN >= x and each doc update bears an LSN value which moves up. A repro would definitely be required, but I'd tend to assume SELECT is not broken in this instance. |
@jkdey If you start a new subscription with a new leaseid, do you see everything you expect ? Have you tried doing a basic validation to sanity check it outside the context of your larger system ? |
Thank you for the suggestions @bartelink I will scrutinize what happens when a lease is started and/or try to figure out how to leverage that validator code and report back. |
From the documentation, I was unable to discern the answer to the following question - I hope I'm wrong, but I feel I've spent reasonable time trying to answer the question myself. Asking here as it might form a good documentation request and/or a place to put a canonical answer...
As one scales up and down to multiple partitions, what are the guarantees provided by the ChangeFeedProcessor as a whole wrt the following:
(I'm thinking reading the integration tests and/or source will give me hints, and I'll do some testing, but it would really help me a lot if someone would be so kind as to give me an answer in advance of me doing the legwork!)
The text was updated successfully, but these errors were encountered: