Replies: 4 comments 8 replies
-
I was wrong, 'fast line' defers to 'rectangular' if the line goes along an image axis. 'rectangular' and 'fast line' use an O(1) algorithm along each axis. This algorithm requires two buffers of the same size as the image line. Having a very long image line like your case increases the cost. The break point for switching from brute force to this algorithm is much lower for "normal" image sizes. I don't know how to pick a value that is always right. If you set The 'discrete line' algorithm is the same as used for 'elliptic' and custom shapes. This algorithm splits the SE into runs along the axis where you get fewest runs (a run is a contiguous set of pixels). For your use of this method you get one run. It then keeps a number of steps for the max value within each run, each time the SE is shifted right, the number of steps is reduced by one. When this value reaches zero, the max value has exited the SE, and the new max for the run is found by brute force. Also, for every step, the new value that comes in is examined to see if it will be the new max (which is the case also if it's equal in value to the current max). This is thus very similar to your method, but without keeping a buffer of the values. I don't know why your implementation is so much faster though. Maybe it's just that the DIPlib code works for any number of runs, not just one? Wouldn't mind comparing. |
Beta Was this translation helpful? Give feedback.
-
That would be nice addition. Following the path of least surprise. As a bonus The literature on structure element decomposition is quite rich so I picked something that I could implement quickly.
I guess that you've seen it. In summary: it performs a line decomposition and has some tricks to reduce the memory usage low. My implementation can only handle structure elements where the lines are centered and of odd size . Against MATLAB it is quite performant, especially given the few lines of code needed: In this case disk-0 is the "true" disk (`strel('disk', r, 0)') while the 4- and 8- are the standard approximations. It is a big mystery why the disk-4 becomes slower than disk-8 around r=35. If you implemented something from scratch for 2D + 3D, what would you pick? (I know that you always pick from the batch of ND methods). When I've added a parameter for stride to my 1d erosion/dilation filter I'll see how to make it fit with DIPlib, but don't count on it. The table is quite full until summer. |
Beta Was this translation helpful? Give feedback.
-
The memory requirement is sure a downside. And the overall situation with one method per class of structuring element is not ideal (and possibly x the number of hardware configurations as well). Too much code to maintain and test! If the strel is H x H and the image is N x N, then one dilation per unique line length in the strel will be stored for the H "hot" lines of the input image. Worst case extra storage will be H x H x N. In best case, like for a rectangle the additional memory usage is down to H x N. There is always the option to only store a logarithmic number of dilations, 2^1+1, 2^2+1, 2^4+1 ... and then calculate the intermediate dilations by a single comparison per pixel "on the fly" so there is a possible memory to performance tradeoff to be made. Two comparisons vs DIPlib. Input is a 2048 x 2048 float image. w is the side length of the structuring element. For squares my implementation requires H comparisons per pixel to assemble along the strided axis. So the method is no silver bullet. For spheres my implementation is a little faster which is nice since it can handle a larger class of structuring elements (any -- if one follow the paper closely). My implementation was not very faithful to the original so it is possible that I got a thing, or two, wrong -- I jumped directly to the implemention phase once I (though) that I got the idea. Thanks for all the advice! It is great fun to get some first hand experience with these things that I've been using for such a long time now :) |
Beta Was this translation helpful? Give feedback.
-
Hello!
I lazily tested the 1D dilation performance using convex binary structuring elements, like [1] (w=1), [1,1,1] (w=3), etc. and though that I should what I found. Included was
implementation details
In MATLAB I timed with:DIPlib was used like this:
Results
Observations
The imdilate behavior of increasing performance with larger structuring elements was a surprise :) Now that my code also has that characteristics I can see why.
It looks like w=3 is faster than w=1, most likely it is a benchmarking artefact (should have done it properly from the beginning).
For small sizes, especially w=3 I would guess that brute force is used. The results indicates that both MATLAB/imdilate and DIPlib/"discrete line" can benefit from delaying the switch from brute force to something more clever. My implementation found the sweet spot around 19 (most likely machine dependent).
My current implementation uses a circular buffer with a pointer to the max element which I update when a larger value is inserted or when the current maxima goes out of scope (triggers w comparisons). The average performance is good but the worst case (monotonically ascending input values) degrades the performance to O(n*w). I guess that I can find the theoretical performance estimates of the methods in DIPlib if I follow the references.
Cheers,
Erik
Beta Was this translation helpful? Give feedback.
All reactions