Were there any major changes after Milvus v2.4.15 that could have affected performance? #41169
-
I've tested various versions of Milvus using the same index options and approximately the same amount of data (although not exactly the same dataset). However, I observed a significant latency gap compared to Milvus v2.4.15. Interestingly, all versions after v2.4.15 consistently show higher latency compared to v2.4.15 under the same indexing conditions. Could you help me identify the cause of this performance regression? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 1 reply
-
Let me verify in my local, will let you know later. |
Beta Was this translation helpful? Give feedback.
-
I used this script to test, seems the performance is not much different between v2.4.15 and v2.4.17
|
Beta Was this translation helpful? Give feedback.
-
@yhmo configuration:
dataCoord:
segment:
maxSize: 4096
queryNode:
mmap:
mmapEnabled: false Here is the segment info. (All versions had almost the same document ratio.) As for my test code, I didn't use PyMilvus because I had some issues using gRPC with Locust. class SearchTest(FastHttpUser):
@task
def search_test(self):
data = {
"data": [np.random.uniform(low=-128, high=127, size=dimension).tolist()],
"annsField": "vector",
"limit": 1,
"collectionName": table,
"searchParams": {
"params": { "ef" : 500 }
},
"consistencyLevel": consistency_level,
"outputFields": [ "pkey" ]
}
with self.client.post(url="/v2/vectordb/entities/search", json=data, catch_response=True) as response:
resp = msgspec.json.decode(response.content, type=Response)
if resp.code != 0:
logging.error(resp.message)
response.failure(resp.code) |
Beta Was this translation helpful? Give feedback.
-
Two million of vectors(dim=384), HNSW index(m=48, efConstruction=800, efSearch=500), and limit=1.
On v2.4.15, the average RPC search latency is 4ms, and the average RESTful search latency is 13ms.
On v2.4.17, the average RPC search latency is 13ms, and the average RESTful search latency is 13ms.
The performance of RPC search is no difference between v2.4.15 and v2.4.17, which indicates the RPC latency is actual performance. The latency of RESTFUL search on v2.4.15 is much better than v2.4.17. The reason is: on v2.4.15, the "searchParams" is not correctly passed by the RESTFUL search interface, the search engine used a default "ef" value to search, the default value of "ef" is much smaller than 500 so that it much faster than v2.4.17.
So, in v2.4.15, although you have input the "ef=500", this parameter was not passed to the search engine, the latency you saw is not the actual latency of ef=500. The latency of v2.4.17 is the actual latency of ef=500. |
Beta Was this translation helpful? Give feedback.
Two million of vectors(dim=384), HNSW index(m=48, efConstruction=800, efSearch=500), and limit=1.
The test script: