-
Notifications
You must be signed in to change notification settings - Fork 380
Add CLI tool for inspecting and creating dataframe of the results on a give leaderboard #2174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi Kenneth, it looks like it is not supported yet. I got the following error message:
|
Hi @lifu-tu this is a suggestion for a feature, not an implemented feature - I have made the main comment a bit more clear, but it sounds like it is something that you would like as well :) |
@KennethEnevoldsen @Samoed, I have thought of following solutions here: |
Yes. I'm also suggest to support multiple output format xlsx, csv, md etc. Also there should be benchmarks as optional and use local path for building the table |
@Samoed , if benchmarks is optional then which results we will consider? Isn't that if benchmark is mentioned then, the tasks will be fixed because we consider only those tasks under that particular benchmark and take results of all models on those tasks only? |
If benchmark specified then yes, otherwise all results can be taken from results dir |
So, we will consider results of all tasks for that particular model, right? |
Yes
I think like
But I think that we can also set aggregation level for slit and subset |
@Samoed What exactly subset is here? and can you clarify what exactly you mean by aggregation level for split and subset Also, won't split be depend on task and not model, so it will be different for each task? |
Tasks have subsets
Yes
I think we can have parameter
|
So, the split field should be different for each task, then we should not have that split column, right? |
No, we should have split column if selected |
I am still not able to understand it clearly.
For subset aggregation level, we will take aggregate of different splits like dev, test so we will have table like:
For task aggregation level, we will take aggregate of different splits like dev, test so we will have table like:
|
Yes, like this! |
@Samoed , I have few more doubts:
|
|
Can you once give me how table look like for |
I see problem now. We can create multi-index (but it's a bit difficult to work) or transpose it.
For subset aggregation leve
For task aggregation level
Example of transpose table
@KennethEnevoldsen What do you think? |
If this is hierarchy, then for transpose tables, shouldn't we have in 1st table, 1st split column and then subset column. and also in 2nd table split column only |
It's not directly transpose, but just changing axis. We have hierarchy task->subset->split and it should follow it import pandas as pd
import numpy as np
arrays = [
["task 1"] * 4,
["subset 1"] * 2 + ["subset 2"] * 2,
["test", "dev"] * 2,
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["task", "subset", "split"])
df = pd.DataFrame(np.random.randn(4), index=index, columns=["model 1"])
|
You mentioned that it was task->split->subset when I asked this before. And also for any results files from results repo, I am seeing this hierarchy only. |
Sorry for the confusion, but the correct order should be
Yes, but they have subsets inside |
so they are inside split only. How then hierarchy is: |
@KennethEnevoldsen, I was working on this one. It will be good if you can have an opinion on the above discussion. |
Just wanted to confirm whether we should go with this or not?
Just have confusion here in this hierarchy. In json files of any results, I am seeing hierarchy as: |
Ah good point. Follow the JSON file:
|
If thats case, then here, the order column should also follow that? And in 1st table, we should have 1st split column and then subset column. and also in 2nd table split column only. So, it will be as follows: I am renaming this as subset instead of split as we have results for each subset under each split:
Renaming this as split aggregation level, as we will take aggregate of different subsets for each split. Table will look like:
Basically I am defining aggregation level as level upto which we want to take aggregate results. Let me know if this sounds correct |
Sounds correct! |
Currently it is quite hard to "just" get a data frame for inspecting the results.
We could add a CLI like:
This would also be useful when comparing models on the results repo as well.
The text was updated successfully, but these errors were encountered: