The leaderboard shows the results of different models of four tasks: Code Retrieval, Code Summarization, Code Completion, and Type inference.
If you would like to report your results here, please submit an issue to CodeMind GitHub repository. All results will be updated if they pass our check.
MRR of our model and baseline methods for the task of code retrieval over CodeSearchNet dataset. (Best scores are in boldface.)
Rank | Model | Go | Java | JS | PHP | Python | Ruby |
---|---|---|---|---|---|---|---|
1 2022 |
cpt-code M
Code Retrieval baseline |
97.5 | 94.4 | 86.5 | 97.2 | 99.9 | 85.5 |
2 2023 |
CodeT5+ 770M
Code Retrieval baseline |
92.7 | 76.2 | 71.3 | 70.1 | 75.8 | 78 |
3 2023 |
CodeT5+ 220M
Code Retrieval baseline |
92.4 | 76.1 | 70.8 | 69.8 | 75.6 | 77.7 |
4 2020 |
GraphCodeBERT
Code Retrieval baseline |
84.1 | 75.7 | 71.1 | 72.5 | 87.9 | 73.2 |
5 2020 |
CodeBERT
Code Retrieval baseline |
69.3 | 86.8 | 74.8 | 70.6 | 84 | 70.6 |
6 2021 |
SelfAttn
Code Retrieval baseline |
78.45 | 66.55 | 50.38 | 65.78 | 79.09 | 47.96 |
7 2021 |
Conv1D
Code Retrieval baseline |
70.87 | 60.49 | 38.81 | 61.92 | 67.29 | 36.53 |
8 2021 |
NBOW
Code Retrieval baseline |
66.59 | 59.92 | 47.15 | 54.75 | 63.33 | 42.86 |
9 2021 |
BiRNN
Code Retrieval baseline |
65.8 | 48.6 | 23.23 | 51.36 | 48.28 | 19.35 |
Performance of our model and baseline methods for the task of code summarization over Python-Doc dataset. (Best scores are in boldface.)
Rank | Model | BLEU-4 | METEOR | BOUGE-L |
---|---|---|---|---|
1 07/01/2021 |
PLBART
Code Summarization baseline |
32.71 | 18.13 | 46.05 |
2 07/01/2021 |
Transformer + BPE
Code Summarization baseline |
31.57 | 17.74 | 45.18 |
3 07/01/2021 |
Transformer
Code Summarization baseline |
30.64 | 17.65 | 44.59 |
4 07/01/2021 |
Seq2Seq + Attn
Code Summarization baseline |
25.57 | 14.40 | 39.41 |
5 07/01/2021 |
Tree2Seq + Attn
Code Summarization baseline |
23.35 | 12.59 | 36.49 |
MRR of our model and baseline methods for the task of code completion over Py150 dataset. (Attr: attribute; Num: numeric constant; Name: variable, module; Func: function parameter name; Token: all tokens. Best scores are in boldface.)
Rank | Model | Accuracy | ||||
---|---|---|---|---|---|---|
Attr | Num | Name | Param | Token | ||
1 2021 |
TravTrans
Code Completion baseline |
72.08 | 68.55 | 76.33 | 71.08 | 83.17 |
2 2021 |
GPT-2
Code Completion baseline |
70.37 | 62.20 | 63.84 | 73.54 | 82.17 |
3 2022 |
PyCoder
Code Completion baseline |
\ | \ | \ | \ | 76.93 |
4 2021 |
LSTM
Code Completion baseline |
51.67 | 47.45 | 46.52 | 66.06 | 73.73 |