@@ -80,6 +80,9 @@ As all tasks are converted into text generation tasks, rule-based parsers will p
...
@@ -80,6 +80,9 @@ As all tasks are converted into text generation tasks, rule-based parsers will p
Since all these metrics range from [0, 1], we calculate the average metric for all tasks within each track (macro-averaged) to determine the overall score for a track and identify track winners. The overall score of Track 5 will be calculated by averaging scores in Tracks 1-4.
Since all these metrics range from [0, 1], we calculate the average metric for all tasks within each track (macro-averaged) to determine the overall score for a track and identify track winners. The overall score of Track 5 will be calculated by averaging scores in Tracks 1-4.
Please refer to [local_evaluation.py](local_evaluation.py) for more details on how we will evaluate your submissions.
# 🗃️ Submission
# 🗃️ Submission
The challenge would be evaluated as a code competition. Participants must submit their code and essential resources, such as fine-tuned model weights and indices for Retrieval-Augmented Generation (RAG), which will be run on our servers to generate results and then for evaluation.
The challenge would be evaluated as a code competition. Participants must submit their code and essential resources, such as fine-tuned model weights and indices for Retrieval-Augmented Generation (RAG), which will be run on our servers to generate results and then for evaluation.
...
@@ -225,10 +228,6 @@ Please follow the instructions in [docs/submission.md](ocs/submission.md) to mak
...
@@ -225,10 +228,6 @@ Please follow the instructions in [docs/submission.md](ocs/submission.md) to mak
**Note**: **Remember to accept the Challenge Rules** on the challenge page, and task page before making your first submission.
**Note**: **Remember to accept the Challenge Rules** on the challenge page, and task page before making your first submission.
## Evaluation Metrics & Local Evaluation
Please refer to [local_evaluation.py](local_evaluation.py) for more details on how we will evaluate your submissions.
## Hardware and System Configuration
## Hardware and System Configuration
We apply a limit on the hardware available to each participant to run their solutions. Specifically,
We apply a limit on the hardware available to each participant to run their solutions. Specifically,