@@ -155,20 +155,22 @@ This also includes instructions on [specifying your software runtime](docs/submi
## 💻 What hardware does my code run on ?
You can find more details about the hardware and system configuration in [docs/hardware-and-system-config.md](docs/hardware-and-system-config.md).
In summary, we provide you `2` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 1; and `4` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 2.
In summary, we provide you `4` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 2.
Your solution will be given a certain amount of time for inference, after which it would be immediately killed and no results would be available. The time limit is set at
For reference, the baseline solution with zero-shot [Vicuna-7B](https://huggingface.co/lmsys/vicuna-7b-v1.5)(Find it [**here**](https://gitlab.aicrowd.com/aicrowd/challenges/amazon-kdd-cup-2024/amazon-kdd-cup-2024-starter-kit/-/blob/master/models/dummy_model.py)) consumes the following amount of time.
For reference, the baseline solution with zero-shot LLaMA3-8B-instruct consumes the following amount of time.
We limit the prediction time of each sample to at most **15 seconds**.
We limit the prediction time of each sample to at most **10 seconds**. This limit applies at a batch level. For example, for a batch of 8 samples, you should return the prediction after at most 80 seconds. Otherwise, your submission will be killed.
Your maximum repo size is 200GB.
## 🧩 How are my model responses parsed by the evaluators ?
Please refer to [parsers.py](parsers.py) for more details on how we parse your model responses.