## Hardware and System Configuration We apply a limit on the hardware available to each participant to run their solutions. - All solutions will be run on [AWS g4dn.12xlarge](https://aws.amazon.com/ec2/instance-types/g4/) instances equipped with [NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/). - The hardware available is: - `4` x [NVIDIA T4 GPU](https://www.nvidia.com/en-us/data-center/tesla-t4/s). - `40` x vCPU (`20` physical CPU cores) - `180GB` RAM Please note that NVIDIA T4 uses a somewhat outdated architecture and is thus not compatible with certain acceleration toolkits (e.g. Flash Attention), so please be careful about compatibility. Besides, the following restrictions will also be imposed: - Network connection will be disabled. - Each submission will be assigned a certain amount of time to run. Submissions that exceed the time limits will be killed and will not be evaluated. The tentative time limit is set as follows **[TO BE ADDED AND TESTED WITH AICROWD SUBMISSION SYSTEM]**. - Each team will be able to make up to **1 submission per day**, with a maximum of **[TO BE ADDED AND TESTED WITH AICROWD SUBMISSION SYSTEM]**. Based on the hardware and system configuration, we recommend participants to begin with 7B and 13B models. According to our experiments, models like Llama-2 13B can perform inference smoothly on 4 NVIDIA T4 GPUs, while 13B models will result in OOM.