diff --git a/aicrowd.json b/aicrowd.json index c4721d2f3d444bf362d3758fd1893da57aedb5fa..b5037bad2ba320494f29d69dd39f3b4c9a39bec0 100644 --- a/aicrowd.json +++ b/aicrowd.json @@ -4,5 +4,6 @@ "aicrowd-bot" ], "gpu": false, + "gpu_count": 0, "description": "(optional) description about your awesome agent" } diff --git a/docs/submission.md b/docs/submission.md index 4dc03ab4aaa44c962a8b646f90f672b7fc83a88f..7cb7a28f214b1a71f2df209979d93fc8947873ae 100644 --- a/docs/submission.md +++ b/docs/submission.md @@ -45,118 +45,17 @@ Your project should follow the structure outlined in the starter kit. Here’s a │ ├── base_model.py # Base model class │ ├── dummy_model.py # A simple or placeholder model for demonstration or testing │ └── user_config.py # IMPORTANT: Configuration file to specify your model -├── parsers.py # Model output parser ├── requirements.txt # Python packages to be installed for model development └── Dockerfile # Example Dockerfile for specifying runtime via Docker ``` -Remember, **your submission metadata JSON (`aicrowd.json`)** is crucial for mapping your submission to the challenge. Ensure it contains the correct `challenge_id`, `authors`, and other necessary information. To utilize GPUs, set the `"gpu": true` flag in your `aicrowd.json`. - -## Submitting to Different Tracks - -Specify the track by setting the appropriate `challenge_id` in your [aicrowd.json](aicrowd.json). Here are the challenge IDs for various tracks: - -| Track Name | Challenge ID | -|-----------------------------------|-----------------------------------------------------| -| Retrieval Summarization | `meta-kdd-cup-24-crag-retrieval-summarization` | -| Knowledge Graph and Web Retrieval | `meta-kdd-cup-24-crag-knowledge-graph-and-web-retrieval` | -| End-to-end Retrieval Augmented Generation | `meta-kdd-cup-24-crag-end-to-end-retrieval-augmented-generation` | - -## Submission Entry Point - -The evaluation process will instantiate a model from `models/user_config.py` for evaluation. Ensure this configuration is set correctly. - -## Setting Up SSH Keys - -You will have to add your SSH Keys to your GitLab account by going to your profile settings [here](https://gitlab.aicrowd.com/profile/keys). If you do not have SSH Keys, you will first need to [generate one](https://docs.gitlab.com/ee/ssh/README.html#generating-a-new-ssh-key-pair). - - -## Managing Large Model Files with Git LFS - -When preparing your submission, it's crucial to ensure all necessary models and files required by your inference code are properly saved and included. Due to the potentially large size of model weight files, we highly recommend using Git Large File Storage (Git LFS) to manage these files efficiently. - -### Why Use Git LFS? - -Git LFS is designed to handle large files more effectively than Git's default handling of large files. This ensures smoother operations and avoids common errors associated with large files, such as: - -- `fatal: the remote end hung up unexpectedly` -- `remote: fatal: pack exceeds maximum allowed size` - -These errors typically occur when large files are directly checked into the Git repository without Git LFS, leading to challenges in handling and transferring those files. - -### Steps to Use Git LFS - -1. **Install Git LFS**: If you haven't already, install Git LFS on your machine. Detailed instructions can be found [here](https://git-lfs.github.com/). - -2. **Track Large Files**: Use Git LFS to track the large files within your project. You can do this by running `git lfs track "*.model"` (replace `*.model` with your file type). - -3. **Add and Commit**: After tracking the large files with Git LFS, add and commit them as you would with any other file. Git LFS will automatically handle these files differently to optimize their storage and transfer. - -4. **Push to Repository**: When you push your changes to the repository, Git LFS will manage the large files, ensuring a smooth push process. - -### Handling Previously Committed Large Files - -If you have already committed large files directly to your Git repository without using Git LFS, you may encounter issues. These files, even if not present in the current working directory, could still be in the Git history, leading to errors. - -To resolve this, ensure that the large files are removed from the Git history and then re-add and commit them using Git LFS. This process cleans up the repository's history and avoids the aforementioned errors. - -For more information on how to upload large files to your submission and detailed guidance on using Git LFS, please refer to [this detailed guide](https://discourse.aicrowd.com/t/how-to-upload-large-files-size-to-your-submission/2304). - -**Note**: Properly managing large files not only facilitates smoother operations for you but also ensures that the evaluation process can proceed without hindrances. - -# Guide to Making Your First Submission - -This document is designed to assist you in making your initial submission smoothly. Below, you'll find step-by-step instructions on specifying your software runtime and dependencies, structuring your code, and finally, submitting your project. Follow these guidelines to ensure a smooth submission process. - -# Table of Contents - -1. [Specifying Software Runtime and Dependencies](#specifying-software-runtime-and-dependencies) -2. [Code Structure Guidelines](#code-structure-guidelines) -3. [Submitting to Different Tracks](#submitting-to-different-tracks) -4. [Submission Entry Point](#submission-entry-point) -5. [Setting Up SSH Keys](#setting-up-ssh-keys) -6. [Managing Large Model Files with Git LFS](#managing-large-model-files-with-git-lfs) - - [Why Use Git LFS?](#why-use-git-lfs) - - [Steps to Use Git LFS](#steps-to-use-git-lfs) - - [Handling Previously Committed Large Files](#handling-previously-committed-large-files) -7. [How to Submit Your Code](#how-to-submit-your-code) - - -## Specifying Software Runtime and Dependencies - -Our platform supports custom runtime environments. This means you have the flexibility to choose any libraries or frameworks necessary for your project. Here’s how you can specify your runtime and dependencies: - -- **`requirements.txt`**: List any PyPI packages your project needs. -- **`apt.txt`**: Include any apt packages required. -- **`Dockerfile`**: Optionally, you can provide your own Dockerfile. An example is located at `utilities/_Dockerfile`, which can serve as a helpful starting point. - -For detailed setup instructions regarding runtime dependencies, refer to the documentation in the `docs/runtime.md` file. - -## Code Structure Guidelines - -Your project should follow the structure outlined in the starter kit. Here’s a brief overview of what each component represents: - +Remember, **your submission metadata JSON (`aicrowd.json`)** is crucial for mapping your submission to the challenge. Ensure it contains the correct `challenge_id`, `authors`, and other necessary information. To utilize GPUs, set the `"gpu": true` flag in your [aicrowd.json](../aicrowd.json) and `"gpu_count"` to a number between `1` and `4`. +For example, if you require 2 GPUS, you should set: ``` -. -├── README.md # Project documentation and setup instructions -├── aicrowd.json # Submission meta information - like your username, track name -├── data -│ └── development.json # Development dataset local testing -├── docs -│ └── runtime.md # Documentation on the runtime environment setup, dependency confifgs -├── local_evaluation.py # Use this to check your model evaluation flow locally -├── metrics.py # Scripts to calculate evaluation metrics for your model's performance -├── models -│ ├── README.md # Documentation specific to the implementation of model interfaces -│ ├── base_model.py # Base model class -│ ├── dummy_model.py # A simple or placeholder model for demonstration or testing -│ └── user_config.py # IMPORTANT: Configuration file to specify your model -├── requirements.txt # Python packages to be installed for model development -└── Dockerfile # Example Dockerfile for specifying runtime via Docker + "gpu": true, + "gpu_count": 2 ``` -Remember, **your submission metadata JSON (`aicrowd.json`)** is crucial for mapping your submission to the challenge. Ensure it contains the correct `challenge_id`, `authors`, and other necessary information. To utilize GPUs, set the `"gpu": true` flag in your `aicrowd.json`. - ## Submitting to Different Tracks Specify the track by setting the appropriate `challenge_id` in your [aicrowd.json](aicrowd.json). Here are the challenge IDs for various tracks: @@ -169,7 +68,7 @@ Specify the track by setting the appropriate `challenge_id` in your [aicrowd.jso ## Submission Entry Point -The evaluation process will instantiate a model from `models/user_config.py` for evaluation. Ensure this configuration is set correctly. +The evaluation process will instantiate a model from [models/user_config.py](../models/user_config.py) for evaluation. Ensure this configuration is set correctly. ## Setting Up SSH Keys