Skip to content
Snippets Groups Projects
Commit 954e30ec authored by Dipam Chakraborty's avatar Dipam Chakraborty
Browse files

Add baseline descrition in readme

parent 50c9385f
No related branches found
No related tags found
1 merge request!4Round 2 Baseline Release
......@@ -21,6 +21,7 @@ This repository contains:
# Table of contents
- [🏆 About the Challenge](#-about-the-challenge)
- [**What's special about this challenge?** ⭐](#whats-special-about-this-challenge-)
- [🧘 About the Round 2 Baseline](#-about-the-round-2-baseline)
- [💪 Getting Started](#-getting-started)
- [Download Dataset](#download-dataset)
- [Dataset Distribution](#dataset-distribution)
......@@ -84,6 +85,25 @@ The `Private Leaderboard` (computed at the end of Round-2), will use a different
**CHANGELOG** | Round 2 | March 3rd, 2022 : Please refer to the [CHANGELOG.md](CHANGELOG.md) for more details on everything that changed between Round 1 & Round 2.
# 🧘 About the Round 2 Baseline
This repository contains the Round 2 baseline which contains fast heuristic implementations of some simple ideas
- **Purchase images with more labels** - For multilabel datasets, often having images with more than one label gives a boost for deep learning models.
- **Purchase uncertain images** - Purchase images which have the most uncertainty in their predictions. While many methods exists to measure uncertainty, a simple output probability based heuristic method is used here.
- **Purchase images to balance labels** - Well balanced datasets can improve model performance in deep learning. We set a uniform target distribution and try to purchase labels to get closer to that distribution. The provided code can try to purchase labels to any target distribution.
The baseline follows the following steps, please check the code in `run.py`:
1. **Pre-train** - We train a simple Resnet model on the pre-training dataset during the pre-train phase. Model can be found at `baseline_utils\model.py` and training loop under `baseline_utils\training.py`
2. **Random purchase** - 20% of the purchase budget is used to make random purchases. While it may seem wasteful, it will help in case the training and unlabelled datasets distributions vary too much. Possibly conditional ideas can be applied here depending on how much the distributions vary based on these random purchases.
3. **Train and Predict** - With the extra labels purchased, further train and predict the labels for the rest of the unlabelled images.
4. **Purchase with more labels** - Use the predicted labels to purchase 30% of the budget for the images that have the most labels predicted. Here a straightforward threshold and count is used, but other ideas like adding the softmax probabilities can be used as well. Code can be found at `purchase_strategies\morelabels_purchase.py`
5. **Purchase uncertain** - 30% of the budget is used to purchase uncertain labels. Here a simple and fast heuristic is used to find uncertainty. This is based on the assumption that the probabilities close to 1 or 0 are "certain". Please note that this is not always the case with deep learning models, so feel free to try out other uncertainty methods. Code can be found at `purchase_strategies\purchase_uncertain.py`
6. **Balance Labels** - The rest of the budget is used to balance the labels. Label balancing can be tricky in multilabel settings because one needs to compute which set of label matches the closest to get the target distribution needed. Here we have setup a heuristic to find the a simple difference in current and target distributions and normalized it to find the closest prediction that matches this distribution. Code can be found at `purchase_strategies\balance_labels.py`
It should be noted that all these strategies depend on good predictions by the pre-trained model, so improving that model should also help purchase better labels.
# 💪 Getting Started
## Download Dataset
......
  • dmytro @dmytro ·

    Hey folks, i recommend this cool site register an account they offer betting on various sports like football, basketball, tennis, and golf. i suggest checking out the site because you can pick events that match your interests, and there are good bonuses for both new and regular customers.

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment