diff --git a/README.md b/README.md
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..b7673b17c26b6d12de6c58932f6b9a967cfc898d 100644
--- a/README.md
+++ b/README.md
@@ -0,0 +1,173 @@
+# TODO: Add banner
+![Banner image]()
+
+# **[Music Demixing Challenge 2023 - Robust Music Separation](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation)** - Starter Kit
+[![Discord](https://img.shields.io/discord/565639094860775436.svg)](https://discord.gg/fNRrSvZkry)
+
+This repository is the Music Demixing Challenge 2023 - Robust Music Separation **Starter kit**! It contains:
+*  **Documentation** on how to submit your models to the leaderboard
+*  **The procedure** for best practices and information on how we evaluate your agent, etc.
+*  **Starter code** for you to get started!
+
+Quick Links:
+
+* [Music Demixing Challenge 2023 - Robust Music Separation - Competition Page](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation)
+* [Discussion Forum](https://www.aicrowd.com/challenges/suadd-23-scene-understanding-for-autonomous-drone-delivery/discussion)
+* [SUADD 2023 Challenge Overview](https://www.aicrowd.com/challenges/suadd-23-scene-understanding-for-autonomous-drone-delivery)
+
+
+# Table of Contents
+1. [About the Scene Understanding for Autonomous Drone Delivery Challenge](#about-the-scene-understanding-for-autonomous-drone-delivery-challenge)
+2. [Evaluation](#evaluation)
+3. [Baselines](#baselines) 
+4. [How to test and debug locally](#how-to-test-and-debug-locally)
+5. [How to submit](#how-to-submit)
+6. [Dataset](#dataset)
+7. [Setting up your codebase](#setting-up-your-codebase)
+8. [FAQs](#faqs)
+
+# About the Music Demixing Challenge 2023
+
+Have you ever sung using a karaoke machine or made a DJ music mix of your favourite song? Have you wondered how hearing aids help people listen more clearly or how video conference software reduces background noise? 
+
+They all use the magic of audio separation. 
+
+Music source separation (MSS) attracts professional music creators as it enables remixing and revising songs in a way traditional equalisers don't. Suppressed vocals in songs can improve your karaoke night and provide a richer audio experience than conventional applications. 
+
+The Music Demixing Challenge 2023 (MDX23) is an opportunity for researchers and machine learning enthusiasts to test their skills by creating a system to **perform audio source separation**. 
+
+Given an **audio signal as input** (referred to as a "mixture"), you must **decompose in its different parts**. 
+
+ðŸŽ» ROBUST MUSIC SEPARATION
+
+This task will focus on music source separation. Participants will submit systems that separate a song into four instruments: vocals, bass, drums, and other (the instrument "other" contains signals of all instruments other than the first three, e.g., guitar or piano). 
+
+Karaoke systems can benefit from the audio source separation technology as users can sing over any original song, where the vocals have been suppressed, instead of picking from a set of "cover" songs specifically produced for karaoke.
+
+Similar to [Music Demixing Challenge 2021](https://www.aicrowd.com/challenges/music-demixing-challenge-ismir-2021), this task will have two leaderboards.
+
+**MUSDB18 Leaderboard**
+
+Participants in Leaderboard  will be allowed to train their system exclusively on the training part of the MUSDB18-HQ dataset. This dataset has become the standard in literature as it is free to use and allows anyone to start training source separation models.
+
+The label swaps are included in the dataset for this leaderboard. 
+
+**No bars held Leaderboard**
+
+This leaderboard will allow bleeding/mixtures in training data. You can train on any data that you like.
+For both the leaderboards, the winning teams will be required to publish their training code (to receive a prize) as it is about the training method. 
+
+# Evaluation
+
+
+
+# Baselines
+
+TODO: To be added
+
+# How to Test and Debug Locally
+
+The best way to test your models is to run your submission locally.
+
+You can do this by simply running  `python evaluate_locally.py`. **Note that your local setup and the server evalution runtime may vary.** Make sure you mention setup your runtime according to the section: [How do I specify my dependencies?](#how-do-i-specify-my-dependencies)
+
+# How to Submit
+
+You can use the submission script `source submit.sh <submission_text>`
+
+More information on submissions can be found in [SUBMISSION.md](/docs/submission.md).
+
+#### A high level description of the Challenge Procedure:
+1. **Sign up** to join the competition [on the AIcrowd website](https://www.aicrowd.com/challenges/neurips-2022-iglu-challenge).
+2. **Clone** this repo and start developing your solution.
+3. **Train** your models on IGLU, and ensure run.sh will generate rollouts.
+4. **Submit** your trained models to [AIcrowd Gitlab](https://gitlab.aicrowd.com)
+for evaluation (full instructions below). The automated evaluation setup
+will evaluate the submissions against the IGLU Gridworld environment for a fixed 
+number of rollouts to compute and report the metrics on the leaderboard
+of the competition.
+
+
+# Dataset
+
+Download the public dataset for this Task using the link below, you'll need to accept the rules of the competition to access the data. The data is same as the well known MUSDB18-HQ dataset and its compressed version.
+
+https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation/dataset_files
+
+
+# Setting Up Your Codebase
+
+AIcrowd provides great flexibility in the details of your submission!  
+Find the answers to FAQs about submission structure below, followed by 
+the guide for setting up this starter kit and linking it to the AIcrowd 
+GitLab.
+
+## FAQs
+
+* How do I submit a model?
+  * More information on submissions can be found at our [submission.md](/docs/submission.md). In short, you should push you code to the AIcrowd's gitlab with a specific git tag and the evaluation will be triggered automatically.
+
+### How do I specify my dependencies?
+
+We accept submissions with custom runtimes, so you can choose your 
+favorite! The configuration files typically include `requirements.txt` 
+(pypi packages), `apt.txt` (apt packages) or even your own `Dockerfile`.
+
+You can check detailed information about this in [runtime.md](/docs/runtime.md).
+
+### What should my code structure look like?
+
+Please follow the example structure as it is in the starter kit for the code structure.
+The different files and directories have following meaning:
+
+
+```
+.
+â”œâ”€â”€ aicrowd.json                # Add any descriptions about your model
+â”œâ”€â”€ apt.txt                     # Linux packages to be installed inside docker image
+â”œâ”€â”€ requirements.txt            # Python packages to be installed
+â”œâ”€â”€ evaluate_locally.py         # Use this to check your model evaluation flow locally
+â””â”€â”€ my_submission               # Place your models and related code here
+    â”œâ”€â”€ <Your model files>      # Add any models here for easy organization
+    â”œâ”€â”€ aicrowd_wrapper.py      # Keep this file unchanged
+    â””â”€â”€ user_config.py          # IMPORTANT: Add your model name here
+```
+
+### How can I get going with a completely new model?
+
+Train your model as you like, and when youâ€™re ready to submit, implement the inference class and import it to `my_submission/user_config.py`. Refer to [`my_submission/README.md`](my_submission/README.md) for a detailed explanation.
+
+Once you are ready, test your implementation `python evaluate_locally.py`
+
+### How do I actually make a submission?
+
+The submission is made by adding everything including the model to git,
+tagging the submission with a git tag that starts with `submission-`, and 
+pushing to AIcrowd's GitLab. The rest is done for you!
+
+For large model weight files, you'll need to use `git-lfs`
+
+More details are available at [docs/submission.md](/docs/submission.md).
+
+### Are there any hardware or time constraints?
+
+Your submission will need to complete predictions on all the **sound tracks** under **120 minutes**. Make sure you take advantage 
+of all the cores by parallelizing your code if needed. Incomplete submissions will fail.
+
+The machine where the submission will run will have following specifications:
+* 4 vCPUs
+* 16GB RAM
+
+## Contributors
+
+- [Dipam Chakraborty](https://www.aicrowd.com/participants/dipam)
+
+# ðŸ“Ž Important links
+- ðŸ’ª [Challenge Page](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation)
+- ðŸ—£ï¸ [Discussion Forum](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/discussion)
+- ðŸ† [Leaderboard](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation/leaderboards)
+- ðŸŽµ [Music Demixing Challenge 2021](https://www.aicrowd.com/challenges/music-demixing-challenge-ismir-2021)
+
+You may also like the new **Cinematic Sound Separation track**
+
+**Best of Luck** ðŸŽ‰ ðŸŽ‰
diff --git a/aicrowd.json b/aicrowd.json
index 108d2b999b7703cb50ef7a763912e61ccb5951da..b5b06e83a61762dffeabacdc29259e8d00084fbd 100644
--- a/aicrowd.json
+++ b/aicrowd.json
@@ -3,5 +3,6 @@
     "authors": [
       "aicrowd-bot"
     ],
+    "external_dataset_used": false,
     "description": "(optional) description about your awesome model"
 }
diff --git a/docs/runtime.md b/docs/runtime.md
new file mode 100644
index 0000000000000000000000000000000000000000..569257988dcacbd8ea98464b0825330f7e6d5a01
--- /dev/null
+++ b/docs/runtime.md
@@ -0,0 +1,30 @@
+## Adding your runtime
+
+This repository is a valid submission (and submission structure). 
+You can simply add your dependencies on top of this repository.
+
+Few of the most common ways are as follows:
+
+* `requirements.txt` -- The `pip3` packages used by your inference code. As you add new pip3 packages to your inference procedure either manually add them to `requirements.txt` or if your software runtime is simple, perform:
+    ```
+    # Put ALL of the current pip3 packages on your system in the submission
+    >> pip3 freeze >> requirements.txt
+    >> cat requirements.txt
+    aicrowd_api
+    coloredlogs
+    matplotlib
+    pandas
+    [...]
+    ```
+
+* `apt.txt` -- The Debian packages (via aptitude) used by your inference code!
+
+These files are used to construct your **AIcrowd submission docker containers** in which your code will run. 
+
+* `Dockerfile` -- **For advanced users only**. `Dockerfile` gives you more flexibility on defining the software runtime used during evaluations. 
+
+----
+
+ðŸ“š A more detailed summary of the same is available here : [How to specify runtime environment for your submission](https://discourse.aicrowd.com/t/how-to-specify-runtime-environment-for-your-submission/2274)
+
+ðŸ‘‹ In case you have any doubts or need help, you can reach out to us via Challenge [Discussions](https://www.aicrowd.com/challenges/music-demixing-challenge-2023/discussion) or [Discord](https://discord.gg/fNRrSvZkry).
diff --git a/docs/submission.md b/docs/submission.md
new file mode 100644
index 0000000000000000000000000000000000000000..6321584b4ffd47890ba607c25f694fe41a4f4f26
--- /dev/null
+++ b/docs/submission.md
@@ -0,0 +1,71 @@
+# Making submission
+
+This file will help you in making your first submission.
+
+
+## Submission Entrypoint
+
+The evaluator will create an instance of a Classifier and Ranker specified in from `models/user_config.py` to run the evaluation. 
+
+## Setting up SSH keys
+
+You will have to add your SSH Keys to your GitLab account by going to your profile settings [here](https://gitlab.aicrowd.com/profile/keys). If you do not have SSH Keys, you will first need to [generate one](https://docs.gitlab.com/ee/ssh/README.html#generating-a-new-ssh-key-pair).
+
+
+## Submit using the utility script
+
+You can run the following to make a submission.
+
+```bash
+./submit.sh <description-phrase> # example: ./submit.sh my-first-model 
+```
+
+`./submit.sh` contains a few git commands that will push your code to AIcrowd GitLab.
+
+**Note:** In case you see an error message like `git: 'lfs' is not a git command. See 'git --help'.`, please make sure you have git LFS installed. You can install it using `git lfs install` or refer [Git LFS website](https://git-lfs.github.com/).
+
+## Pushing the code manually
+
+### IMPORTANT: Saving Models before submission!
+
+Before you submit make sure that you have saved your models, which are needed by your inference code.
+In case your files are larger in size you can use `git-lfs` to upload them. More details [here](https://discourse.aicrowd.com/t/how-to-upload-large-files-size-to-your-submission/2304).
+
+## How to submit your code?
+
+You can create a submission by making a _tag push_ to your repository on [https://gitlab.aicrowd.com/](https://gitlab.aicrowd.com/).
+**Any tag push (where the tag name begins with "submission-") to your private repository is considered as a submission**
+
+```bash
+cd mdx-2023-robust-music-separation-starter-kit
+
+# Add AIcrowd git remote endpoint
+git remote add aicrowd git@gitlab.aicrowd.com:<YOUR_AICROWD_USER_NAME>/mdx-2023-robust-music-separation-starter-kit.git 
+git push aicrowd master
+```
+
+```bash
+
+# Commit All your changes
+git commit -am "My commit message"
+
+# Create a tag for your submission and push
+git tag -am "submission-v0.1" submission-v0.1
+git push aicrowd master
+git push aicrowd submission-v0.1
+
+# Note : If the contents of your repository (latest commit hash) does not change,
+# then pushing a new tag will **not** trigger a new evaluation.
+```
+
+You now should be able to see the details of your submission at:
+`https://gitlab.aicrowd.com/<YOUR_AICROWD_USER_NAME>/mdx-2023-robust-music-separation-starter-kit/issues`
+
+**NOTE**: Please remember to update your username instead of `<YOUR_AICROWD_USER_NAME>` in the above link :wink:
+
+After completing above steps, you should start seeing something like below to take shape in your Repository -> Issues page:
+![](https://i.imgur.com/17U52oB.png)
+
+### Other helpful files
+
+ðŸ‘‰ [runtime.md](/docs/runtime.md)
diff --git a/evaluate_locally.py b/evaluate_locally.py
new file mode 100644
index 0000000000000000000000000000000000000000..662330e01ac0794ec8a68d3155de8e7cdc941a54
--- /dev/null
+++ b/evaluate_locally.py
@@ -0,0 +1,98 @@
+import os
+from tqdm.auto import tqdm
+import numpy as np
+import soundfile
+
+from my_submission.aicrowd_wrapper import AIcrowdWrapper
+from local_evaluator.sisec21_evaluation.metrics import GlobalSDR
+
+
+def check_data(datafolder):
+    """
+    Checks if the data is downloaded and placed correctly
+    """
+    inputsfolder = datafolder
+    groundtruthfolder = datafolder
+    dl_text = ("Please download the public data from"
+               "\n https://www.aicrowd.com/challenges/music-demixing-challenge-2023/problems/robust-music-separation/dataset_files"
+               "\n And unzip it with ==> unzip <zip_name> -d public_dataset")
+    if not os.path.exists(datafolder):
+        raise NameError(f'No folder named {datafolder} \n {dl_text}')
+    if not os.path.exists(groundtruthfolder):
+        raise NameError(f'No folder named {groundtruthfolder} \n {dl_text}')
+
+def calculate_metrics(ground_truth_path, prediction_path):
+    """
+    Calculate metrics for one prediction and ground truth pair for all instruments
+    """
+    metric = GlobalSDR()
+    # read in all WAV files for the four instruments
+    gt = []
+    se = []
+    instruments = ['bass', 'drums', 'other', 'vocals']
+    for instr in instruments:
+        _gt, _ = soundfile.read(os.path.join(ground_truth_path, instr + '.wav'))
+        _se, _ = soundfile.read(os.path.join(prediction_path, instr + '.wav'))
+        gt.append(_gt)
+        se.append(_se)
+    gt = np.stack(gt) # shape: n_sources x n_samples x n_channels
+    se = np.stack(se) # shape: n_sources x n_samples x n_channels
+    # compute scores
+    music_scores = metric(gt, se)
+
+    metrics = {f"sdr_{instr}" : float(score)  for instr, score in zip(instruments, music_scores)}
+    metrics['mean_sdr'] = float(np.mean(music_scores))
+    return metrics
+
+def evaluate(LocalEvalConfig):
+    """
+    Runs local evaluation for the model
+    Final evaluation code is the same as the evaluator
+    """
+    datafolder = LocalEvalConfig.DATA_FOLDER
+    
+    check_data(datafolder)
+    inputsfolder = datafolder
+    groundtruthfolder = datafolder
+
+    preds_folder = LocalEvalConfig.OUTPUTS_FOLDER
+
+    model = AIcrowdWrapper(predictions_dir=preds_folder, dataset_dir=datafolder)
+    folder_names = os.listdir(datafolder)
+
+    for fname in tqdm(folder_names, desc="Demixing"):
+        model.separate_music_file(fname)
+
+    # Evalaute metrics
+    all_metrics = {}
+    
+    folder_names = os.listdir(datafolder)
+    for fname in tqdm(folder_names, desc="Calculating scores"):
+        ground_truth_path = os.path.join(groundtruthfolder, fname)
+        prediction_path = os.path.join(preds_folder, fname)
+        all_metrics[fname] = calculate_metrics(ground_truth_path, prediction_path)
+        
+    
+    metric_keys = list(all_metrics.values())[0].keys()
+    metrics_lists = {mk: [] for mk in metric_keys}
+    for metrics in all_metrics.values():
+        for mk in metrics:
+            metrics_lists[mk].append(metrics[mk])
+    
+    print("Evaluation Results")
+    results = {key: np.mean(metric_list) for key, metric_list in metrics_lists.items()}
+    for k,v in results.items():
+        print(k,v)
+
+
+if __name__ == "__main__":
+    # change the local config as needed
+    class LocalEvalConfig:
+        DATA_FOLDER = './public_dataset/MUSDB18-7-WAV/test/'
+        OUTPUTS_FOLDER = './evaluator_outputs'
+
+    outfolder=  LocalEvalConfig.OUTPUTS_FOLDER
+    if not os.path.exists(outfolder):
+        os.mkdir(outfolder)
+    
+    evaluate(LocalEvalConfig)
diff --git a/local_evaluation.py b/local_evaluation.py
deleted file mode 100644
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000
diff --git a/local_evaluator/README.md b/local_evaluator/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a5bc46762854d4faf1a740bfb1491ff69f63fee9
--- /dev/null
+++ b/local_evaluator/README.md
@@ -0,0 +1,5 @@
+# Helper code for evaluation
+
+This folder contains helper code for local evaluation.
+
+**Note**: Changing anything here will not change the actual evaluation on aicrowd.
\ No newline at end of file
diff --git a/local_evaluator/sisec21_evaluation/README.md b/local_evaluator/sisec21_evaluation/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..fe316638be0ff4de53a3ed6723baafd94f5045c3
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/README.md
@@ -0,0 +1,25 @@
+# SiSEC 2021 Evaluation
+
+This repository contains code to see which measure we should use for the
+contest.
+
+```bash
+$ python eval.py --references-folder REF_DIR        \
+                 --separations-folder SEP_DIR       \
+                 --output RESULT.mat
+```
+
+From the different metrics, we choose `GlobalSDR` as the metric for SiSEC2021.
+Here is an example how to compute it:
+
+```
+$ python eval_sisec21.py --reference-folder sample/groundtruth/AM\ Contra\ -\ Heart\ Peripheral \
+                         --separation-folder sample/separation/AM\ Contra\ -\ Heart\ Peripheral
+Evaluated separation in sample/separation/AM Contra - Heart Peripheral/:
+	Bass: SDR_instr = 2.0729923248291016
+	Drums: SDR_instr = 5.259542465209961
+	Other: SDR_instr = 0.4605589509010315
+	Vocals: SDR_instr = 10.389951705932617
+SDR_song = 4.5457611083984375
+```
+
diff --git a/local_evaluator/sisec21_evaluation/eval.py b/local_evaluator/sisec21_evaluation/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..e7a5933e867b6211b11a7e7ff0f4e51e9667710b
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/eval.py
@@ -0,0 +1,100 @@
+"""
+eval.py
+
+Evaluate all metrics for a given separation.
+Usage:
+  $ python eval.py --references-folder REF_DIR        \
+                   --separations-folder SEP_DIR       \
+                   --output RESULT.mat
+
+Stefan Uhlich, Sony RDC
+"""
+import argparse
+from multiprocessing import Pool
+import musdb
+import numpy as np
+import os
+from scipy.io import savemat
+import stempeg
+
+from metrics import MeanSDR_BSSEval4, MedianSDR_BSSEval4
+from metrics import SDR_BSSEval3, MeanFramewiseSDR_BSSEval3, MedianFramewiseSDR_BSSEval3
+from metrics import GlobalMSE, MeanFramewiseMSE, MedianFramewiseMSE
+from metrics import GlobalMAE, MeanFramewiseMAE, MedianFramewiseMAE
+from metrics import GlobalSDR, MeanFramewiseSDR, MedianFramewiseSDR
+from metrics import GlobalSISDR, MeanFramewiseSISDR, MedianFramewiseSISDR
+
+parser = argparse.ArgumentParser(description='Eval parser')
+
+parser.add_argument('--references-folder', type=str,
+                    required=True, help='path to musdb (groundtruth)')
+parser.add_argument('--separations-folder', type=str,
+                    required=True, help='path to separations')
+parser.add_argument('--output', type=str,
+                    required=True, help='path to MAT file where metrices are stored')
+parser.add_argument('--num-processes', type=int,
+                    required=True, help='number of processes for multiprocessing')
+args = parser.parse_args()
+
+# load mus data
+mus = musdb.DB(root=args.references_folder, subsets='test', is_wav=True)
+
+# define instruments
+instruments = ['drums', 'bass', 'other', 'vocals']
+
+# define metrics
+global_metrics = [MeanSDR_BSSEval4, MedianSDR_BSSEval4,
+                  SDR_BSSEval3, MeanFramewiseSDR_BSSEval3, MedianFramewiseSDR_BSSEval3,
+                  GlobalMSE, GlobalMAE, GlobalSDR, GlobalSISDR]
+framewise_metrics = [MeanFramewiseMSE, MedianFramewiseMSE,
+                     MeanFramewiseMAE, MedianFramewiseMAE,
+                     MeanFramewiseSDR, MedianFramewiseSDR,
+                     MeanFramewiseSISDR, MedianFramewiseSISDR]
+
+metrics = []
+metrics.extend([metric() for metric in global_metrics])
+metrics.extend([metric(win=44100, hop=44100) for metric in framewise_metrics])
+
+# define output dictionary
+n_metrics = len(global_metrics) + len(framewise_metrics)
+results = {instr: np.zeros((n_metrics, 50)) for instr in instruments}
+results['metrics'] = [str(metric.__class__) for metric in metrics]
+
+
+# define evaluation function
+def _eval(track):
+    print(f'Evaluating track {track.path}')
+    # create return variable
+    res = {instr: np.zeros((n_metrics, )) for instr in instruments}
+
+    # get reference (order: 'drums', 'bass', 'other', 'vocals')
+    references = track.stems[1:]
+
+    # get separation
+    sep_path = os.path.dirname(track.path).replace(args.references_folder,
+                                                   args.separations_folder)
+    separations = [stempeg.read_stems(os.path.join(sep_path, instr + '.wav'), ffmpeg_format="s16le")[0] \
+        for instr in instruments]
+    separations = np.array(separations)
+
+    # apply metrics
+    for idx_metric, metric in enumerate(metrics):
+        m = metric(references, separations)
+        for idx_instr, instr in enumerate(instruments):
+            res[instr][idx_metric] = m[idx_instr]
+
+    return res
+
+
+# use multi-processing to evaluate tracks in parallel
+with Pool(args.num_processes) as tp:
+    _results = tp.map(_eval, mus)
+
+# store results
+for idx_track in range(len(mus)):
+    for idx_instr, instr in enumerate(instruments):
+        results[instr][:, idx_track] = _results[idx_track][instr]
+
+# store matlab file
+savemat(args.output, results)
+
diff --git a/local_evaluator/sisec21_evaluation/eval_sisec21.py b/local_evaluator/sisec21_evaluation/eval_sisec21.py
new file mode 100644
index 0000000000000000000000000000000000000000..c4aa2d75ba6fecf631f0839bc660a71b43c52f1e
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/eval_sisec21.py
@@ -0,0 +1,44 @@
+"""
+Evaluate separation using signal-to-distiortion ratio (SDR)
+
+Stefan Uhlich, Sony RDC
+"""
+import argparse
+import numpy as np
+import os
+import soundfile as sf
+
+from metrics import GlobalSDR
+
+parser = argparse.ArgumentParser(description='Eval parser')
+parser.add_argument('--reference-folder', type=str,
+                    required=True, help='path to single song (groundtruth)')
+parser.add_argument('--separation-folder', type=str,
+                    required=True, help='path to single song (separations)')
+args = parser.parse_args()
+
+# Create metric object
+metric = GlobalSDR()
+
+# read in all WAV files for the four instruments
+gt = []
+se = []
+for instr in ['bass', 'drums', 'other', 'vocals']:
+    _gt, _fs = sf.read(os.path.join(args.reference_folder, instr + '.wav'))
+    _se, _fs = sf.read(os.path.join(args.separation_folder, instr + '.wav'))
+    gt.append(_gt)
+    se.append(_se)
+
+gt = np.stack(gt) # shape: n_sources x n_samples x n_channels
+se = np.stack(se) # shape: n_sources x n_samples x n_channels
+
+# compute scores
+scores = metric(gt, se)
+
+print(f'Evaluated separation in {args.separation_folder}:')
+print(f'\tBass: SDR_instr = {scores[0]}')
+print(f'\tDrums: SDR_instr = {scores[1]}')
+print(f'\tOther: SDR_instr = {scores[2]}')
+print(f'\tVocals: SDR_instr = {scores[3]}')
+
+print(f'SDR_song = {np.mean(scores)}')
diff --git a/local_evaluator/sisec21_evaluation/metrics.py b/local_evaluator/sisec21_evaluation/metrics.py
new file mode 100644
index 0000000000000000000000000000000000000000..5db5203a3efe2c2c5fb9e7e41ab16e11c3d4907c
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/metrics.py
@@ -0,0 +1,255 @@
+"""
+metrics.py
+
+Contains metrics to evaluate source separation results.
+Stefan Uhlich, Sony RDC
+"""
+import numpy as np
+from museval import evaluate
+from mir_eval.separation import bss_eval_images, bss_eval_images_framewise
+
+
+class BaseMetric(object):
+    """ Base class that implements a metric for audio source separation.
+
+    ```
+    metric = BaseMetric(references, separations)  # shape: [n_sources]
+    ```
+
+    where `references` and `separations` are numpy arrays of shape
+    [n_sources x n_samples x n_channels].
+    """
+    def __call__(self, references, separations):
+        # make sure that we have `np.float32` data type
+        references = references.astype(np.float32)
+        separations = separations.astype(np.float32)
+
+        # make sure that `separations` and `references` have the same shape
+        assert references.shape == separations.shape,\
+               (f"Shape mismatch between references ({references.shape}) and "
+                f"estimates ({separations.shape}).")
+
+        # return metric
+        return self._metric(references, separations)
+
+    def _metric(self, references, separations):
+        """ implements our metric that we use to compare `references` with
+        `separations` """
+        raise NotImplementedError
+
+
+class MedianSDR_BSSEval4(BaseMetric):
+    """ SDR from BSS Eval v4: https://github.com/sigsep/sigsep-mus-eval
+    This is the main metric for SiSEC 2018 """
+    def _metric(self, references, separations):
+        sdr, _isr, _sir, _sar = evaluate(references, separations)
+        return np.nanmedian(sdr, axis=1)
+
+
+class MeanSDR_BSSEval4(BaseMetric):
+    """ SDR from BSS Eval v4: https://github.com/sigsep/sigsep-mus-eval
+    This is the main metric for SiSEC 2018 """
+    def _metric(self, references, separations):
+        sdr, _isr, _sir, _sar = evaluate(references, separations)
+        return np.nanmean(sdr, axis=1)
+
+
+class MedianFramewiseSDR_BSSEval3(BaseMetric):
+    """ SDR from BSS Eval v3 implemented by mir-eval:
+            https://github.com/craffel/mir_eval/blob/master/mir_eval/separation.py
+        This is the main metric for SiSEC 2016 """
+    def _metric(self, references, separations):
+        sdr, _isr, _sir, _sar, _perm = bss_eval_images_framewise(references, separations,
+                                                                 compute_permutation=False)
+        # remove all -inf/inf values
+        sdr = [_ for _ in sdr if not np.any(np.isinf(sdr))]
+        return np.nanmedian(sdr, axis=1)
+
+
+class MeanFramewiseSDR_BSSEval3(BaseMetric):
+    """ SDR from BSS Eval v3 implemented by mir-eval:
+            https://github.com/craffel/mir_eval/blob/master/mir_eval/separation.py
+        This is the main metric for SiSEC 2016 """
+    def _metric(self, references, separations):
+        sdr, _isr, _sir, _sar, _perm = bss_eval_images_framewise(references, separations,
+                                                                 compute_permutation=False)
+        # remove all -inf/inf values
+        sdr = [_ for _ in sdr if not np.any(np.isinf(sdr))]
+        return np.nanmean(sdr, axis=1)
+
+
+class SDR_BSSEval3(BaseMetric):
+    """ SDR from BSS Eval v3 implemented by mir-eval:
+            https://github.com/craffel/mir_eval/blob/master/mir_eval/separation.py
+        This is the original BSSEvalv3 (Matlab) """
+    def _metric(self, references, separations):
+        sdr, _isr, _sir, _sar, _popt = bss_eval_images(references, separations,
+                                                       compute_permutation=False)
+        return sdr
+
+
+class GlobalSISDR(BaseMetric):
+    """ SI-SDR - see, e.g., https://arxiv.org/pdf/1811.02508.pdf """
+    def _metric(self, references, separations):
+        delta = 1e-7  # avoid numerical errors
+
+        alpha = np.sum(separations * references, axis=(1, 2)) / \
+            (delta + np.sum(references * references, axis=(1, 2)))
+        alpha = alpha[:, np.newaxis, np.newaxis]
+
+        num = np.sum(np.square(alpha * references), axis=(1, 2))
+        den = np.sum(np.square(alpha * references - separations), axis=(1, 2))
+        num += delta
+        den += delta
+        return 10 * np.log10(num / den)
+
+
+class MedianFramewiseSISDR(GlobalSISDR):
+    """ Framewise SI-SDR + Median averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.median(vals, axis=1)
+
+
+class MeanFramewiseSISDR(GlobalSISDR):
+    """ Framewise SI-SDR + Mean averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.mean(vals, axis=1)
+
+
+class GlobalMSE(BaseMetric):
+    """ Global MSE """
+    def _metric(self, references, separations):
+        return np.mean(np.square(references - separations), axis=(1, 2))
+
+
+class MedianFramewiseMSE(GlobalMSE):
+    """ Framewise MSE + Median averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.median(vals, axis=1)
+
+
+class MeanFramewiseMSE(GlobalMSE):
+    """ Framewise MSE + Mean averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.mean(vals, axis=1)
+
+
+class GlobalMAE(BaseMetric):
+    """ Global MSE """
+    def _metric(self, references, separations):
+        return np.mean(np.abs(references - separations), axis=(1, 2))
+
+
+class MedianFramewiseMAE(GlobalMAE):
+    """ Framewise MSE + Median averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.median(vals, axis=1)
+
+
+class MeanFramewiseMAE(GlobalMAE):
+    """ Framewise MSE + Mean averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.mean(vals, axis=1)
+
+
+class GlobalSDR(BaseMetric):
+    """ Global SDR """
+    def _metric(self, references, separations):
+        delta = 1e-7  # avoid numerical errors
+        num = np.sum(np.square(references), axis=(1, 2))
+        den = np.sum(np.square(references - separations), axis=(1, 2))
+        num += delta
+        den += delta
+        return 10 * np.log10(num / den)
+
+
+class MedianFramewiseSDR(GlobalSDR):
+    """ Framewise SDR + Median averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.median(vals, axis=1)
+
+
+class MeanFramewiseSDR(GlobalSDR):
+    """ Framewise SDR + Mean averaging """
+    def __init__(self, win, hop):
+        super().__init__()
+        self.win = win
+        self.hop = hop
+
+    def _metric(self, references, separations):
+        vals = np.zeros((references.shape[0],
+                         1 + (references.shape[1] - self.win) // self.hop))
+        for i, idx in enumerate(range(0, references.shape[1] - self.win, self.hop)):
+            vals[:, i] = super()._metric(references[:, idx:idx+self.win, :],
+                                         separations[:, idx:idx+self.win, :])
+        return np.mean(vals, axis=1)
diff --git a/local_evaluator/sisec21_evaluation/results/Evaluation_TAU1_FULL.xlsx b/local_evaluator/sisec21_evaluation/results/Evaluation_TAU1_FULL.xlsx
new file mode 100755
index 0000000000000000000000000000000000000000..b204bb11cd9507cc57637e3c16375c6ee0049f4d
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/Evaluation_TAU1_FULL.xlsx differ
diff --git a/local_evaluator/sisec21_evaluation/results/corr_coef_pearson.m b/local_evaluator/sisec21_evaluation/results/corr_coef_pearson.m
new file mode 100644
index 0000000000000000000000000000000000000000..93761507fbd39b0f686e7334b6f276e2830352b7
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/results/corr_coef_pearson.m
@@ -0,0 +1,4 @@
+function rho = corr_coef_pearson(x, y)
+%% Compute Pearson correlation coefficient
+
+rho = (x-mean(x))'*(y-mean(y)) / (eps + sqrt((x-mean(x))'*(x-mean(x)))) / (eps + sqrt((y-mean(y))'*(y-mean(y))));
diff --git a/local_evaluator/sisec21_evaluation/results/corr_coef_spearman.m b/local_evaluator/sisec21_evaluation/results/corr_coef_spearman.m
new file mode 100644
index 0000000000000000000000000000000000000000..a05c8f5b93c9a4c11a2fbf79e27710e59d1125f1
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/results/corr_coef_spearman.m
@@ -0,0 +1,7 @@
+function rho = corr_coef_spearman(x, y)
+%% Compute Spearman correlation coefficient
+
+[~, ~, rk1] = unique(x);
+[~, ~, rk2] = unique(y);
+
+rho = corr_coef_pearson(rk1, rk2);
diff --git a/local_evaluator/sisec21_evaluation/results/fig1.png b/local_evaluator/sisec21_evaluation/results/fig1.png
new file mode 100644
index 0000000000000000000000000000000000000000..57f893a3c80dcbb2e5df2fcff6c2358ed6fd6b49
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig1.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig10.png b/local_evaluator/sisec21_evaluation/results/fig10.png
new file mode 100644
index 0000000000000000000000000000000000000000..8df3cd6a1ad7f85b549ee26d086490eb18d70562
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig10.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig11.png b/local_evaluator/sisec21_evaluation/results/fig11.png
new file mode 100644
index 0000000000000000000000000000000000000000..3a12608092df0f410eee2b705815405c88248d71
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig11.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig12.png b/local_evaluator/sisec21_evaluation/results/fig12.png
new file mode 100644
index 0000000000000000000000000000000000000000..98a5f5ea5dc62651e4622c1592d14d5a97f15c25
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig12.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig13.png b/local_evaluator/sisec21_evaluation/results/fig13.png
new file mode 100644
index 0000000000000000000000000000000000000000..b8d4f864a2781b96c2893897a1c39c0424020450
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig13.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig14.png b/local_evaluator/sisec21_evaluation/results/fig14.png
new file mode 100644
index 0000000000000000000000000000000000000000..ec3387c96c08ec5284d52ec5935a157edff9bb12
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig14.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig15.png b/local_evaluator/sisec21_evaluation/results/fig15.png
new file mode 100644
index 0000000000000000000000000000000000000000..b6609e22cfd57a17cb03943e7a35f5a9bdff4955
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig15.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig16.png b/local_evaluator/sisec21_evaluation/results/fig16.png
new file mode 100644
index 0000000000000000000000000000000000000000..79c69cf7e1e9ff2fc4b3ffe347fcd9e2fba84a35
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig16.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig17.png b/local_evaluator/sisec21_evaluation/results/fig17.png
new file mode 100644
index 0000000000000000000000000000000000000000..e195b1f89b374c4f4fdf00dbc42a9edd5c8c9ffc
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig17.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig2.png b/local_evaluator/sisec21_evaluation/results/fig2.png
new file mode 100644
index 0000000000000000000000000000000000000000..43c075206798499c9263f62e8cab0c5151ca15e1
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig2.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig3.png b/local_evaluator/sisec21_evaluation/results/fig3.png
new file mode 100644
index 0000000000000000000000000000000000000000..d55f83d0d070a5ac448aac6a2c42ea66aea9f803
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig3.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig4.png b/local_evaluator/sisec21_evaluation/results/fig4.png
new file mode 100644
index 0000000000000000000000000000000000000000..33529d629d730371566e05a8fb33c63fc7f9ea0b
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig4.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig5.png b/local_evaluator/sisec21_evaluation/results/fig5.png
new file mode 100644
index 0000000000000000000000000000000000000000..ee9814eb59278543bf8718b884d112f3f32dce4f
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig5.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig6.png b/local_evaluator/sisec21_evaluation/results/fig6.png
new file mode 100644
index 0000000000000000000000000000000000000000..c4428896c8eb39b30f782cce4d0b6964180f508b
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig6.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig7.png b/local_evaluator/sisec21_evaluation/results/fig7.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4fb037926b3a7d1d65ca273b6a09097755d9820
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig7.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig8.png b/local_evaluator/sisec21_evaluation/results/fig8.png
new file mode 100644
index 0000000000000000000000000000000000000000..f91989f27661f477bc288e507d5f3f0f655616c7
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig8.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/fig9.png b/local_evaluator/sisec21_evaluation/results/fig9.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2f98380b19f3d3d50bd27c8a016d740a20785f5
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/fig9.png differ
diff --git a/local_evaluator/sisec21_evaluation/results/output_TAU1_30.mat b/local_evaluator/sisec21_evaluation/results/output_TAU1_30.mat
new file mode 100644
index 0000000000000000000000000000000000000000..dadd30dfa3f1aae92968b6058a40c2a22c8033e7
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/output_TAU1_30.mat differ
diff --git a/local_evaluator/sisec21_evaluation/results/output_TAU1_FULL.mat b/local_evaluator/sisec21_evaluation/results/output_TAU1_FULL.mat
new file mode 100644
index 0000000000000000000000000000000000000000..911f02ef290af909a4632204a2b6fb69822f1160
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/results/output_TAU1_FULL.mat differ
diff --git a/local_evaluator/sisec21_evaluation/results/plot_metrics.m b/local_evaluator/sisec21_evaluation/results/plot_metrics.m
new file mode 100644
index 0000000000000000000000000000000000000000..57eeda02c1a38b87ceafdfdf0a53832575dbdbdf
--- /dev/null
+++ b/local_evaluator/sisec21_evaluation/results/plot_metrics.m
@@ -0,0 +1,104 @@
+clc
+clear
+
+% load evaluation data
+load output_TAU1_FULL.mat
+
+% define reference measure (to which we compare)
+reference_measure_idx = 2;
+
+% convert `metrics` into cell array
+metrics_cell = cell(size(metrics, 1), 1);
+for k = 1:numel(metrics_cell), metrics_cell{k} = strtrim(metrics(k, :));end
+for k = 1:numel(metrics_cell), metrics_cell{k} = strrep(metrics_cell{k}, '<class ''metrics.', '');end
+for k = 1:numel(metrics_cell), metrics_cell{k} = metrics_cell{k}(1:end-2);end
+for k = 1:numel(metrics_cell), metrics_cell{k} = strrep(metrics_cell{k}, '_', '\_');end
+
+for idx = 1:numel(metrics_cell)
+    figure(idx)
+    
+    min_rho_pearson  = inf;
+    max_rho_pearson  = -inf;
+    mean_rho_pearson = 0;
+    min_rho_spearman  = inf;
+    max_rho_spearman  = -inf;
+    mean_rho_spearman = 0;
+    
+    subplot(2,2,1)
+    scatter(bass(idx,:), bass(reference_measure_idx,:), 'filled')
+    xlabel(metrics_cell{idx})
+    ylabel(metrics_cell{reference_measure_idx})
+    rho_pearson  = corr_coef_pearson(bass(idx,:)', bass(reference_measure_idx,:)');
+    rho_spearman = corr_coef_spearman(bass(idx,:)', bass(reference_measure_idx,:)');
+    title(sprintf('bass pear: %.2f spear: %.2f', rho_pearson, rho_spearman))
+    grid on
+    box on
+    min_rho_pearson  = min(min_rho_pearson, rho_pearson);
+    max_rho_pearson  = max(max_rho_pearson, rho_pearson);
+    mean_rho_pearson = mean_rho_pearson + rho_pearson;
+    min_rho_spearman  = min(min_rho_spearman, rho_spearman);
+    max_rho_spearman  = max(max_rho_spearman, rho_spearman);
+    mean_rho_spearman = mean_rho_spearman + rho_spearman;
+    
+    subplot(2,2,2)
+    scatter(drums(idx,:), drums(reference_measure_idx,:), 'filled')
+    xlabel(metrics_cell{idx})
+    ylabel(metrics_cell{reference_measure_idx})
+    rho_pearson  = corr_coef_pearson(drums(idx,:)', drums(reference_measure_idx,:)');
+    rho_spearman = corr_coef_spearman(drums(idx,:)', drums(reference_measure_idx,:)');
+    title(sprintf('drums pear: %.2f spear: %.2f', rho_pearson, rho_spearman))
+    grid on
+    box on
+    min_rho_pearson  = min(min_rho_pearson, rho_pearson);
+    max_rho_pearson  = max(max_rho_pearson, rho_pearson);
+    mean_rho_pearson = mean_rho_pearson + rho_pearson;
+    min_rho_spearman  = min(min_rho_spearman, rho_spearman);
+    max_rho_spearman  = max(max_rho_spearman, rho_spearman);
+    mean_rho_spearman = mean_rho_spearman + rho_spearman;
+    
+    subplot(2,2,3)
+    scatter(other(idx,:), other(reference_measure_idx,:), 'filled')
+    xlabel(metrics_cell{idx})
+    ylabel(metrics_cell{reference_measure_idx})
+    rho_pearson  = corr_coef_pearson(other(idx,:)', other(reference_measure_idx,:)');
+    rho_spearman = corr_coef_spearman(other(idx,:)', other(reference_measure_idx,:)');
+    title(sprintf('other pear: %.2f spear: %.2f', rho_pearson, rho_spearman))
+    grid on
+    box on
+    min_rho_pearson  = min(min_rho_pearson, rho_pearson);
+    max_rho_pearson  = max(max_rho_pearson, rho_pearson);
+    mean_rho_pearson = mean_rho_pearson + rho_pearson;
+    min_rho_spearman  = min(min_rho_spearman, rho_spearman);
+    max_rho_spearman  = max(max_rho_spearman, rho_spearman);
+    mean_rho_spearman = mean_rho_spearman + rho_spearman;
+    
+    subplot(2,2,4)
+    scatter(vocals(idx,:), vocals(reference_measure_idx,:), 'filled')
+    xlabel(metrics_cell{idx})
+    ylabel(metrics_cell{reference_measure_idx})
+    rho_pearson  = corr_coef_pearson(vocals(idx,:)', vocals(reference_measure_idx,:)');
+    rho_spearman = corr_coef_spearman(vocals(idx,:)', vocals(reference_measure_idx,:)');
+    title(sprintf('vocals pear: %.2f spear: %.2f', rho_pearson, rho_spearman))
+    grid on
+    box on
+    min_rho_pearson  = min(min_rho_pearson, rho_pearson);
+    max_rho_pearson  = max(max_rho_pearson, rho_pearson);
+    mean_rho_pearson = mean_rho_pearson + rho_pearson;
+    min_rho_spearman  = min(min_rho_spearman, rho_spearman);
+    max_rho_spearman  = max(max_rho_spearman, rho_spearman);
+    mean_rho_spearman = mean_rho_spearman + rho_spearman;
+    
+    mean_rho_pearson = mean_rho_pearson / 4;
+    mean_rho_spearman = mean_rho_spearman / 4;
+    fprintf('%s\t%f\t%f\t%f\t%f\t%f\t%f\n', metrics_cell{idx}, ...
+        min_rho_pearson, mean_rho_pearson, max_rho_pearson, ...
+        min_rho_spearman, mean_rho_spearman, max_rho_spearman);
+    
+    print('-dpng', '-r300', sprintf('fig%d.png', idx));
+end
+
+fprintf('Mean scores: \n')
+disp([mean(bass, 2) mean(drums, 2) mean(other, 2) mean(vocals, 2)])
+
+fprintf('Median scores: \n')
+disp([median(bass, 2) median(drums, 2) median(other, 2) median(vocals, 2)])
\ No newline at end of file
diff --git a/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/bass.wav b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/bass.wav
new file mode 100644
index 0000000000000000000000000000000000000000..61c1ae143ad26737c3c43bf25e41a48ba0352866
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/bass.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/drums.wav b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/drums.wav
new file mode 100644
index 0000000000000000000000000000000000000000..10ac4f04f156553a4828cdb53606842a47c51cea
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/drums.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/other.wav b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/other.wav
new file mode 100644
index 0000000000000000000000000000000000000000..71fdb3d800abb772a9808e729b60acf544cfaf61
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/other.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/vocals.wav b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/vocals.wav
new file mode 100644
index 0000000000000000000000000000000000000000..76e8369a78c9f8732e4028818fac12e99d372ea2
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/groundtruth/AM Contra - Heart Peripheral/vocals.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/bass.wav b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/bass.wav
new file mode 100644
index 0000000000000000000000000000000000000000..d374ffb2a8d9b0e1addd02028ef92c8727503c25
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/bass.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/drums.wav b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/drums.wav
new file mode 100644
index 0000000000000000000000000000000000000000..bfd64189a868381fea755262242e29b2bf481853
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/drums.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/other.wav b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/other.wav
new file mode 100644
index 0000000000000000000000000000000000000000..71d07975e7602185457ee8fdd1586c1a9bbfd74f
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/other.wav differ
diff --git a/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/vocals.wav b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/vocals.wav
new file mode 100644
index 0000000000000000000000000000000000000000..3f601c59180c5308b204e16398c88d9b5ed13ebb
Binary files /dev/null and b/local_evaluator/sisec21_evaluation/sample/separation/AM Contra - Heart Peripheral/vocals.wav differ
diff --git a/my_submission/README.md b/my_submission/README.md
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..d09e03fc5fd8bb0b79f1bd928346466823401edf 100644
--- a/my_submission/README.md
+++ b/my_submission/README.md
@@ -0,0 +1,18 @@
+## How to write your own models?
+
+We recommend that you place the code for all your agents in the `my_submisison` directory (though it is not mandatory). We have added sample code that just returns the original soundfile in `identity_music_separation_model.py`
+
+**Add your model name in** [`user_config.py`](user_config.py)
+
+# What instruments need to be separated:
+
+Your model needs to output 4 separate sound arrays corresponding to 'bass', 'drums', 'other', and 'vocals'
+
+## Depth Perception model format
+You will have to implement a class containing the function `separate_music_file`. This will recieve sound array,  which will be the mixed music. You need to dictionary with the instruments as keys and the correponding demixed arrays as values.
+
+The evaluator uses `MySeparationModel` from `user_config.py` as its entrypoint. Specify the class name for your model here.
+
+## What's AIcrowd Wrapper
+
+Don't change this file, it is used to save the outputs you predict and read the input sound files. We run it on the client side for efficiency. The AIcrowdWrapper is the actual class that is called by the evaluator, it also contains some checks to see your predictions are formatted correctly.
\ No newline at end of file
diff --git a/my_submission/aicrowd_wrapper.py b/my_submission/aicrowd_wrapper.py
index 2d2f2218b0c7520cc3114f95940696bfb994fa99..1f03d522bb2d86fe4ebd1a29d9123f14a26ec3b4 100644
--- a/my_submission/aicrowd_wrapper.py
+++ b/my_submission/aicrowd_wrapper.py
@@ -1,5 +1,4 @@
 ## DO NOT CHANGE THIS FILE
-## Your changes will be discarded at the server evaluation
 
 import os
 import numpy as np
@@ -35,7 +34,7 @@ class AIcrowdWrapper:
         raise NameError(msg)
 
     def check_output(self, separated_music_arrays, output_sample_rates):
-        pass
+        assert set(self.instruments) == set(separated_music_arrays.keys()), "All instrument not present"
     
     def save_prediction(self, prediction_path, separated_music_arrays, output_sample_rates):
         if not os.path.exists(prediction_path):
diff --git a/my_submission/random_music_separation_model.py b/my_submission/identity_music_separation_model.py
similarity index 78%
rename from my_submission/random_music_separation_model.py
rename to my_submission/identity_music_separation_model.py
index 23b3c3c5d4c82e1888d51ba6b7cd3c275144042d..6d153d78f80456f0c9d753cbdf4228e73e7c1ee5 100644
--- a/my_submission/random_music_separation_model.py
+++ b/my_submission/identity_music_separation_model.py
@@ -1,6 +1,9 @@
 import numpy as np
 
-class RandomMusicSeparationModel:
+class IdentityMusicSeparationModel:
+    """
+    Doesn't do any separation just passes the input back as output
+    """
     def __init__(self):
         """
         Initialize your model here
@@ -32,9 +35,9 @@ class RandomMusicSeparationModel:
         separated_music_arrays = {}
         output_sample_rates = {}
         for instrument in self.instruments:
-            separated_music_arrays[instrument] = np.random.rand(input_length, 2) * 2 - 1
+            separated_music_arrays[instrument] = mixed_sound_array.copy()
+            # separated_music_arrays[instrument] = np.random.rand(input_length, 2) * 2 - 1 # Random predictions
             output_sample_rates[instrument] = sample_rate
 
-        assert set(self.instruments) == set(separated_music_arrays.keys()), "All instrument not present"
 
         return separated_music_arrays, output_sample_rates
\ No newline at end of file
diff --git a/my_submission/user_config.py b/my_submission/user_config.py
index a8123c1ab91561d4128a614e7f3b4591365e725e..518f1f6e0b54cbbe95fbfc3a2edb4beb95dbd8ab 100644
--- a/my_submission/user_config.py
+++ b/my_submission/user_config.py
@@ -1,3 +1,3 @@
-from my_submission.random_music_separation_model import RandomMusicSeparationModel
+from my_submission.identity_music_separation_model import IdentityMusicSeparationModel
 
-MySeparationModel = RandomMusicSeparationModel
\ No newline at end of file
+MySeparationModel = IdentityMusicSeparationModel
\ No newline at end of file