Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • gaojingtong/amazon-kdd-cup-2024-starter-kit
  • pp/amazon-kdd-cup-2024
  • jeremy_shi/amazon-kdd-cup-2024-starter-kit
  • zeng_biao_jie/amazon-kdd-cup-2024-starter-kit
  • der2933/amazon-kdd-cup-2024-starter-kit
  • zbt2702160239/amazon-kdd-cup-2024-starter-kit
  • pokce/amazon-kdd-cup-2024-starter-kit
  • xbtl/amazon-kdd-cup-2024-starter-kit
  • boren/amazon-kdd-cup-2024-starter-kit
  • simon_jegou/amazon-kdd-cup-2024-starter-kit
  • li_zhi_peng/amazon-kdd-cup-2024-starter-kit
  • shisong_qin/amazon-kdd-cup-2024-starter-kit
  • lei_ding5/amazon-kdd-cup-2024-starter-kit
  • Pokce2/amazon-kdd-cup-2024-starter-kit
  • lizhipeng/amazon-kdd-cup-2024-starter-kit
  • giba/amazon-kdd-cup-2024-starter-kit
  • liuxiaoming1412/amazon-kdd-cup-2024-phase-2-lxm-07
  • haoyuzhang/amazon-kdd-cup-2024-starter-kit
  • aicrowd/challenges/amazon-kdd-cup-2024/amazon-kdd-cup-2024-starter-kit
19 results
Show changes
Commits on Source (41)
Showing with 644 additions and 213 deletions
models/**
\ No newline at end of file
.git/
models/**
data/
\ No newline at end of file
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu20.04
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04
ENV DEBIAN_FRONTEND=noninteractive \
LANG=en_US.UTF-8 \
......@@ -12,7 +12,7 @@ ENV DEBIAN_FRONTEND=noninteractive \
# Install system dependencies and clean up in one layer
COPY apt.txt /tmp/apt.txt
RUN apt -qq update && apt -qq install -y --no-install-recommends `cat /tmp/apt.txt` locales wget \
RUN apt -qq update && apt -qq install -y --no-install-recommends `cat /tmp/apt.txt | tr -d '\r'` locales wget build-essential \
&& locale-gen en_US.UTF-8 \
&& rm -rf /var/cache/apt/* /var/lib/apt/lists/* \
&& apt clean
......@@ -24,7 +24,7 @@ RUN groupadd -g 1001 aicrowd && \
USER ${USER_NAME}
WORKDIR ${HOME_DIR}
# Install Miniconda and Python packages
# Install Miniconda and Python packages. You can change the python version by using another Miniconda.
RUN wget -nv -O miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-py38_22.11.1-1-Linux-x86_64.sh \
&& bash miniconda.sh -b -p ${CONDA_DIR} \
&& . ${CONDA_DIR}/etc/profile.d/conda.sh \
......
![AMAZON KDD CUP 2024: MULTI-TASK ONLINE SHOPPING CHALLENGE FOR LLMS](https://images.aicrowd.com/raw_images/challenges/social_media_image_file/1139/566667103918dae81381.jpg)
![AMAZON KDD CUP 2024: MULTI-TASK ONLINE SHOPPING CHALLENGE FOR LLMS](https://aicrowd-production.s3.eu-central-1.amazonaws.com/challenge_images/amazon-kdd-cup-2024/amazon-kdd-cup-24-banner.jpg)
[![Discord](https://img.shields.io/discord/565639094860775436.svg)](https://discord.gg/yWurtB2huX)
# 🛒 [Amazon KDD CUP 2024: Multi-Task Online Shopping Challenge for LLMs](https://www.aicrowd.com/challenges/amazon-kdd-cup-2024-multi-task-online-shopping-challenge-for-llms) Starter Kit
......@@ -55,7 +55,9 @@ The development datasets will be given in json format with the following fields.
- `input_field`: This field contains the instructions and the question that should be answered by the model.
- `output_field`: This field contains the ground truth answer to the question.
- `task_type`: This field contains the type of the task (Details in the next Section, "Tasks")
- `task_name`: This field contains the name of the task. However, the exact task names are redacted, and we only provide participants with hashed task names (e.g. `task1`, `task2`).
- `metric`: This field contains the metric used to evaluate the question (Details in Section "Evaluation Metrics").
- `track`: This field specifies the track the question comes from.
However, the test dataset (which will be hidden from participants) will have a different format with only two fields:
- `input_field`, which is the same as above.
......@@ -116,7 +118,7 @@ Please follow the instructions in [models/README.md](models/README.md) for instr
1. **Add your SSH key** to AIcrowd GitLab
You can add your SSH Keys to your GitLab account by going to your profile settings [here](https://gitlab.aicrowd.com/profile/keys). If you do not have SSH Keys, you will first need to [generate one](https://docs.gitlab.com/ee/user/ssh.html).
You can add your SSH Keys to your GitLab account by going to your profile settings [here](https://gitlab.aicrowd.com/-/profile/keys). If you do not have SSH Keys, you will first need to [generate one](https://docs.gitlab.com/ee/user/ssh.html).
2. **Fork the repository**. You can use [this link](https://gitlab.aicrowd.com/aicrowd/challenges/amazon-kdd-cup-2024/amazon-kdd-cup-2024-starter-kit/-/forks/new) to create a fork.
......@@ -153,8 +155,22 @@ This also includes instructions on [specifying your software runtime](docs/submi
## 💻 What hardware does my code run on ?
You can find more details about the hardware and system configuration in [docs/hardware-and-system-config.md](docs/hardware-and-system-config.md).
In summary, we provide you `2` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 1; and `4` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 2.
In summary, we provide you `4` x [[NVIDIA T4 GPUs](https://www.nvidia.com/en-us/data-center/tesla-t4/)] in Phase 2.
Your solution will be given a certain amount of time for inference, after which it would be immediately killed and no results would be available. The time limit is set at
| Phase | Track 1 | Track 2 | Track 3 | Track 4 | Track 5 |
| ------ | ------- | ------- | ------- | ------- | ------- |
| **Phase 2**| 70 minutes | 20 minutes | 30 minutes | 20 minutes | 140 minutes |
For reference, the baseline solution with zero-shot LLaMA3-8B-instruct consumes the following amount of time.
| Phase | Track 1 | Track 2 | Track 3 | Track 4 |
| ------ | ------- | ------- | ------- | ------- |
| **Phase 2**| 1490s | 397s | 576s | 359s |
We limit the prediction time of each sample to at most **10 seconds**. This limit applies at a batch level. For example, for a batch of 8 samples, you should return the prediction after at most 80 seconds. Otherwise, your submission will be killed.
Your maximum repo size is 200GB.
## 🧩 How are my model responses parsed by the evaluators ?
Please refer to [parsers.py](parsers.py) for more details on how we parse your model responses.
......
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'product type'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: tablette xiaomi\nOutput: ","output_field":["tablette"],"task_name":"task1","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'brand'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: chocolate buttons nestle easter\nOutput: ","output_field":["nestle"],"task_name":"task1","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'audience'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: women long sleeved casual shirts\nOutput: ","output_field":["women"],"task_name":"task1","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'composition'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: 3m plastic repair tape\nOutput: ","output_field":["plastic"],"task_name":"task1","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false}
{"input_field":"The product 'Hanes Men's Beefy-T T-Shirt, Heavyweight Cotton Tee, 1 Or 2 Pack, Big & Tall' appears on an e-commerce website. What type of fabric is used in it?\n0. spandex, polyester\n1. cotton\n2. microfiber\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'Axeman Floating Shelves 12 Inch Deep, Rustic Wood Wall Mounted Shelves Set of 2, Large Floating Shelves for Wall Decor, Hanging Shelves for Farmhouse Kitchen Bathroom Bedroom, Rustic Brown' appears on an e-commerce website. What type of mounting is used in it?\n0. freestanding\n1. wall\n2. handlebar\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'Amazon Brand, Happy Belly 2% Reduced Fat Milk, Half Gallon, 64 Fl Oz' appears on an e-commerce website. Is the drink vegetarian? \n0. non-vegetarian\n1. kosher\n2. It cannot be inferred.\n3. vegetarian\nAnswer: ","output_field":3,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'Nexxus Clean and Pure Conditioner, With ProteinFusion, Nourished Hair Care Silicone, Dye And Paraben Free 33.8 oz' appears on an e-commerce website. What type of hair can be used for it?\n0. It cannot be inferred.\n1. all\n2. dry\n3. all hair\nAnswer: ","output_field":0,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"You are given a user review to a(n) pants product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: Love these pants. Super comfortable. They run longer than what I am use to, but I think that is part of the style. Bought in two other colors. As for sizing, I normally wear a 34\/30 to 36\/30 depending on the manufacturer. I am 5'11 215 with an athletically build.\nAspect: comfort\nOutput: \n","output_field":"super comfortable","task_name":"task3","task_type":"generation","metric":"rougel","is_multiple_choice":false}
{"input_field":"You are given a user review to a(n) shorts product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: These shoes are soft and comfortable but slightly snug. I can wear them now if not doing a lot of walking, but they are lather and will stretch a little, so I'm happy with them. They are a great color, sort of stone. Not white and not beige. Very versatile.\nAspect: comfort\nOutput: \n","output_field":"comfortable but slightly snug","task_name":"task3","task_type":"generation","metric":"rougel","is_multiple_choice":false}
{"input_field":"The following is a user review to a(n) water cup product and an aspect mentioned in the review.\nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: Perfect size for a couple of large water bottles and some ice packs. I see that a few reviewers commented on how hard the lid was to open, I can't speak to their experiences, but mine works just fine.. I am going to LOVE this thing for strapping onto my atv for work. I don't drink much soda, so a few water bottles and a small lunch in mine and I am all set.\nAspect: ease of use\nOutput: \n","output_field":"works just fine","task_name":"task3","task_type":"generation","metric":"rougel","is_multiple_choice":false}
{"input_field":"You are given a user review to a(n) shirt product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: I bought this for my 8 year old daughter and I got a Medium. She is small for her age but I would normally buy her a size 7. But since the only options were small medium and large, and after reading the reviews, I went with a medium. It is way way too big. I should have gone with a small or XS. So I am disappointed because she is going to wear it for pictures. Instead of sending it back, I'll just have to cut and sew it myself. The colors are really bright and it's a pretty nice shirt. It could be a little thicker though in my opinion.\nAspect: color\nOutput: \n","output_field":"colors are really bright","task_name":"task3","task_type":"generation","metric":"rougel","is_multiple_choice":false}
{"input_field":"The product 'Nissin RAOH Ramen Noodle Soup, Tonkotsu, 3.53 Ounce (Pack of 6)' appears on e-commerce website. What is the total weight of the noodles?\n0. 8 ounce\n1. 21.18 ounce\n2. 14.19 ounce\n3. 60 ounce\nAnswer: ","output_field":1,"task_name":"task4","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'Vanity Fair Entertain Paper Napkins, 3-Ply Disposable Napkins, Dinner Size (24 packs of 40 Napkins)' appears on e-commerce website. What is the total count of disposable napkins in this package?\n0. 120 count\n1. 660 count\n2. 1040 count\n3. 960 count\nAnswer: ","output_field":3,"task_name":"task4","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'AQQEF Desktop Monitor Stand Riser with Drawer,Laptop Stand Riser and Computer for Desk with USB 3.0 Data Port and Type-C Charging' appears on an e-commerce website. It is a monitor. How many usb ports are there?\n0. It cannot be inferred.\n1. 3.0\n2. 2\n3. 1\nAnswer: ","output_field":0,"task_name":"task4","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"The product 'Anker 555 USB-C Hub (8-in-1), with 100W Power Delivery, 4K 60Hz HDMI Port, 10Gbps USB C and 2 A Data Ports, Ethernet microSD SD Card Reader, for MacBook Pro More' appears on an e-commerce website. It is a multiport hub. How many usb ports are there?\n0. It cannot be inferred.\n1. 1\n2. 2\n3. 4\nAnswer: ","output_field":3,"task_name":"task4","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following events can obstruct the event \"PersonX finds a tampon\"\n0. PersonX has a sore finger\n1. PersonX doesn't have any interest in watching the movie\n2. PersonX didn't have enough money to buy the ticket\n3. PersonX can't find a box\nAnswer: ","output_field":3,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Given the event \"PersonX wears a blazer\", as a result, PersonX feels\n0. fearful\n1. dignified\n2. more confident\n3. compassion\nAnswer: ","output_field":1,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Given the event \"PersonX reads novels in a hammock\", as a result, PersonX\n0. is relaxed\n1. is caught by store owner\n2. buys a flashlight\n3. buys a wrench\nAnswer: ","output_field":0,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"\"PersonX cleans the cat box\" because PersonX wanted\n0. to have some\n1. to be able to hear PersonY\n2. to get away from the pressure\n3. to be clean\nAnswer: ","output_field":3,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following statements best describes the relation from query \"oral b dental floss\" to query \"short sleeve polo for men\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":0,"task_name":"task6","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following statements best describes the relation from query \"lancome\" to query \"lancome face moisturizer\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":3,"task_name":"task6","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following statements best describes the relation from query \"oreos\" to query \"fruit loops\"?\n0. narrowing\n1. substitute\n2. irrelevant\n3. complement\nAnswer: ","output_field":1,"task_name":"task6","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following statements best describes the relation from query \"man city jersey\" to query \"man city hat\"?\n0. irrelevant\n1. complement\n2. substitute\n3. narrowing\nAnswer: ","output_field":1,"task_name":"task6","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"A user has made a query with keyword 'jeep liberty lift'. Given the following numbered list of 5 products, please rank the products according their relevance with the query. \nProduct List: \n1. Supreme Suspensions - Front Leveling Kit for 2002-2007 Jeep Liberty KJ and 2008-2012 Jeep Liberty KK 2.5\" Front Lift High-Strength Carbon Steel Strut Spacers 2WD 4WD\n2. Rough Country 2.5\" Lift Kit for 2007-2018 Jeep Wrangler JK 4DR - 67930\n3. Rough Country 2.5\" Lift Kit (fits) 1997-2006 Jeep Wrangler TJ LJ | 6 CYL | N3 Shocks | Suspension System | 653.20\n4. Supreme Suspensions - Full Lift Kit for 2008-2012 Jeep Liberty KK 2.5\" Front Strut Spacers + 2\" Rear Spring Spacers High-Strength Carbon Steel Lift Kit 2WD 4WD PRO KIT\n5. TeraFlex 1251000 2.5\" Lift Kit (JK 4 Door with All (4) 2.5\" Shock)\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[1,0.01,0.1,1,0],"task_name":"task7","task_type":"ranking","metric":"ndcg","is_multiple_choice":false}
{"input_field":"You are an intelligent shopping assistant that can rank products based on their relevance to the query. The following numbered list contains 5 products. Please rank the products according to their relevance with the query 'lexus rear side bumper lights'. \nProduct List: \n1. Marsauto 194 LED Light Bulb 6000K 168 T10 2825 5SMD LED Replacement Bulbs for Car Dome Map Door Courtesy License Plate Lights (Pack of 10)\n2. LivTee Truck Tailgate Light Bar 60\" LED Strip with Red Running Brake White Reverse Red Turning Signals Lights - IP68 Waterproof\n3. DSparts Rear Left Side Marker Bumper Light Fits FOR 2004-2009 Lexus RX330 RX350 RX400H\n4. Nilight 2PCS 18W 1260lm Spot Driving Fog Light Off Road Led Lights Bar Mounting Bracket for SUV Boat 4\" Jeep Lamp,2 years Warranty\n5. Motor Trend 923-GR Gray FlexTough Contour Liners-Deep Dish Heavy Duty Rubber Floor Mats for Car SUV Truck & Van-All Weather Protection, Universal Trim to Fit\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0.01,0.1,1,0,0],"task_name":"task7","task_type":"ranking","metric":"ndcg","is_multiple_choice":false}
{"input_field":"You are an intelligent shopping assistant that can rank products based on their relevance to the query. The following numbered list contains 5 products. Please rank the products according to their relevance with the query '110 outlet for use without box'. \nProduct List: \n1. Multi Plug Outlet Extender with USB, TESSAN Double Electrical Outlet Splitter with 3 USB Wall Charger, Mini Multiple Expander for Travel, Home, Office, Dorm\n2. ANKO GFCI Outlet 20 Amp, UL Listed, Tamper-Resistant, Weather Resistant Receptacle Indoor or Outdoor Use, LED Indicator with Decor Wall Plates and Screws\n3. Echo Dot (3rd Gen) - Smart speaker with Alexa - Charcoal\n4. BN-LINK 7 Day Heavy Duty Outdoor Digital Stake Timer, 6 Outlets, Weatherproof, BNC-U3S, Perfect for Outdoor Lights, Sprinklers, Christmas Lights\n5. WELLUCK 15 Amp 125V AC Power Inlet Port Plug with Integrated 18\" Extension Cord, NEMA 5-15 RV Flanged Inlet with Waterproof & Back Cover, 2 Pole 3-Wire Shore Power Plug for Boat\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0.01,0.1,0,1,0],"task_name":"task7","task_type":"ranking","metric":"ndcg","is_multiple_choice":false}
{"input_field":"A user has made a query with keyword 'blue shampoo aveda'. Given the following numbered list of 5 products, please rank the products according their relevance with the query. \nProduct List: \n1. Organic Blue Mallow Flowers - Color-Changing Blue Herbal Tea | 100% Dried Blue Mallow Flowers - Malva sylvestris | Net Weight: 0.5oz \/ 15g\n2. Aveda Clove Shampoo, 33.8 Oz, 33.8 Fl Oz () (0018084813553)\n3. 2 New Aveda Bottle Pumps fits 1 Liter products Shampoo, Conditioner, Lotion, Etc.\n4. Joico Color Balance Blue Shampoo 10.1 fl oz\n5. AVEDA by Aveda: Blue Malva Color Shampoo 33.8 OZ\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0,0.1,0.01,0.1,1],"task_name":"task7","task_type":"ranking","metric":"ndcg","is_multiple_choice":false}
{"input_field":"A user on an online shopping website has just purchased a product 'Steven Harris Mathematics Math Equations Necktie - Red - One Size Neck Tie'. The following numbered list contains 15 products. Please select 3 products from the list that the user may also purchase.\nProduct List: \n1. Under Armour Men`s ColdGear Lite Cushion Boot Socks, 1 Pair\n2. Little Angel Tasha-685E Patent Bow Mary Jane Pump (Toddler\/Little Girl\/Big Girl) - Fuchsia\n3. Men's Solar System Planets Necktie-Black-One Size Neck Tie by\n4. Crocs Women's Malindi Flat\n5. Wrangler Men's Big & Tall Rugged Wear Unlined Denim Jacket\n6. NIKE Sunray Protect 2 (TD) Womens Fashion-Sneakers 943829\n7. Calvin Klein Women's Seductive Comfort Customized Lift Bra with Lace\n8. Steven Harris Mens Smiley Face Necktie - Yellow - One Size Neck Tie\n9. ComputerGear Math Formula Tie Engineer Silk Equations Geek Nerd Teacher Gift\n10. Harley-Davidson Boys Baby Twin Pack Creeper My Daddy Rides a Harley Orange\n11. Liverpool Football Club Official Soccer Gift Mens Crest T-Shirt\n12. SITKA Traverse Beanie Waterfowl One Size Fits All (90002-WL-OSFA)\n13. The Magic Zoo Sterling Silver Snake Chain with Lobster Clasp\n14. Napier\"Classics\" Silver-Tone Round Button Earrings\n15. Tru-Spec Men's Base Layers Series Gen-iii ECWCS Level-2 Bottom\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[3,8,9],"task_name":"task8","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false}
{"input_field":"You are a helpful shop assistant. A user would like to buy the product 'Sempio Soy Sauce for Soup 31.4 Fl Oz.'. Please select the products that the user may also buy from the following numbered list.\nProduct List: \n1. assi Dried Baekdudaegan Fernbraken, 8 Ounce\n2. Pete & Gerry's, Organic Free-Range Grade A Extra Large Brown Eggs, 12 ct, 1 dozen\n3. Punjana Fair Trade (80 Tea Bags)\n4. Ottogi 100% Korean Rice Syrup, 700 Grams\/24 Ounces (Jocheong, Yetnal Ssalyeot)\n5. 12 ct - Spongebob Squarepants and Patrick Birthday Party Cupcake Rings\n6. Sorghum (popping) 8 oz by OliveNation\n7. Medium Japanese Dried Scallops Dried Seafood Conpoy Yuanbei Worldwide Free AIR Mail (0.5LB)\n8. Pancake Mix, Korean Style (2.2 Lb) By Beksul\n9. ARCTIC ZERO Fit Frozen Desserts - 6 Pack - Cappuccino and Purely Chocolate Creamy Pints\n10. After Eight Thin Mints 7.05 ounce (3 packs)\n11. Walkers Fine Oatcake Crackers-10.6 oz\n12. Dean Jacobs Grinder Rosemary N Garlic, 1.5-Ounce\n13. wilton 703-222 pearl dust blue gum paste fondant M4530\n14. Necta Sweet NECTASWEET SUGAR SUB TB .25 GR 1000\n15. 3 Acme Nova-Lox Sliced Salmon packages 3lb Avg\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[1,4,8],"task_name":"task8","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false}
{"input_field":"You are a helpful shop assistant. A user would like to buy the product 'Empire Paintball Prophecy Z2 Gun Loader'. Please select the products that the user may also buy from the following numbered list.\nProduct List: \n1. St. Louis Blues Magnus Cap (One-Size \/)\n2. Emarth Lightweight Envelope Sleeping Bag with Ultra Compact Design for Outdoor Camping 6-19 Degree Weather Orange\n3. Invert Helix Thermal Paintball Goggles Mask - Olive\n4. Lookbook Store Womens Lace Crochet Sweetheart-Neck Swimsuit Bathing Suit US 2-16\n5. Nike Boys Elite Stripe Pants (Little Big Kids)\n6. ALPS Mountaineering Chip Table\n7. Fripp&Folly - Bourbon Barrel - Comfort Colors - T-Shirt - XL\n8. Real Madrid Soccer Structured Flex Fit Cap, Black\n9. ZUMWax Ski\/Snowboard RACING WAX - Universal - 100 gram - INCREDIBLY FAST in ALL Temperatures !!!\n10. West Biking Cycling Mudguard for Bicycle Mountain Bike Fender Front\/Rear Fenders MTB Road Bike Accessories Suit 20" 24" 26"\n11. GXG Lightning Empire Prophecy Z2 Electronic Loader Hopper Speed Feedgate Collar Feed Gate Lid Crown\n12. Walls Men's Big & Tall Cape Back Long Sleeve Hunting Button Shirt 100% Cotton Twill\n13. Planet Eclipse Paintball Gun Grease 20ml Tube of Lubricant Tech Gear\n14. Greenkeepers 4 Hybrid Golf Tee\n15. Cleto Reyes Traditional Lace Up Training Boxing Gloves - 14 oz - Red\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[3,11,13],"task_name":"task8","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false}
{"input_field":"A user on an online shopping website has just purchased a product 'Flanged End Cap for L-Track'. The following numbered list contains 15 products. Please select 3 products from the list that the user may also purchase.\nProduct List: \n1. Blacktop & Roof Patch,10.1 Oz (Pack of 6)\n2. DEWALT D25553K 1-9\/16-Inch Spline Combination Hammer Kit\n3. XtremepowerUS 10pc 1" Dr.Deep Impact Cr-V Socket Set - MM (Black)\n4. Forney 60224 Mini Rotary File Cutter Set, 1\/8-Inch Shaft, 3-Piece\n5. Voltec 08-00616 1400-Watt Halogen Pro Worklight, 7-Foot, Blue & Yellow\n6. 3.5" Acrylic Lens Rimless 2x Magnifying Glass w\/2 LEDs - Great for Basic Inspections, Perfect for Crafts & Hobbies!\n7. Icicle Solar Christmas String Lights, 15.7ft 8 Light Modes 20 LED Water Drop Fairy String Lighting for Outdoor & Indoor, Home, Patio, Lawn, Garden, Party, and Holiday Decorations (Warm White)\n8. Blue Sea Systems 187 Series, 285 Series & Klixon Circuit Breakers\n9. Ridgid 59832 Die Head Post\n10. Johnson Level & Tool 175-L Post Level\n11. Wire Loom Black 20' Feet 1" Split Tubing Hose Cover Auto Home Marine by Nippon America\n12. Brinks 7462-619 Hampton 3-Light Camille Bath Vanity Light, Satin Nickel\n13. Platform Stepladder, 7 ft. 9in, 330 lb.\n14. Self-Adhesive Stress Crack Tape Textured Roll\n15. SHURFLO (255-313) 1\/2" Twist-On Pipe Strainer\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[8,11,15],"task_name":"task8","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Necklace product\nI bought this chain to go along with a alexander the great necklace and im gonna get right to the point if you want a 17 dollar necklace that looks like it came out of a gumball machine buy this one right here, its to short to shiny and to cheap to even consider wearing, go to mall try something nice on and buy it, stay away from this one.\nOutput:\nAnswer: ","output_field":1,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Boxing Glove product\nPerfect for kids. I purchased them thinking they were toy boxing gloves but they seem pretty realy. I have a small hand and it doesn't fit but they are perfect size for my 6 and 7 yr olds.\nOutput:\nAnswer: ","output_field":4,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Thermometer product\nThis unit works well, but the temp is approximately 4 degrees lower than actual temp. which is annoying. The other two thermometers in my house were put next to this unit and they read consistent with each other but ~4 degrees hotter. Not sure why the company can't make it so it reads a true temp. However, the unit does work and fits my needs.\nOutput:\nAnswer: ","output_field":3,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) cereal product\nI love this Meusli. Its very simple. Im sure I could mix up my own, but its easier to just scoop it out for use. I haven't tried cooking it. I usually put it in a bowl with a little almond milk and let it sit for about 10 minutes, which I think is what the package directions say. Id like to try cooking it sometime. Its very good. Its currently my favorite cereal and, I think, much better for you than other packaged breakfast cereals.\nOutput:\nAnswer: ","output_field":5,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Given the following product, which of the following keyword sets is most suitable for it?\nProduct Title: Alchemy Power Inc. Pi-EzConnect \u2013 Raspberry Pi GPIO Verbinder. Eine Kappe um GPIOs und Sensoren zu einem Raspberry Pi zu verbinden. Inkl. MwSt.\nProduct Description: Vor dem Pi-EzConnect werden ein Flachbandkabel, ein Steckbrett und andere Dr\u00e4hte verwendet, um den Temperatursensor mit dem Raspberry-Pi zu verbinden. Mit dem Pi-Ezconnect, wird die Verbindung wie gezeigt vereinfacht. Sehen Sie das Video auf You Tube \u2013 https:\/\/youtu.be\/oChXSE0etQw an. Sowohl der 3,3 V als auch der 5 V Strom ist auf Pi-EzConnect verf\u00fcgbar. Es gibt auch eine Stromerdung auf Ppi-EzConnect. Elektronische Ger\u00e4te, wie z.B. "Pull-up oder Pull-Down - Resistors, LEDs oder andere Komponenten k\u00f6nnen leicht in die Pi-EzConnect L\u00f6tstellen gelegt werden. Sensoren k\u00f6nnen auch auf L\u00f6tpunkte verl\u00f6tet werden. Verwenden Sie abwechselnd die mitgelieferten L\u00f6tverbindungen. Erweiterte Header Pins erlauben Verbindung anderen Kappen mit dem Pi-EzConnect ohne Verlust an Funktionalit\u00e4t. Zur Montage der Pi-EzConnect-Platine auf dem Raspberry-Pi verwenden Sie den 2,5 mm x 15 mm Messing-Abstandshalter f\u00fcr Raspberry Pi HATs. Funktioniert z.B. mit 40 Pin- Header Raspberry Pi Pi-2, Pi-3 sowie Orange-Pi, DIGI ConnectCore und anderen Computern, die das 40 Pin HAT- Format folgen.\n\n0. erweiterung, gpio, breakout board, raspberry pi\n1. arzt, raspberry pi, gpio, klemmleiste\n2. besteckkasten, raspberry pi, gpio, leicht\n3. t cobbler, bett, klemmleiste, learning resource\nAnswer: ","output_field":0,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following sets of phrases best summarizes the following product?\nProduct Title: Learning Resources- Juego para Practicar Palabras de Uso frecuente Pop for Sight Words, Color (LER8430)\nProduct Description: Juego para practicar palabras de uso frecuente pop for sight words de la gama pop games de learning resources s\u00edrvete de esta caja de palomitas para ayudar a los ni\u00f1os a que mejoren su uso de las palabras de uso frecuente; este juego de la gama pop games, que es uno de los m\u00e1s vendidos de learning resources, incluye tarjetas en forma de palomitas con 92 palabras de uso frecuente; los ni\u00f1os escogen una tarjeta de la caja; a continuaci\u00f3n dir\u00e1n en voz alta la palabra y la utilizar\u00e1n correctamente; si utilizan la palabra correctamente, podr\u00e1n quedarse con la tarjeta las tarjetas \u00abpop\u00bb adicionales mantendr\u00e1n a los ni\u00f1os alerta; este juego r\u00e1pido, ganador de un premio, fomenta la fluidez en el habla; juego de alfabetizaci\u00f3n con dos niveles de juego para ampliar el aprendizaje; id\u00f3neo para ni\u00f1os de 5+ a\u00f1os de edad; p\u00f3nselo un poco m\u00e1s dif\u00edcil con pop for sight words 2 de pop games (disponible por separado). not applicable Contribuye a que los ni\u00f1os reconozcan las palabras de uso frecuente\n\n0. mujer, bearing, english, juegos\n1. juegos, duke, pop, recursos educativos\n2. primaria, para, recursos educativos, pop\n3. ingles, neewer, juegos, rejillas\nAnswer: ","output_field":2,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Which of the following sets of phrases best summarizes the following product?\nProduct Title: C\u00e2bles s\u00e9parateur NANYI XLR \u00e0 3.5mm, c\u00e2ble de microphone audio d'interconnexion m\u00e2le TRS st\u00e9r\u00e9o vers deux XLR m\u00e2les, c\u00e2ble adaptateur de s\u00e9parateur Y \u00a0 (0.5 m\u00e8tres \/ 1.6 pieds)\nProduct Description: C\u00e2ble XLR de qualit\u00e9 professionnelle Diffusez de la musique de votre smartphone ou de votre ordinateur sur des haut-parleurs pour une f\u00eate ou pour une console de mixage num\u00e9rique pour une performance. Ce c\u00e2ble complet a un PVC souple veste pour une utilisation facile et le stockage. Les connecteurs m\u00e9talliques robustes sont plaqu\u00e9s or contacts afin de r\u00e9duire le moulage par oxydation et d\u00e9charge de traction sur le connecteur XLR maintenir l'int\u00e9grit\u00e9 du c\u00e2ble. Les conducteurs en cuivre sans oxyg\u00e8ne sont enferm\u00e9s \u00e0 nu blindage en tresse de cuivre pour fournir un son pur et sans bruit. Caract\u00e9ristiques: 1. Le c\u00e2ble XLR NANYI associe des canaux st\u00e9r\u00e9o TRS de 3,5 mm \u00e0 une alimentation monoRo XLR. 2. Adaptabilit\u00e9 \u00e9lev\u00e9e, fiches XLR sophistiqu\u00e9es de 3,5 mm et 3 broches pour plus d'\u00e9quipements et d'applications. 3. Le bo\u00eetier de prise XLR pour la peinture en a\u00e9rosol noire polie de moulage sous pression en alliage de zinc de haute r\u00e9sistance, attrayant et durable. 4. Fiche 3,5 mm plaqu\u00e9e or pour une meilleure conductivit\u00e9 et une meilleure clart\u00e9 du signal. 5. Cuivre sans oxyg\u00e8ne de qualit\u00e9 sup\u00e9rieure (OFC) pour un rejet et une flexibilit\u00e9 efficaces des EMI et des RFI. Paquet: C\u00e2ble micro m\u00e2le de 3,5 mm \u00e0 2 xlr de 0,5 m\u00e8tre * 1 Garantie: 12 mois 1. Le c\u00e2ble XLR NANYI associe des canaux st\u00e9r\u00e9o TRS de 3,5 mm \u00e0 une alimentation monoRo XLR.\n\n0. adaptateur, stereo, reparation, jeu\n1. xor, l&apos, xlr male, abreuvoir oiseau\n2. xlr male, mini, peinture, deux\n3. stereo, xor, xlr male, jack to xlr\nAnswer: ","output_field":3,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Given the following product, which of the following keyword sets is most suitable for it?\nProduct Title: bonmedico Rialzo per Sedia Portatile, Cuscino a Cuneo Ergonomico Ideale Come Cuscino da Divano e Cuscino per Sedia da Ufficio \u2013 Alzasedia Perfetto Come Cuscino per Sedia a Rotelle\nProduct Description: Alzasedia bonmedico \u2013 ergonomico, innovativo e incredibilmente versatile\u00a0 Il cuscino da seduta bonmedico, realizzato in una schiuma innovativa e con sporgenze antiscivolo. Sia a casa che a lavoro, potrai usare questo cuscino ergonomico per la tua sedia da ufficio o per la poltrone in salotto. La forma a cuneo e l\u2019altezza lo rendono perfetto per alzarsi pi\u00f9 facilmente.\u00a0 La forma ergonomica del nostro cuscino a cuneo lo rende perfetto per sedersi per lunghi periodi. La schiuma innovativa non si adatter\u00e0 solamente al peso del tuo corpo, ma ti garantir\u00e0 un appoggio per rialzarti comodamente. La federa \u00e8 realizzata in velluto di alta qualit\u00e0 e ha una piacevole sensazione al tatto. Con questo supporto universale potrai rendere pi\u00f9 comoda qualsiasi sedia e vivere meglio. \u00c8 ideale per sedie da tv, poltrone e divani bassi e pu\u00f2 essere persino utilizzato come comodissimo cuscino per sedia a rotelle. Sarai a tuo agio ovunque e in qualsiasi situazione.\u00a0 La forma ergonomica a cuneo e le sporgenze antiscivolo garantiscono una tenuta sicura anche sulle superfici lisce. Il manico integrato e le pratiche dimensioni lo rendono comodissimo da portare in giro. \u00a0 Potrai rimuovere facilmente la federa grazie alla cerniera nella parte posteriore. Il materiale in velluto pu\u00f2 essere lavato in lavatrice a 60\u00b0C e tutti i materiali utilizzati sono atossici. \u00a0 \u00a0 \u2714 DESIGN INTELLIGENTE: I nostri cuscini per sedie hanno una forma ergonomica e sporgenze antiscivolo che ne garantiranno la stabilit\u00e0 anche sulle superfici lisce. Ogni cuscino per sedia \u00e8 dotato di un comodo manico e le sue pratiche dimensioni di 40 cm x 40 cm x 13\/8 cm (larghezza x spessore x altezza) lo rendono facilissimo da trasportare. Avrai finalmente il cuscino da ufficio che sognavi.\n\n0. samsung, scheda, ergonomico, sedie\n1. camo, poltrona, sedia, custodia\n2. ergonomico, sedie, antidecubito, memory\n3. vasca, foam, sedia, seduta\nAnswer: ","output_field":2,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"A product entitled 'Steadtler Fimo Soft Starter Pack 12 x 57 g Multicolour Blocks' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) German online shopping website.\nOutput: ","output_field":"Fimo Soft Starter Pack 12 x 56g Multicolour Blocks by Steadtler","task_name":"task11","task_type":"generation","metric":"bleu","is_multiple_choice":false}
{"input_field":"A product entitled 'Electric Toothbrush, Sonic Toothbrushes with 8 Brush Heads 40000 VPM 5 Modes, Sonic Toothbrushes Fast Charge 4 Hours Last 30 Days, Rechargeable Electric Toothbrush for Adults, Black' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) Spanish online shopping website.\nOutput: ","output_field":"Cepillo de dientes el\u00e9ctrico, cepillo de dientes el\u00e9ctrico Sonic con 8 cabezales de cepillo 40000 VPM 5 modos, cepillo de dientes el\u00e9ctrico recargable para adultos, negro","task_name":"task11","task_type":"generation","metric":"bleu","is_multiple_choice":false}
{"input_field":"A user found a product with title 'Corsair HS60 PRO Surround Gaming Headset (7.1 Surround Sound, Adjustable Memory Foam Ear Cups, Noise-Cancelling Detachable Microphone with PC, PS4, Xbox One, Switch and Mobile Compatibility) - Yellow' on an online shopping website. Please translate the product title into French.\nOutput: ","output_field":"Corsair HS60 PRO Surround Casque de Gaming Son surround 7.1, M\u00e9moire ajustables Oreillettes, Unidirectionnel Antibruit Microphone avec PC, PS4, Xbox One, Switch et mobiles Compatibilit\u00e9 - Jaune","task_name":"task11","task_type":"generation","metric":"bleu","is_multiple_choice":false}
{"input_field":"Translate the product title 'Actesso Breathable Wrist Support Brace Splint - Ideal for Carpal Tunnel, Sprains, and Tendonitis (Black, Large Left)' into Italian. \nOutput: ","output_field":"Actesso Tutore Polso Traspirante - Ideale Polsiera per Sindrome del Tunnel Carpale, Slogature, RSI e Tendinite (Nero Sinistra, L)","task_name":"task11","task_type":"generation","metric":"bleu","is_multiple_choice":false}
{"input_field":"A product entitled 'JETech Case for iPad (9.7-Inch, 2018\/2017 Model, 6th\/5th Generation), Smart Cover Auto Wake\/Sleep (Light Purple)' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) Japanese online shopping website.\nOutput: ","output_field":"JEDirect iPad 9.7\u30a4\u30f3\u30c1 (2018\/2017 \u7b2c6\/5\u4e16\u4ee3\u7528) \u30b1\u30fc\u30b9 PU\u30ec\u30b6\u30fc \u4e09\u3064\u6298\u30b9\u30bf\u30f3\u30c9 \u30aa\u30fc\u30c8\u30b9\u30ea\u30fc\u30d7\u6a5f\u80fd (\u30e9\u30a4\u30c8\u30d1\u30fc\u30d7\u30eb)","task_name":"task11","task_type":"generation","metric":"jp-bleu","is_multiple_choice":false}
{"input_field":"A product with description 'Flexible and Durable: Withstand over 10,000 bend lifespans. No more worrying about the connector bending, coming out of the housing, or even being left in your TV's HDMI port causing any damage.' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. Compatibilit\u00e0: Per iPad Pro da 12,9 pollici (5a, 6a gen - 2021, 2022) - A2378, A2461, A2379, A2462\n1. Durchmesser (DIA): 14,00 mm Radius (BC): 8.60 Wassergehalt 45 %\n2. \u2b50\u2b50\u301021Pcs Scrapbook Album Set\u3011 1pc album fotos, con 12 bol\u00edgrafos met\u00e1licos, 2 pegatinas de \u00e1lbum, 2 pegatinas de esquina, 2 pegatinas doradas, 2 plantillas de dibujo. Totalmente 20 piezas de accesorios para \u00e1lbumes de recortes, para hacer tus \u00e1lbumes de fotos personalizados. Conjunto de herramientas perfectas para ideas hechas a mano para guardar sus recuerdos. M\u00e1s: si no ha recibido tantos accesorios, por favor cont\u00e1ctenos.\n3. HDMI\u30b1\u30fc\u30d6\u30eb: Twozoh\u9ad8\u901fHDMI\u30b1\u30fc\u30d6\u30eb\u306f\u30014K\u30d3\u30c7\u30aa@60Hz\u30011080P\u3001True HD 7.1\u3001\u30aa\u30fc\u30c7\u30a3\u30aa\u30ea\u30bf\u30fc\u30f3\u30c1\u30e3\u30f3\u30cd\u30eb(ARC)\u3068\u30a4\u30fc\u30b5\u30cd\u30c3\u30c8\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002\nAnswer: ","output_field":3,"task_name":"task12","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"A product with description 'Charging time: 3 hours full charge' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. \u30dc\u30fc\u30ba\u72ec\u81ea\u306e\u6280\u8853\u304b\u3089\u518d\u751f\u3055\u308c\u308b\u3001\u6df1\u304f\u8c4a\u304b\u306a\u9ad8\u97f3\u8cea\u30b5\u30a6\u30f3\u30c9\u3002 \u98a8\u306e\u5f37\u3044\u5834\u6240\u3084\u9a12\u97f3\u4e0b\u3067\u3082\u30af\u30ea\u30a2\u306a\u901a\u8a71\u54c1\u8cea\u3092\u5b9f\u73fe\u3002 \u30b9\u30de\u30fc\u30c8\u30d5\u30a9\u30f3\u3068\u30bf\u30d6\u30ec\u30c3\u30c8\u306a\u3069\u30012\u53f0\u306eBluetooth\u6a5f\u5668\u3092\u540c\u6642\u306b\u63a5\u7d9a\u3067\u304d\u308b\u30de\u30eb\u30c1\u30dd\u30a4\u30f3\u30c8\u6a5f\u80fd\u3092\u642d\u8f09\u3002 \u5145\u96fb\u5f0f\u30ea\u30c1\u30a6\u30e0\u30a4\u30aa\u30f3\u30d0\u30c3\u30c6\u30ea\u30fc\u306b\u3088\u308a\u7d0415\u6642\u9593\u306e\u9023\u7d9a\u4f7f\u7528\u304c\u53ef\u80fd\u3002 SCMS-T\u306b\u5bfe\u5fdc\u3002\u30ef\u30f3\u30bb\u30b0\u653e\u9001\u3082\u30ef\u30a4\u30e4\u30ec\u30b9\u3067\u697d\u3057\u3081\u308b\u3002 \u8efd\u91cf\u5316\u3092\u8ffd\u6c42\u3057\u305f\u72ec\u81ea\u30c7\u30b6\u30a4\u30f3\u3002\n1. [ Active Noise Cancelling Earbuds] Equipped with strong noise cancelling technology, SoundPEATS Life wireless earbuds can effectively eliminate external noise up to 25dB, you can jam in your music and enjoy your time no matter you are in the subway or on the street. The ergonomic design creates a tight seal with your ear canal for better noise cancellation and more imersive beats.\n2. 80 PLUS BRONZE EFFIZIENZ - Extern zertifiziert (80 PLUS Bronze 230V EU), um einen typischen Wirkungsgrad von 88% unter Standardlastbedingungen zu gew\u00e4hrleisten\n3. \u2b50\u3010Tips\u3011When using, please tighten the drawstring and tie the extra rope into the laundry bag for easy access to clothes\nAnswer: ","output_field":0,"task_name":"task12","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"A product with description 'Available in both 3 and 6 packs' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. \u3010Auto Schlaf\/Aufwach\u3011- Man kann die Abdeckung zum Aufwecken \u00f6ffnen und zum Ruhezustand schlie\u00dfen. An der Innenseite der vorderen Abdeckung befindet sich eine praktische Handschlaufe, um das Lesen beim Halten des Tablets zu erleichtern. Mit einem Gummiband, damit sich der Deckel nicht leicht \u00f6ffnet.\n1. COMPATIBILITY \u2013 The TORRO Magnetic Leather Cardholder is compatible with any MagSafe device (iPhone 14 \/ 13 \/ 12 Series). The built-in magnets ensure it connects to your device with precision for a seamless and secure attach\/detach.\n2. Adoucit l'eau du robinet\n3. Des yeux plus charmants: Les cils magnetique naturel vous aideront \u00e0 cr\u00e9er le maquillage des yeux le plus glamour, faisant de vous la femme la plus attirante de la foule. Apr\u00e8s avoir s\u00e9lectionn\u00e9 parmi plus de 60 types diff\u00e9rents de faux cils magn\u00e9tiques, nous avons choisi ces 5 paires de cils les plus confortables et les plus naturels.\nAnswer: ","output_field":2,"task_name":"task12","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"A user found a product with description 'Medium acrylic pen for easy, spontaneous drawing on light or dark surfaces; ideal for materials such as canvases, acrylic paper, painting boards and terracotta' on an online shopping website. If the product appears on another shopping website in a different language, which of the following may be its descriptions?\n0. Das hautkl\u00e4rende Gesichtsserum f\u00fcr unreine Haut \u2013 Das Anti-Pickel+ T\u00e4gliches Serum von Neutrogena mit kl\u00e4render Salicyls\u00e4ure hilft, Pickel zu beseitigen, farbige Pickelmale verblassen zu lassen und die Strahlkraft der Haut und das Hautbild zu verbessern\n1. Ben\u00f6tigt Batterien: Nein\n2. Contiene 1 marcatore acrilico (nero)\n3. Caf\u00e9 en dosettes compatibles avec les machines Tassimo\nAnswer: ","output_field":2,"task_name":"task12","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true}
{"input_field":"Instructions: Tell me what this product category is about\nInput: Toggle Switch\nOutput:","output_field":"A toggle switch is an electric switch operated by means of a projecting lever that is moved up and down.","task_name":"task1","task_type":"generation","metric":"sent-transformer","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Instructions: Explain the product category name\nInput: Watch Band\nOutput:","output_field":"A watch band is a bracelet that straps a wristwatch onto the wrist.","task_name":"task1","task_type":"generation","metric":"sent-transformer","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Input: Explain the product type Handbag\nOutput:","output_field":"Handbags are bags carried close to the body that are intended to hold multiple personal items for easy retrieval, and they are often fashionably designed. Handbags may be handheld, or may have a strap or handles.","task_name":"task1","task_type":"generation","metric":"sent-transformer","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Input: Explain the product type Water Purification Unit\nOutput:","output_field":"A water purification unit removes impurities by lowering contamination of water using a fine physical barrier, a chemical process, or a biological process.","task_name":"task1","task_type":"generation","metric":"sent-transformer","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Which of the following product categories may have the attribute eu spare part availability duration?\n0. mouse pad\n1. leash\n2. surveillance camera\n3. garbage bin\nAnswer: ","output_field":0,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Which of the following product categories may have the attribute power source?\n0. table\n1. writing tools\n2. car seat cover\n3. comb\nAnswer: ","output_field":3,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Select the category of products that can have the attribute closure type.\n0. sacco chair\n1. rangefinder\n2. cord management cover\n3. watch band\nAnswer: ","output_field":3,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"Select the category of products that can have the attribute eu spare part availability duration.\n0. tarpaulin\n1. bed sheet\n2. saw\n3. suitcase\nAnswer: ","output_field":0,"task_name":"task2","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"A customer has bought a(n) handbag product and wants to write a review to express a positive sentiment on the size aspect. \nYou are given a numbered list of 15 potential review snippets. Please select 3 snippets from the list that the customer is most likely to write in his review. \nYou should output three numbers, separated with comma. Generate only indices. Do not include review snippets in your answer. Do not give explanations. \nReview Snippet List: \n1. Good quality\n2. High quality\n3. Good amount for the money\n4. beautiful box\n5. accurate sizing\n6. roomy\n7. clear instructions\n8. Very beautiful\n9. compact\n10. better price\n11. does the job just fine\n12. very comfortable\n13. super comfy\n14. good replacement\n15. kind of perfect\nOutput: ","output_field":[6,9,15],"task_name":"task3","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"A customer has bought a(n) timer product and wants to write a review to express a positive sentiment on the performance aspect. \nYou are given a numbered list of 15 potential review snippets. Please select 3 snippets from the list that the customer is most likely to write in his review. \nYou should output three numbers, separated with comma. Generate only indices. Do not include review snippets in your answer. Do not give explanations. \nReview Snippet List: \n1. this matches perfectly\n2. Lightweight\n3. paper quality is nice\n4. affordable\n5. looks great\n6. great bang for your buck\n7. wrapped to shreds\n8. work wonderfully\n9. coating seems very durable\n10. gorgeous\n11. well made\n12. good corner toaster with quality features\n13. timer works great\n14. it does the job\n15. soft to the ear sound\nOutput: ","output_field":[14,13,8],"task_name":"task3","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"A customer has bought a(n) ladder product and wants to write a review to express a positive sentiment on the weight aspect. \nYou are given a numbered list of 15 potential review snippets. Please select 3 snippets from the list that the customer is most likely to write in his review. \nYou should output three numbers, separated with comma. Generate only indices. Do not include review snippets in your answer. Do not give explanations. \nReview Snippet List: \n1. It worked great\n2. High quality\n3. well made\n4. ability to take photos with the iPhone\n5. beautiful addition to my home\n6. Colors are vibrant\n7. colors are beautiful\n8. Ligthtweight\n9. goes really well with my new house\n10. held up quite well\n11. not super solid\n12. Great light weight ladder\n13. item doesn't work with my 2012 Macbook Pro\n14. high quality\n15. Light\nOutput: ","output_field":[8,15,12],"task_name":"task3","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"A customer has bought a(n) robe product and wants to write a review to express a positive sentiment on the quality aspect. \nYou are given a numbered list of 15 potential review snippets. Please select 3 snippets from the list that the customer is most likely to write in his review. \nYou should output three numbers, separated with comma. Generate only indices. Do not include review snippets in your answer. Do not give explanations. \nReview Snippet List: \n1. easy to carry\n2. fit as expected\n3. no issues\n4. fits in one carrying case\n5. feel good\n6. Lights work well\n7. can contain a lot of toys\n8. ribbed fabric\n9. These inks are pretty nice\n10. sound is really good\n11. Beautifully made\n12. awesome pillows for the price\n13. Perfect quality\n14. happy with the quality\n15. works good\nOutput: ","output_field":[8,14,13],"task_name":"task3","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'product type'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: tablette asus\nOutput: ","output_field":["tablette"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'brand'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: chocolate buttons cadbury xmas\nOutput: ","output_field":["cadbury"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'audience'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: mens short sleeved business shirts\nOutput: ","output_field":["mens"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'composition'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: 3m 1080 vinyl wrap\nOutput: ","output_field":["vinyl"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'product type'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: tablette xiaomi\nOutput: ","output_field":["tablette"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'brand'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: chocolate buttons nestle easter\nOutput: ","output_field":["nestle"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'audience'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: women long sleeved casual shirts\nOutput: ","output_field":["women"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are a helpful online shop assistant and a linguist. A customer on an online shopping platform has made the following query. Please extract phrases from the query that correspond to the entity type 'composition'. \nPlease directly output the entity without repeating the entity type. If there are multiple such entities, separate them with comma. Do not give explanations. \nQuery: 3m plastic repair tape\nOutput: ","output_field":["plastic"],"task_name":"task4","task_type":"named_entity_recognition","metric":"micro f1","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'APRATIM Women's Cotton Bandhani Dupatta With Mirror Work Free Size Blue' appears on an e-commerce website. What type of fabric is used in it?\n0. spandex, polyester\n1. cotton\n2. microfiber\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Emfogo Wall Shelves with Ledge 16.9 inch Wood Picture Shelf Rustic Floating Shelves Set of 3 for Storage and Display Carbonized Black' appears on an e-commerce website. What type of mounting is used in it?\n0. freestanding\n1. wall\n2. handlebar\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Amazon Brand - Happy Belly Fat Free Milk, Lactose Free, Ultra-Pasteurized, Kosher, Half Gallon, 64 Fl. Oz' appears on an e-commerce website. Is the drink vegetarian? \n0. non-vegetarian\n1. kosher\n2. It cannot be inferred.\n3. vegetarian\nAnswer: ","output_field":3,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'M & H Melanin Multi-Use Softening Leave In Conditioner,16 Oz. Formulated with Nourishing Baobab Oil, Turnip Root,ProVitamin B5,Hydrate, Soften and Condition, 16 Fl Oz (Pack of 1)' appears on an e-commerce website. What type of hair can be used for it?\n0. It cannot be inferred.\n1. all\n2. dry\n3. all hair\nAnswer: ","output_field":0,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Hanes Men's Beefy-T T-Shirt, Heavyweight Cotton Tee, 1 Or 2 Pack, Big & Tall' appears on an e-commerce website. What type of fabric is used in it?\n0. spandex, polyester\n1. cotton\n2. microfiber\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Axeman Floating Shelves 12 Inch Deep, Rustic Wood Wall Mounted Shelves Set of 2, Large Floating Shelves for Wall Decor, Hanging Shelves for Farmhouse Kitchen Bathroom Bedroom, Rustic Brown' appears on an e-commerce website. What type of mounting is used in it?\n0. freestanding\n1. wall\n2. handlebar\n3. It cannot be inferred.\nAnswer: ","output_field":1,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Amazon Brand, Happy Belly 2% Reduced Fat Milk, Half Gallon, 64 Fl Oz' appears on an e-commerce website. Is the drink vegetarian? \n0. non-vegetarian\n1. kosher\n2. It cannot be inferred.\n3. vegetarian\nAnswer: ","output_field":3,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Nexxus Clean and Pure Conditioner, With ProteinFusion, Nourished Hair Care Silicone, Dye And Paraben Free 33.8 oz' appears on an e-commerce website. What type of hair can be used for it?\n0. It cannot be inferred.\n1. all\n2. dry\n3. all hair\nAnswer: ","output_field":0,"task_name":"task5","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) shirt product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: I had 4 pairs of healing hands scrubs that I wore for 3 years amd they still look just as good as when I bought them. I lost some weight and needed a smaller size purchased on Amazon for convenience. I do not know the the quality of the material used has decreased or if these just have a healing hand label. These scrubs are pretty terrible. I only kept the 3 pairs because I need to wear something. All 3 are the same brand, style and size yet the purple pants are very tight compared to the grey and navy. The material on all of them is just bad. They catch every single piece of lint and are already getting those balls you see on an old sweater and I have worn these twice. Never had that issue with my older scrubs. They are comfortable and seem well sown but with the quality of the material not worth the cost.\nAspect: comfort\nOutput: \n","output_field":"comfortable","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) shorts product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: Great quality and cut. Perfect length! I love that they\u2019re not snug on thighs. I am 5\u20199.5\u201d and 128lbs and ordered a 4, which was too big. They seem to run a little big, so I exchanged them for a 2. I read some reviews that said they run small. That surprises me! Maybe they have changed the sizing since then. I suppose it depends where you carry your weight. Excited to wear them!\nAspect: comfort\nOutput: \n","output_field":"not snug on thighs","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The following is a user review to a(n) water cup product and an aspect mentioned in the review.\nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: If you're looking for a quality big water bottle without paying a ton buy this! I got the all black and its so nice to have around during the work day. Very light weight, doesn't leak and super convenient! Easy to take into a car since it fits in cup holders. Buying one for my wife cause she is jealous!\nAspect: ease of use\nOutput: \n","output_field":"super convenient","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) shoes product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: The stitches started ripping the 2nd day my 2 and 3 year old wore them... other than that they were comfy and cute for the price.\nAspect: color\nOutput: \n","output_field":"cute for the price","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) pants product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: Love these pants. Super comfortable. They run longer than what I am use to, but I think that is part of the style. Bought in two other colors. As for sizing, I normally wear a 34\/30 to 36\/30 depending on the manufacturer. I am 5'11 215 with an athletically build.\nAspect: comfort\nOutput: \n","output_field":"super comfortable","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) shorts product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: These shoes are soft and comfortable but slightly snug. I can wear them now if not doing a lot of walking, but they are lather and will stretch a little, so I'm happy with them. They are a great color, sort of stone. Not white and not beige. Very versatile.\nAspect: comfort\nOutput: \n","output_field":"comfortable but slightly snug","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The following is a user review to a(n) water cup product and an aspect mentioned in the review.\nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: Perfect size for a couple of large water bottles and some ice packs. I see that a few reviewers commented on how hard the lid was to open, I can't speak to their experiences, but mine works just fine.. I am going to LOVE this thing for strapping onto my atv for work. I don't drink much soda, so a few water bottles and a small lunch in mine and I am all set.\nAspect: ease of use\nOutput: \n","output_field":"works just fine","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"You are given a user review to a(n) shirt product and an aspect covered in the review. \nPlease extract the keyphrase from the review that mentions the given aspect. \nYou should only extract ONLY ONE keyphrase from the review. \nYou should not generate new keyphrases that do not exist in the review.\nDo not give explanations or irrelevant text. Generate short keyphrases, not whole sentences. \nReview: I bought this for my 8 year old daughter and I got a Medium. She is small for her age but I would normally buy her a size 7. But since the only options were small medium and large, and after reading the reviews, I went with a medium. It is way way too big. I should have gone with a small or XS. So I am disappointed because she is going to wear it for pictures. Instead of sending it back, I'll just have to cut and sew it myself. The colors are really bright and it's a pretty nice shirt. It could be a little thicker though in my opinion.\nAspect: color\nOutput: \n","output_field":"colors are really bright","task_name":"task6","task_type":"generation","metric":"rougel","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"\nYou are given a user review given to a(n) bra product. You are also given a numbered list of ten aspects. \nPlease choose three aspects from the list that are covered by the review. \nYou should ONLY output three numbers, separated by comma. Do not generate explanations or other texts. \nReview: \nVery comfortable and supportive, as a 38D it\u2019s hard to find a good bra. True to size\nAspect List: \n1.stability\n2.magnet strength\n3.straps\n4.lid\n5.hook\n6.comfort\n7.value\n8.support\n9.quality\n10.fit\nOutput: ","output_field":[6,8,10],"task_name":"task7","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"\nYou are given a user review given to a(n) knife product. You are also given a numbered list of ten aspects. \nPlease choose three aspects from the list that are covered by the review. \nYou should ONLY output three numbers, separated by comma. Do not generate explanations or other texts. \nReview: \nHands down my favorite knife. The sheath that it comes with is ok. Nothing special but its functional. The handle for it sucks. Its very comfortable but slips very easily. You also need to put loctite on the screw or it comes loose during use. And you have to remember to clean it! Because of the type of steel it is, it will rust if you let it. I sunk mine in a water soaked log over night and when I woke up the next day the exposed blade had rust on it. Stupid me. Now the good stuff... its an awesome chopper. This has replaced my hatchet. The type of steel also allows it to maintain a good edge. Its obviously a large size so small camp tasks are not suited to it. But it is possible with a steady careful hand. Over all I really love the knife and it goes every where my backpack goes.\nAspect List: \n1.safety\n2.quality\n3.appearance\n4.ease of use\n5.ease of use\n6.length\n7.elasticity\n8.rust\n9.blade play\n10.grip\nOutput: ","output_field":[8,2,10],"task_name":"task7","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"\nYou are given a user review given to a(n) bra product. You are also given a numbered list of ten aspects. \nPlease choose three aspects from the list that are covered by the review. \nYou should ONLY output three numbers, separated by comma. Do not generate explanations or other texts. \nReview: \nComfortable bra but not enough padding to keep the headlights hidden. Fit was good and accurate.\nAspect List: \n1.size\n2.quality\n3.fit\n4.padding\n5.delivery\n6.rolling up\n7.comfort\n8.lines\n9.quality\n10.smoothness\nOutput: ","output_field":[4,3,7],"task_name":"task7","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"\nYou are given a user review given to a(n) shoes product. You are also given a numbered list of ten aspects. \nPlease choose three aspects from the list that are covered by the review. \nYou should ONLY output three numbers, separated by comma. Do not generate explanations or other texts. \nReview: \nLOVE the arch support!! These are the most comfortable flip flops I\u2019ve ever had. I wore these all day at Disney World and will do it again! The arch support is way superior than any other shoe I\u2019ve tried. I have a really high arch and most shoes don\u2019t provide proper support for me. This soft flexible material pushes upwards into your arch with every step! You can feel the support! LOVE them and the color choices! I found these 3 years ago and have been telling everyone I know about them. I wear these daily in doors and out! Well worth every Penny!\nAspect List: \n1.entertainment value\n2.arch support\n3.comfort\n4.lid\n5.noise\n6.fit\n7.value\n8.condition\n9.ease of installation\n10.hole\nOutput: ","output_field":[7,3,2],"task_name":"task7","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-understanding-shopping-concepts"}
{"input_field":"The product 'Simply Asia Garlic Basil Singapore Street Noodles, 9.24 oz (Pack of 6)' appears on e-commerce website. What is the total weight of the noodles?\n0. 8 ounce\n1. 55.44 ounce\n2. 14.19 ounce\n3. 60 ounce\nAnswer: ","output_field":1,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'Bounty Paper Napkins, White or Printed, 200 Count, Pack of 2' appears on e-commerce website. What is the total count of disposable napkins in this package?\n0. 120 count\n1. 660 count\n2. 1040 count\n3. 400 count\nAnswer: ","output_field":3,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'TopMate Monitor Stand for Desk RGB Gaming Lights with USB 3.0 Hub, Foldable Computer Screen Riser with Storage Drawer and Phone Holder, Desk Organizer Laptop Shelf, for PC\/Laptop\/iMac - Black' appears on an e-commerce website. It is a monitor. How many usb ports are there?\n0. It cannot be inferred.\n1. 3.0\n2. 2\n3. 1\nAnswer: ","output_field":0,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'Anker USB C Hub, 655 USB-C Hub (8-in-1), with 2 USB-A 10 Gbps Data Ports, 100W Power Delivery, 4K HDMI, 1 Gbps Ethernet, microSD and SD Card Slots, 3.5 mm AUX, for MacBook, and More (Earthy White)' appears on an e-commerce website. It is a multiport hub. How many usb ports are there?\n0. It cannot be inferred.\n1. 1\n2. 2\n3. 3\nAnswer: ","output_field":3,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'Nissin RAOH Ramen Noodle Soup, Tonkotsu, 3.53 Ounce (Pack of 6)' appears on e-commerce website. What is the total weight of the noodles?\n0. 8 ounce\n1. 21.18 ounce\n2. 14.19 ounce\n3. 60 ounce\nAnswer: ","output_field":1,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'Vanity Fair Entertain Paper Napkins, 3-Ply Disposable Napkins, Dinner Size (24 packs of 40 Napkins)' appears on e-commerce website. What is the total count of disposable napkins in this package?\n0. 120 count\n1. 660 count\n2. 1040 count\n3. 960 count\nAnswer: ","output_field":3,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'AQQEF Desktop Monitor Stand Riser with Drawer,Laptop Stand Riser and Computer for Desk with USB 3.0 Data Port and Type-C Charging' appears on an e-commerce website. It is a monitor. How many usb ports are there?\n0. It cannot be inferred.\n1. 3.0\n2. 2\n3. 1\nAnswer: ","output_field":0,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"The product 'Anker 555 USB-C Hub (8-in-1), with 100W Power Delivery, 4K 60Hz HDMI Port, 10Gbps USB C and 2 A Data Ports, Ethernet microSD SD Card Reader, for MacBook Pro More' appears on an e-commerce website. It is a multiport hub. How many usb ports are there?\n0. It cannot be inferred.\n1. 1\n2. 2\n3. 4\nAnswer: ","output_field":3,"task_name":"task8","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Which of the following events can obstruct the event \"PersonX finds a tampon\"\n0. PersonX has a sore finger\n1. PersonX doesn't have any interest in watching the movie\n2. PersonX didn't have enough money to buy the ticket\n3. PersonX can't find a box\nAnswer: ","output_field":3,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Given the event \"PersonX wears a blazer\", as a result, PersonX feels\n0. fearful\n1. dignified\n2. more confident\n3. compassion\nAnswer: ","output_field":1,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Given the event \"PersonX reads novels in a hammock\", as a result, PersonX\n0. is relaxed\n1. is caught by store owner\n2. buys a flashlight\n3. buys a wrench\nAnswer: ","output_field":0,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"\"PersonX cleans the cat box\" because PersonX wanted\n0. to have some\n1. to be able to hear PersonY\n2. to get away from the pressure\n3. to be clean\nAnswer: ","output_field":3,"task_name":"task9","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Which of the following product categories best complement the product type tabletop game?\n0. toy figure\n1. DV recorder\n2. tablet computer\n3. hair iron\nAnswer: ","output_field":0,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Which of the following product categories best complement the product type personal fragrance?\n0. chair\n1. overalls\n2. body deodorant\n3. hair clip\nAnswer: ","output_field":2,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Given product type knife, which of the following product categories best complement the given product type?\n0. watch\n1. umbrella\n2. blade sharpener\n3. wheel\nAnswer: ","output_field":2,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Given product type golf club, which of the following product categories best complement the given product type?\n0. sonar fathometer\n1. golf bag\n2. fitness band\n3. buoyancy device\nAnswer: ","output_field":1,"task_name":"task10","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-shopping-knowledge-reasoning"}
{"input_field":"Which of the following statements best describes the relation from query \"waterpik\" to query \"long sleeve crop tops for women\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":0,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"bioderma\" to query \"bioderma eye cream\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":3,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"kitkat\" to query \"cookies\"?\n0. narrowing\n1. substitute\n2. irrelevant\n3. complement\nAnswer: ","output_field":3,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"dinosaur slippers\" to query \"dinosaur crocs\"?\n0. irrelevant\n1. complement\n2. substitute\n3. narrowing\nAnswer: ","output_field":1,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"oral b dental floss\" to query \"short sleeve polo for men\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":0,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"lancome\" to query \"lancome face moisturizer\"?\n0. irrelevant\n1. substitute\n2. complement\n3. narrowing\nAnswer: ","output_field":3,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"oreos\" to query \"fruit loops\"?\n0. narrowing\n1. substitute\n2. irrelevant\n3. complement\nAnswer: ","output_field":1,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Which of the following statements best describes the relation from query \"man city jersey\" to query \"man city hat\"?\n0. irrelevant\n1. complement\n2. substitute\n3. narrowing\nAnswer: ","output_field":1,"task_name":"task11","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"A user has made a query with keyword 'jeep liberty lift'. Given the following numbered list of 5 products, please rank the products according their relevance with the query. \nProduct List: \n1. Supreme Suspensions - Front Leveling Kit for 2002-2007 Jeep Liberty KJ and 2008-2012 Jeep Liberty KK 2.5\" Front Lift High-Strength Carbon Steel Strut Spacers 2WD 4WD\n2. Rough Country 2.5\" Lift Kit for 2007-2018 Jeep Wrangler JK 4DR - 67930\n3. Rough Country 2.5\" Lift Kit (fits) 1997-2006 Jeep Wrangler TJ LJ | 6 CYL | N3 Shocks | Suspension System | 653.20\n4. Supreme Suspensions - Full Lift Kit for 2008-2012 Jeep Liberty KK 2.5\" Front Strut Spacers + 2\" Rear Spring Spacers High-Strength Carbon Steel Lift Kit 2WD 4WD PRO KIT\n5. TeraFlex 1251000 2.5\" Lift Kit (JK 4 Door with All (4) 2.5\" Shock)\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[1,0.01,0.1,1,0],"task_name":"task12","task_type":"ranking","metric":"ndcg","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are an intelligent shopping assistant that can rank products based on their relevance to the query. The following numbered list contains 5 products. Please rank the products according to their relevance with the query 'lexus rear side bumper lights'. \nProduct List: \n1. Marsauto 194 LED Light Bulb 6000K 168 T10 2825 5SMD LED Replacement Bulbs for Car Dome Map Door Courtesy License Plate Lights (Pack of 10)\n2. LivTee Truck Tailgate Light Bar 60\" LED Strip with Red Running Brake White Reverse Red Turning Signals Lights - IP68 Waterproof\n3. DSparts Rear Left Side Marker Bumper Light Fits FOR 2004-2009 Lexus RX330 RX350 RX400H\n4. Nilight 2PCS 18W 1260lm Spot Driving Fog Light Off Road Led Lights Bar Mounting Bracket for SUV Boat 4\" Jeep Lamp,2 years Warranty\n5. Motor Trend 923-GR Gray FlexTough Contour Liners-Deep Dish Heavy Duty Rubber Floor Mats for Car SUV Truck & Van-All Weather Protection, Universal Trim to Fit\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0.01,0.1,1,0,0],"task_name":"task12","task_type":"ranking","metric":"ndcg","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are an intelligent shopping assistant that can rank products based on their relevance to the query. The following numbered list contains 5 products. Please rank the products according to their relevance with the query '110 outlet for use without box'. \nProduct List: \n1. Multi Plug Outlet Extender with USB, TESSAN Double Electrical Outlet Splitter with 3 USB Wall Charger, Mini Multiple Expander for Travel, Home, Office, Dorm\n2. ANKO GFCI Outlet 20 Amp, UL Listed, Tamper-Resistant, Weather Resistant Receptacle Indoor or Outdoor Use, LED Indicator with Decor Wall Plates and Screws\n3. Echo Dot (3rd Gen) - Smart speaker with Alexa - Charcoal\n4. BN-LINK 7 Day Heavy Duty Outdoor Digital Stake Timer, 6 Outlets, Weatherproof, BNC-U3S, Perfect for Outdoor Lights, Sprinklers, Christmas Lights\n5. WELLUCK 15 Amp 125V AC Power Inlet Port Plug with Integrated 18\" Extension Cord, NEMA 5-15 RV Flanged Inlet with Waterproof & Back Cover, 2 Pole 3-Wire Shore Power Plug for Boat\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0.01,0.1,0,1,0],"task_name":"task12","task_type":"ranking","metric":"ndcg","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"A user has made a query with keyword 'blue shampoo aveda'. Given the following numbered list of 5 products, please rank the products according their relevance with the query. \nProduct List: \n1. Organic Blue Mallow Flowers - Color-Changing Blue Herbal Tea | 100% Dried Blue Mallow Flowers - Malva sylvestris | Net Weight: 0.5oz \/ 15g\n2. Aveda Clove Shampoo, 33.8 Oz, 33.8 Fl Oz () (0018084813553)\n3. 2 New Aveda Bottle Pumps fits 1 Liter products Shampoo, Conditioner, Lotion, Etc.\n4. Joico Color Balance Blue Shampoo 10.1 fl oz\n5. AVEDA by Aveda: Blue Malva Color Shampoo 33.8 OZ\nYou should output a permutation of 1 to 5. There should be a comma separating two numbers. Each product and its number should appear only once in the output. Only respond with the ranking results. Do not say any word or explanations.\nOutput: ","output_field":[0,0.1,0.01,0.1,1],"task_name":"task12","task_type":"ranking","metric":"ndcg","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are a user on an online shopping platform. You make queries and click on products to eventually find the product you want and make your purchase. \nSuppose you have just performed the following sequence of actions of queries, clicks, and purchases. What is most likely to be the keyword of your next query? \nYou are given a numbered list of ten candidate queries. Select three from the list that you think most likely. Output ONLY three numbers separated with comma. Do not give explanations. \n\nAction Sequence:\nQuery keyword 'boat cover bungee straps'\nClick on product '21 Inch Tarp Straps - Rubber Bungee Cords with Crimped S Hooks - Natural Rubber Heavy-Duty Bungee Straps - Weatherproof (Pack of 10)'\nClick on product '21 Inch Tarp Straps - Rubber Bungee Cords with Crimped S Hooks - Natural Rubber Heavy-Duty Bungee Straps - Weatherproof (Pack of 10)'\nClick on product 'Seachoice 78941 Boat Cover Tie-Down Strap Kit \u2013 Contains 12 Straps \u2013 8 Feet Long \u2013 Black'\nClick on product 'Seachoice 78941 Boat Cover Tie-Down Strap Kit \u2013 Contains 12 Straps \u2013 8 Feet Long \u2013 Black'\nClick on product 'Wake Cover Tie Down Straps - Pack of 12'\nClick on product 'Wake Cover Tie Down Straps - Pack of 12'\nQuery keyword 'pontoon cover bungee straps'\nClick on product 'Vortex New Grey 20 FT Ultra 5 Year Canvas Pontoon\/Deck Boat Cover, Elastic, Strap System, FITS 18'1" FT to 20' Long Deck Area, UP to 102" Beam (Fast - 1 to 4 Business Day DELIVERY)'\nClick on product 'Vortex New Beige 20 FT Ultra 5 Year Canvas Pontoon\/Deck Boat Cover, Elastic, Strap System, FITS 18'1" FT to 20' Long Deck Area, UP to 102" Beam (Fast - 1 to 4 Business Day DELIVERY)'\nQuery keyword 'camp quitcherbitchin rug'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Standard Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 18 Inch x 30 Inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nQuery keyword 'camp quitcherbitchin mat'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Standard Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 18 Inch x 30 Inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Standard Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 18 Inch x 30 Inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nQuery keyword 'camp quitcherbitchin rug'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 23.6 inch by 15.7 inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nQuery keyword 'camp quitcherbitchin mat'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 23.6 inch by 15.7 inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nClick on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 23.6 inch by 15.7 inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\nQuery keyword 'camp quitcherbitchin rug'\nFollow up click on product 'SHANGMAO Funny Camp Door Mat Entrance Floor Mat | Standard Non-Slip Back Rubber Welcome Front Doormat Outdoor Decor 18 Inch x 30 Inch | Welcome to Camp Quitcherbitchin A Certified Happy Camper Area'\n\nCandidate Query List:\n1. over the sink storage\n2. over the sink dish drying rack small\n3. smoke alarm\n4. camp quitcherbitchin mat\n5. pontoon cover bungee straps\n6. boat cover bungee straps\n7. fireplace pads for toddlers\n8. dont turn off sign\n9. Sunex impact socket set\n10. camp quitcherbitchin rug\nOutput (answer in three comma-separated numbers): ","output_field":[4],"task_name":"task13","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are a user on an online shopping platform. You make queries and click on products to eventually find the product you want and make your purchase. \nSuppose you have just performed the following sequence of actions of queries, clicks, and purchases. What is most likely to be the keyword of your next query? \nYou are given a numbered list of ten candidate queries. Select three from the list that you think most likely. Output ONLY three numbers separated with comma. Do not give explanations. \n\nAction Sequence:\nQuery keyword 'fire extinguisher'\nClick on product 'Kidde FA110 Multi Purpose Fire Extinguisher 1A10BC, 1 Pack'\nAdd product 'Kidde FA110 Multi Purpose Fire Extinguisher 1A10BC, 1 Pack' to cart\nClick on product 'Kidde FA110 Multi Purpose Fire Extinguisher 1A10BC, 1 Pack'\nPurchase product 'Kidde FA110 Multi Purpose Fire Extinguisher 1A10BC, 1 Pack'\nClick on product 'Kidde FA110 Multi Purpose Fire Extinguisher 1A10BC, 1 Pack'\nQuery keyword 'smoke alarm'\nClick on product 'First Alert Battery Powered Smoke Alarm with Silence Button, SA303CN3'\nAdd product 'First Alert Battery Powered Smoke Alarm with Silence Button, SA303CN3' to cart\nPurchase product 'First Alert Battery Powered Smoke Alarm with Silence Button, SA303CN3'\nClick on product 'First Alert Battery Powered Smoke Alarm with Silence Button, SA303CN3'\nQuery keyword 'small gaming keyboard and mouse'\nClick on product 'CHONCHOW Gaming Keyboard and Mouse Combo Led Compact Teclado 87 Keys Wired Rainbow Backlit Tenkeyless Keyboard and Mouse Mousepad Compatible with Windows PC Mac Vista (Black)'\nQuery keyword 'dont turn off sign'\nClick on product 'Notice Do Not Turn Off Hazard Sign Notice Signs Vinyl Sticker Decal 8"'\nClick on product 'Notice Do Not Turn Off Hazard Sign Notice Signs Label Vinyl Decal Sticker Kit OSHA Safety Label Compliance Signs 8"'\n\nCandidate Query List:\n1. do not turn off sticker\n2. heel grass stoppers\n3. hideaway containers\n4. sit and spin\n5. stiletto heels\n6. smoke alarm\n7. dyson tp01\n8. fire extinguisher\n9. first aid kits\n10. small gaming keyboard and mouse\nOutput (answer in three comma-separated numbers): ","output_field":[1],"task_name":"task13","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are a user on an online shopping platform. You make queries and click on products to eventually find the product you want and make your purchase. \nSuppose you have just performed the following sequence of actions of queries, clicks, and purchases. What is most likely to be the keyword of your next query? \nYou are given a numbered list of ten candidate queries. Select three from the list that you think most likely. Output ONLY three numbers separated with comma. Do not give explanations. \n\nAction Sequence:\nAdd product 'Stand Mixer, Aicok Dough Mixer with 5 Qt Stainless Steel Bowl, 6 Speeds Tilt-Head Food Mixer, Kitchen Electric Mixer with Double Dough Hooks, Whisk, Beater, Pouring Shield, Black' to cart\nPurchase product 'Stand Mixer, Aicok Dough Mixer with 5 Qt Stainless Steel Bowl, 6 Speeds Tilt-Head Food Mixer, Kitchen Electric Mixer with Double Dough Hooks, Whisk, Beater, Pouring Shield, Black'\nQuery keyword 'pool vacuum for above ground pools'\nClick on product 'Hayward W900 Wanda the Whale Above-Ground Pool Vacuum (Automatic Pool Cleaner)'\nClick on product 'Hayward W900 Wanda the Whale Above-Ground Pool Vacuum (Automatic Pool Cleaner)'\nQuery keyword 'chlorine'\nClick on product 'CLOROX Pool&Spa XtraBlue 3-Inch Long Lasting Chlorinating Tablets, 5-Pound Chlorine'\nQuery keyword 'intex pool vacuum'\nClick on product 'Poolmaster 28300 Big Sucker Swimming Pool Leaf Vacuum'\nClick on product 'Intex 28620EP Rechagreable Handheld Vacuum, Grey'\nPurchase product 'ASURION 4 Year Kitchen Protection Plan $70-79.99'\nClick on product 'Intex Handheld Rechargeable Vacuum with Telescoping Aluminum Shaft and Two Interchangeable Brush Heads , Gray\/Black'\nClick on product 'Flowclear Deluxe Maintenance Kit'\nClick on product 'Flowclear Deluxe Maintenance Kit'\nClick on product 'WORX WA4054.2 LeafPro Universal Leaf Collection System for All Major Blower\/Vac Brands'\nQuery keyword 'intex pool volleyball net'\nClick on product 'Poolmaster Super Combo Water Volleyball and Badminton Swimming Pool Game'\nFollow up click on product 'Poolmaster Swimming Pool Basketball and Volleyball Game Combo, Above-Ground Pool'\nQuery keyword 'intex pool above ground volleyball net'\nFollow up click on product 'Poolmaster Swimming Pool Basketball and Volleyball Game Combo, Above-Ground Pool'\nQuery keyword 'intex pool clip on volleyball net'\nClick on product 'Intex Pool Bench, Foldable Seat for Above Ground Pools'\n\nCandidate Query List:\n1. intex pool volleyball net\n2. black and stainless cabinet hardware\n3. intex pool clip on volleyball net\n4. intex pool above ground volleyball net\n5. hanging spice rack\n6. intex pool vacuum\n7. womens pale blue tops\n8. sram 10 speed gx rear derailleur and shifter\n9. tie rod end\n10. baby girl stuff\nOutput (answer in three comma-separated numbers): ","output_field":[4],"task_name":"task13","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"A user on an online shopping website has just purchased a product 'Steven Harris Mathematics Math Equations Necktie - Red - One Size Neck Tie'. The following numbered list contains 15 products. Please select 3 products from the list that the user may also purchase.\nProduct List: \n1. Under Armour Men`s ColdGear Lite Cushion Boot Socks, 1 Pair\n2. Little Angel Tasha-685E Patent Bow Mary Jane Pump (Toddler\/Little Girl\/Big Girl) - Fuchsia\n3. Men's Solar System Planets Necktie-Black-One Size Neck Tie by\n4. Crocs Women's Malindi Flat\n5. Wrangler Men's Big & Tall Rugged Wear Unlined Denim Jacket\n6. NIKE Sunray Protect 2 (TD) Womens Fashion-Sneakers 943829\n7. Calvin Klein Women's Seductive Comfort Customized Lift Bra with Lace\n8. Steven Harris Mens Smiley Face Necktie - Yellow - One Size Neck Tie\n9. ComputerGear Math Formula Tie Engineer Silk Equations Geek Nerd Teacher Gift\n10. Harley-Davidson Boys Baby Twin Pack Creeper My Daddy Rides a Harley Orange\n11. Liverpool Football Club Official Soccer Gift Mens Crest T-Shirt\n12. SITKA Traverse Beanie Waterfowl One Size Fits All (90002-WL-OSFA)\n13. The Magic Zoo Sterling Silver Snake Chain with Lobster Clasp\n14. Napier\"Classics\" Silver-Tone Round Button Earrings\n15. Tru-Spec Men's Base Layers Series Gen-iii ECWCS Level-2 Bottom\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[3,8,9],"task_name":"task14","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are a helpful shop assistant. A user would like to buy the product 'Sempio Soy Sauce for Soup 31.4 Fl Oz.'. Please select the products that the user may also buy from the following numbered list.\nProduct List: \n1. assi Dried Baekdudaegan Fernbraken, 8 Ounce\n2. Pete & Gerry's, Organic Free-Range Grade A Extra Large Brown Eggs, 12 ct, 1 dozen\n3. Punjana Fair Trade (80 Tea Bags)\n4. Ottogi 100% Korean Rice Syrup, 700 Grams\/24 Ounces (Jocheong, Yetnal Ssalyeot)\n5. 12 ct - Spongebob Squarepants and Patrick Birthday Party Cupcake Rings\n6. Sorghum (popping) 8 oz by OliveNation\n7. Medium Japanese Dried Scallops Dried Seafood Conpoy Yuanbei Worldwide Free AIR Mail (0.5LB)\n8. Pancake Mix, Korean Style (2.2 Lb) By Beksul\n9. ARCTIC ZERO Fit Frozen Desserts - 6 Pack - Cappuccino and Purely Chocolate Creamy Pints\n10. After Eight Thin Mints 7.05 ounce (3 packs)\n11. Walkers Fine Oatcake Crackers-10.6 oz\n12. Dean Jacobs Grinder Rosemary N Garlic, 1.5-Ounce\n13. wilton 703-222 pearl dust blue gum paste fondant M4530\n14. Necta Sweet NECTASWEET SUGAR SUB TB .25 GR 1000\n15. 3 Acme Nova-Lox Sliced Salmon packages 3lb Avg\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[1,4,8],"task_name":"task14","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"You are a helpful shop assistant. A user would like to buy the product 'Empire Paintball Prophecy Z2 Gun Loader'. Please select the products that the user may also buy from the following numbered list.\nProduct List: \n1. St. Louis Blues Magnus Cap (One-Size \/)\n2. Emarth Lightweight Envelope Sleeping Bag with Ultra Compact Design for Outdoor Camping 6-19 Degree Weather Orange\n3. Invert Helix Thermal Paintball Goggles Mask - Olive\n4. Lookbook Store Womens Lace Crochet Sweetheart-Neck Swimsuit Bathing Suit US 2-16\n5. Nike Boys Elite Stripe Pants (Little Big Kids)\n6. ALPS Mountaineering Chip Table\n7. Fripp&Folly - Bourbon Barrel - Comfort Colors - T-Shirt - XL\n8. Real Madrid Soccer Structured Flex Fit Cap, Black\n9. ZUMWax Ski\/Snowboard RACING WAX - Universal - 100 gram - INCREDIBLY FAST in ALL Temperatures !!!\n10. West Biking Cycling Mudguard for Bicycle Mountain Bike Fender Front\/Rear Fenders MTB Road Bike Accessories Suit 20" 24" 26"\n11. GXG Lightning Empire Prophecy Z2 Electronic Loader Hopper Speed Feedgate Collar Feed Gate Lid Crown\n12. Walls Men's Big & Tall Cape Back Long Sleeve Hunting Button Shirt 100% Cotton Twill\n13. Planet Eclipse Paintball Gun Grease 20ml Tube of Lubricant Tech Gear\n14. Greenkeepers 4 Hybrid Golf Tee\n15. Cleto Reyes Traditional Lace Up Training Boxing Gloves - 14 oz - Red\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[3,11,13],"task_name":"task14","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"A user on an online shopping website has just purchased a product 'Flanged End Cap for L-Track'. The following numbered list contains 15 products. Please select 3 products from the list that the user may also purchase.\nProduct List: \n1. Blacktop & Roof Patch,10.1 Oz (Pack of 6)\n2. DEWALT D25553K 1-9\/16-Inch Spline Combination Hammer Kit\n3. XtremepowerUS 10pc 1" Dr.Deep Impact Cr-V Socket Set - MM (Black)\n4. Forney 60224 Mini Rotary File Cutter Set, 1\/8-Inch Shaft, 3-Piece\n5. Voltec 08-00616 1400-Watt Halogen Pro Worklight, 7-Foot, Blue & Yellow\n6. 3.5" Acrylic Lens Rimless 2x Magnifying Glass w\/2 LEDs - Great for Basic Inspections, Perfect for Crafts & Hobbies!\n7. Icicle Solar Christmas String Lights, 15.7ft 8 Light Modes 20 LED Water Drop Fairy String Lighting for Outdoor & Indoor, Home, Patio, Lawn, Garden, Party, and Holiday Decorations (Warm White)\n8. Blue Sea Systems 187 Series, 285 Series & Klixon Circuit Breakers\n9. Ridgid 59832 Die Head Post\n10. Johnson Level & Tool 175-L Post Level\n11. Wire Loom Black 20' Feet 1" Split Tubing Hose Cover Auto Home Marine by Nippon America\n12. Brinks 7462-619 Hampton 3-Light Camille Bath Vanity Light, Satin Nickel\n13. Platform Stepladder, 7 ft. 9in, 330 lb.\n14. Self-Adhesive Stress Crack Tape Textured Roll\n15. SHURFLO (255-313) 1\/2" Twist-On Pipe Strainer\nYou should output 3 numbers that correspond to the selected products. There should be a comma separating every two numbers. Only respond with the results. Do not say any word or explanations.\nOutput: ","output_field":[8,11,15],"task_name":"task14","task_type":"retrieval","metric":"hit rate@3","is_multiple_choice":false,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Leotard product\nRuns small and fabric doesn\u2019t stretch\nThis body shaper is way too small even though I bought 2X. It doesn\u2019t stretch at all to be form fitting. I don\u2019t like this product and will never purchase it again.\nOutput:\nAnswer: ","output_field":1,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Boxing Glove product\nGood choice for begginers\/intermediates\nGloves look exactly like on a picture, love them. Good gloves for good price, especially for beginners. I used to using leather gloves and that leather smell was really bad, I am glad that this time I decided to buy synthetic. Unfortunally this may be con too, because durabality is lower. Anyway I prefer this synthetic ones. Also part on palms is made with textile for ventilation, so hands don't sweat that much.\nOutput:\nAnswer: ","output_field":4,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Thermometer product\nHas that cheaply made feel\nWhile I believe the thermometer to be accurate, it has the "made in China" cheaply made feel. Additionally, it is larger than I expected. I was hoping it would fit in one of the thermometer and pen pockets on the arm of my chef jacket, but it is too large and does not fit. I have a number of the Taylor thermometers that are much better quality.\nOutput:\nAnswer: ","output_field":3,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) optical frame product\nLove these!\nI really like these frames. I like the style and the size of the lenses. I would definitely buy them again in other colors. Happy with this purchase.\nOutput:\nAnswer: ","output_field":5,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Necklace product\nI bought this chain to go along with a alexander the great necklace and im gonna get right to the point if you want a 17 dollar necklace that looks like it came out of a gumball machine buy this one right here, its to short to shiny and to cheap to even consider wearing, go to mall try something nice on and buy it, stay away from this one.\nOutput:\nAnswer: ","output_field":1,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Boxing Glove product\nPerfect for kids. I purchased them thinking they were toy boxing gloves but they seem pretty realy. I have a small hand and it doesn't fit but they are perfect size for my 6 and 7 yr olds.\nOutput:\nAnswer: ","output_field":4,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) Thermometer product\nThis unit works well, but the temp is approximately 4 degrees lower than actual temp. which is annoying. The other two thermometers in my house were put next to this unit and they read consistent with each other but ~4 degrees hotter. Not sure why the company can't make it so it reads a true temp. However, the unit does work and fits my needs.\nOutput:\nAnswer: ","output_field":3,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Instructions: Evaluate the following product review on a scale of 1 to 5, with 1 being very negative and 5 being very positive. \nInput: a review text for a(n) cereal product\nI love this Meusli. Its very simple. Im sure I could mix up my own, but its easier to just scoop it out for use. I haven't tried cooking it. I usually put it in a bowl with a little almond milk and let it sit for about 10 minutes, which I think is what the package directions say. Id like to try cooking it sometime. Its very good. Its currently my favorite cereal and, I think, much better for you than other packaged breakfast cereals.\nOutput:\nAnswer: ","output_field":5,"task_name":"task15","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-user-behavior-alignment"}
{"input_field":"Given the following product, which of the following keyword sets is most suitable for it?\nProduct Title: Alchemy Power Inc. Pi-EzConnect \u2013 Raspberry Pi GPIO Verbinder. Eine Kappe um GPIOs und Sensoren zu einem Raspberry Pi zu verbinden. Inkl. MwSt.\nProduct Description: Vor dem Pi-EzConnect werden ein Flachbandkabel, ein Steckbrett und andere Dr\u00e4hte verwendet, um den Temperatursensor mit dem Raspberry-Pi zu verbinden. Mit dem Pi-Ezconnect, wird die Verbindung wie gezeigt vereinfacht. Sehen Sie das Video auf You Tube \u2013 https:\/\/youtu.be\/oChXSE0etQw an. Sowohl der 3,3 V als auch der 5 V Strom ist auf Pi-EzConnect verf\u00fcgbar. Es gibt auch eine Stromerdung auf Ppi-EzConnect. Elektronische Ger\u00e4te, wie z.B. "Pull-up oder Pull-Down - Resistors, LEDs oder andere Komponenten k\u00f6nnen leicht in die Pi-EzConnect L\u00f6tstellen gelegt werden. Sensoren k\u00f6nnen auch auf L\u00f6tpunkte verl\u00f6tet werden. Verwenden Sie abwechselnd die mitgelieferten L\u00f6tverbindungen. Erweiterte Header Pins erlauben Verbindung anderen Kappen mit dem Pi-EzConnect ohne Verlust an Funktionalit\u00e4t. Zur Montage der Pi-EzConnect-Platine auf dem Raspberry-Pi verwenden Sie den 2,5 mm x 15 mm Messing-Abstandshalter f\u00fcr Raspberry Pi HATs. Funktioniert z.B. mit 40 Pin- Header Raspberry Pi Pi-2, Pi-3 sowie Orange-Pi, DIGI ConnectCore und anderen Computern, die das 40 Pin HAT- Format folgen.\n\n0. erweiterung, gpio, breakout board, raspberry pi\n1. arzt, raspberry pi, gpio, klemmleiste\n2. besteckkasten, raspberry pi, gpio, leicht\n3. t cobbler, bett, klemmleiste, learning resource\nAnswer: ","output_field":0,"task_name":"task16","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"Which of the following sets of phrases best summarizes the following product?\nProduct Title: Learning Resources- Juego para Practicar Palabras de Uso frecuente Pop for Sight Words, Color (LER8430)\nProduct Description: Juego para practicar palabras de uso frecuente pop for sight words de la gama pop games de learning resources s\u00edrvete de esta caja de palomitas para ayudar a los ni\u00f1os a que mejoren su uso de las palabras de uso frecuente; este juego de la gama pop games, que es uno de los m\u00e1s vendidos de learning resources, incluye tarjetas en forma de palomitas con 92 palabras de uso frecuente; los ni\u00f1os escogen una tarjeta de la caja; a continuaci\u00f3n dir\u00e1n en voz alta la palabra y la utilizar\u00e1n correctamente; si utilizan la palabra correctamente, podr\u00e1n quedarse con la tarjeta las tarjetas \u00abpop\u00bb adicionales mantendr\u00e1n a los ni\u00f1os alerta; este juego r\u00e1pido, ganador de un premio, fomenta la fluidez en el habla; juego de alfabetizaci\u00f3n con dos niveles de juego para ampliar el aprendizaje; id\u00f3neo para ni\u00f1os de 5+ a\u00f1os de edad; p\u00f3nselo un poco m\u00e1s dif\u00edcil con pop for sight words 2 de pop games (disponible por separado). not applicable Contribuye a que los ni\u00f1os reconozcan las palabras de uso frecuente\n\n0. mujer, bearing, english, juegos\n1. juegos, duke, pop, recursos educativos\n2. primaria, para, recursos educativos, pop\n3. ingles, neewer, juegos, rejillas\nAnswer: ","output_field":2,"task_name":"task16","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"Which of the following sets of phrases best summarizes the following product?\nProduct Title: C\u00e2bles s\u00e9parateur NANYI XLR \u00e0 3.5mm, c\u00e2ble de microphone audio d'interconnexion m\u00e2le TRS st\u00e9r\u00e9o vers deux XLR m\u00e2les, c\u00e2ble adaptateur de s\u00e9parateur Y \u00a0 (0.5 m\u00e8tres \/ 1.6 pieds)\nProduct Description: C\u00e2ble XLR de qualit\u00e9 professionnelle Diffusez de la musique de votre smartphone ou de votre ordinateur sur des haut-parleurs pour une f\u00eate ou pour une console de mixage num\u00e9rique pour une performance. Ce c\u00e2ble complet a un PVC souple veste pour une utilisation facile et le stockage. Les connecteurs m\u00e9talliques robustes sont plaqu\u00e9s or contacts afin de r\u00e9duire le moulage par oxydation et d\u00e9charge de traction sur le connecteur XLR maintenir l'int\u00e9grit\u00e9 du c\u00e2ble. Les conducteurs en cuivre sans oxyg\u00e8ne sont enferm\u00e9s \u00e0 nu blindage en tresse de cuivre pour fournir un son pur et sans bruit. Caract\u00e9ristiques: 1. Le c\u00e2ble XLR NANYI associe des canaux st\u00e9r\u00e9o TRS de 3,5 mm \u00e0 une alimentation monoRo XLR. 2. Adaptabilit\u00e9 \u00e9lev\u00e9e, fiches XLR sophistiqu\u00e9es de 3,5 mm et 3 broches pour plus d'\u00e9quipements et d'applications. 3. Le bo\u00eetier de prise XLR pour la peinture en a\u00e9rosol noire polie de moulage sous pression en alliage de zinc de haute r\u00e9sistance, attrayant et durable. 4. Fiche 3,5 mm plaqu\u00e9e or pour une meilleure conductivit\u00e9 et une meilleure clart\u00e9 du signal. 5. Cuivre sans oxyg\u00e8ne de qualit\u00e9 sup\u00e9rieure (OFC) pour un rejet et une flexibilit\u00e9 efficaces des EMI et des RFI. Paquet: C\u00e2ble micro m\u00e2le de 3,5 mm \u00e0 2 xlr de 0,5 m\u00e8tre * 1 Garantie: 12 mois 1. Le c\u00e2ble XLR NANYI associe des canaux st\u00e9r\u00e9o TRS de 3,5 mm \u00e0 une alimentation monoRo XLR.\n\n0. adaptateur, stereo, reparation, jeu\n1. xor, l&apos, xlr male, abreuvoir oiseau\n2. xlr male, mini, peinture, deux\n3. stereo, xor, xlr male, jack to xlr\nAnswer: ","output_field":3,"task_name":"task16","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"Given the following product, which of the following keyword sets is most suitable for it?\nProduct Title: bonmedico Rialzo per Sedia Portatile, Cuscino a Cuneo Ergonomico Ideale Come Cuscino da Divano e Cuscino per Sedia da Ufficio \u2013 Alzasedia Perfetto Come Cuscino per Sedia a Rotelle\nProduct Description: Alzasedia bonmedico \u2013 ergonomico, innovativo e incredibilmente versatile\u00a0 Il cuscino da seduta bonmedico, realizzato in una schiuma innovativa e con sporgenze antiscivolo. Sia a casa che a lavoro, potrai usare questo cuscino ergonomico per la tua sedia da ufficio o per la poltrone in salotto. La forma a cuneo e l\u2019altezza lo rendono perfetto per alzarsi pi\u00f9 facilmente.\u00a0 La forma ergonomica del nostro cuscino a cuneo lo rende perfetto per sedersi per lunghi periodi. La schiuma innovativa non si adatter\u00e0 solamente al peso del tuo corpo, ma ti garantir\u00e0 un appoggio per rialzarti comodamente. La federa \u00e8 realizzata in velluto di alta qualit\u00e0 e ha una piacevole sensazione al tatto. Con questo supporto universale potrai rendere pi\u00f9 comoda qualsiasi sedia e vivere meglio. \u00c8 ideale per sedie da tv, poltrone e divani bassi e pu\u00f2 essere persino utilizzato come comodissimo cuscino per sedia a rotelle. Sarai a tuo agio ovunque e in qualsiasi situazione.\u00a0 La forma ergonomica a cuneo e le sporgenze antiscivolo garantiscono una tenuta sicura anche sulle superfici lisce. Il manico integrato e le pratiche dimensioni lo rendono comodissimo da portare in giro. \u00a0 Potrai rimuovere facilmente la federa grazie alla cerniera nella parte posteriore. Il materiale in velluto pu\u00f2 essere lavato in lavatrice a 60\u00b0C e tutti i materiali utilizzati sono atossici. \u00a0 \u00a0 \u2714 DESIGN INTELLIGENTE: I nostri cuscini per sedie hanno una forma ergonomica e sporgenze antiscivolo che ne garantiranno la stabilit\u00e0 anche sulle superfici lisce. Ogni cuscino per sedia \u00e8 dotato di un comodo manico e le sue pratiche dimensioni di 40 cm x 40 cm x 13\/8 cm (larghezza x spessore x altezza) lo rendono facilissimo da trasportare. Avrai finalmente il cuscino da ufficio che sognavi.\n\n0. samsung, scheda, ergonomico, sedie\n1. camo, poltrona, sedia, custodia\n2. ergonomico, sedie, antidecubito, memory\n3. vasca, foam, sedia, seduta\nAnswer: ","output_field":2,"task_name":"task16","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product entitled 'Steadtler Fimo Soft Starter Pack 12 x 57 g Multicolour Blocks' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) German online shopping website.\nOutput: ","output_field":"Fimo Soft Starter Pack 12 x 56g Multicolour Blocks by Steadtler","task_name":"task17","task_type":"generation","metric":"bleu","is_multiple_choice":false,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product entitled 'Electric Toothbrush, Sonic Toothbrushes with 8 Brush Heads 40000 VPM 5 Modes, Sonic Toothbrushes Fast Charge 4 Hours Last 30 Days, Rechargeable Electric Toothbrush for Adults, Black' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) Spanish online shopping website.\nOutput: ","output_field":"Cepillo de dientes el\u00e9ctrico, cepillo de dientes el\u00e9ctrico Sonic con 8 cabezales de cepillo 40000 VPM 5 modos, cepillo de dientes el\u00e9ctrico recargable para adultos, negro","task_name":"task17","task_type":"generation","metric":"bleu","is_multiple_choice":false,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A user found a product with title 'Corsair HS60 PRO Surround Gaming Headset (7.1 Surround Sound, Adjustable Memory Foam Ear Cups, Noise-Cancelling Detachable Microphone with PC, PS4, Xbox One, Switch and Mobile Compatibility) - Yellow' on an online shopping website. Please translate the product title into French.\nOutput: ","output_field":"Corsair HS60 PRO Surround Casque de Gaming Son surround 7.1, M\u00e9moire ajustables Oreillettes, Unidirectionnel Antibruit Microphone avec PC, PS4, Xbox One, Switch et mobiles Compatibilit\u00e9 - Jaune","task_name":"task17","task_type":"generation","metric":"bleu","is_multiple_choice":false,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"Translate the product title 'Actesso Breathable Wrist Support Brace Splint - Ideal for Carpal Tunnel, Sprains, and Tendonitis (Black, Large Left)' into Italian. \nOutput: ","output_field":"Actesso Tutore Polso Traspirante - Ideale Polsiera per Sindrome del Tunnel Carpale, Slogature, RSI e Tendinite (Nero Sinistra, L)","task_name":"task17","task_type":"generation","metric":"bleu","is_multiple_choice":false,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product entitled 'JETech Case for iPad (9.7-Inch, 2018\/2017 Model, 6th\/5th Generation), Smart Cover Auto Wake\/Sleep (Light Purple)' exists on an online shopping website. Generate an adequate title for the product when it appears on a(n) Japanese online shopping website.\nOutput: ","output_field":"JEDirect iPad 9.7\u30a4\u30f3\u30c1 (2018\/2017 \u7b2c6\/5\u4e16\u4ee3\u7528) \u30b1\u30fc\u30b9 PU\u30ec\u30b6\u30fc \u4e09\u3064\u6298\u30b9\u30bf\u30f3\u30c9 \u30aa\u30fc\u30c8\u30b9\u30ea\u30fc\u30d7\u6a5f\u80fd (\u30e9\u30a4\u30c8\u30d1\u30fc\u30d7\u30eb)","task_name":"task17","task_type":"generation","metric":"jp-bleu","is_multiple_choice":false,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product with description 'Flexible and Durable: Withstand over 10,000 bend lifespans. No more worrying about the connector bending, coming out of the housing, or even being left in your TV's HDMI port causing any damage.' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. Compatibilit\u00e0: Per iPad Pro da 12,9 pollici (5a, 6a gen - 2021, 2022) - A2378, A2461, A2379, A2462\n1. Durchmesser (DIA): 14,00 mm Radius (BC): 8.60 Wassergehalt 45 %\n2. \u2b50\u2b50\u301021Pcs Scrapbook Album Set\u3011 1pc album fotos, con 12 bol\u00edgrafos met\u00e1licos, 2 pegatinas de \u00e1lbum, 2 pegatinas de esquina, 2 pegatinas doradas, 2 plantillas de dibujo. Totalmente 20 piezas de accesorios para \u00e1lbumes de recortes, para hacer tus \u00e1lbumes de fotos personalizados. Conjunto de herramientas perfectas para ideas hechas a mano para guardar sus recuerdos. M\u00e1s: si no ha recibido tantos accesorios, por favor cont\u00e1ctenos.\n3. HDMI\u30b1\u30fc\u30d6\u30eb: Twozoh\u9ad8\u901fHDMI\u30b1\u30fc\u30d6\u30eb\u306f\u30014K\u30d3\u30c7\u30aa@60Hz\u30011080P\u3001True HD 7.1\u3001\u30aa\u30fc\u30c7\u30a3\u30aa\u30ea\u30bf\u30fc\u30f3\u30c1\u30e3\u30f3\u30cd\u30eb(ARC)\u3068\u30a4\u30fc\u30b5\u30cd\u30c3\u30c8\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002\nAnswer: ","output_field":3,"task_name":"task18","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product with description 'Charging time: 3 hours full charge' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. \u30dc\u30fc\u30ba\u72ec\u81ea\u306e\u6280\u8853\u304b\u3089\u518d\u751f\u3055\u308c\u308b\u3001\u6df1\u304f\u8c4a\u304b\u306a\u9ad8\u97f3\u8cea\u30b5\u30a6\u30f3\u30c9\u3002 \u98a8\u306e\u5f37\u3044\u5834\u6240\u3084\u9a12\u97f3\u4e0b\u3067\u3082\u30af\u30ea\u30a2\u306a\u901a\u8a71\u54c1\u8cea\u3092\u5b9f\u73fe\u3002 \u30b9\u30de\u30fc\u30c8\u30d5\u30a9\u30f3\u3068\u30bf\u30d6\u30ec\u30c3\u30c8\u306a\u3069\u30012\u53f0\u306eBluetooth\u6a5f\u5668\u3092\u540c\u6642\u306b\u63a5\u7d9a\u3067\u304d\u308b\u30de\u30eb\u30c1\u30dd\u30a4\u30f3\u30c8\u6a5f\u80fd\u3092\u642d\u8f09\u3002 \u5145\u96fb\u5f0f\u30ea\u30c1\u30a6\u30e0\u30a4\u30aa\u30f3\u30d0\u30c3\u30c6\u30ea\u30fc\u306b\u3088\u308a\u7d0415\u6642\u9593\u306e\u9023\u7d9a\u4f7f\u7528\u304c\u53ef\u80fd\u3002 SCMS-T\u306b\u5bfe\u5fdc\u3002\u30ef\u30f3\u30bb\u30b0\u653e\u9001\u3082\u30ef\u30a4\u30e4\u30ec\u30b9\u3067\u697d\u3057\u3081\u308b\u3002 \u8efd\u91cf\u5316\u3092\u8ffd\u6c42\u3057\u305f\u72ec\u81ea\u30c7\u30b6\u30a4\u30f3\u3002\n1. [ Active Noise Cancelling Earbuds] Equipped with strong noise cancelling technology, SoundPEATS Life wireless earbuds can effectively eliminate external noise up to 25dB, you can jam in your music and enjoy your time no matter you are in the subway or on the street. The ergonomic design creates a tight seal with your ear canal for better noise cancellation and more imersive beats.\n2. 80 PLUS BRONZE EFFIZIENZ - Extern zertifiziert (80 PLUS Bronze 230V EU), um einen typischen Wirkungsgrad von 88% unter Standardlastbedingungen zu gew\u00e4hrleisten\n3. \u2b50\u3010Tips\u3011When using, please tighten the drawstring and tie the extra rope into the laundry bag for easy access to clothes\nAnswer: ","output_field":0,"task_name":"task18","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A product with description 'Available in both 3 and 6 packs' exists on an online shopping website. Which of the following descriptions may describe the same product in a different language?\n0. \u3010Auto Schlaf\/Aufwach\u3011- Man kann die Abdeckung zum Aufwecken \u00f6ffnen und zum Ruhezustand schlie\u00dfen. An der Innenseite der vorderen Abdeckung befindet sich eine praktische Handschlaufe, um das Lesen beim Halten des Tablets zu erleichtern. Mit einem Gummiband, damit sich der Deckel nicht leicht \u00f6ffnet.\n1. COMPATIBILITY \u2013 The TORRO Magnetic Leather Cardholder is compatible with any MagSafe device (iPhone 14 \/ 13 \/ 12 Series). The built-in magnets ensure it connects to your device with precision for a seamless and secure attach\/detach.\n2. Adoucit l'eau du robinet\n3. Des yeux plus charmants: Les cils magnetique naturel vous aideront \u00e0 cr\u00e9er le maquillage des yeux le plus glamour, faisant de vous la femme la plus attirante de la foule. Apr\u00e8s avoir s\u00e9lectionn\u00e9 parmi plus de 60 types diff\u00e9rents de faux cils magn\u00e9tiques, nous avons choisi ces 5 paires de cils les plus confortables et les plus naturels.\nAnswer: ","output_field":2,"task_name":"task18","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
{"input_field":"A user found a product with description 'Medium acrylic pen for easy, spontaneous drawing on light or dark surfaces; ideal for materials such as canvases, acrylic paper, painting boards and terracotta' on an online shopping website. If the product appears on another shopping website in a different language, which of the following may be its descriptions?\n0. Das hautkl\u00e4rende Gesichtsserum f\u00fcr unreine Haut \u2013 Das Anti-Pickel+ T\u00e4gliches Serum von Neutrogena mit kl\u00e4render Salicyls\u00e4ure hilft, Pickel zu beseitigen, farbige Pickelmale verblassen zu lassen und die Strahlkraft der Haut und das Hautbild zu verbessern\n1. Ben\u00f6tigt Batterien: Nein\n2. Contiene 1 marcatore acrilico (nero)\n3. Caf\u00e9 en dosettes compatibles avec les machines Tassimo\nAnswer: ","output_field":2,"task_name":"task18","task_type":"multiple-choice","metric":"accuracy","is_multiple_choice":true,"track":"amazon-kdd-cup-24-multi-lingual-abilities"}
......@@ -15,6 +15,11 @@ IMAGE_NAME="aicrowd/amazon-kddcup24-submission:${LAST_COMMIT_HASH}"
# This means Docker will look for a Dockerfile in the current directory to build the image.
START_TIME=$(date +%s)
DOCKER_BUILDKIT=1 docker build -t $IMAGE_NAME .
BUILD_STATUS=$?
if [ $BUILD_STATUS -ne 0 ]; then
echo "Docker build failed. Exiting..."
exit $BUILD_STATUS
fi
END_TIME=$(date +%s)
BUILD_TIME=$((END_TIME - START_TIME))
echo "Total build time: $BUILD_TIME seconds"
......@@ -26,7 +31,15 @@ echo "Total build time: $BUILD_TIME seconds"
# 'python /submission/local_evaluation.py' is the command executed inside the container.
# the -w sets the workind directory to /submission.
# It then local_evaluation.py using software runtime set up in the Dockerfile.
docker run -v "$(pwd)":/submission -w /submission $IMAGE_NAME python local_evaluation.py
docker run \
--gpus all \
-v "$(pwd)":/submission \
-w /submission \
--shm-size=10.24gb\
$IMAGE_NAME python local_evaluation.py
# Note: We assume you have nvidia-container-toolkit installed and configured
# to use the --gpus all flag. If you are not using GPUs, you can remove this flag.
# Note 1: Please refer to the Dockerfile to understand how the software runtime is set up.
......
### Setting Up and Downloading Baseline Model weighta with Hugging Face
This guide outlines the steps to download (and check in) the models weights required for the baseline models.
We will focus on the `Meta-Llama-3-8B-Instruct`.
But the steps should work equally well for any other models on hugging face.
#### Preliminary Steps:
1. **Install the Hugging Face Hub Package**:
Begin by installing the `huggingface_hub` package, which includes the `hf_transfer` utility, by running the following command in your terminal:
```bash
pip install huggingface_hub[hf_transfer]
```
2. **Accept the LLaMA Terms**:
You must accept the LLaMA model's terms of use by visiting: [meta-llama/Meta-Llama-3-8B-Instruct Terms](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
3. **Create a Hugging Face CLI Token**:
Generate a CLI token by navigating to: [Hugging Face Token Settings](https://huggingface.co/settings/tokens). You will need this token for authentication.
#### Hugging Face Authentication:
1. **Login via CLI**:
Authenticate yourself with the Hugging Face CLI using the token created in the previous step. Run:
```bash
huggingface-cli login
```
When prompted, enter the token.
#### Model Downloads:
1. **Download LLaMA-2-7b Model**:
Execute the following command to download the `Meta-Llama-3-8B-Instruct` model to a local subdirectory. This command excludes unnecessary files to save space:
```bash
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download \
meta-llama/Meta-Llama-3-8B-Instruct \
--local-dir-use-symlinks False \
--local-dir models/meta-llama/Meta-Llama-3-8B-Instruct \
--exclude *.pth # These are alternates to the safetensors hence not needed
```
#### Version Control with Git LFS:
1. **Track Model Weights**:
Use Git Large File Storage (LFS) to track the model directories. This ensures efficient handling of large files:
```bash
git lfs track "models/meta-llama/*"
```
2. **Commit and Push**:
Add the models to your Git repository, commit the changes, and push them to your remote repository:
```bash
git add models/
git commit -am "add weights"
git push origin master
```
If you are struggling with GIT-LFS, you are very much encouraged to check out [this post](https://discourse.aicrowd.com/t/how-to-upload-large-files-size-to-your-submission/2304).
......@@ -11,18 +11,19 @@ We apply a limit on the hardware available to each participant to run their solu
- `40` x vCPU (`20` physical CPU cores)
- `180GB` RAM
**Note**: When running in `gpu:false` mode, you will have access to `4` x vCPUs (`2` physical cores) and `8GB` RAM.
Please note that NVIDIA T4 uses a somewhat outdated architectures and is thus not compatible with certain acceleration toolkits (e.g. Flash Attention), so please be careful about compatibility.
Besides, the following restrictions will also be imposed:
- Network connection will be disabled (except for HuggingFace to download open-source checkpoints).
- Each submission will be assigned a certain amount of time to run. Submissions that exceed the time limits will be killed and will not be evaluated. The tentative time limit is set as follows **[TO BE TESTED WITH AICROWD SUBMISSION SYSTEM]**.
- Network connection will be disabled.
- Each submission will be assigned a certain amount of time to run. Submissions that exceed the time limits will be killed and will not be evaluated. The tentative time limit is set as follows.
| Phase | Track 1 | Track 2 | Track 3 | Track 4 | Track 5 |
| ------ | ------- | ------- | ------- | ------- | ------- |
| **Phase 1**| 140 minutes | 40 minutes | 60 minutes | 60 minutes | 5 hours |
- Each team will be able to make up to **4 submissions per week**, with a maximum of **2 Track 5 all-around submissions** **[TO BE TESTED WITH AICROWD SUBMISSION SYSTEM]**.
- Each team will be able to make up to **2 submissions per week** per track for Tracks 1-4, and **1 submission per week** for track 5 all-around.
Based on the hardware and system configuration, we recommend participants to begin with 7B models. According to our experiments, 7B models like Vicuna-7B and Mistral can perform inference smoothly on 2 NVIDIA T4 GPUs, while 13B models will result in OOM.
......@@ -17,11 +17,13 @@ Few of the most common ways are as follows:
[...]
```
We would suggest participants to keep the `requirements.txt` to the minimum, with only necessary packages in it. Chances are that, the more (unnecessary) packages you put in it, the more likely you may encounter an error on some (maybe totally unnecessary) packages.
* `apt.txt` -- The Debian packages (via aptitude) used by your inference code!
These files are used to construct your **AIcrowd submission docker containers** in which your code will run.
* `Dockerfile` -- **For advanced users only**. `Dockerfile` gives you more flexibility on defining the software runtime used during evaluations.
* `Dockerfile` -- `Dockerfile` gives you more flexibility on defining the software runtime used during evaluations. The `Dockerfile` under the root path of the starter kit will be used to build your solution. Feel free to modify anything in it, and test it locally.
----
......
......@@ -22,7 +22,7 @@ Our platform supports custom runtime environments. This means you have the flexi
- **`requirements.txt`**: List any PyPI packages your project needs. **Do specify versions, as we observe significant difference in inference time between different `transformer` versions.**
- **`apt.txt`**: Include any apt packages required.
- **`Dockerfile`**: The one located at the root will be used by default to build your submission.
- **`Dockerfile`**: The one located at the root will be used by default to build your submission. **You can specify the python version here if you need specific ones**.
For detailed setup instructions regarding runtime dependencies, refer to the documentation in the `docs/runtime.md` file.
......@@ -32,6 +32,7 @@ Your project should follow the structure outlined in the starter kit. Here’s a
```
.
├── .dockerignore # Please specify the paths to your model checkpoints so that the large files won't be built into the docker image.
├── README.md # Project documentation and setup instructions
├── aicrowd.json # Submission meta information - like your username, track name
├── data
......@@ -45,7 +46,7 @@ Your project should follow the structure outlined in the starter kit. Here’s a
├── models
│ ├── README.md # Documentation specific to the implementation of model interfaces
│ ├── base_model.py # Base model class
│ ├── dummy_model.py # A simple or placeholder model for demonstration or testing
│ ├── dummy_model.py # A simple or placeholder model for demonstration or testing. We also implement a simple Vicuna-7B baseline here.
│ └── user_config.py # IMPORTANT: Configuration file to specify your model
├── parsers.py # Model output parser
├── requirements.txt # Python packages to be installed for model development
......
......@@ -52,20 +52,36 @@ def generate_model_outputs(data_df, model):
- A list containing the model outputs for each entry in the data DataFrame.
"""
outputs = []
for _, row in tqdm(
data_df.iterrows(), total=len(data_df), desc="Generating Responses"
):
is_multiple_choice = row["task_type"] == "multiple-choice"
# the 'task_type' column won't be available during evaluation, so you should use something like
# ```is_multiple_choice = row['is_multiple_choice']``
prompt = row["input_field"]
model_output = model.predict(prompt, is_multiple_choice)
outputs.append(model_output)
return outputs
task_grouped_df = data_df.groupby(by=["task_type"])
for task_type, task_group_data_df in task_grouped_df:
task_group_data_df = task_group_data_df.reset_index(drop=True)
is_multiple_choice = task_type[0] == "multiple-choice"
batch_size = model.get_batch_size()
batches = [task_group_data_df[i:i+batch_size] for i in range(0,len(task_group_data_df),batch_size)]
for batch_df in batches:
batch = {
"prompt": batch_df["input_field"].tolist(),
}
model_output = model.batch_predict(
batch,
is_multiple_choice
)
outputs.append(
pd.DataFrame({
"input_field": batch["prompt"],
"model_output_str": model_output
}))
df_outputs = pd.concat(outputs)
return df_outputs
# Function to evaluate the generated model outputs
def evaluate_outputs(data_df, outputs, log_every_n_steps=1):
def evaluate_outputs(data_df, log_every_n_steps=1):
"""
Evaluate the model outputs against ground truth values using specified metrics.
......@@ -84,17 +100,18 @@ def evaluate_outputs(data_df, outputs, log_every_n_steps=1):
for row_idx, row in tqdm(
data_df.iterrows(), total=len(data_df), desc="Evaluating"
):
task_name, task_type, metric, ground_truth = (
task_name, task_type, metric, ground_truth, model_output_str = (
row["task_name"],
row["task_type"],
row["metric"],
row["output_field"],
row["model_output_str"],
)
if metric not in eval_methods:
raise NotImplementedError(f"No metric for {metric=}")
model_output = task_parsers[task_type].parse(outputs[row_idx])
model_output = task_parsers[task_type].parse(model_output_str)
eval_fn = eval_methods[metric]
metric_score = eval_fn(model_output, ground_truth)
......@@ -230,14 +247,15 @@ def main():
model = UserModel()
# Generate model outputs
outputs = generate_model_outputs(data_df, model)
data_df["outputs"] = (
outputs # Optional: Add outputs back to DataFrame for inspection
)
print(data_df.head())
df_outputs = generate_model_outputs(data_df, model)
# add outputs to the data_df
merged_data_df = pd.merge(data_df, df_outputs, on="input_field")
print(merged_data_df.head())
# Evaluate the generated outputs and calculate metrics
per_task_metrics = evaluate_outputs(data_df, outputs)
per_task_metrics = evaluate_outputs(merged_data_df)
# Aggregate and display the evaluation scores
overall_metrics = aggregate_scores(per_task_metrics)
......
from rouge_score import rouge_scorer
from sentence_transformers import SentenceTransformer
import numpy as np
import evaluate
import os
from typing import List, Tuple, Union
import evaluate
import numpy as np
import torch
from typing import List, Union, Tuple
from loguru import logger
from rouge_score import rouge_scorer
from sentence_transformers import SentenceTransformer
sacrebleu = None
sentence_transformer_model_cache = {}
......
......@@ -4,7 +4,7 @@
For a streamlined experience, we suggest placing the code for all your models within the `models` directory. This is a recommendation for organizational purposes, but it's not a strict requirement.
## Model Base Class
Your models should inherit from the `ShopBenchBaseModel` class found in [base_model.py](base_model.py). We provide an example model, `dummy_model.py`, to illustrate how you might structure your own model. Crucially, your model class must implement the `predict` method.
Your models should inherit from the `ShopBenchBaseModel` class found in [base_model.py](base_model.py). We provide an example model, `dummy_model.py`, to illustrate how you might structure your own model. Crucially, your model class must implement the `batch_predict` method.
## Configuring Your Model
To ensure your model is recognized and utilized correctly, please specify your model class name in the [`user_config.py`](user_config.py) file, by following the instructions in the inline comments.
......@@ -12,12 +12,14 @@ To ensure your model is recognized and utilized correctly, please specify your m
## Model Inputs and Outputs
### Inputs
Your model will receive two pieces of information for every task:
- `prompt` (`str`): This is the specific task's input prompt.
- `batch` (`Dict[str, Any]`): A batch of inputs as a dictionary, where the dictionary has the following key:
- `prompt` (`List[str]`): `A list if prompts representing the tasks in a batch`
- `is_multiple_choice` (`bool`): This indicates whether the task is a multiple choice question.
### Outputs
The output from your model's `predict` function should always be a string. Depending on the task, this could be:
The output from your model's `batch_predict` function should be a list of string responses for all the prompts in the input batch.
Depending on the task, each response could be:
- A single integer (in the range [0, 3]) for multiple choice tasks.
- A comma-separated list of integers for ranking tasks.
- A comma-separated list of named entities for Named Entity Recognition (NER) tasks.
......
from typing import Any, Dict, List
class ShopBenchBaseModel:
def __init__(self):
pass
def predict(self, prompt: str, is_multiple_choice: bool) -> str:
def get_batch_size(self) -> int:
"""
Determines the batch size that is used by the evaluator when calling the `batch_predict` function.
Returns:
int: The batch size, an integer between 1 and 16. This value indicates how many
queries should be processed together in a single batch. It can be dynamic
across different batch_predict calls, or stay a static value.
"""
raise NotImplementedError("get_batch_size method not implemented")
def batch_predict(self, batch: Dict[str, Any], is_multiple_choice:bool) -> List[str]:
"""
Generates a prediction based on the input prompt and task type.
Generates a batch of prediction based on associated prompts and task_type
For multiple choice tasks, it randomly selects a choice.
For other tasks, it returns a list of integers as a string,
representing the model's prediction in a format compatible with task-specific parsers.
Args:
prompt (str): The input prompt for the model.
is_multiple_choice (bool): Indicates whether the task is a multiple choice question.
Parameters:
- batch (Dict[str, Any]): A dictionary containing a batch of input prompts with the following keys
- prompt (List[str]): a list of input prompts for the model.
- is_multiple_choice bool: A boolean flag indicating if all the items in this batch belong to multiple choice tasks.
Returns:
str: The prediction as a string representing a single integer[0, 3] for multiple choice tasks,
str: A list of predictions for each of the prompts received in the batch.
Each prediction is
a string representing a single integer[0, 3] for multiple choice tasks,
or a string representing a comma separated list of integers for Ranking, Retrieval tasks,
or a string representing a comma separated list of named entities for Named Entity Recognition tasks.
or a string representing the (unconstrained) generated response for the generation tasks
......
from typing import List, Union
import random
import os
import random
from typing import Any, Dict, List
from .base_model import ShopBenchBaseModel
......@@ -19,33 +19,55 @@ class DummyModel(ShopBenchBaseModel):
"""Initializes the model and sets the random seed for consistency."""
random.seed(AICROWD_RUN_SEED)
def predict(self, prompt: str, is_multiple_choice: bool) -> str:
def get_batch_size(self) -> int:
"""
Determines the batch size that is used by the evaluator when calling the `batch_predict` function.
Returns:
int: The batch size, an integer between 1 and 16. This value indicates how many
queries should be processed together in a single batch. It can be dynamic
across different batch_predict calls, or stay a static value.
"""
Generates a prediction based on the input prompt and task type.
self.batch_size = 4
return self.batch_size
def batch_predict(self, batch: Dict[str, Any], is_multiple_choice:bool) -> List[str]:
"""
Generates a batch of prediction based on associated prompts and task_type
For multiple choice tasks, it randomly selects a choice.
For other tasks, it returns a list of integers as a string,
representing the model's prediction in a format compatible with task-specific parsers.
Args:
prompt (str): The input prompt for the model.
is_multiple_choice (bool): Indicates whether the task is a multiple choice question.
Parameters:
- batch (Dict[str, Any]): A dictionary containing a batch of input prompts with the following keys
- prompt (List[str]): a list of input prompts for the model.
- is_multiple_choice bool: A boolean flag indicating if all the items in this batch belong to multiple choice tasks.
Returns:
str: The prediction as a string representing a single integer[0, 3] for multiple choice tasks,
str: A list of predictions for each of the prompts received in the batch.
Each prediction is
a string representing a single integer[0, 3] for multiple choice tasks,
or a string representing a comma separated list of integers for Ranking, Retrieval tasks,
or a string representing a comma separated list of named entities for Named Entity Recognition tasks.
or a string representing the (unconstrained) generated response for the generation tasks
Please refer to parsers.py for more details on how these responses will be parsed by the evaluator.
"""
prompts = batch["prompt"]
possible_responses = [1, 2, 3, 4]
if is_multiple_choice:
# Randomly select one of the possible responses for multiple choice tasks
return str(random.choice(possible_responses))
else:
# For other tasks, shuffle the possible responses and return as a string
random.shuffle(possible_responses)
return str(possible_responses)
# Note: As this is dummy model, we are returning random responses for non-multiple choice tasks.
# For generation tasks, this should ideally return an unconstrained string.
batch_response = []
for prompt in prompts:
if is_multiple_choice:
# Randomly select one of the possible responses for multiple choice tasks
batch_response.append(str(random.choice(possible_responses)))
else:
# For other tasks, shuffle the possible responses and return as a string
random.shuffle(possible_responses)
batch_response.append(str(possible_responses))
# Note: As this is dummy model, we are returning random responses for non-multiple choice tasks.
# For generation tasks, this should ideally return an unconstrained string.
return batch_response
......@@ -7,6 +7,7 @@ from models.dummy_model import DummyModel
# This approach allows for easier reference to your model class when evaluating your models,
UserModel = DummyModel
# When implementing your own model please follow this pattern:
#
# from models.your_model import YourModel
......@@ -17,3 +18,11 @@ UserModel = DummyModel
# Finally, assign YourModel to UserModel as shown below to use it throughout your script.
#
# UserModel = YourModel
# For example, to use the Llama3 8B Instruct baseline, you can comment the lines below:
# please remember to download the model weights and checking them into the repository
# before submitting
# from models.vanilla_llama3_baseline import Llama3_8B_ZeroShotModel
# UserModel = Llama3_8B_ZeroShotModel
import os
import random
from typing import Any, Dict, List
import vllm
from .base_model import ShopBenchBaseModel
#### CONFIG PARAMETERS ---
# Set a consistent seed for reproducibility
AICROWD_RUN_SEED = int(os.getenv("AICROWD_RUN_SEED", 773815))
# Batch size you wish the evaluators will use to call the `batch_generate_answer` function
AICROWD_SUBMISSION_BATCH_SIZE = 16 # TUNE THIS VARIABLE depending on the number of GPUs you are requesting and the size of your model.
# VLLM Parameters
VLLM_TENSOR_PARALLEL_SIZE = 4 # TUNE THIS VARIABLE depending on the number of GPUs you are requesting and the size of your model.
VLLM_GPU_MEMORY_UTILIZATION = 0.85 # TUNE THIS VARIABLE depending on the number of GPUs you are requesting and the size of your model.
class Llama3_8B_ZeroShotModel(ShopBenchBaseModel):
"""
A dummy model implementation for ShopBench, illustrating how to handle both
multiple choice and other types of tasks like Ranking, Retrieval, and Named Entity Recognition.
This model uses a consistent random seed for reproducible results.
"""
def __init__(self):
"""Initializes the model and sets the random seed for consistency."""
random.seed(AICROWD_RUN_SEED)
self.initialize_models()
def initialize_models(self):
# Initialize Meta Llama 3 - 8B Instruct Model
self.model_name = "models/meta-llama/Meta-Llama-3-8B-Instruct"
if not os.path.exists(self.model_name):
raise Exception(
f"""
The evaluators expect the model weights to be checked into the repository,
but we could not find the model weights at {self.model_name}
Please follow the instructions in the docs below to download and check in the model weights.
https://gitlab.aicrowd.com/aicrowd/challenges/amazon-kdd-cup-2024/amazon-kdd-cup-2024-starter-kit/-/blob/master/docs/download-baseline-model-weights.md
"""
)
# initialize the model with vllm
self.llm = vllm.LLM(
self.model_name,
worker_use_ray=True,
tensor_parallel_size=VLLM_TENSOR_PARALLEL_SIZE,
gpu_memory_utilization=VLLM_GPU_MEMORY_UTILIZATION,
trust_remote_code=True,
dtype="half", # note: bfloat16 is not supported on nvidia-T4 GPUs
enforce_eager=True
)
self.tokenizer = self.llm.get_tokenizer()
def get_batch_size(self) -> int:
"""
Determines the batch size that is used by the evaluator when calling the `batch_predict` function.
Returns:
int: The batch size, an integer between 1 and 16. This value indicates how many
queries should be processed together in a single batch. It can be dynamic
across different batch_predict calls, or stay a static value.
"""
self.batch_size = AICROWD_SUBMISSION_BATCH_SIZE
return self.batch_size
def batch_predict(self, batch: Dict[str, Any], is_multiple_choice:bool) -> List[str]:
"""
Generates a batch of prediction based on associated prompts and task_type
For multiple choice tasks, it randomly selects a choice.
For other tasks, it returns a list of integers as a string,
representing the model's prediction in a format compatible with task-specific parsers.
Parameters:
- batch (Dict[str, Any]): A dictionary containing a batch of input prompts with the following keys
- prompt (List[str]): a list of input prompts for the model.
- is_multiple_choice bool: A boolean flag indicating if all the items in this batch belong to multiple choice tasks.
Returns:
str: A list of predictions for each of the prompts received in the batch.
Each prediction is
a string representing a single integer[0, 3] for multiple choice tasks,
or a string representing a comma separated list of integers for Ranking, Retrieval tasks,
or a string representing a comma separated list of named entities for Named Entity Recognition tasks.
or a string representing the (unconstrained) generated response for the generation tasks
Please refer to parsers.py for more details on how these responses will be parsed by the evaluator.
"""
prompts = batch["prompt"]
# format prompts using the chat template
formatted_prompts = self.format_prommpts(prompts)
# set max new tokens to be generated
max_new_tokens = 100
if is_multiple_choice:
max_new_tokens = 1 # For MCQ tasks, we only need to generate 1 token
# Generate responses via vllm
responses = self.llm.generate(
formatted_prompts,
vllm.SamplingParams(
n=1, # Number of output sequences to return for each prompt.
top_p=0.9, # Float that controls the cumulative probability of the top tokens to consider.
temperature=0, # randomness of the sampling
seed=AICROWD_RUN_SEED, # Seed for reprodicibility
skip_special_tokens=True, # Whether to skip special tokens in the output.
max_tokens=max_new_tokens, # Maximum number of tokens to generate per output sequence.
),
use_tqdm = False
)
# Aggregate answers into List[str]
batch_response = []
for response in responses:
batch_response.append(response.outputs[0].text)
if is_multiple_choice:
print("MCQ: ", batch_response)
return batch_response
def format_prommpts(self, prompts):
"""
Formats prompts using the chat_template of the model.
Parameters:
- queries (list of str): A list of queries to be formatted into prompts.
"""
system_prompt = "You are a helpful online shopping assistant. Please answer the following question about online shopping and follow the given instructions.\n\n"
formatted_prompts = []
for prompt in prompts:
formatted_prompts.append(system_prompt + prompt)
return formatted_prompts
import ast
VERSION = "0.1.0"
from loguru import logger
VERSION = "0.1.1"
MAX_RESPONSE_CHARACTERS = 5000
......@@ -80,10 +82,15 @@ class ShoppingBenchTaskParsers:
An integer representing the selected option. Returns -1 if the parsing fails due to
an invalid response format.
"""
default_response = -1
try:
return int(response.strip()[0])
except ValueError:
return -1
response = response.strip()
return int(response[0])
except Exception as e:
logger.warning(
f"SHOPBENCH_PARSER_WARNING::: Error parsing multichoice response: {e}. Responding with default : {default_response}"
)
return default_response
def _parse_ranking(self, response: str) -> list:
"""
......@@ -98,6 +105,7 @@ class ShoppingBenchTaskParsers:
A list of integers representing the items in ranked order. Limits to the first 5 unique
elements. Returns an empty list if duplicates are found or parsing fails.
"""
default_respomse = []
# Keep only numeric characters and specific punctuation.
cleaned_response = "".join(
c for c in response if c.isnumeric() or c in [",", " "]
......@@ -108,7 +116,9 @@ class ShoppingBenchTaskParsers:
for item in cleaned_response.split(","):
try:
# Attempt to convert each item to an integer and add it to the list.
ranked_items.append(int(item))
int_item = int(item)
if int_item <= 5: # we know int_item can be at most 5
ranked_items.append(int_item)
except ValueError:
pass # Skip non-numeric items.
......@@ -117,7 +127,7 @@ class ShoppingBenchTaskParsers:
# If there are duplicates, empty the list
if len(ranked_items) != len(set(ranked_items)):
ranked_items = []
ranked_items = default_respomse
return ranked_items
def _parse_generation(self, response: str) -> str:
......@@ -146,24 +156,30 @@ class ShoppingBenchTaskParsers:
Returns:
A list of integers representing the first 3 unique retrieved item indices.
"""
# Similar to ranking parser, but only returns the first 3 elements.
cleaned_response = "".join(
c for c in response if c.isnumeric() or c in [",", " "]
)
# Convert to list of integers
response = []
for item in cleaned_response.split(","):
try:
# Attempt to convert each item to an integer and add it to the list.
response.append(int(item))
except ValueError:
pass # Skip non-numeric items.
# consider only the first 3 elements
retrieved_items = response[:3]
default_response = []
try:
# Similar to ranking parser, but only returns the first 3 elements.
cleaned_response = "".join(
c for c in response if c.isnumeric() or c in [",", " "]
)
return retrieved_items
# Convert to list of integers
response = []
for item in cleaned_response.split(","):
try:
# Attempt to convert each item to an integer and add it to the list.
response.append(int(item))
except ValueError:
pass # Skip non-numeric items.
# consider only the first 3 elements
retrieved_items = response[:3]
return retrieved_items
except Exception as e:
logger.warning(
f"SHOPBENCH_PARSER_WARNING::: Error parsing retrieval response: {e}. Responding with default : {default_response}"
)
return default_response
def _parse_named_entity_recognition(self, response: str) -> list:
"""
......@@ -189,78 +205,124 @@ class ShoppingBenchTaskParsers:
raise SyntaxError(
"Unexpected Syntax error - fall back to comma separated list."
)
except (SyntaxError, ValueError):
except Exception as e:
# Fallback: split the string by commas and strip whitespace.
return [entity.strip() for entity in response.split(",")]
# we remove empty entities. it will not cause bug, just an implementation choice.
return [
entity.strip()
for entity in response.split(",")
if entity.strip() != ""
]
import unittest
class TestShoppingBenchTaskParsers(unittest.TestCase):
def test_multichoice(self):
parser = ShoppingBenchTaskParsers("multichoice")
# Check for a valid numeric response
self.assertEqual(parser.parse("2"), 2)
# Check for an invalid (alphabetic) response, expecting failure code -1
self.assertEqual(parser.parse("a"), -1)
# Check handling of newline-only input, expecting failure code -1
self.assertEqual(parser.parse("\n"), -1)
# Check handling of space-only input, expecting failure code -1
self.assertEqual(parser.parse(" "), -1)
# Check handling of leading space before a valid response
self.assertEqual(parser.parse(" 2"), 2)
# Check handling of newline before a valid response
self.assertEqual(parser.parse("\n1"), 1)
# Check for newline and space before a valid response
self.assertEqual(parser.parse("\n 3"), 3)
# Check for newline and space only, expecting failure code -1
self.assertEqual(parser.parse("\n "), -1)
def test_ranking(self):
parser = ShoppingBenchTaskParsers("ranking")
# Basic successful parse of a comma-separated list of numbers
self.assertEqual(parser.parse("1, 2, 3, 4, 5"), [1, 2, 3, 4, 5])
# Successfully parses even when wrapped in square brackets
self.assertEqual(parser.parse("[1, 2, 3, 4, 5]"), [1, 2, 3, 4, 5])
# Fails (empty list) when numbers are repeated
self.assertEqual(parser.parse("1, 2, 2, 3"), [])
# Filters out non-numeric values correctly, keeping the valid numbers
self.assertEqual(parser.parse("1, 2, 4, aicrowd, 5"), [1, 2, 4, 5])
# Check handling of newline-only input, expecting empty list
self.assertEqual(parser.parse("\n"), [])
# Check handling of space and newline input, expecting empty list
self.assertEqual(parser.parse(" \n"), [])
# Parses numbers correctly even when prefixed by non-numeric text
self.assertEqual(
parser.parse("The answer is: 1, 2, 3, 4, 5"), [1, 2, 3, 4, 5]
)
# Correctly handles a leading comma
self.assertEqual(parser.parse(",1,2,3,4,5"), [1, 2, 3, 4, 5])
# Fails (empty list) when numbers are not comma-separated
self.assertEqual(parser.parse("1 2"), [])
def test_generation(self):
parser = ShoppingBenchTaskParsers("generation")
# Verifies correct response without modification
self.assertEqual(
parser.parse("This is a generated response."),
"This is a generated response.",
)
# Handles and trims extraneous newlines and spaces correctly
self.assertEqual(
parser.parse("\nThe answer is \n\n good.\n\n\n\n\n\n\n"),
"The answer is \n\n good.",
)
# Correctly returns empty string for newline and space-only inputs
self.assertEqual(parser.parse("\n \n"), "")
def test_retrieval(self):
parser = ShoppingBenchTaskParsers("retrieval")
# Basic successful parse of a comma-separated list of numbers
self.assertEqual(parser.parse("100, 200, 300"), [100, 200, 300])
# Successfully handles shorter than expected input lists
self.assertEqual(parser.parse("100, 200"), [100, 200])
# Filters out non-numeric values correctly, keeping the valid numbers
self.assertEqual(parser.parse("100, 200, jjhg"), [100, 200])
# Correctly parses numbers despite excessive spacing and newlines
self.assertEqual(
parser.parse("100, 200, \n\n\n 300"), [100, 200, 300]
)
# Limits output to first three elements if more are provided
self.assertEqual(parser.parse("100, 200, 300, 400"), [100, 200, 300])
# Correctly handles newline before valid input
self.assertEqual(parser.parse("\n 100, 200, 300"), [100, 200, 300])
# Returns empty list for newline-only inputs
self.assertEqual(parser.parse("\n \n \n"), [])
def test_named_entity_recognition(self):
parser = ShoppingBenchTaskParsers("named_entity_recognition")
# Successfully parses a list of strings, correctly interpreting them as separate entities
self.assertEqual(
parser.parse("['New York', 'ShopBench', 'Amazon']"),
["New York", "ShopBench", "Amazon"],
)
# Successfully parses comma-separated entities without brackets or quotes
self.assertEqual(
parser.parse("New York, ShopBench, Amazon"),
["New York", "ShopBench", "Amazon"],
)
# Incorrectly includes the opening bracket in the first entity and the closing bracket in the last entity,
# indicating an unintentional parsing error with brackets when quotes are not used.
self.assertEqual(
parser.parse("[New York, ShopBench, Amazon]"),
["[New York", "ShopBench", "Amazon]"],
)
# Correctly parses entities even when the input starts with a newline and a comma, trimming unnecessary characters
self.assertEqual(
parser.parse("\n, New York, ShopBench"), ["New York", "ShopBench"]
)
# Returns an empty list when parsing only a space, indicating no entities found
self.assertEqual(parser.parse(" "), [])
# Returns an empty list for inputs consisting only of newlines and spaces, indicating no entities found
self.assertEqual(parser.parse("\n \n"), [])
if __name__ == "__main__":
# Example usage of the ShoppingBenchTaskParsers class for various task types.
# MULTICHOICE EXAMPLE
multic_choice_parser = ShoppingBenchTaskParsers("multichoice")
print("Multichoice Example:")
print(multic_choice_parser.parse("2")) # Expected output: 2
print(
multic_choice_parser.parse("a")
) # Expected output (failure case): -1
print()
# RANKING EXAMPLE
ranking_parser = ShoppingBenchTaskParsers("ranking")
print("Ranking Example:")
print(
ranking_parser.parse("1, 2, 3, 4, 5")
) # Expected output: [1, 2, 3, 4, 5]
print(
ranking_parser.parse("[1, 2, 3, 4, 5]")
) # Expected output: [1, 2, 3, 4, 5] - tolerant to [, ]
print(
ranking_parser.parse("1, 2, 2, 3")
) # Expected output (failure case): [] # because of repeating numbers
print(
ranking_parser.parse("1, 4, 5, aicrowd, 6")
) # Expected output: [1, 4, 5, 6] # remove alphanumeric chars
print()
# GENERATION EXAMPLE
generation_parser = ShoppingBenchTaskParsers("generation")
print("Generation Example:")
print(
generation_parser.parse("This is a generated response")
) # Expected output: 'This is a generated response.'
print()
# RETRIEVAL EXAMPLE
retrieval_parser = ShoppingBenchTaskParsers("retrieval")
print("Retrieval Example:")
print(
retrieval_parser.parse("100, 200, 300")
) # Expected output: [100, 200, 300]
print(
retrieval_parser.parse("100, 200")
) # Expected output (shorter than 3): [100, 200]
print(
retrieval_parser.parse("100, 200, jjhg")
) # Expected output (removed alphhanumeric chars): [100, 200]
print(
retrieval_parser.parse("100, 200, 300, 400")
) # Expected output (only consider first 3 elems): [100, 200, 300]
print()
# NAMED ENTITY RECOGNITION EXAMPLE
ner_parser = ShoppingBenchTaskParsers("named_entity_recognition")
print("Named Entity Recognition Example:")
print(
ner_parser.parse("['New York', 'ShopBench', 'Amazon']")
) # Expected output: ['New York', 'ShopBench', 'Amazon']
print(
ner_parser.parse("New York, ShopBench, Amazon")
) # Expected output: ['New York', 'ShopBench', 'Amazon']
print(
ner_parser.parse("[New York, ShopBench, Amazon]")
) # failure case - not tolerant to [ if quotes not used
# - extra '[' characters added to boundary elems]): ['[New York', 'ShopBench', 'Amazon]']
# Expected output: ['[New York', 'ShopBench', 'Amazon]']
unittest.main()
torch
\ No newline at end of file
torch
vllm>=0.4.2
loguru
......@@ -3,5 +3,5 @@ pandas
sentence-transformers
rouge_score
evaluate
sacrebleu
sacrebleu[ja]
\ No newline at end of file
sacrebleu==2.4.1
sacrebleu[ja]
python-3.8