Mar. 1st, 2024: Following the policy of Spider, we have decided to make the CSpider test set publicly available. You are encouraged to freely test it by checking the CSpider dataset link below. Please note that we will no longer accept submissions for CSpider.
CSpider is a Chinese large-scale complex and cross-domain semantic parsing and text-to-SQL dataset translated from Spider by 2 NLP researchers and 1 computer science student. The goal of the CSpider challenge is to develop natural language interfaces to cross-domain databases for Chinese, which is currently a low-resource language in this task area. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables covering 138 different domains. Following Spider 1.0, in CSpider, different complex SQL queries and databases appear in train and test sets. To do well on it, systems must generalize well to not only new SQL queries but also new database schemas.
CSipder is translated from Spider. However, there can be added challenges. First, structures of relational databases, in particular names and column names of DB tables, are typically represented in English. This adds to the challenges to question-to-DB mapping. Second, the basic semantic unit for denoting columns or cells can be words, but word segmentation can be erroneous. It is also interesting to study the influence of other linguistic characteristics of Chinese, such as zero-pronoun, on its SQL parsing.
CSpider Paper (EMNLP'19)The data is split into training, development, and unreleased test sets. Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):
Details of baseline models and evaluation script can be found on the following GitHub site:
Once you have a built a model that works to your expectations on the dev set, you submit it to get official scores on the dev and a hidden test set. To preserve the integrity of test results, we do not release the test set to the public. Instead, we require you to submit your model so that we can run it on the test set for you. Here's a tutorial walking you through official evaluation of your model:
Submission TutorialSome examples look like the following:
Ask us questions at our Github issues page or contact minqingkai@westlake.edu.cn or shiyuefeng@westlake.edu.cn.
We expect the dataset to evolve. We would greatly appreciate it if you could donate us your non-private databases or SQL queries for the project.
We thank Tao Yu for sharing the original Spider test set with us, and the anonymous reviewers for their precious comments on this project. Also, we thank Pranav Rajpurkar and Tao Yu for giving us the permission to build this website based on SQuAD and Spider.
Following Spider, we take exact matching evaluation. Instead of simply conducting string comparison between the predicted and gold SQL queries, we decompose each SQL into several clauses, and conduct set comparison in each SQL clause. Please refer to our Github page or the Spider paper and its Github page for more details.
Rank | Model | Dev | Test |
---|---|---|---|
1 January 16, 2024 |
FastRAT + AST Ranking + GPT-4
HUAWEI Poisson-ERC-KG Lab & HUAWEI Cloud |
66.2 | 62.1 |
2 June 29, 2023 |
Roberta + Seq2SQL
Beijing PERCENT Technology Group Co.,Ltd. |
66.2 | 60.6 |
3 May 24, 2022 |
LGESQL + GTL + Electra + QT
SJTU X-LANCE Lab |
64.0 | 60.3 |
4 December 31, 2021 |
LGESQL + ELECTRA + QT
HUAWEI Poisson Lab & HUAWEI Cloud |
64.5 | 58.1 |
5 May 24, 2022 |
LGESQL + GTL + Infoxlm
SJTU X-LANCE Lab |
61.0 | 57.0 |
6 November 27, 2020 |
RAT-SQL + GraPPa + Adv
Alibaba |
59.7 | 56.2 |
7 May 24, 2022 |
LGESQL + GTL + Multilingual BERT
SJTU X-LANCE Lab |
58.6 | 52.7 |
8 July 8, 2020 |
XL-SQL
Anonymous |
54.9 | 47.8 |
9 November 25, 2020 |
DG-SQL + Multilingual BERT
University of Edinburgh https://arxiv.org/abs/2010.11988 |
50.4 | 46.9 |
10 October 10, 2020 |
RAT-SQL (without schema linking) + Multilingual BERT
Anonymous |
41.4 | 37.3 |
11 July 15, 2020 |
RYANSQL + Multilingual BERT
Kakao Enterprise https://arxiv.org/abs/2004.03125 |
41.3 | 34.7 |
12 July 8, 2020 |
DG-SQL
Anonymous |
35.5 | 26.8 |
13 Nov 28, 2019 |
CN-SQL
oneconnect |
22.9 | 18.8 |
14 Sep 18, 2019 |
SyntaxSQLNet (based on
Yu et al. (2018a))
Westlake University https://arxiv.org/abs/1909.13293 |
16.4 | 13.3 |