Call for Papers
Recently there have been attempts to address the problem of benchmarks and metrics that do not represent performance well. For example, in abusive language detection, there are both static datasets of hard-to-detect examples (Röttger et al. 2021) and dynamic approaches for generating such examples (Calabrese et al. 2021). On the platform DynaBench (Kiela et al. 2021), benchmarks are dynamic and constantly updated with hard-to-classify examples, avoiding overfitting a predetermined dataset. However, these approaches only capture a tiny fraction of issues with benchmarking. There is still much work to do.
For this edition of the workshop on Novel Evaluation Approaches for Text Classification Systems (NEATCLasS) we welcome submissions discussing such new evaluation approaches, introducing new or refining existing ones, promoting the use of novel metrics for abuse detection, hate speech recognition, sentiment analysis and similar tasks within the community. Furthermore, the workshop will promote discussion on the importance, potential and danger of disagreement in tasks that require subjective judgements. This discussion will also focus on how to evaluate human annotations, and how to find the most suitable set of annotators (if any) for a given instance and task. The workshop will solicit, among others, research papers about:
- Issues with current evaluation metrics and benchmarking datasets
- New evaluation metrics
- User-centred (qualitative or quantitative) evaluation of social media text analysis tools
- Adaptations and translations of novel evaluation metrics for other languages
- New datasets for benchmarking
- Increasing data quality in benchmarking datasets, e.g., avoidance of selection bias, identification of suitable expert human annotators for tasks involving subjective judgements
- Systems that facilitate dynamic evaluation and benchmarking
- Models that perform better at hard-to-classify instances and novel evaluation metrics such as AAA, DynaBench and HateCheck
- Bias, error analysis and model diagnostics
- Phenomena not captured by existing evaluation metrics (such as models making the right predictions for the wrong reason)
- Approaches to mitigating bias and common errors
- Alternative designs for NLP competitions that evaluate a wide range of model characteristics (such as bias, error analysis, cross-domain performance)
- Challenges of downstream applications (in industry, computational social science and elsewhere) and reflections on how these challenges can be captured in evaluation metrics.
Format and Submissions
The workshop will take place as a half-day meeting on 5 June. Participants will be invited to trial an innovative format for paper presentations: presenters will be given 5 minutes to describe their research questions and hypothesis, and a group discussion will start after that. Then, presenters will be given 5 more minutes to describe their method and results, followed by a new group discussion about the interpretation and implications of such results. There will also be a group discussion to bring researchers together and collect ideas for new evaluation approaches and future work in the field.
We invite research papers (8 pages), position and short papers (4 pages), and demo papers (2 pages). References and appendices (if applicable) are excluded from this page count, but the length of the entire paper including references must not exceed 11 pages in the case of full papers, 5 in the case of short papers, and 3 in the case of demo papers. Submissions must be original and should not have been published previously or be under consideration for publication while being evaluated for this workshop. Submissions will be evaluated by the program committee based on the quality of the work and its fit to the workshop themes. All submissions must be double-blind and a high-resolution PDF of the paper should be uploaded to the EasyChair submission site (link below) before the paper submission deadline. The accepted papers will be published as Proceedings of the ICWSM Workshops. Please use AAAI two-column, camera-ready style.
While we would encourage you to attend in person, we are also planning to live stream the workshop on Zoom and record talks to allow as many people as possible to participate.
Submission Information
- Submission link: https://easychair.org/conferences/?conf=neatclass2023
- Paper submission deadline: 17 April 2023
- Paper acceptance notification: 30 April 2023
- Final camera-ready paper due: 6 May 2023
- Workshop Day: 5 June 2023
All deadlines are 11:59pm AOE (anywhere on earth).