Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
TextFlint is a multilingual robustness evaluation platform for natural language processing, which unifies text transformation, sub-population, adversarial attack,and their combinations to provide a comprehensive robustness analysis. So far, TextFlint supports 13 NLP tasks.
If you’re looking for robustness evaluation results of SOTA models, you might want the TextFlint IO page.
You can test most of transformations directly on our online demo.
Require python version >= 3.7, recommend install with pip
.
pip install textflint
Once TextFlint is installed, you can run it via command-line (textflint ...
) or integrate it inside another NLP project.
The general workflow of TextFlint is displayed above. Evaluation of target models could be divided into three steps:
Dataset
, should be firstly formatted as a series of JSON
objects. You can use the built-in Dataset
following this instruction. TextFlint configuration is specified by Config
. Target model is also loaded as FlintModel
.Dataset
to generate transformed samples. Besides, to ensure semantic and grammatical correctness of transformed samples, Validator calculates confidence of each sample to filter out unacceptable samples.Analyzer
collects evaluation results and ReportGenerator
automatically generates a comprehensive report of model robustness.For example, on the Sentiment Analysis (SA) task, this is a statistical chart of the performance ofXLNET
with different types of Transformation
/Subpopulation
/AttackRecipe
on the IMDB
dataset.
We release tutorials of performing the whole pipeline of TextFlint on various tasks, including:
Using TextFlint to verify the robustness of a specific model is as simple as running the following command:
$ textflint --dataset input_file --config config.json
where input_file is the input file of csv or json format, config.json is a configuration file with generation and target model options. Transformed datasets would save to your out dir according to your config.json.
Based on the design of decoupling sample generation and model verification, TextFlint can be used inside another NLP project with just a few lines of code.
from textflint import Engine
data_path = 'input.json'
config = 'config.json'
engine = Engine()
engine.run(data_path, config)
For more input and output instructions of TextFlint, please refer to the IO format document.
Input layer: receives textual datasets and models as input, represented as Dataset
and FlintModel
separately.
DataSet
: a container, provides efficient and handy operation interfaces for Sample
. Dataset
supports loading, verification, and saving data in Json or CSV format for various NLP tasks.FlintModel
: a target model used in an adversarial attack.Generation layer: there are mainly four parts in generation layer:
Subpopulation
: generates a subset of a DataSet
.Transformation
: transforms each sample of Dataset
if it can be transformed.AttackRecipe
: attacks the FlintModel
and generates a DataSet
of adversarial examples.Validator
: verifies the quality of samples generated by Transformation
and AttackRecipe
.textflint provides an interface to integrate the easy-to-use adversarial attack recipes implemented based on
textattack
. Users can refer to textattack for more information about the supportedAttackRecipe
.
Report layer: analyzes model testing results and provides robustness report for users.
Section | Description |
---|---|
Documentation | Full API documentation and tutorials |
Tutorial | The tutorial of textflint components and pipeline |
Website | Provides evaluation results of SOTA models and transformed data download |
Online Demo | Interactive demo to try single text transformations |
Paper | Our system paper which was received by ACL2021 |
We welcome community contributions to TextFlint in the form of bugfixes 🛠️ and new features💡! If you want to contribute, please first read our contribution guideline.
If you are using TextFlint for your work, please kindly cite our ACL2021 TextFlint demo paper:
@inproceedings{wang-etal-2021-textflint,
title = {TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing},
author = {Wang, Xiao and Liu, Qin and Gui, Tao and Zhang, Qi and others},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations},
month = {aug},
year = {2021},
address = {Online},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2021.acl-demo.41},
doi = {10.18653/v1/2021.acl-demo.41},
pages = {347--355}
}