Faker-driven, configuration-based, platform-agnostic, locale-compatible data faker tool
This project has been abandoned. For a better, faster and more maintained alternative, see Smile’s gdpr-dump. We have created our own repository for config files for popular Magento 2 extensions, see elgentos/gdpr-dump-magento-2-extensions.
Point Masquerade to a database, give it a rule-set defined in YAML and Masquerade will anonymize the data for you
automatically!
You can add your own configuration files in a directory named config
in the same directory as where you run masquerade. The configuration files will be merged with any already present configuration files for that platform, overriding any out-of-the-box values.
See the Magento 2 YAML files as examples for notation.
For example, to override the admin.yaml
for Magento 2, you place a file in config/magento2/admin.yaml
. For example, if you want to completely disable/skip a group, just add this content;
admin:
You can add your own config files for custom tables or tables from 3rd party vendors. Here are a few examples:
To generate such files, you can run the masquerade identify
command. This will look for columns that show a hint of personal identifiable data in the name, such as name
or address
. It will interactively ask you to add it to a config file for the chosen platform.
You can affect only certain records by including a ‘where’ clause - for example to avoid anonymising certain admin accounts, or to preserve data used in unit tests, like this:
customers:
customer_entity:
provider: # this sets options specific to the type of table
where: "`email` not like '%@mycompany.com'" # leave mycompany.com emails alone
You might want to fully or partially delete data - eg. if your developers don’t need sales orders, or you want to keep the database size a lot smaller than the production database. Specify the ‘delete’ option.
When deleting some Magento data, eg. sales orders, add the command line option --with-integrity
which enforces foreign key checks, so for example sales_invoice records will be deleted automatically if their parent sales_order is deleted:
orders:
sales_order:
provider:
delete: true
where: "customer_id != 3" # delete all except customer 3's orders because we use that for testing
# no need to specify columns if you're using 'delete'
If you use ‘delete’ without a ‘where’, and without ‘–with-integrity’, it will use ‘truncate’ to delete the entire table. It will not use truncate if --with-integrity is specified since that bypasses key checks.
You can use the Magento2Eav table type to treat EAV attributes just like normal columns, eg.
products:
catalog_product_entity: # specify the base table of the entity
eav: true
provider:
where: "sku != 'TESTPRODUCT'" # you can still use 'where' and 'delete'
columns:
my_custom_attribute:
formatter: sentence
my_other_attribute:
formatter: email
catalog_category_entity:
eav: true
columns:
description: # refer to EAV attributes like normal columns
formatter: paragraph
For formatters, you can use all default Faker formatters.
You can also create your own custom providers with formatters. They need to extend Faker\Provider\Base
and they need to live in either ~/.masquerade
or .masquerade
relative from where you run masquerade.
An example file .masquerade/Custom/WoopFormatter.php
;
<?php
namespace Custom;
use Faker\Provider\Base;
class WoopFormatter extends Base {
public function woopwoop() {
$woops = ['woop', 'wop', 'wopwop', 'woopwoop'];
return $woops[array_rand($woops)];
}
}
And then use it in your YAML file. A provider needs to be set on the column name level, not on the formatter level.
customer:
customer_entity:
columns:
firstname:
provider: \Custom\WoopFormatter
formatter:
name: woopwoop
Some systems have linked tables containing related data - eg. Magento’s EAV system, Drupal’s entity fields and Wordpress’s post metadata tables. You can provide custom table types.
In order to do it you need to implement 2 interfaces:
Elgentos\Masquerade\DataProcessorFactory
is to instantiate your custom processor. It receives table service factory, output object and whole array of yaml configuration specified for your table.
Elgentos\Masquerade\DataProcessor
is to process various operations required by run command like:
truncate
should truncate table in provided table via configurationdelete
should delete table in provided table via configurationupdateTable
should update table with values provided by generator based on columns definitions in the configuration.Elgentos\Masquerade\DataProcessor\RegularTableProcessor::updateTable
for a reference.First you need to start with a factory that will instantiate an actual processor
An example file .masquerade/Custom/WoopTableFactory.php
;
<?php
namespace Custom;
use Elgentos\Masquerade\DataProcessor;
use Elgentos\Masquerade\DataProcessor\TableServiceFactory;
use Elgentos\Masquerade\DataProcessorFactory;
use Elgentos\Masquerade\Output;
class WoopTableFactory implements DataProcessorFactory
{
public function create(
Output $output,
TableServiceFactory $tableServiceFactory,
array $tableConfiguration
): DataProcessor {
$tableService = $tableServiceFactory->create($tableConfiguration['name']);
return new WoopTable($output, $tableService, $tableConfiguration);
}
}
An example file .masquerade/Custom/WoopTable.php
;
<?php
namespace Custom;
use Elgentos\Masquerade\DataProcessor;
use Elgentos\Masquerade\DataProcessor\TableService;
use Elgentos\Masquerade\Output;
class WoopTable implements DataProcessor
{
/** @var Output */
private $output;
/** @var array */
private $configuration;
/** @var TableService */
private $tableService;
public function __construct(Output $output, TableService $tableService, array $configuration)
{
$this->output = $output;
$this->tableService = $tableService;
$this->configuration = $configuration;
}
public function truncate(): void
{
$this->tableService->truncate();
}
public function delete(): void
{
$this->tableService->delete($this->configuration['provider']['where'] ?? '');
}
public function updateTable(int $batchSize, callable $generator): void
{
$columns = $this->tableService->filterColumns($this->configuration['columns'] ?? []);
$primaryKey = $this->configuration['pk'] ?? $this->tableService->getPrimaryKey();
$this->tableService->updateTable(
$columns,
$this->configuration['provider']['where'] ?? '',
$primaryKey,
$this->output,
$generator,
$batchSize
);
}
}
And then use it in your YAML file. A processor factory needs to be set on the table level, and can be a simple class name, or a set of options which are available to your class.
customer:
customer_entity:
processor_factory: \Custom\WoopTableFactory
some_custom_config:
option1: "test"
option2: false
columns:
firstname:
formatter:
name: firstName
Download the phar file:
curl -L -o masquerade.phar https://github.com/elgentos/masquerade/releases/latest/download/masquerade.phar
$ php masquerade.phar run --help
Description:
List of tables (and columns) to be faked
Usage:
run [options]
Options:
--platform[=PLATFORM]
--driver[=DRIVER] Database driver [mysql]
--database[=DATABASE]
--username[=USERNAME]
--password[=PASSWORD]
--host[=HOST] Database host [localhost]
--port[=PORT] Database port [3306]
--prefix[=PREFIX] Database prefix [empty]
--locale[=LOCALE] Locale for Faker data [en_US]
--group[=GROUP] Comma-separated groups to run masquerade on [all]
--with-integrity Run with foreign key checks enabled
--batch-size=BATCH-SIZE Batch size to use for anonymization [default: 500]
You can also set these variables in a config.yaml
file in the same location as where you run masquerade from, for example:
platform: magento2
database: dbnamehere
username: userhere
password: passhere
host: localhost
port: porthere
Check out the wiki on how to run Masquerade nightly in CI/CD;
To build the phar from source you can use the build.sh
script. Note that it depends on Box which is included in this repository.
# git clone https://github.com/elgentos/masquerade
# cd masquerade
# composer install
# chmod +x build.sh
# ./build.sh
# bin/masquerade
To build a deb for this project run:
# apt-get install debhelper cowbuilder git-buildpackage
# export ARCH=amd64
# export DIST=buster
# cowbuilder --create --distribution buster --architecture amd64 --basepath /var/cache/pbuilder/base-$DIST-amd64.cow --mirror http://ftp.debian.org/debian/ --components=main
# echo "USENETWORK=yes" > ~/.pbuilderrc
# git clone https://github.com/elgentos/masquerade
# cd masquerade
# gbp buildpackage --git-pbuilder --git-dist=$DIST --git-arch=$ARCH --git-ignore-branch -us -uc -sa --git-ignore-new
To generate a new debian/changelog
for a new release:
export BRANCH=master
export VERSION=$(date "+%Y%m%d.%H%M%S")
gbp dch --debian-tag="%(version)s" --new-version=$VERSION --debian-branch $BRANCH --release --commit