Article
Henry Pereira · Dec 1, 2021 5m read

Data anonymization, introducing iris-Disguise

freepik- freepik.com
First of all, what is data anonymization?

According to Wikipedia:

Data anonymization is a type of information sanitization whose intent is privacy protection. It is the process of removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous.

In other words, the data anonymization is a process that retains the data but keeps the source anonymous.
Depending on the adopted anonymization technique the data is redacted, masked or substituted.

And that is the purpose of iris-Disguise, to provide a set of anonymization tools.

You can use in two different ways, by method execution or specify your anonymization strategy inside the persistent class definition itself.

The current version of iris-Disguise offers 6 strategies to anonymize data:

  • Destruction
  • Scramble
  • Shuffling
  • Partial Masking
  • Randomization
  • Faking

Let me explain each strategy, I will show a method execution with an example and as mentioned, I'll also show how to apply inside the persistent class definition.
To use iris-Disguise in this way you need to "wear a disguise glasses".
In the persistent class, you can extent the dc.Disguise.Glasses class and change any property with the data type with the strategy of your choice.
After that, at any moment, just call the DisguiseProcess method on the class. All the values will be replaced using the strategy of the data type.

So buckle up and let's go.

Destruction

This strategy will replace a entire column with a word ('CONFIDENTIAL' is the default).

Do ##class(dc.Disguise.Strategy).Destruction("classname", "propertyname", "Word to replace")

The third parameter is optional. If not provided, the word 'CONFIDENTIAL' will be used.

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As dc.Disguise.DataTypes.String(FieldStrategy = "DESTRUCTION");
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

1

Scramble

This strategy will scrambling all characters in a property.

Do ##class(dc.Disguise.Strategy).Scramble("classname", "propertyname")

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As dc.Disguise.DataTypes.String(FieldStrategy = "SCRAMBLE");
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

scramble

Shuffling

Shuffling will rearrange all values in a given property. Is not a masking strategy because it works "verticaly".
This strategy is useful for relatinship because referential integrity will be kept.
Until this version, this method only works on one-to-many relationships.

Do ##class(dc.Disguise.Strategy).Shuffling("classname", "propertyname")

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As %String;
Property Weapon As dc.Disguise.DataTypes.String(FieldStrategy = "SHUFFLING");
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

shuffling

Partial Masking

This strategy will obfuscate the part of data, a credit card number for example, can be replaced by 456X XXXX XXXX X783

Do ##class(dc.Disguise.Strategy).PartialMasking("classname", "propertyname", prefixLength, suffixLength, "mask")

PrefixLength, suffixLength and mask are optional. If not provided, the default values will be used.

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As %String;
Property SSN As dc.Disguise.DataTypes.PartialMaskString(prefixLength = 2, suffixLength = 2);
Property Weapon As %String;
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

partialmsk

Randomization

This strategy will generate purely random data. There are three types of randomization: integer, numeric and date.

Do ##class(dc.Disguise.Strategy).Randomization("classname", "propertyname", "type", from, to)

type: "integer", "numeric" or "date". "integer" is the default.

from and to are optional. Is to define the range of randomization.
For integer type the default range is 1 to 100. For numeric type the default range is 1.00 to 100.00.

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As %String;
Property Age As dc.Disguise.DataTypes.RandomInteger(MINVAL = 10, MAXVAL = 25);
Property SSN As %String;
Property Weapon As %String;
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

rand

Fake Data

The idea of Faking is to replace data with random but plausible values.
iris-Disguise provides a small set of methods to generate fake data.

Do ##class(dc.Disguise.Strategy).Fake("classname", "propertyname", "type")

type: "firstname", "lastname", "fullname", "company", "country", "city" and "email"

Class packageSample.FictionalCharacter Extends (%Persistent, dc.Disguise.Glasses)
{
Property Name As dc.Disguise.DataTypes.FakeString(FieldStrategy = "FIRSTNAME");
Property Age As %Integer;
Property SSN As %String;
Property Weapon As %String;
}
Do ##class(packageSample.FictionalCharacter).DisguiseProcess()

fake

 I want to hear from you!

Feedback and ideas are welcome!

Let me know what you think of this tool, how it fits your needs and what features are missing.

And I want to say a very special thanks to @Henrique Dias, @Oliver Wilms, @Robert Cemper, @YURI MARX GOMES and @Evgeny Shvarov that commented, reviewed, suggested and made rich discussions which inspired me to create and improve the iris-Disguise.

190
1 2 5 315
Log in or sign up to continue

Nice app, @Henry Pereira !

If I have the property Name that contains full names, and I need to Fake it, I should:

0. Install ZPM module iris-disquise

1. Change the datatype to dc.Disguise.DataTypes.FakeString("FULLNAME")

2.  Run the ##class(dc.Disguise.Strategy).Fake("myclass", "Name", "fullname")

Right?

I'm curious, why should I repeat fullname 2 times? Maybe I could omit the strategy at least in the Fake method assuming that strategy can be taken from the property?

If you change the property datatype and extend your class from dc.Disguise.Glasses you only need to run the method DisguiseProcess from your class.

The strategy will be taken from property datatype.


An alternative way is run ##class(dc.Disguise.Strategy).Fake only, in this way you don't need to change the datatype from porperty.
Thanks @Evgeny Shvarov 
 

Great article. thank you!!