The COS Faker

Hi Community,

This post is to introduce one of my first project in COS, I created when started to learn the language and until today I'm keeping improve it.

The CosFaker(here on Github) is a pure COS library for generating fake data.

cosFaker vs Populate Utils

So why use cosFaker if caché has the populate data utility?

Ok the populate utility has great things, like the SSN Generator for example, but what to do when you have a field with a long description of a product? How to check if that table will list the emails or if that calculated property will count the days of the last user interaction.

For me cosFaker is a populate utils with steroids! You can use together with the Populate, to generate %Stream or long strings, or random Dates.

e.g.

Class Sample.Product Extends (%Persistent, %Populate, %XML.Adaptor)
{

  Property Type As %String;
  Property Notes As %String(MAXLEN = 250, MINLEN = 10);
  Property Name As %String;
  Property Origin As %String;
  Property LastInteraction As %TimeStamp;

Method OnPopulate() As %Status [ ServerOnly = 1 ]
{
   Set tSC = $$$OK
   Try {
      Set ..Type = "Coffee"
      Set ..Name = ##class(cosFaker.Coffee).BlendName()
      Set ..LastInteraction = ##class(cosFaker.Dates).Backward($Random(80))
      Set ..Notes = ##class(cosFaker.Coffee).Notes()
      Set ..Origin = ##class(cosFaker.Coffee).Origin()
   Catch tException {
      Set:$$$ISOK(tSC) tSC = tException.AsStatus()
   }
   Quit tSC
}
}
 Do ##class(Sample.Product).Populate(10)

And it's great to write unit tests like that:

Method TestPersonLogin() As SampleProject.DataModel.Person
{
   Set person = ##class(Sample.DataModel.Person).%New()
   Set person.FirstName = ##class(cosFaker.Name).FirstName()
   Set person.LastName = ##class(cosFaker.Name).LastName()
   Set person.Email = ##class(cosFaker.Internet).Email(person.FirstName, person.LastName)
   Do $$$AssertStatusOK(person.%Save())

   Set matcher=##class(%Regex.Matcher).%New("\A([\w+\-].?)+@[a-z\d\-]+(\.[a-z]+)*\.[a-z]+\z")
   Set matcher.Text = person.Email
   Do $$$AssertTrue(matcher.Locate())
   Do $$$AssertEquals(person, ##class(Sample.DataModel.Person).%OpenId(person.%Id()))
}

cosFaker is FUN!!

 

Yes, cosFaker is fun... Insted of names like "Gibbs, Zoe K.", you have "Goku" or "Piccolo" using ##class(cosFaker.DragonBall).Character()

And a lot of funny stuffs like Pokemon Name Generator, Star Wars Planets or Droids, Coffee, UFC Fighters Names, Lorem Ipsum, etc...

 

So, That's all folks

Cheers

 

Comments

Do you have any benchmark data comparing this to the populate utils?

Cheers,
Fab

Hi @Fabian Haupt

Unfortunately  I didn't do a benchmark... But it's an awesome idea, compare the performance, I'll do and put here the results.
Thanks ;)

what benchmark? populating the data or retrieving populated data? I think the speed of data population is not that important, comparing the standard populate or cos faker. What would be important is ability of the tool to mimic real data at maximum possible extent (e.g. values distribution).

When just seen this project I thought that it is based on faker.js project (demo). But, unfortunately, they made their own base. Faker.js, by the way, is a quite good project for populating data in javascript (frontend or backend, no matter), it supports many languages, even Russian and Czech, and lots of different formats of data.

Part of testing with populated data is performance testing. If your data populating utilities can't give a high enough throughput, you can't really test your application under load.
And generating meaningfully big sets of data requires a lot of time. So for example, with Caché populate utils it takes 7.891 seconds on my machine to create 1M pairs of ,

The same takes 0.39s on my machine with a rudimentary go implementation.
I very much disagree that performance doesn't matter.

In terms of an online service, you could do something like:

curl -H "Content-Type: application/json" -X POST --data '{"count":1000000,"headers":false,"fields":[{"name":"Name","type":"name"},{"name":"Age","type":"digits"}]}' http://data.panadadata.com -o data.json

(disclaimer, I run that service)