Storage Schema in VCS: to Store Or Not to Store?

Answers

You should definitely store your storage schema in VCS. When you first compile and install your solution into an environment, it doesn't really matter -- as you said, storage will be generated automatically during compilation. However, this will cause major issues when you'll have to upgrade your environment to a newer version of your class and keep the data.

If you had a class definition in Version 1 of your system with properties Address, Zipcode your storage schema will look like

node=$LB($ClassName,Address,Zipcode)

If in version 2 you add another property, BusinessPhone, and you kept your schema, new storage definition will look like

node=$LB($ClassName,Address,Zipcode,BusinessPhone)

And all old data will still be valid, just its BusinessPhone property will be empty

However if you didn't save your schema, new storage will be alphabetically sorted as this:

node=$LB($ClassName,Address,BusinessPhone,ZipCode)

And all old data will have its ZipCode as BusinessPhone now!

I encountered this problem a couple of times, when class definition was exported before it was compiled (and storage schema was not updated), and it was not easy to fix: you need to iterate across the whole global and rewrite it, trying to guess if it's an old data or a new one.

Hope this helps

Sergei

Evgeny

Yes, of course you should.  The storage schema is an essential part of the class definition.

Why would you not want to?

Regards
George

www.georgejames.com

Why not: if you import the class without schema it would be generated automatically according to the class description.  So why to export and keep it in VCS?

 

Evgeny

If you have customised the storage schema, or even added and then deleted properties, then a regenerated schema will not match the original schema. 

This can have serious consequences as data may get stored in a different place from data stored by previous versions of the class.

George
www.georgejames.com

If you have customised the storage schema

But how can I customize it? Do you mean manually?

When I export the class def to the file I have the regenerated schema every time. So the only way to customize it is to change the schema manually in exported file. And this would be the risk to experience problems after importing that class.

To be clear, I do keep schema with the class def in Git and I never ever changed (hope so) this part of the class definition. It is useful for me only for information purposes to know where  the data is stored.

 So here is my question: why should I export and keep this part of class def if I don't want to know where is the data and don't want to have risk to manually change the storage data?

This can have serious consequences as data may get stored in a different place from data stored by previous versions of the class.

I believe if you change properties and indexes of the persistent class it would definitely cause  serious consequences on the data of previous versions of the class (but happily here we have the recipe of data conversion and etc).

 

 

Evgeny

What method are you using that even gives you the choice to not export the storage schema?  I don't see anything that allows this, other than if you export the class before the schema has even been generated.  Is that what you do?

George

www.georgejames.com

 

Evgeny

Try this experiment:

  1. Create a class containing two properties a and b.
  2. Populate the class with a couple of rows of data.
  3. Delete property a.
  4. Export the class without it's storage schema.
  5. Import the class, regenerate the storage.
  6. View the data created earlier.

I think you'll find that the regenerated class will incorrect values for property b.  This would be a very nasty surprise if it happened to some real application.

Regards
George

www.georgejames.com

 

And what will happen, if one decides to revert his class def to a previous version with previous storage def state, having some data already populated using new schema?

It seems that there is no "good" choice between TS's two options, only some "bad" vs "worse", unless business logic is accurately separated from data and kept in different classes. In this case the probability of reverting data class def should be lower.

 

If one simply gets an older storage def from VCS, how Cache internal migration facility can help? The history of schema changes will be lost. It seems that only some ad hoc global level migration code can help in this case.

So this kind of "forced" schema migration (having some populated data which you don'want to loose) would usually be a pain.

Alexy

Providing the storage schema is exported along with the class definition (it will be, automatically, unless you mess with the default export options) then you will not have a problem reverting to a previous definition.  The scheme evolution mechanism used by Studio ensures the consistency of the data locations from previous versions. 

This works well unless you make some deliberate schema changes that include a manual data migration.  In this case if you reverted to a previous version then you would also need to undo the manual data migration.  So there is a danger here if you are not aware of it or prepared for it.

The documentation on this is quite sketchy but worth reading anyway: http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=...

Regards
George

www.georgejames.com

 

Right. But I would never delete and change the name of the properties if you have the data behind them already. I prefer "deprecated" approach of "deleting" unused properties.

I think it is a good candidate for another question. 

So, in the end, if I want to store in VCS only class body (without storage schema)  it is safe to do it if:

I use only default storage schema in this project;

I never delete properties and use "deprecated" clause instead.

Right?

So, it's never safe to go without storage schema (even with default storage schema) if you want to either to delete or add new class properties.

Evgeny

I think you have come to the wrong conclusion.  It is never ok to export the class without the storage schema. 

The case of the deleted property is just one example of how things can go wrong.  A manual change to the schema of any kind (like changing the global name) would also get lost if you don't export it.  It's even possible that future versions of Cache might use a different algorithm that gives a different result.  You just can't second guess all the constraints that you need to watch for.

In short.  Never export a persistent class definition without the storage schema.

Regards
George

www.georgejames.com

 

Yes, you are right and I agree. George, thank you very much for thorough explanations.

I see that as an integrity viewpoint.

What if the user customized the storage? If you don't export it with his definition, it would map to another global.

This is even more fearsome if you map a class to a global since they're usually based on user-made globals.
 

Removing Schema definition also means introducing a checkpoint to verify if the class is a %Persistent or not, since you can also have global mapped classes. That could also reduce the peformance.

 

And the fundamental question is:

Why you would versionate something different than what you actually intended?
 

Yes, sure. If you customized the default storage (which is not very recommended, see @Kyle Baxter's answer) on purpose it becomes the part of the solution and should be stored in VCS along with class def.