The Perils of Modifying the Storage Definition

Article

Vivian Lee · Dec 15, 2021 3m read

Setting the Scene

I was editing the properties of a persistent ObjectScript class the other day and noticed that the storage definition wasn't updating to reflect my latest changes.

In this case, I deleted a property that was no longer needed from the class definition, saved, recompiled, and still saw it in the storage definition.

No problem, I thought. If the storage definition gets autogenerated on compile, I can just delete it and recompile the class. Sure enough, after doing this, I no longer saw the deleted property in the storage definition.

There, problem solved... right?

(cue wrong answer buzzer)

I later learned that this approach would work if the class had no stored data yet. But if there is pre-existing data, it could cause major data reference issues.

What is the Storage Definition Anyway?

The storage definition acts as a map between class properties and their physical storage locations in the database.

Imagine you have a bookshelf with multiple slots and a list to keep track of which slot each book is in. New books are always placed in the left-most slot that is shown as vacant on your list and slots are numbered in ascending order.

Scenario	Storage Definition	Status	Bookshelf Analog
1	Class has no stored data yet and you modify the storage definition.	OK	Shelf has no books and you modify your map of which slots future books should be stored in. No data = no data reference issues.
2	Class has stored data, you add a new property without saving any data for it yet, then delete that property and add another one. (aka renaming the last added property before any data was stored for it yet)	OK	Slot 3 is the next vacant slot on your shelf. You write down that Book C is supposed to be in slot 3 but don't actually place a book in slot 3 yet, then update your map to say that slot 3 will now be reserved for Book D.
3	Class has stored data and you delete a property that has data stored for it from the storage definition.	ERROR	There are books in slots 1-3 of your bookshelf. Your map shows (slot 1: Book A, slot 2: Book B, slot 3: Book C). You remove Book B from your map only. Book B still exists on your shelf but your map shows slot 2 as vacant. The next time you add a new book, your map shows slot 2 as available and you'll unintentionally double-book (cue rimshot) slot 2.

Scenario 3 Continued

You might be tempted to work around this by physically removing Book B from slot 2 when you delete it from the map. Don't do this. You may accidentally corrupt your data table. Surprisingly, manually modifying your data is a lot more risky than removing a book from a shelf.

Option

Storage Definition

Bookshelf Analog

Do nothing. Leave the data and storage definition alone.

If you don't plan to read Book B ever again, its existence on your map shouldn't matter since you won't be looking for it anyway.

If having a deprecated property remain in your storage definition still bothers you:

Leave the data alone and update the name of the property in the storage definition to indicate that it is no longer in use (e.g. propertyA => propertyADeprecated)

Keep Book B on your shelf and update the title of slot 2 in your map from Book B => Book B (No longer relevant)

Thanks for reading!

Joel Solon · Dec 21, 2021

Nice article about an important question: when should you touch the storage definition? Some comments on the scenarios (with sound effects!). The only modifications I'd make to the bookshelf analogy is that there are gazillions of shelves (every person gets their own shelf) using the mapping of what type of book goes in each slot.

Scenario 1: You are creating a class, and adding, deleting, and renaming properties. If there's any data for the class, it is test/temporary data and can be deleted at will. As Vivian saw, deleting a property doesn't change the storage definition (the slot for the deleted property is left unused), and renaming a property is really a deletion (old slot unused) followed by an addition (new slot). Once you get the class the way you want it for release, you can safely delete the storage definition (since there's no real data) and the next compilation will create a fresh one, with no unused slots. No danger (cue happy music).
Scenario 2: You are editing an existing class with real stored data. If you rename any property (new or existing, with or without data) it's best practice to rename the storage definition slot also before your next compile. If it's a newly added property without data and you don't bother to rename the slot, it's OK, but you now have a wasted slot that will never be used, and a new slot for the data you'll eventually start saving. Why not rename the slot right away? On the other hand, if it's a property that has data, you must rename the storage slot, or the data for that property will no longer be accessible, and new data for the renamed property will go in a different slot. Any danger is prevented by renaming the property and storage definition slot at the same time (cue triumphant marching music).
Scenario 3: You delete a property with real stored data. As before, the old slot is left unused. Any new property added later will not use that slot (no double-booking; new data won't ever be added there). But the real question is: Why are you deleting the property? Do you not care about that data anymore? If so, why aren't you deleting the data first? And by deleting the data, I don't mean editing the global directly (which I think is what Vivian meant by "don't do this."). But it's not dangerous to the table to use SQL to delete Book B from every bookshelf before deleting the property:update schema.table set property = null(cue computer bleeps).

Editing the name of any unused slot in the storage definition and adding "-unused" to make things clear could be OK but it's definitely not intended for that use. I'm not sure it's necessary to do anything in this case; If I see a slot in the storage definition named XYZ and there's no property XYZ, I know it's an unused slot. Maybe preceding the storage definition with a comment section would be cleaner:

/* unused slots
   XYZ (as of 1/1/2020)
   ABC (as of 6/1/2021)
*/

Please let me know if it's not clear, or if I missed a scenario.

1 0

Michael Pine · Dec 15, 2021

So just to confirm there is no good way to remove a book once its been given a reserved slot. The best thing to do is either ignore it if it becomes irrelevant like many books do in people bookshelves or to identify it as out of use.

4 0

Evgeny Shvarov · Dec 16, 2021

Thanks @Vivian Lee!

This is a known issue. And this is why it is THE MUST to store the storage strategy with the class definition in the source code repository.

Also, maybe it's not a bad idea to never delete properties and just mark it as archived/deprecated.

2 0

Ben Spead · Dec 16, 2021

as Vivian explained, you can delete the property definition and then change the name in the storage definition to make it clear that that slot should be ignored. This of course should be done while keeping versions of everything in source control so that the reason for the change is documented and discoverable in the future should someone need to understand why the property was removed.

Evgeny Shvarov · Dec 17, 2021

Thanks, Ben! Just curious is it safe and the best practice to edit the storage definition manually?

Or is it safer to let the compiler do the thing?

Agree on having the storage definition in the source control along with the class definition.

0 0

Ben Spead · Dec 17, 2021

It should be safe to edit manually as long as you are not changing the piece numbers or changing the order around. Simply changing the property name within <Value>...</Value> of the storage definition should be fine if it is no longer used at all. In our case, we have had instances in the past where we needed to change a property name and rather than creating a new property and migrating the data it proved to be safe and much more efficient to change the property name in both the property definition and the storage definition (as well as all of the places in the source code which refer to it). The nice thing about this approach is that we've been able to simply do a Find/Replace in all of the source code and carefully review the diffs before submitting.

If others are aware of gotcha's when changing the storage definition Value, I would be interested in hearing them.

Great analogy @Vivian Lee!! Thank you for writing this up :)