Is Caché SQL essential to Caché Objects?
I would like to start a discussion regarding Caché Objects and Caché SQL.
It is my understanding that the creators of Caché Objects see Caché SQL as the reporting arm of Objects and as such SQL is essential to Caché Objects.
I once met a Caché Objects programmer who was writing code to $Order through the Globals because that person thought that Caché SQL was too slow and inefficient. I attempted to convince the person otherwise.
So, what say you? Is SQL essential to Caché Objects?
Or
Is Caché SQL a nice to have but we can do without? Is it too slow? Is it too inefficient?
I really would like to have an idea of what the InterSystems Caché Community, as well as the InterSystems Engineers and anyone else with skin in the game thinks.
I guess it depends on which context you're referring to essential.
If you are working on a pure Cache environment you can probably go without the SQL layer. However, you need SQL to expose data to the world (e.g. reporting tools that can only connect through xDBC layer).
In terms for performance you can always make the case that SQL is slower when compared to direct global access, however SQL provides more "readability" for new (and not so new) developers since SQL is "standard" (or wider spread) compared to ObjectScript.
My rule is to always chose SQL over direct global access unless the performance requirements can only be met by the latter.
I guess you must at least use Extent query to be able iterate the objects. Or can you manage without?
Since %PARALLEL was added to the mix, you can't be sure that $order beats SQL.
Curious about everyone's thoughts about dynamic vs embedded SQL in terms of preference and performance?
Yes. In fact, SQL access predates objects.
Check this "Using Caché Objects ▶ Introduction to Persistent Objects ▶ Extent Queries"
https://community.intersystems.com/product-documentation/using-cach%C3%A...
You can manage without. You have plenty of Object Oriented options to get to the items. Or you could use $O over the global to get the IDs (similar to what Extent would do)
I always thought Extent is the only way. If I have for example persons in the database, how can I iterate all of them?
Custom queries can be used. Check the examples there. Or you can just iterate over data global and call %OpenId.
Please check my comment in the post below
Let's say we must iterate all Persons. Sure, by knowing its storage scheme, we can do it with Order but it's like invoking virtual function by memory offset in c++ world. The only object-oriented (i.e. without knowing too much of object internals) way I know is to use Extent. I wonder why generated class does not provide methods like GetFirst/GetLast/GetNext, this way users of the class could iterate extent in object-oriented way.
What's the use case for iterating over all (or even a big slice of) objects?
Very same as with raw Order: user opens a list, you give him first 100 objects and when he presses Next page, you continue from index 101
Show user SQL result without opening objects?
I just mean that if I have simple task to show 1000 records split into 10 pages, with "raw" globals I can do it simple with $order but with objects there is nothing similar to $Order so I have to use SQL
Iterating over indexed value can be very helpful.
We have IndexOpen and IndexExists autogenerated methods (article on autogenerated methods) for persistent classes. I'm using these very handy methods a lot.
It would be great if somebody introduce Next/Prev autogenerated methods too e.g. via some NavigationAdapter class to derive from.
Then execute this statement:
select * from DQ()
or
select * from DQ() where lastName %STARTSWITH 'S' order by age DESC
2 row(s) affected
Cache doesn't provide this "out of the box" but you can always create a class that works as a Collection (a collection of IDs if you will) and provides an Iterator interface/functionality for to you iterate over the values. You can always expose and access data through Objects as you need to.
$ORDER would do. Not the cleanest (or preferred) but the point being you can do without SQL.
The idea is extremely simple: each class includes auto-generated query 'select ID from classname' that returns ids of all existing objects of this class
I wonder why when working with objects, we still must use SQL (Extent Query) to be able iterate all instances of a class. Do you know reasons for this design?
Arto,
Thank you for the pointer.
I have seen the documentation before and read through it more than once.
But, I still cannot seem to wrap my mind around it what they are saying.
Is it just that InterSystem's documentation tries to be difficult, or is it just me?
Maybe they should put out a version of documentation for "Dummies"
Obviously each project will have different requirements. But here are some generalizations from my experience:
Cache SQL: use when you want to expose your data to report writers who use SQL. Use Cache SQL write capability if you must but may be slower. Fairly easy to read code. But you do NOT need Cache SQL to create an app.
Cache Objects: used most often, especially if you are an object oriented programmer. Hard to not see the advantages of objects in most projects. Reasonable speed and code readability.
Cache Direct Global access: used for legacy programs (lots out there) and may be faster performance (depending on read/write code structure). Difficult to read after years of code changes.
Tom FItzgibbon | gototomAtG...l.com | 347-464-8531
There can be more to "speed" that just execution time of your code. If you need to add an index to help performance, then a Caché SQL query will be able to utilise it without any code changes. If you are using $Order, then you need to spend some time writing and testing your new code.
I typically use SQL to identify objects, then Object methods to interact with them, but for simple updates of large numbers of objects, I'll usually use a simple SQL Update. The beauty of Caché is the flexibility to use the best tool for the task at hand
Hi Mike,
In 2013, we did some performance test to figure out the same on Cache 2010 and we found the following:-
Accessing data for reporting etc
$Order was the fastest, next was SQL, objects were the least efficient, probably because it fetches a lot of information every time you open an object reference.
Insert /Update
SQL was better as it took care of the Indices and SQL computed fields etc consistently and efficiently.
Object save, had some inconsistency in terms of SQL computed fields.
Coding
Cache Studio, supported Objects the most :)
I am not sure if this still holds true with the current version.
Anil
One thought I had, I am not sure if it applies.
I assume that the Object processing code is more developed than the SQL processing code.
Assuming that holds, can we say that future SQL processing code has more to gain than future Object processing code?
If that is true, then going with SQL in the mix seems like the correct way to go.
One very important point is that SQL is easier to read and modify than Cache.
So maybe not in run time efficiency, but the advantage is to SQL for speed of changes.
Perhaps I missed the post, but has any engineers weighed-in on the planning of Objects and SQL?
Was SQL part of the mix from the start?
I started this post and I appreciate everyone who contributed.
However, I see that the phrase "Extent Query" used a number of times.
I must admit that I really do not know its meaning.
Will someone help me understand the meanings and ramifications of "Extent Query"?