Composite Primary key in Parent Child relationship table.

Hi. I have the following tables

User.Order [Parent][Primary Key : Year , Month, OrderNo]

 > User.OrderDetails [Child] [Primary Key : {%%PARENT}("ITM")(Running Number)]

 > User.OrderDetails2[Child] [Primary Key : {%%PARENT}("ITM2")(Running Number)]

  My question is there any problem if I use a composite key in User.Order?  I am using the composite key is because I have a lot of records each month(millions) and sometimes I will need to clear the records for a specific month for reloading, with those primary key removing the data would be easier (possibility faster too).

  Currently I am aware that I cannot use "||" in my Year/Month/OrderNo, not an issue. Is there any other problem? Worried I might not able to pull out the records from the data after I implement the table design.

Thank you.
Regards
Jimmy

  • 0
  • 0
  • 93
  • 5
  • 1

Answers

Hi Jimmy

Just to make sure we are talking about the same thing:  in Cache the Primary Key is not always the ID Key.  I think you are really talking about the ID Key, used in the subscripts of the D global.  If I am correct then you have an index define on Year , Month, OrderNo with IDKey True.

From a Parent Child point of view having a compound IDKey is fine.  The IDKey of the parent will be something like "2018||11||1" and then the IDKey of the children will just add one more subscript:  "2018||11||1||child 1"

Generally speaking we recommend that you leave the IDKey as a system generated integer and instead of using a Parent Child relationship you define a Foreign Key with Cascade on Delete  to get the same behavior.  You can still define a Primary Key index on Year, Month, and OrderNo so deleted will be fast.  By keeping the data in different globals and using Integers as IDKeys you will be able to define a Bitmap index in all of your classes and will keep disc IO to a minimum when queries the different tables.

Now maybe you say you will not be using Bitmaps and you will never query the children without also referencing the Parent data, so having the data in the same global makes sense.  You need to look at everything  case by case.

Will it work, Yes.

Will it perform the way you want it to, Maybe.

Sorry I can't give you a simple yes or no, but hopefully I have provided the needed info so you can make an educated decision.  As always InterSystems Support is happy to go over your specific details and help you come up with what will work best.

Brendan Bannon

InterSystems Support Manager

Hi Brendan

 Thank you for your reply. That solved my concern about problem might arise using the IDKey True with child table. 

  Speaking about performance, I am interested in fine tuning the table design to get the best of of Cache database performance. If I may have your advise on my table design. The following is the detail information.

 I have tables for monthly  data  (in Server A)

User.OrderYYYYMM [IDKey : OrderNo]

 > User.OrderDetailsYYYYMM [Child] [IDKey : {%%PARENT}("ITM")(Running Number)]

 > User.OrderDetails2YYYYMM[Child] [IDKey : {%%PARENT}("ITM2")(Running Number)]


 Then I have  a Yearly data (in Server B)

User.Order [IDKey : Year,Month,OrderNo]

 > User.OrderDetails [Child] [IDKey : {%%PARENT}("ITM")(Running Number)]

 > User.OrderDetails2[Child] [IDKey : {%%PARENT}("ITM2")(Running Number)]

   There is 2 server to serve 2 type of user group from different location. The monthly data (Server A) is where the user will manage the data. Then the monthly data (Server A) will push to Yearly data(Server B, user use the data).  I put all the table in the same global so that I will not have any problem pushing the data from Server A to Server B (through Shadowing with some scripting). Should there is any update in Server A, I could easily identify the record in Server B and update it.

  For my case I do use bitmap for indexing.  I have 2 set of tables (monthly and yearly) is due to bitmap as well. The user in Server A might flush out the entire month of data and reinsert it again (Is also the user wish as they do not want the risk of accidentally deleting the wrong data).  Hence to avoid re-index the whole table again, I break it to monthly table to reduce the indexing time. This will not occur in server B as I could make user the Shadow "Filter routine" option to pick up the deleted record and update the index in Server B (not tested it yet.)

  As mentioned in your reply "query the children without also referencing the Parent data ",  there are time where I would only access the parent/child (not both at the same time).

Thank you.
Regards
Jimmy

Hi Jimmy

If I understand correctly the only table that will support a bitmap is the Order Table on Server A.  All the other tables have compound keys and will not support bitmaps.

It is possible to keep the Parent Child relationship and split the children out into their own globals.  You just change the Global node in the Mapping to something else.  That will make your Journal filter code a little more complex but it could improve performance a good bit.

The fact that you have OrderDetails and OrderDetails2 makes me think there is a lot of details for every order.  If are storing a lot of data for each row and each order has a large number of children your global is going to be large and that will slow down query performance.  Just switching the children to different globals could help.

Building Indices talks about about the fastest ways for you to rebuild an index on different types of systems.  You might find this helpful for your shadow filter code.

I still recommend using default IDKeys and Foreign Keys but without know what the really important queries are and the Properties and TuneTable data it is hard to know how hard I should push you in that directions.

Bitmaps are great for properties that have fewer than 10,000 distinct values.  The fewer the values the better the performance.  For example Gender is a great property to use with a bitmap while PatientID is a very bad property to bitmap.  The other big win with bitmaps is using multiple bitmaps together to solve a query.  While we can do this multi index solution with standard indices as well the performance with bitmap is MUCH faster.  For example if you want to find all the men who live in a given state and have a given eye color your where clause would be:  WHERE Gender = ? and State = ? and EyeColor = ?  If you have a bitmap on each of these properties the query plan would contain something like:  Combine Bitmap GenderIndex INTERSECT StateIndex INTERSECT EyeColorIndex  

Now you are thinking to yourself that you could define a standard index on (Gender, State, EyeColor) and get great performance and you would be correct, but what happens when the query changes so it does not include Gender?  Now your compound index will not perform as well but the bitmaps will still be used and give great performance StateIndex INTERSECT EyeColorIndex.   Compound indices are great to solve a specific query, but defining multiple bitmap instead gives you much more flexibility to solve more queries.

Hope this helps.  If you want to get into more specific questions about a query I can help with that.

Brendan

Hi Brendan

 Thanks and sorry for the late reply.  After testing it out,  I'll create each table in a separate global and the smaller table (e.g.   Order Contact in Order table, one to many)  that link to each table as parent and child. 

  As for the bitmap, I will not be using it as the data distinct values is way over 10,000 and not sure is it worth the effort or not (will just use the normal Indexing without bitmap). As changing all the tables to running number IDKEY it involve quite a number of work later when I consolidate all the records from different servers with Order, OrderDetails and OrderDetails2. 

Thanks.
Regards
Jimmy

> instead of using a Parent Child relationship you define a Foreign Key with Cascade on Delete

Hi Brendan, I've tended to use relationships in the past and would be interested to know why using a Foreign Key would be recommended over relationships?

Hi Sean

2 reasons:  

    1:  In general you get better performance if the data is stored in 2 different global.  The only time having the data in 1 global is better is when you query both the parent and the child together.  If you write a query against just one you still end up loading the other table as well.  So over all 2 globals means less disk IO.

    2:  You can't use bitmaps in the child table because the ID of the child is a String and Bitmaps only work when the IDKey is a positive Integer.

The downside of using a Foreign Key is with Objects you don't have the methods to get at children like you do with a Relationship.

Brendan