Question
· 23 hr ago

Schema Design Best Practices for Cross-Departmental Data Lakes in IRIS

We’re building out a data lake in IRIS 2025.1 that aggregates data across multiple business systems and departments. I’m trying to establish best practices for schema design and separation.

Right now, I’m thinking of using a separate schema for each distinct system of record feeding into the data lake - for example, one schema per upstream source system, rather than splitting based on function (e.g. staging, raw, curated). The idea is that this would make it easier to manage source ownership, auditing, and pipeline logic, especially when multiple domains are contributing data.

But I’d love to hear what others have done. Specifically:

  • What’s the best way to structure schemas in a cross-departmental data lake?
  • Do you separate by source system, business function, or lifecycle stage?
  • Are there downsides to having many small schemas instead of one large shared schema?
  • How do you manage naming, discoverability, and access control across schemas?

Any examples, patterns, or anti-patterns would be helpful.

Thanks!

Product version: IRIS 2025.1
Discussion (0)2
Log in or sign up to continue