We’re building out a data lake in IRIS 2025.1 that aggregates data across multiple business systems and departments. I’m trying to establish best practices for schema design and separation.
Right now, I’m thinking of using a separate schema for each distinct system of record feeding into the data lake - for example, one schema per upstream source system, rather than splitting based on function (e.g. staging, raw, curated). The idea is that this would make it easier to manage source ownership, auditing, and pipeline logic, especially when multiple domains are contributing data.