Case Study: Enterprise Information Management for a Large
Our client is a large, not-for-profit government agency that provides housing and housing assistance to low-income families. The organization serves over 20,000 families, including working families with children, seniors, and disabled households.
The agency engaged our company to create their version of an enterprise information management (EIM) system. The system’s core is a data platform hosted in Microsoft Azure.
The agency’s reporting and information sharing had become an issue. Reports were assembled manually in Excel and distributed via email or file shares. This caused issues with the timeliness of the information as well as the accuracy of the data being presented. Users were limited in their ability to interact with the data as the reports were static.
Additionally, leadership had an ardent desire to share insights with the public. The agency wanted to be as transparent as possible to community leaders and others on the types of services provided and the number of households served. When we arrived, this was not possible.
Our client records sensitive data for the members it serves. This data is PII (personally identifiable information) and while useful for internal reporting, it cannot be shared in a public forum.
The PII needed to be stored and secured in the data platform but only reported where required. This demanded diligence in creating an appropriate data model and choosing the proper method for sharing reports and dashboards via the web.
The government agency wanted to modernize its data operations and did not have an appetite for growing its on-premise footprint or adding staff to support the effort. The initial work for the engagement entailed determining the appropriate cloud platform. The two competing cloud providers were Amazon and Microsoft. We provided our client with an in-depth comparison of services and pricing for each platform, and our client chose to go with Microsoft.
- Azure Data Lake Storage Gen. 2 (ADLS)
- Azure Data Factory (ADF)
- Azure Databricks
- Azure SQL
- Azure Logic Apps
- Power BI
The data for this report was sourced from two on-premise databases. The databases were fed from two disparate systems – a legacy system and a commercial-off-the-shelf (COTS) system. The data from those systems needed to be extracted, cleansed, and harmonized into a singular reporting model.
The IT department was very concerned about the security of this architecture. Much time was spent on ensuring that the communication between cloud and on-premise resources was secure and that only individuals from inside the agency could access data stored in Azure Data Lake Storage or Azure SQL. The architecture went before a review board and was determined to meet all the agency’s requirements.
The lakehouse model was able to support the initial deliverable along with other internal reporting needs. The ingestion process was based on source-aligned ADF template pipelines (i.e., SQL Server, CSV, Excel) that were informed from metadata configurations stored in a database repository. New source tables or files became entries in a set of related tables, which expedited the development process.
Leadership at the agency was most interested in developing a dashboard report that could be presented via their website. After delivering the initial reports, we quickly turned our attention to how to do this in a timely and secure fashion.
Power BI was the tool of choice for reporting. Reports developed in Power BI can be published on a website. Our client wanted the dashboard to be hosted externally and did not want users to have to log in to view the reports.
In a short amount of time, report pages were created with limited requirements, and visuals such as maps were used to enhance the user experience. At the same time, we worked closely with IT to maintain the integrity and security of the dashboard. The dashboard is now hosted on the website and has become a centerpiece of the EIM platform.
Community leaders have heaped praise on the agency and challenged other agencies to pursue similar efforts.
One key lesson learned from this engagement was that it’s important to try to create reusable, data-driven data pipeline templates to speed up the development process. The initial work to create the templates will be worth the time savings as new source objects are added.
Another key lesson learned was the importance of understanding the security requirements of the internal IT department and communicating with full transparency about the architecture of your solution. Failure to do so upfront can add days and even weeks to creating a secure and acceptable solution!