FinTech Solutions
Provider of Third-Party Financial Reports to Brokerages
A fintech company specializing in delivering high level daily reports and robust quarterly financial reports to brokerages, faced significant challenges due to antiquated back-end processes. Their system relied on a large database of all publicly traded stocks, updated daily, but the existing data handling methods were inefficient, leading to prolonged processing times and delayed report generation.
The project's goal was to overhaul the company's historical and incremental database loading processes, significantly reducing the time required for data updates and report generation. By implementing modern data processing techniques using Python and Apache Spark, the aim was to enhance system efficiency and reliability.
Team
Data Engineers, Database Administrators, Project Manager
Python for scripting and data manipulation.
Apache Spark for handling large-scale data processing.
SQL databases for data storage.
Data complexity. Handling real-time data feeds and performing calculations on over 2000 NASDAQ stocks simultaneously presented significant technical challenges.
Current state script complexity. Client lacked documentation and knowledge of the previously built load processes, causing a need to reverse engineer the current state stored procedures.
Data quality reporting. Data quality dashboards were developed to present potential issues in the current state load and compare with Optimlaize's development.
Process redesign. Redesigned the entire data ingestion and processing pipeline using Python to streamline tasks and improve automation.
Spark implementation. Utilized Apache Spark to manage large datasets efficiently, significantly reducing load times by parallelizing data processing tasks.
Incremental load optimization. Developed a new incremental loading strategy that reduced data update times from 2 hours to 2 minutes by optimizing data transformations and load operations.
Historical data handling. Overhauled the historical data load process to reduce processing time from 15 hours to 45 minutes, improving both speed and data integrity.
Results
Reduced processing times. Achieved a drastic reduction in both incremental and historical data loading times, enhancing overall system responsiveness and report timeliness.
Improved system efficiency. Enhanced the backend architecture, leading to better scalability and the ability to handle larger datasets without performance degradation.
Client satisfaction. Improved client satisfaction through faster and more reliable report delivery, reinforcing market position as a leading financial reporting service provider.
Lessons
The project underscored the value of modern data processing technologies like Apache Spark in transforming legacy systems.
Ongoing performance monitoring and iterative improvements are crucial for maintaining system efficiency in data-intensive applications.
Regular engagement with stakeholders during the redesign process ensured that the system met all user requirements and facilitated smooth transitions.
This transformation project not only streamlined the client's data processing capabilities but also positioned the company to better meet the demands of the dynamic financial services market. The improvements in processing speed and reliability are expected to support new business opportunities and client growth.