The second part of the Mortgage Data Processing series is now online

The second part of the Mortgage Data Processing series is now online

by Ad Min -
Number of replies: 0

Course (PYT37066) is now available on the Academy for all users


An illustration of various stages of data processing


Course Content:

This crash course illustrates how to process loan-level US Agency mortgage data using awk, pandas and django. The second part of the course focuses on the performing book. This part covers the following topics:

  • Concepts of the Credit Life Cycle and how changing states are captured in Loan Data as Dynamic (Variable) Fields with a focus on performing loans (excluding delinquent loans)
  • Selecting performing loan data using awk and pandas, manipulating and exporting derived data models using pandas
  • Data Quality Concepts for Performing Loans and Concepts from Bitemporal Databases
  • Importing performing book credit data models into a django based web platform (openNPL) that enables further interactive work with such data

Nota Bene: The course requires actual historical loan performance data for its proper completion. Those data are not provided within the course. Students must source such data themselves from the Data Dynamics website and agree to be in compliance with the applicable terms and conditions.

Who Is This Course For:

The course is useful to:

  • Data Engineers / Data Scientists across the financial industry and beyond that need to work with mortgage data
  • Credit Risk Management professionals and students
  • Credit Portfolio Management professionals

How Does The Course Help:

Mastering the course content provides background knowledge towards the following activities:

  • Improved ability to process large loan-level historical performance data
  • Pre-process, categorize, segment and improve on such data sets in preparation for further analysis

What Will You Get From The Course:

  • You will be able to confidently work with Loan-level historical performance data
  • You will be able to contribute to the specific use cases mentioned above

Course Level and Difficulty Level:

This course is part of the Risk Modeling using Python family.

  • This is a Core Level course in Risk Modelling. A good grounding at Introductory level to various Data Engineering and Data Science topics is a prerequisite for making the most out of this course.
  • This is a Technical course which means certain technology elements (Python, CLI) are needed for mastering the material.

If you have not taken an Open Risk Academy course before the "CrashCourse Academy Demo" provides a quick overview of the Academy.

The following table places the course in the Open Risk Academy skills diagram:

Course Level & Type
Introductory Level Core Level Advanced Level
Non-technical
Technical CrashProgram
PYT37066

Course Material:

The course material comprises the following:

Time Requirements and Important Dates

  • The course is self-paced and can be undertaken at any point. It requires a commitment of about five hours total, depending on student familiarity and existing development environment.
  • It is advisable to pursue this course after completing the first part of the series

Where To Get Help:

If you get stuck on any issue with the course or the Academy:

  • If the issue is related to the course topics / material, check in the first instance the Course Forum
  • If the issue is related the operation of the Open Risk Academy check first the Academy FAQ. If the issue persists contact us at info@openrisk.eu