FIP- X Discovery Performance Case Study
Created By | Sahamati |
Contributors | 1. FIP 2. Cookiejar Technologies Private Limited (Finvu AA) 3. CAMS Finserv (CAMS AA) 4. NESL Asset Data Limited (NADL AA) 5. Perfios Account Aggregation Services Private Limited (Anumati AA) 6. Finsec AA Solutions Private Limited (Onemoney AA) 7. DigiSahamati Foundation |
Introduction:
FIP- X, one of the largest public sector banks, has been live on Account Aggregator. They have tied up with 10 AAs with majority of the traffic coming from- Finvu, CAMS, Onemoney, NADL and Anumati.
FIP- X had not been meeting the SLAs as an FIP in the AA ecosystem. Through the period of this case study, all contributing AAs and FIP- X team shared and reconciled logs of the AA transactions to identify key problem areas. Following that, FIP- X MISD team (Managing the AA implementation) worked internally with the other teams within FIP- X and their FIP TSP to resolve these issues.
Goal:
The goal of the process was to work with FIP- X to improve FIP- X discovery performance. This improvement was measured as the p50 discovery success percentage on SAANS and the p50 and p95 response times as reported on SAANS. The metrics were measured as both the aggregate metric as well as AA wise performance. More details about the definitions of metrics on SAANS can be found here.
The goal was to minimise the 5xx series(Issues arising out of the server side i.e. the FIP side) HTTP response codes for the discovery API. Other error codes were not occurring because of issues with the banks and hence were not included in the process.
The secondary goal was to identify isolated paths in the AA ecosystem that exhibit bad performance.
Outcome:
Comparison between p50 success percentage as reported by AAs on SAANS for FIPX on Aug 12th 2023 vs Aug 22nd 2023 vs Aug 27th 2023
AA1 | AA2 | AA3 | AA4 | |
Before new update |
33.3% | 65.5% | 76.9% | 0% |
After new update |
100% | 100% | 78% | 0% |
As of Sept 12th 2023 |
100% | 100% | 100% | 100% |
Performance on AA-3 had not improved because of errors at the AAs end and not a problem of the FIP. AA-3 was able to resolve the issue and hence the improvement in performance.
Methodology:
The goal was to examine the data on a recurring basis to identify the key problem areas in FIP- Xs AA implementation. This process involved:
- Identifying sub-areas of discovery performance to work upon for the FIP- X team.
- Sharing of transaction logs between AAs and FIP- Xs team.
- Internal identification and tracking of specific transactions with specific error codes.
- Identification of problems and reasons for failure of specific transactions.
- Resolution of the problems
- Identification of the next set of problems and repeating the process.
Key Problems Identified:
-
Issues with the bank’s API Gateway:
FIP- X API gateway had bandwidth issues preventing API calls from reaching the AA middleware offered by the FIP TSP. For such calls, there were issues with connection timeouts.
-
Rejection of requests by the banks Firewall:
Banks firewall policy was causing issues with the discovery requests being sent by the AA. In order to resolve this issue, the FIP- X team had to request for an
exemption from their internal team. -
Error in implementation of the ReBIT Specs:
Having resolved the issues with the API gateway and firewall, the key improvement in success percentage came by identifying and correcting the error in implementation of the ReBIT Specs. In the scenario where the mobile number is linked with multiple customer IDs, it is not possible to identify the customer and hence the FIP must return the error code- 404. This change required the FIP- X team to update their AA Middleware.
Remarks and Recommendations for FIPs:
- Most FIPs that are not performing well have a significant proportion of the errors due to timeouts – both connection and read-timeouts. It is recommended that all FIPs re-examine the configurations at the API gateway level and work with their FIP TSPs to minimise the timeout issues.
- Re-examine the WAF policies and identify blockers (if any) that are rejecting the requests at the FIPs firewall levels.
- Finally, all FIPs must re-examine the scenarios and HTTP response codes to ensure that they are in compliance with the ReBIT Specs.