API3:2019 Excessive Data Exposure
Either looking forward to generic implementations or due to short time-to-market, developers tend to expose all object properties (e.g. JSON), relying on clients (e.g. web front-end or mobile application) to filter relevant data to render. Quite often such data exposes system internals or personally identifiable information (PII) that bad actors can take advantage of.
What is the issue?
By design (deliberately) the API endpoint exposes more object properties than those required by the client. Not always the exposed data is “sensitive”, but it may reveal internal implementation or architecture details that bad actors may take advantage of to compromise the API.
How does it look like?
To identify whether API exposes excessive data you need to know what data is required by the client. A good rule of thumbs is considering that only rendered data is required and anything other than that is excessive.
In the example above, to render the user’s todo list (on the left), the mobile app issues an API request to the /tasks endpoint (right-top). The API returns a JSON object (right-bottom) which includes a “profile” property with the user’s profile details and the tasks’ list (truncated).
Since the user’s profile details are not rendered by the mobile app, there’s no apparent reason to include them as part of the /tasks endpoint response. If there’s such a requirement, then the user’s profile details should be gathered on a different API request to another endpoint e.g. /me.
Apart from a bad design decision, based on this response, bad actors get some knowledge about API internals, for example, the user’s location is stored with GPS-level precision and how to compute users’ profile photo location. This knowledge can be used to exploit the API, in case other vulnerabilities are found.
Where have we seen this issue lately?
This is a common issue found on APIs, especially those backing rich web front-ends and mobile applications. The nature of the exposed data varies. In the research disclosed by Checkmarx at Black Hat USA 2020, among other issues, excessive data exposure was found in Meetup’s API exposing 38504 email addresses from WordPress network users only: you can get the details watching the "API (in)Security TOP 10: Guided tour - DEF CON 28 AppSec Village" YouTube video. Another excessive data exposure issue was found in the APIs behind the Bumble dating app and the APIs behind the volunteer app of the Philippines’ opposition coalition 1Sambayan, this time leaking highly sensitive personal information.
Conclusion
It is very important to inventory the data managed by the API and properly classify it according to its severity. API endpoints should check whether the client/authenticated user is authorized to access requested resources and/or specific resources’ properties, returning the minimum amount of data/details required by the client. Code review may allow earlier detection of excessive data exposure, preventing it to be deployed in production.
Excessive Data Exposure is something we often find associated with API1:2019 Broken Object- Level Authorization: you can read more about the latter in our previous article.