APIs have become the crown jewels of organizations’ digital transformation initiatives, empowering employees, partners, customers, and other stakeholders to access applications, data, and business functionality across their digital ecosystem. So, it’s no wonder that hackers have increased their waves of attacks against these critical enterprise assets.
Unfortunately, it looks like the problem will only worsen. Gartner has predicted that, “By 2022, API abuses will be the most-frequent attack vector resulting in data breaches for enterprise web applications.”
Many enterprises have responded by implementing API management solutions that provide mechanisms, such as authentication, authorization, and throttling. These are must-have capabilities for controlling who accesses APIs across the API ecosystem—and how often. However, in building their internal and external API strategies, organizations also need to address the growth of more sophisticated attacks on APIs by implementing dynamic, artificial intelligence (AI) driven security.
This article examines API management and security tools that organizations should incorporate to ensure security, integrity, and availability across their API ecosystems.
Rule-based and policy-based security measures
Rule-based and policy-based security checks, which can be performed in a static or dynamic manner, are mandatory parts of any API management solution. API gateways serve as the main entry point for API access and therefore typically handle policy enforcement by inspecting incoming requests against policies and rules related to security, rate limits, throttling, etc. Let’s look closer at some static and dynamic security checks to see the additional value they bring.
Static security checks
Static security checks do not depend on the request volume or any previous request data, since they usually validate message data against a predefined set of rules or policies. Different static security scans are performed in gateways to block SQL injection, cohesive parsing attacks, entity expansion attacks, and schema poisoning, among others.
Dynamic security checks
Dynamic security checks, in contrast to static security scans, are always checking against something that varies over time. Usually this involves validating request data with decisions made using existing data. Examples of dynamic checks include access token validation, anomaly detection, and throttling. These dynamic checks depend heavily on the data volume being sent to the gateway. Sometimes these dynamic checks occur outside the API gateway, and then the decisions are communicated to the gateway. Let’s look at a couple examples.
Throttling and rate limiting are important for reducing the impact of attacks, because whenever attackers get access to APIs, the first thing they do is read as much data as possible. Throttling API requests — i.e., limiting access to the data — requires that we keep a count of incoming requests within a specific time window. If a request count exceeds the allocated amount at that time, the gateway can block API calls. With rate limiting, we can limit the concurrent access allowed for a given service.
Authentication helps API gateways to identify each user who invokes an API uniquely. Available API gateway solutions generally support basic authentication, OAuth 2.0, JWT (JSON Web Token) security, and certificate-based security. Some gateways also provide an authentication layer on top of that for additional fine-grained permission validation, which is usually based on XACML (eXtensible Access Control Markup Language) style policy definition languages. This is important when an API contains multiple resources that need different levels of access control for each resource.
Limitations of traditional API security
Policy-based approaches around authentication, authorization, rate limiting, and throttling are effective tools, but they still leave cracks through which hackers can exploit APIs. Notably, API gateways front multiple web services, and the APIs they manage are frequently loaded with a high number of sessions. Even if we analyzed all those sessions using policies and processes, it would be difficult for a gateway to inspect every request without additional computation power.
Additionally, each API has its own access pattern. So, a legitimate access pattern for one API could indicate malicious activity for a different API. For example, when someone buys items through an online shopping application, they will conduct multiple searches before making the purchase. So, a single user sending 10 to 20 requests to a search API within a short period of time can be a legitimate access pattern for a search API. However, if the same user sends multiple requests to the buying API, the access pattern could indicate malicious activity, such as a hacker trying to withdraw as much as possible using a stolen credit card. Therefore, each API access pattern needs to be analyzed separately to determine the correct response.
Yet another factor is that significant numbers of attacks happen internally. Here, users with valid credentials and access to systems utilize their ability to attack those systems. Policy-based authentication and authorization capabilities are not designed to prevent these kinds of attacks.
Even if we could apply more rules and policies to an API gateway to protect against the attacks described here, the additional overhead on the API gateway would be unacceptable. Enterprises cannot afford to frustrate genuine users by asking them to bear the processing delays of their API gateways. Instead, gateways need to process valid requests without blocking or slowing user API calls.
The case for adding an AI security layer
To fill the cracks left by policy-based API protections, modern security teams need artificial intelligence-based API security that can detect and respond to dynamic attacks and the unique vulnerabilities of each API. By applying AI models to continuously inspect and report on all API activity, enterprises could automatically discover anomalous API activity and threats across API infrastructures that traditional methods miss.
Even in cases where standard security measures are able to detect anomalies and risks, it can take months to make the discoveries. By contrast, using pre-built models based on user access patterns, an AI-driven security layer would make it possible to detect some attacks in near real time.
Importantly, AI engines usually run outside of API gateways and communicate their decisions to them. Because the API gateway does not have to expend resources to process these requests, the addition of AI-security typically does not impact runtime performance.
Integrating policy-based and AI-driven API security
When adding AI-powered security to an API management implementation, there will be a security enforcement point and a decision point. Typically, these units are independent due to the high computational power required, but the latency should not be allowed to affect their efficiency.
The API gateway intercepts API requests and applies various policies. Linked to it is the security enforcement point, which describes the attributes of each request (API call) to the decision point, requests a security decision, and then enforces that decision in the gateway. The decision point, powered by AI, continuously learns the behavior of each API access pattern, detects anomalous behaviors, and flags different attributes of the request.
There should be an option to add policies to the decision point as needed and invoke these policies—which may vary from API to API—during the learning period. Any policies should be defined by the security team once the potential vulnerabilities of each API they plan to expose are thoroughly understood. However, even without support from external policies, adaptive, AI-powered decision point and enforcement point technology will eventually learn and prevent some of the complex attacks that we cannot detect with policies.
Another advantage of having two separate security enforcement point and decision point components is the ability to integrate with existing API management solutions. A simple user interface enhancement and customized extension could integrate the security enforcement point to the API publisher and gateway. From the UI, the API publisher could choose whether to enable AI security for the published API, along with any special policies that needed. The extended security enforcement point would publish the request attributes to the decision point and restrict access to the API according to the decision point’s response.
However, publishing events to the decision point and restricting access based on its response will take time and depend heavily on the network. Therefore, it is best implemented asynchronously with the help of a caching mechanism. This will affect the accuracy a bit, but when considering the efficiency of the gateway, adding an AI security layer will minimally contribute to the overall latency.
AI-driven security layer challenges
Of course, benefits don’t come without costs. While an AI-driven security layer offers an additional level of API protection, it presents some challenges that security teams will need to address.
- Additional overhead. The additional AI security layer adds some overhead to the message flow. So, mediation solutions should be smart enough to handle information gathering and publishing outside the main mediation flow.
- False positives. A high volume of false positives will require additional review by security professionals. However, with some advanced AI algorithms, we can reduce the number of false positives triggered.
- Lack of trust. People feel uncomfortable when they don’t understand how a decision was made. Dashboards and alerts can help users to visualize the factors behind a decision. For example, if an alert clearly states that a user was blocked for accessing the system at an abnormal rate of 1,000-plus times within a minute, people can understand and trust the system’s decision.
- Data vulnerability. Most AI and machine learning solutions rely on massive volumes of data, which is often sensitive and personal. As a result, these solutions could become prone to data breaches and identity theft. Complying with the European Union GDPR (General Data Protection Regulation) helps to mitigate this risk but doesn’t eliminate it entirely.
- Labeled data limitations. The most powerful AI systems are trained through supervised learning, which requires labeled data that is organized to make it understandable by machines. But labeled data has limits, and the future automated creation of increasingly difficult algorithms will only exacerbate the problem.
- Faulty data. An AI system’s effectiveness depends on the data it is trained on. Too often, bad data is associated with ethnic, communal, gender, or racial biases, which can affect crucial decisions about individual users.
Given the critical role of APIs in enterprises today, they increasingly are becoming targets for hackers and malicious users. Policy-based mechanisms, such as authentication, authorization, payload scanning, schema validation, throttling, and rate limiting, are baseline requirements for implementing a successful API security strategy. However, only by adding AI models to continuously inspect and report on all API activity will enterprises be protected against the most sophisticated security attacks emerging today.
Sanjeewa Malalgoda is software architect and associate director of engineering at WSO2, where he leads the development of the WSO2 API Manager. Lakshitha Gunasekara is a software engineer on the WSO2 API Manager team.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to email@example.com.