Google ADK - OpenAPI tools, agents-as-tools, authentication, and long-running operations
Imagine this scenario. Your company has dozens of internal REST APIs, each with an OpenAPI specification already published. You need an agent that can query the inventory service, place orders through the fulfillment API, and check shipment tracking, all without writing individual tool functions for every single endpoint. Now layer on some more complexity. Some of those APIs require OAuth2 tokens. One task, say generating a compliance report, takes ten minutes to complete. And you want a coordinator agent that can delegate research tasks to a specialized sub-agent without losing control of the conversation.
These four problems, connecting to REST APIs at scale, composing agents hierarchically, authenticating with protected services, and handling slow operations gracefully, are exactly what this article covers. Each is a distinct Google ADK capability, but together they form the toolkit you need to build production-grade agents that interact with the real world.
Let us start with OpenAPI tools.
OpenAPI tools
When your organization already has REST APIs documented with OpenAPI specifications, writing individual FunctionTool wrappers for each endpoint is tedious and error-prone. The OpenAPIToolset class solves this by reading an OpenAPI v3.x specification and automatically generating callable tools for every operation it discovers. You hand ADK a spec file, and it returns a set of tools your agent can use immediately.
The two key classes
The OpenAPI integration revolves around two classes working together.
OpenAPIToolset is the entry point. You initialize it with your OpenAPI specification, provided as a JSON string, a YAML string, or a Python dictionary, and it handles parsing, reference resolution, and tool generation. Think of it as the factory that reads the blueprint and produces the individual tools.
RestApiTool is what the factory produces. Each API operation defined in your spec (every GET, POST, PUT, DELETE path) becomes one RestApiTool instance. This tool can construct the correct HTTP request, fill in path parameters, attach query strings and headers, serialize the request body, and return the response, all based on the information in the spec.
How the toolset generates tools
When you create an OpenAPIToolset, several things happen internally:
- The spec is parsed, and all internal
$refreferences are resolved to produce a complete API description. - Every valid operation within the
pathssection is identified. - For each operation, a
RestApiToolis created with a name derived from theoperationIdfield (converted tosnake_case, capped at 60 characters). If nooperationIdexists, the name is generated from the HTTP method and path. - The tool’s description is pulled from the
summaryordescriptionfield in the spec, which helps the LLM understand when to use it. - Each
RestApiTooldynamically creates aFunctionDeclarationthat maps the operation’s parameters and request body into arguments the LLM can provide.
Basic usage
Here is how you would connect an agent to a task management API using its OpenAPI spec.
This example uses a mock OpenAPI server created in Python using FastAPI. The code for this server is at A mock OpenAPI server
|
|
The task_toolset automatically generates tools like list_tasks, create_task, update_task, and delete_task based on the operations defined in the spec. The agent can call any of them as needed.
You can also provide the spec as a YAML string or a Python dictionary. When using YAML, set spec_str_type="yaml". When passing a dictionary, use the spec_dict parameter instead of spec_str.
|
|
Let us add the execution logic to interact with this agent.
|
|
When you run this, you will see the agent calling the list_tasks_api_tasks_get tool with the argument {'status': 'in-progress'}.
|
|
Working with individual RestApiTool instances
Sometimes you need finer control over specific tools. You can retrieve individual tools by name using get_tool() or get all of them with get_tools().
|
|
This is particularly useful when you want to apply tool-specific configurations, such as custom headers or different authentication credentials, to individual endpoints.
Agents-as-tools (AgentTool)
In the MCP tools article, we saw how to extend an agent’s capabilities by connecting it to external tool servers. But what if the capability you need is not a simple function but an entire reasoning process that, in turn, requires an LLM? What if you need a “research agent” that can search, synthesize, and summarize information, and you want a parent agent to be able to call this research agent the same way it would call any other tool?
This is where AgentTool comes in. It wraps an entire agent, with its own instruction, model, and tools, and exposes it as a callable tool for a parent agent. The parent agent sends a request to the wrapped agent, which processes it using its own LLM and tools, and the result flows back to the parent as a standard tool response.
How AgentTool differs from sub-agent transfer
ADK provides two ways for agents to work together, and understanding the difference is critical for designing the right architecture.
With sub-agent transfer (using sub_agents), the parent agent hands off control of the conversation entirely. The sub-agent takes over, directly responds to the user, and handles all follow-up questions. The parent steps out of the loop. This is like a customer service representative transferring your call to a specialist. You are now speaking with the specialist, not the original representative.
With AgentTool, the parent agent remains in control. It formulates a specific question, sends it to the tool agent, receives the result, and then decides what to do next, including how to present the answer to the user. The tool agent never talks to the user directly. This is like a manager asking a team member to research something and report back. The manager still owns the conversation with the client.
Here is a summary of the key differences:
| Aspect | Sub-Agent Transfer | AgentTool |
|---|---|---|
| Who talks to the user | The sub-agent | The parent agent |
| Who handles follow-ups | The sub-agent | The parent agent |
| Does control return? | No | Yes |
| What is sent to the child | Full conversation history | A summarized question from the parent |
| Typical LLM calls | 2 (parent decides + sub answers) | 3+ (parent + child + parent again) |
| Session context | Shared | Separate (encapsulated) |
Using AgentTool
To use an agent as a tool, wrap it with the AgentTool class and add it to the parent agent’s tools list.
|
|
When the user asks the DevAssistant to review code, the LLM invokes the CodeReviewer tool. ADK runs the CodeReviewer agent in its own execution context, collects its response, and returns it to the DevAssistant as a tool result. The DevAssistant then summarizes and presents the findings to the user.
The skip_summarization option
By default, when the tool agent returns its result, the parent agent’s LLM makes an additional call to summarize or interpret that result before presenting it to the user. This is useful when the tool agent’s output is verbose or technical and needs to be adapted for the user.
However, if the tool agent already produces clean, user-ready output, this extra LLM call is wasteful. Setting skip_summarization=True tells ADK to bypass the summarization step and pass the tool agent’s output directly to the parent.
|
|
Building a hierarchical agent system
AgentTool really shines when you build multi-level hierarchies. Consider a research coordinator who delegates to specialized agents.
|
|
The user interacts with the ProjectManager. When it needs research, it calls the ResearchCoordinator as a tool, which in turn calls WebResearcher or DataAnalyst as tools. Results flow back up the hierarchy, with each level adding its own interpretation and context.
Authenticated tools
Many real-world APIs require authentication. Your agent might need to read a user’s calendar events (requiring OAuth2 consent), query an internal service protected by API keys, or access enterprise resources behind OpenID Connect. ADK provides a comprehensive authentication system that integrates with OpenAPIToolset, RestApiTool, and custom FunctionTool implementations.
Authentication building blocks
ADK’s auth system is built on two core concepts.
AuthScheme defines how an API expects credentials. Is it an API key in a header? An OAuth2 bearer token? A service account? ADK supports the same authentication schemes as OpenAPI 3.0: APIKey, HTTPBearer, OAuth2, and OpenIdConnectWithConfig.
AuthCredential stores the initial information needed to start the authentication process. This includes your application’s OAuth client ID and secret, an API key, or a service account JSON key. It also includes an auth_type field that specifies the credential type.
The supported credential types are:
| Type | When to use |
|---|---|
API_KEY |
Simple key-based auth, no exchange needed |
HTTP |
Pre-obtained bearer tokens |
OAUTH2 |
Standard OAuth2 flows requiring user consent |
OPEN_ID_CONNECT |
Enterprise OIDC providers like Okta or Auth0 |
SERVICE_ACCOUNT |
Google Cloud service accounts for server-to-server auth |
API key authentication
The simplest form of authentication is an API key. You can use the token_to_scheme_credential helper to create the required objects.
|
|
The auth_scheme and auth_credential are applied to every RestApiTool generated by the toolset. Each HTTP request made by these tools will automatically include the API key.
OAuth2 authentication
OAuth2 is more complex because it requires user interaction. The user must log in and grant your application permission to access their data. ADK handles this through an interactive flow involving your agent and the client application.
Here is how you configure an OpenAPIToolset for OAuth2.
|
|
The interactive OAuth2 flow
When an agent attempts to use an OAuth2-protected tool and no valid token is available, ADK triggers an interactive authentication flow. Here is what happens step by step.
Step 1: The agent yields an auth request. Instead of a normal tool response, the runner emits a special event containing a function call named adk_request_credential. Your client application must detect this event.
|
|
Step 2: Redirect the user to the authorization URL. Extract the auth_uri from the auth_config and append your application’s redirect_uri.
|
|
Step 3: Capture the callback. After the user logs in and grants permission, the OAuth provider redirects them to your redirect_uri with an authorization code in the URL. Your application captures this full callback URL.
Step 4: Send the auth response back to ADK. Update the auth_config with the callback URL and send it back to the runner as a FunctionResponse.
|
|
Step 5: ADK completes the flow. ADK exchanges the authorization code for tokens, stores them, and retries the original tool call, this time with a valid access token. The agent gets the real API response and generates its final answer.
Building custom authenticated tools
When building your own FunctionTool that requires OAuth2, implement the authentication logic within the tool function using ToolContext. The pattern follows three phases: check for cached credentials, check for an auth response from a previous redirect, or initiate a new auth request.
|
|
The three-phase pattern ensures your tool works correctly regardless of where it is in the authentication lifecycle. On the first call, the flow begins. After the user authenticates, it receives the tokens. On subsequent calls within the same session, it reuses the cached token.
Security considerations for token storage
Storing tokens in session state works well for InMemorySessionService during development, since the data is transient. For persistent session backends in production, consider encrypting token data before storing it. For the most sensitive environments, use a dedicated secret manager, such as Google Cloud Secret Manager, to store refresh tokens, and store only short-lived access tokens in session state.
Long-running function tools
Standard tool functions are synchronous: the agent calls the function, the function executes, and the result comes back in the same event loop iteration. But what about operations that take minutes or hours? Generating a compliance report, processing a large dataset, or waiting for a human to approve a request. None of these can block the agent’s event loop.
LongRunningFunctionTool addresses this by splitting a tool invocation into two phases: initiation and completion. The tool function starts the operation and returns an initial status. ADK pauses the agent run. The client application monitors the external process and, when it receives an update, sends the result back to the agent. The agent then resumes with the new information.
How it differs from regular tools
With a regular FunctionTool, the sequence is: agent calls → function executes → result returns → agent continues. The entire cycle happens within a single agent run.
With a LongRunningFunctionTool, the sequence is:
- First agent run: Agent calls the tool → function starts the operation and returns an initial status (like a ticket ID or “pending”) → ADK pauses the run.
- Client waits: The client application monitors the external operation outside the agent runtime.
- Second agent run: Client sends the final (or intermediate) result back → agent resumes and generates a response.
This is fundamentally about decoupling starting a task from completing it.
Creating a long-running tool
You define a regular Python function and wrap it with LongRunningFunctionTool instead of FunctionTool.
|
|
The check_policy_compliance tool runs synchronously. It checks the policy and returns immediately. The request_expense_approval tool, wrapped in LongRunningFunctionTool, creates a ticket and returns a pending status. The agent pauses at this point.
Handling the client-side flow
After the agent run pauses, the client application is responsible for monitoring the external process and sending results back. Here is the full client-side flow.
|
|
When you run this, the flow looks like:
|
|
Sending intermediate progress updates
You are not limited to a single final result. The client can send multiple intermediate updates to keep the user informed about progress.
|
|
Each intermediate update triggers a new agent run, in which the LLM generates a user-friendly status message based on the progress data.
Human-in-the-loop patterns
One of the most practical applications for LongRunningFunctionTool is human-in-the-loop workflows. The agent gathers information, prepares a request, and submits it. A human reviewer makes a decision outside the agent’s runtime. The decision flows back, and the agent takes the next action.
This pattern works for approval workflows, content review pipelines, quality assurance checkpoints, and any scenario where a human decision gate sits between agent actions. The key insight is that LongRunningFunctionTool does not perform the long-running work itself. It manages the handoff between the agent and the external process. The actual work happens elsewhere, and the tool is just the communication channel.
Putting it all together
These four capabilities, OpenAPI tools, agents-as-tools, authentication, and long-running operations, rarely exist in isolation. A real-world agent might use OpenAPIToolset to integrate with a fleet of internal APIs, AgentTool to delegate specialized work to child agents, OAuth2 authentication to access user data on protected services, and LongRunningFunctionTool to handle approval workflows that depend on human reviewers.
The key insight is that all of these are composable. You can authenticate an OpenAPIToolset with OAuth2 credentials. You can wrap an agent that uses authenticated tools inside an AgentTool. You can have a long-running tool inside a child agent that is itself wrapped as an AgentTool. ADK’s tool system is designed so these patterns stack cleanly.
In the next article in this series, we will look at how to integrate tools from third-party agent frameworks, LangChain and CrewAI adapters, to bring an even wider range of capabilities to your ADK agents.
Comments
Comments Require Consent
The comment system (Giscus) uses GitHub and may set authentication cookies. Enable comments to join the discussion.