MCPFill: Auto-Filling Forms with Model Context Protocol
Web forms are everywhere – account signups, government applications, checkout flows, survey portals, insurance quotes. For years, automating them meant writing brittle scripts that hardcoded selectors and field values. MCPFill takes a fundamentally different approach. It bridges AI agents and web forms through the Model Context Protocol, letting a large language model understand a form’s structure, decide what to fill, and execute the filling – all through a standardized tool interface. Instead of writing a script per form, you describe what you want in natural language and let the AI handle the rest.
For a broader look at the traditional scripted approach, see our guide on how to automate web form filling. This post breaks down what MCPFill is, how it works, why form filling is harder than it looks for AI, and where this approach fits alongside traditional browser automation.
What MCPFill Is
MCPFill is an MCP-based tool that exposes form-filling capabilities to AI agents. At its core, it is a bridge between an LLM and the browser’s form layer. The tool inspects a web page, extracts the structure of every form it finds – field names, types, labels, validation rules, required flags – and presents that structure to the AI agent as a callable MCP tool. The agent then decides what values to fill based on context, user instructions, or available data, and MCPFill handles the actual DOM manipulation.
The key distinction from traditional automation is where the intelligence lives. In a Playwright or Selenium script, the developer encodes all the logic: which field gets which value, how to handle dropdowns, when to click submit. With MCPFill, the LLM makes those decisions. The developer provides the intent (“fill out this job application with my resume data”), and the AI figures out how to map that intent onto the specific form it encounters.
flowchart TD
A["AI Agent<br>(Claude, GPT, etc.)"] -->|"Natural language intent"| B["MCP Layer"]
B -->|"Tool call: extract_form"| C["MCPFill Server"]
C -->|"DOM inspection"| D["Browser Instance"]
D -->|"Form structure"| C
C -->|"Structured form schema"| B
B -->|"Form fields + metadata"| A
A -->|"Decided field values"| B
B -->|"Tool call: fill_form"| C
C -->|"Set values, trigger events"| D
D -->|"Form filled"| C
C -->|"Confirmation"| B
B -->|"Result"| A
style A fill:#e8eaf6
style B fill:#e3f2fd
style C fill:#e8f5e9
style D fill:#fff3e0
The Form-Filling Problem for AI
Forms look simple to a human. You see a label, you type a value. But for an AI agent driving a browser, forms present a stack of challenges that go well beyond locating an input element.
Labels and semantics. A field labeled “Phone” might accept (555) 123-4567 or 5551234567 but reject +1-555-123-4567. The label alone does not tell you the expected format. Some forms use placeholder text instead of labels. Others use aria-label attributes that are invisible on screen. Matching the right value to the right field requires semantic understanding, not just selector matching.
Validation rules. Client-side validation can be embedded in HTML attributes (required, pattern, minlength), in JavaScript event handlers, or in third-party validation libraries. An AI agent needs to know these constraints before it fills a field, or it will trigger error states that block submission.
Dynamic behavior. Selecting “Yes” in a radio button might reveal three additional fields. Choosing a country from a dropdown might change the available states or provinces. These conditional fields do not exist in the DOM until a previous field is filled, which means the form’s structure changes as you interact with it.
Non-standard controls. Date pickers, multi-select dropdowns, rich text editors, slider inputs, and custom-built components all require different interaction patterns. A standard page.fill() call will not work on a React Select component or a Flatpickr date widget.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# The manual approach: every field needs explicit handling
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://example.com/application")
# Each field requires knowing the selector AND the expected format
page.fill("#first-name", "Jane")
page.fill("#last-name", "Doe")
page.fill("#email", "jane.doe@example.com")
page.fill("#phone", "(555) 123-4567") # Format matters
# Dropdowns need special handling
page.select_option("#country", value="US")
# Wait for dynamic fields to appear after country selection
page.wait_for_selector("#state")
page.select_option("#state", value="CA")
# Date pickers are their own adventure
page.click("#date-of-birth")
page.click(".datepicker .prev-month") # Navigate to correct month
page.click(".datepicker td[data-day='15']")
# Checkboxes
page.check("#terms-agreement")
page.click("#submit-button")
browser.close()
This script is tightly coupled to one specific form. Change the layout, rename a field, swap the date picker library, and the script breaks. That brittleness is exactly what MCPFill aims to eliminate.
How MCPFill Solves It
MCPFill operates in two phases: extraction and filling.
Phase 1: Form Structure Extraction
When the AI agent asks MCPFill to analyze a page, it traverses the DOM and builds a structured representation of every form element. This representation includes:
- Field type (text, email, select, checkbox, radio, textarea, date, file)
- Associated label text (from
<label>,aria-label,placeholder, or nearby text nodes) - Validation constraints (required, pattern, min/max length, min/max value)
- Current value if pre-filled
- Available options for select elements and radio groups
- Whether the field is visible, disabled, or hidden behind a conditional
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
{
"form_id": "application-form",
"action": "/api/submit-application",
"method": "POST",
"fields": [
{
"selector": "#first-name",
"type": "text",
"label": "First Name",
"required": true,
"maxlength": 50,
"value": ""
},
{
"selector": "#country",
"type": "select",
"label": "Country",
"required": true,
"options": [
{"value": "US", "text": "United States"},
{"value": "GB", "text": "United Kingdom"},
{"value": "CA", "text": "Canada"}
],
"value": ""
},
{
"selector": "#terms-agreement",
"type": "checkbox",
"label": "I agree to the Terms of Service",
"required": true,
"checked": false
}
]
}
This schema gives the LLM everything it needs to make decisions: what each field means, what constraints apply, and what options are available.
Phase 2: Intelligent Filling
The AI agent receives the form schema and decides what values to fill based on context. This is where the LLM’s reasoning shines. Given a user instruction like “fill this form with my profile data” and a data source containing the user’s information, the LLM can:
- Map “First Name” to the user’s first name from the data source
- Choose the correct country option from a dropdown
- Format a phone number to match the expected pattern
- Check required checkboxes like terms agreements
- Skip optional fields that are not relevant
The agent sends the decided values back to MCPFill, which handles the actual DOM manipulation – filling text fields, selecting dropdown options, checking boxes, and triggering the appropriate JavaScript events so that dynamic fields and validation respond correctly.
sequenceDiagram
participant User as User
participant Agent as AI Agent
participant MCP as MCPFill Server
participant Browser as Browser
User->>Agent: "Fill the application<br>with my resume data"
Agent->>MCP: extract_form(url)
MCP->>Browser: Navigate + inspect DOM
Browser->>MCP: Form structure (12 fields)
MCP->>Agent: Structured form schema
Note over Agent: Maps resume data<br>to form fields
Agent->>MCP: fill_form(field_values)
MCP->>Browser: Fill fields + trigger events
Browser->>MCP: Dynamic fields revealed
MCP->>Agent: Updated schema (15 fields now)
Note over Agent: Fills remaining<br>dynamic fields
Agent->>MCP: fill_form(additional_values)
MCP->>Browser: Fill remaining + submit
Browser->>MCP: Submission result
MCP->>Agent: Success confirmation
Agent->>User: "Application submitted"
Notice the iterative loop. MCPFill re-extracts the form after filling fields that trigger dynamic behavior, so the agent can handle conditional sections without the developer predicting them in advance.

Setup and Configuration
MCPFill follows the standard MCP server pattern. You install it, configure your AI agent to connect to it, and then interact through natural language.
1
2
3
4
5
# Install the MCPFill server
npm install -g mcpfill
# Or run it directly with npx
npx mcpfill --port 3100
For Claude Desktop, add MCPFill to your MCP server configuration:
1
2
3
4
5
6
7
8
9
10
11
12
{
"mcpServers": {
"mcpfill": {
"command": "npx",
"args": ["mcpfill"],
"env": {
"BROWSER": "chromium",
"HEADLESS": "false"
}
}
}
}
The server exposes several tools to the connected AI agent:
| Tool | Description |
|---|---|
extract_form | Navigates to a URL and returns the structured form schema |
fill_form | Fills specified fields with provided values |
submit_form | Clicks the submit button and returns the result |
get_form_state | Returns the current state of all fields without modifying them |
validate_form | Triggers client-side validation and reports any errors |
Each tool accepts and returns structured JSON, which is exactly what LLMs are good at reasoning about.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Example: Using MCPFill programmatically through an MCP client
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
server_params = StdioServerParameters(
command="npx",
args=["mcpfill"]
)
async def fill_application():
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Step 1: Extract the form structure
form_schema = await session.call_tool(
"extract_form",
arguments={"url": "https://example.com/apply"}
)
# Step 2: Let your application logic decide values
# (or pass form_schema to an LLM for decision-making)
field_values = map_data_to_form(form_schema, user_data)
# Step 3: Fill the form
result = await session.call_tool(
"fill_form",
arguments={"values": field_values}
)
# Step 4: Submit
submission = await session.call_tool("submit_form", arguments={})
print(f"Submission result: {submission}")
Use Cases
MCPFill is most valuable when the forms are varied, numerous, or unpredictable. Here are the scenarios where it outperforms traditional script-based automation.
Automated Data Entry
Organizations that process the same data through multiple web portals – entering employee information into HR systems, submitting financial data to regulatory portals, or syncing records across platforms that lack APIs – benefit from MCPFill’s ability to handle different form layouts without per-form scripts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Traditional approach: one script per form
def fill_hr_portal_a(page, employee):
page.fill("#emp-first", employee["first_name"])
page.fill("#emp-last", employee["last_name"])
page.select_option("#dept-select", value=employee["department"])
# ... 20 more fields, each hardcoded
def fill_hr_portal_b(page, employee):
page.fill("input[name='fname']", employee["first_name"])
page.fill("input[name='lname']", employee["last_name"])
page.click(f"label:has-text('{employee['department']}')")
# ... completely different selectors for the same data
# MCPFill approach: one instruction for any form
# "Fill this HR form with the employee record for Jane Doe"
Form Testing
QA teams can use MCPFill to test forms with varied inputs without writing test scripts for each combination. The AI agent can generate edge cases, test boundary values, and identify fields that accept invalid input – all driven by natural language instructions.
Government and Compliance Forms
Government portals are notorious for complex, multi-section forms with conditional logic, strict validation, and inconsistent UI patterns across agencies. MCPFill’s extraction phase handles the complexity of understanding each form, while the AI’s reasoning handles mapping data to unfamiliar layouts.
Lead Generation and Research
Sales teams that need to fill out contact forms, demo request forms, or registration forms across hundreds of target websites can use MCPFill to handle the variety without maintaining a script for each site.

Limitations
MCPFill is not a silver bullet. Several categories of forms remain challenging.
CAPTCHAs. Any form protected by reCAPTCHA, hCaptcha, or similar challenge systems will block automated submission. MCPFill does not attempt to solve CAPTCHAs. If you encounter them, you need a separate strategy – CAPTCHA solving services, browser profiles with established reputation, or manual intervention for the challenge step.
Complex multi-step wizards. Forms that span multiple pages with server-side state management (where each step requires a server round-trip and session validation) add complexity that a single extraction pass cannot handle. MCPFill’s iterative approach helps, but deeply nested multi-step flows with branching paths can still trip up the system.
File uploads. While MCPFill can identify file upload fields, the actual file selection involves OS-level file dialogs that sit outside the browser DOM. This requires separate handling through Playwright’s set_input_files() or equivalent methods.
Custom web components. Heavily customized components built with Web Components, Shadow DOM, or framework-specific rendering (like Material UI’s Autocomplete or Ant Design’s DatePicker) may not expose their structure through standard DOM inspection. MCPFill can fall back to accessibility tree inspection, but results vary.
Token cost. Every extraction and filling cycle consumes LLM tokens. For a form with 30 fields, the schema alone might be several hundred tokens, and the agent’s reasoning about what to fill adds more. At scale, this cost adds up compared to a hardcoded script that uses zero LLM tokens per execution.
flowchart TD
A["Form Encountered"] --> B{"CAPTCHA<br>Present?"}
B -->|Yes| C["Manual / Solving Service"]
B -->|No| D{"Multi-Step<br>Wizard?"}
D -->|Yes| E["Iterative MCPFill<br>(step by step)"]
D -->|No| F{"File Upload<br>Required?"}
F -->|Yes| G["MCPFill + Manual<br>File Handling"]
F -->|No| H["Full MCPFill<br>Automation"]
style C fill:#ffcdd2
style E fill:#fff9c4
style G fill:#fff9c4
style H fill:#c8e6c9
MCPFill vs. Manual Playwright Form Filling
The choice between MCPFill and traditional Playwright scripting depends on your use case.
| Factor | MCPFill | Manual Playwright |
|---|---|---|
| Setup time per form | Minutes (natural language) | Hours (inspect, code, test) |
| Reliability | Depends on LLM accuracy | Deterministic once working |
| Maintenance | Self-adapting to UI changes | Breaks when selectors change |
| Speed | Slower (LLM round-trips) | Fast (direct execution) |
| Cost per execution | Token costs per run | Zero marginal cost |
| Handling varied forms | Excellent | One script per form |
| Debugging | Harder (LLM reasoning is opaque) | Standard debugging tools |
For high-volume, single-form automation (like submitting the same form thousands of times with different data), manual Playwright wins on speed and cost. The script runs in milliseconds, never makes a wrong decision, and costs nothing per execution.
For low-volume, high-variety automation (like filling out a different form every time), MCPFill wins on development time and adaptability. You do not need to write and maintain a separate script for each form.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Manual Playwright: fast, predictable, but rigid
async def fill_checkout(page, order):
await page.fill("#card-number", order["card"])
await page.fill("#card-expiry", order["expiry"])
await page.fill("#card-cvc", order["cvc"])
await page.fill("#billing-zip", order["zip"])
await page.click("#place-order")
# This works perfectly for this one checkout form
# and breaks the moment the form changes
# MCPFill: flexible, intelligent, but slower
# Agent instruction: "Complete the checkout with the payment
# method on file for order #12345"
# MCPFill extracts the form, the LLM figures out the mapping,
# and the form gets filled regardless of layout changes
The hybrid approach is often the best strategy. Use MCPFill for initial exploration and prototyping – let it figure out the form structure and the right field mappings. Then, if you need to automate that same form at scale, convert the MCPFill output into a hardcoded Playwright script for production use.
The Broader MCP Ecosystem for Browser Automation
MCPFill does not exist in isolation. It is part of a growing ecosystem of MCP servers that expose browser capabilities to AI agents.
Playwright MCP Server (@playwright/mcp) provides general browser automation – navigation, clicking, typing, screenshots, and accessibility snapshots. MCPFill complements it by adding form-specific intelligence on top.
Puppeteer MCP Server offers similar browser control through Puppeteer instead of Playwright. If your infrastructure is built on Puppeteer, you can pair it with MCPFill for form-specific tasks.
Browser-Use and Stagehand are higher-level agent frameworks that already include form-filling capabilities as part of their broader automation toolkit. They take a different approach by embedding the browser interaction logic inside the agent framework rather than exposing it through MCP.
graph TD
A["AI Agent"] --> B["MCP Protocol"]
B --> C["Playwright MCP<br>General browser control"]
B --> D["MCPFill<br>Form-specific intelligence"]
B --> E["Puppeteer MCP<br>Alternative browser engine"]
B --> F["Other MCP Servers<br>File, database, API"]
C --> G["Browser Instance"]
D --> G
E --> G
style A fill:#e8eaf6
style B fill:#e3f2fd
style C fill:#e8f5e9
style D fill:#c8e6c9
style E fill:#e8f5e9
style F fill:#fff3e0
style G fill:#fff3e0
The MCP approach has a structural advantage: composability. An AI agent can use the Playwright MCP server for navigation, MCPFill for form filling, and a database MCP server for fetching the data to fill – all in the same conversation, all through the same protocol.
What Comes Next
The trajectory of tools like MCPFill points toward a future where AI agents – including those built on Playwright for browser automation – handle any form based on natural language instructions alone. Several developments are pushing in that direction.
Better form understanding. As LLMs improve at reading and reasoning about DOM structures, the extraction phase becomes more accurate. Models that can process accessibility trees, visual layouts, and HTML simultaneously will handle edge cases that trip up current systems.
Pre-trained form knowledge. Future versions of MCPFill could ship with knowledge of common form patterns – the standard Shopify checkout, the typical WordPress contact form, the common SaaS signup flow. This would reduce the extraction overhead for frequently encountered forms.
Autonomous multi-step flows. Handling a five-page insurance application that branches based on previous answers requires the agent to maintain state, anticipate future steps, and backtrack when it makes a wrong choice. Current MCPFill implementations handle this iteratively, but future versions could plan the entire flow before starting.
Validation-aware filling. Instead of filling a field and then discovering it failed validation, future tools could predict validation failures by analyzing the constraints in the schema and adjusting values before they are entered.
The combination of MCP as a protocol standard and LLMs as the reasoning engine creates a foundation where form filling is no longer about scripting specific interactions. It is about describing intent and letting the AI handle the mechanics. MCPFill is an early implementation of that vision, and as the MCP ecosystem matures, tools like it will become the default approach for anyone who needs to automate forms at any meaningful variety or scale.

