Nano Banana Pro
Agent skill for nano-banana-pro
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Sign in to like and favorite skills
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
MacMCP is a Model Context Protocol (MCP) server that exposes macOS accessibility APIs to Large Language Models (LLMs) over the stdio protocol. It allows LLMs like Claude to interact with macOS applications using the same accessibility APIs available to users, enabling them to perform user-level tasks such as browsing the web, creating presentations, working with spreadsheets, or using messaging applications.
This is the REAL APP. We never use mocks. not in the code, not in the tests our tests have access to a live macos UI. we can interact with the ui in tests
# Navigate to the MacMCP directory cd MacMCP # Build the project (debug build) swift build # Build the project (release build) swift build -c release
# Direct execution (for debugging) ./.build/debug/MacMCP # Run with arguments ./.build/debug/MacMCP --debug # Use the wrapper script for Claude desktop integration ../mcp-macos-wrapper.sh
# Format all Swift code and run linting ./scripts/format.sh # Check formatting and linting without making changes (for CI) ./scripts/lint.sh
# Run all tests (serialized execution) cd MacMCP swift test --no-parallel # Run specific test swift test --filter MacMCPTests.BasicArithmeticE2ETests/testAddition --no-parallel # Run tests with verbose output swift test --verbose --no-parallel # Run tests with a specific number of workers (when parallel execution is needed for non-UI tests) swift test --num-workers 1 # Run tests with code coverage swift test --no-parallel --enable-code-coverage
The MCP server automatically checks all required permissions at startup and displays a native macOS dialog to guide users through granting any missing permissions. No manual checking is required.
# Navigate to the MacMCP directory cd MacMCP # Build the tool swift build # Run the tool to inspect an application by bundle ID ./.build/debug/ax-inspector --app-id com.apple.calculator # Run the tool to inspect an application by process ID ./.build/debug/ax-inspector --pid 12345 # Filter output to show only specific UI component types ./.build/debug/ax-inspector --app-id com.apple.calculator --show-menus ./.build/debug/ax-inspector --app-id com.apple.calculator --show-window-controls ./.build/debug/ax-inspector --app-id com.apple.calculator --show-window-contents # Hide invisible or disabled elements ./.build/debug/ax-inspector --app-id com.apple.calculator --hide-invisible --hide-disabled # Save the output to a file ./.build/debug/ax-inspector --app-id com.apple.calculator --save output.txt # Apply custom property filters ./.build/debug/ax-inspector --app-id com.apple.calculator --filter "role=button"
# Navigate to the MacMCP directory cd MacMCP # Build the tool swift build # Run the tool to inspect an application by bundle ID (shows what the MCP server "sees") ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP # Filter output to show only buttons ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --filter "role=AXButton" # Find elements by description (useful for finding numeric buttons in Calculator) ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --filter "description=1" # Other filtering options work the same as the native inspector ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --show-window-contents ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --hide-invisible # Save output to a file ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --save output.txt # Limit the tree depth for large applications ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --max-depth 5 # NEW: Display detailed menu structure - shows all application menus ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --menu-detail # NEW: Get items for a specific menu ./.build/debug/mcp-ax-inspector --app-id com.apple.TextEdit --mcp-path ./.build/debug/MacMCP --menu-path "File" # NEW: Get detailed window information ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --window-detail # NEW: Combine multiple options for comprehensive inspection ./.build/debug/mcp-ax-inspector --app-id com.apple.TextEdit --mcp-path ./.build/debug/MacMCP --menu-detail --window-detail --max-depth 3
While both accessibility inspectors provide valuable insights, the MCP-based inspector is now the preferred tool:
Native Inspector (
)ax-inspector
MCP-Based Inspector (
) - RECOMMENDEDmcp-ax-inspector
When to use which:
MCPServer (
MCPServer.swift): The central server class that initializes the MCP server, registers tools, and handles accessibility permissions.
Accessibility Services:
AccessibilityService: Core service for interacting with macOS accessibility APIsScreenshotService: Takes screenshots of the screen or specific UI elementsUIInteractionService: Performs UI interactions like clicking and typingApplicationService: Launches and manages macOS applicationsMCP Tools:
InterfaceExplorerTool: Enhanced UI exploration with state and capabilities information (replaces UIStateTool)ScreenshotTool: Takes screenshots of the screen or UI elementsUIInteractionTool: Interacts with UI elements (clicking, typing, scrolling)ApplicationManagementTool: Manages application lifecycle (launch, terminate, query status)WindowManagementTool: Manages application windows (move, resize, minimize, maximize)MenuNavigationTool: Navigates application menus and activates menu itemsKeyboardInteractionTool: Executes keyboard shortcuts and types textClipboardManagementTool: Manages clipboard content including text and imagesModels:
UIElement: Represents a UI element with accessibility propertiesElementDescriptor: Describes a UI element for serializationmain.swift: The primary entry point that starts the server with stdio transportmcp-macos-wrapper.sh: A wrapper script that runs the server with logging for debuggingThe project includes end-to-end tests that use the macOS Calculator app to validate that the accessibility interactions work correctly. The test architecture includes:
CalculatorApp.swift: A wrapper for the Calculator app used in testsCalculatorElementMap.swift: Maps of Calculator UI elements for testingBasicArithmeticE2ETests.swift: Tests basic arithmetic operationsKeyboardInputE2ETests.swift: Tests keyboard inputScreenshotE2ETests.swift: Tests screenshot functionalityUIStateInspectionE2ETests.swift: Tests UI state inspectionSince our tests interact with the UI, they must be run serially to avoid conflicts:
@Suite(.serialized) annotation to ensure tests within a class run serially--no-parallel flag when running swift test to ensure all tests run seriallySee
docs/Testing.md for full details on our testing approach and practices.
MacMCP provides two different accessibility inspector tools:
Native Inspector (
): Directly uses macOS accessibility APIs to provide comprehensive information about UI elements.ax-inspector
MCP-Based Inspector (
): Uses the MCP server to inspect applications, showing exactly what the MCP tools can see and work with.mcp-ax-inspector
These inspector tools are invaluable for:
mcp-ax-inspector) is particularly useful as it shows exactly the same view of the UI that the MCP server will provide to the tools.The inspector provides detailed information about each UI element, organized into sections:
Example output for a button:
[24] AXButton: Untitled Identifier: Three Frame: (x:360, y:561, w:40, h:40) State: Enabled, Visible, Clickable, Unfocused, Unselected Role: AXButton (button) Description: 3 Relationships: Children: 0, Has Parent, Has Window Actions: AXPress Additional Attributes: AXActivationPoint: (380, 581) AXAutoInteractable: 0 AXPath: [Element reference]
To find the correct element selectors for interacting with UI elements:
Run the MCP-based inspector on your target application:
# Always use the MCP-based inspector for MCP tools and tests: ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP
Look for elements with the identifier, role, or description you need:
# Find all buttons ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --filter "role=AXButton" # Find elements with specific descriptions ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --filter "description=1" # Find elements with specific identifiers ./.build/debug/mcp-ax-inspector --app-id com.apple.calculator --mcp-path ./.build/debug/MacMCP --filter "id=Equals"
For menu-related testing, use the new menu inspection features:
# Get full menu structure of the application ./.build/debug/mcp-ax-inspector --app-id com.apple.TextEdit --mcp-path ./.build/debug/MacMCP --menu-detail # Get items for a specific menu ./.build/debug/mcp-ax-inspector --app-id com.apple.TextEdit --mcp-path ./.build/debug/MacMCP --menu-path "File"
For window-related testing, use the window inspection features:
# Get all windows and their properties ./.build/debug/mcp-ax-inspector --app-id com.apple.TextEdit --mcp-path ./.build/debug/MacMCP --window-detail
Use the identified properties in your MCP tools:
Important: The MCP-based inspector with its menu and window inspection features shows you the exact element identifiers that the MCP tools use, making it essential when developing tests for MCP-based workflows.
The MCP-based inspector provides a rich set of options for exploring UI hierarchies:
--show-window-contents to focus on the main UI elements and exclude menus and controls--hide-invisible to reduce clutter in output--filter "role=X" to find elements with specific roles--filter "description=X" to find elements with specific descriptions--save output.txt--max-depth (e.g., 3-5) for large applications to prevent overwhelming output--menu-detail to see all application menus (essential for MenuNavigationTool)--menu-path "MenuName" to explore specific menu contents--window-detail to see all windows and their properties (essential for WindowManagementTool)--menu-detail --window-detail --max-depth 3)--mcp-path parameter pointing to your MacMCP executable--menu-detail to verify menu structure--window-detail to verify window IDsNEVER Implement Mocks: NEVER implement mocks for system behavior inside the MCP server. It is better to have a hard failure than to serve up mocked data. The server must always interact with the real macOS accessibility APIs and never simulate or fake functionality.
Accessibility Permissions: The app requires accessibility permissions to function correctly. The MCP server automatically detects missing permissions at startup and displays a native macOS dialog to guide users through granting them. The permissions are granted to the host process (e.g., Terminal, Claude.app, or other LLM host applications).
Error Handling: Use the provided error handling utilities in
ErrorHandling.swift for consistent error reporting.
Logging: Use the Swift Logging framework for consistent logging throughout the codebase.
Action Logging: The
ActionLogger.swift utility provides structured logging of accessibility actions.
Testing Approach: When writing tests, prefer end-to-end tests with real applications over mock-based tests to ensure the accessibility features work correctly.
Synchronization: When working with UI elements, ensure proper synchronization and waiting for UI updates, especially in tests.
Proper tool implementation: When implementing new tools, follow the pattern in existing tools like
UIStateTool.swift and register them in MCPServer.swift.