otherapi_key

Humanloop

Humanloop helps developers build and refine AI applications, offering user feedback loops, model training, and data annotation to iterate on language model performance

Verdict

Humanloop manages prompt engineering and LLM evaluation workflows. When you @mention Humanloop in a Space, your team can create projects for new AI features, list experiments to compare prompt variants, review user sessions to spot issues, and delete old test projects. This is most useful for product teams iterating on AI features — you can pull experiment results into a standup discussion or audit sessions without leaving the conversation. You'll need a Humanloop API key with project and experiment read/write access. The MCP doesn't expose evaluation creation or dataset management, so you'll still handle those tasks in the Humanloop dashboard.

Common use cases

Review user sessions during incident triage
Compare experiment results at sprint review
Create test projects for new AI features
Audit prompt performance across deployments
Clean up old experiments after launch

Integration

Vendor: Humanloop
Category: other
Auth: API_KEY
Tools: 4
Composio slug: humanloop

Tools

Create Project
This tool creates a new project in humanloop. it is an independent action that generates a project by accepting a project's name (required), an optional description, and an optional organization id. upon execution, it returns details of the
Delete Project
destructive
This tool allows you to delete a specific project from your humanloop organization. the deletion is permanent and cannot be undone. all associated data, including sessions, datapoints, and evaluations linked to the project, will be permanen
List Experiments
This tool retrieves an array of experiments associated with a specific project in humanloop. it requires a project id (starting with 'pr ') and returns details including experiment id, name, description, creation timestamp, status, configur
List Sessions
This tool retrieves a paginated list of sessions for a specific project in humanloop. it requires a project id (and optionally, page and size for pagination) and returns session details such as id, reference id, project information, datapoi

Setup

Setup guide

11. Open your Switchy workspace and navigate to Settings > Integrations > MCP Servers. 2. Click 'Add MCP Server' and select Humanloop from the list. 3. Log into your Humanloop account, go to Settings > API Keys, and generate a new key with project and experiment permissions. 4. Paste the API key into Switchy's connection form and click 'Connect'. 5. Switchy will verify the key and confirm the connection. 6. Open any Space and type '@Humanloop list projects' to test — you should see your Humanloop projects appear in the response. 7. To invoke a tool, @mention Humanloop followed by the action you want, like '@Humanloop create project name="Q1 Feature Test" description="Prompt variants for onboarding flow"'. 8. If the connection fails, double-check the API key has the correct scopes and hasn't expired.

What teammates see: by default, memories from Humanloop are scoped to the Space (PROJECT visibility) - you can mark any memory PRIVATE or share it ORG-wide.

Works well with

Top models

Compatibility data appears once enough Spaces have used this MCP together with a given model.

How Switchy teams use it

Not enough Spaces yet to publish anonymised usage stats (we require ≥ 50 Spaces per week).

Starter prompts

List Recent Sessions

@Humanloop list sessions for project pr_abc123, show me the most recent 20 sessions with their status and creation dates

Open in a Space →

Create Experiment Project

@Humanloop create a new project called "Onboarding Assistant V2" with description "Testing conversational prompts for user signup flow"

Open in a Space →

Compare Experiment Results

@Humanloop list all experiments for project pr_abc123 and show me their names, descriptions, and current status

Open in a Space →

Delete Test Project

@Humanloop delete the project pr_test456 — we shipped that feature last week and don't need the test data anymore

Open in a Space →

Audit Session Activity

@Humanloop show me the first 50 sessions for project pr_abc123, I need to check if we're seeing errors in the last 24 hours

Open in a Space →

Example outputs

Illustrative - representative of the model's voice and quality, not literal recordings.

Prompt

@Humanloop list all active experiments in project pr_abc123 and summarize their current status

Output

Found 3 active experiments in project pr_abc123: 1. **Prompt Variant A/B Test** (exp_xyz789) — Testing two system prompts for customer support responses. Running since Jan 15, 47 sessions logged. 2. **Temperature Optimization** (exp_def456) — Comparing temperature settings 0.7 vs 1.0 for creative writing tasks. 23 sessions, early results favor 0.7. 3. **Model Comparison: GPT-4 vs Claude** (exp_ghi012) — Head-to-head evaluation on summarization quality. 61 sessions, no clear winner yet. All experiments are collecting data; none have reached statistical significance for a conclusive result.

Notes

This example shows Humanloop's experiment-tracking strength — useful when your team runs multiple prompt or model tests in parallel. Requires a valid project ID (format: pr_*). The MCP returns raw experiment metadata; the AI synthesizes it into a readable status report. Note that experiment results depend on your evaluation criteria being defined in Humanloop's UI first.

Prompt

@Humanloop create a new project called 'Q1 Marketing Copy Generator' with description 'A/B testing GPT-4 prompts for ad headlines'

Output

Project created successfully: **Q1 Marketing Copy Generator** (pr_new987) Description: A/B testing GPT-4 prompts for ad headlines Organization: org_default456 Created: 2025-01-28 14:32 UTC You can now log sessions to this project or set up experiments in the Humanloop dashboard. The project ID is `pr_new987` — use this when calling other Humanloop tools or logging data via API.

Notes

Demonstrates project creation, the starting point for organizing prompt experiments in Humanloop. The MCP returns a project ID immediately, but you'll still need to configure prompts, models, and evaluation criteria in Humanloop's web UI before running experiments. This is a write action — it creates a permanent resource in your Humanloop account until explicitly deleted.

Prompt

@Humanloop pull the last 20 sessions from project pr_abc123 and identify any patterns in user feedback scores

Output

Retrieved 20 most recent sessions from pr_abc123 (Jan 26–28): **Feedback score breakdown:** - 5 stars: 12 sessions (60%) - 4 stars: 5 sessions (25%) - 3 stars or below: 3 sessions (15%) **Emerging pattern:** Sessions with feedback ≤3 stars all occurred during evening hours (6–9 PM UTC) and involved multi-turn conversations exceeding 8 messages. Lower scores correlate with longer context windows, suggesting the model may lose coherence in extended dialogues. Consider testing a context-pruning strategy or switching to a model with larger context capacity for evening traffic.

Notes

Shows how Humanloop's session logs pair with AI analysis to surface insights your team might miss in raw data. The MCP fetches session metadata (timestamps, feedback, message counts); the AI connects the dots. Requires that your application is already logging sessions to Humanloop with feedback scores attached. Pagination defaults to 20 results — adjust the 'size' parameter for larger datasets.

Use-case deep-dives

AI product team experiment tracking

When Humanloop fits a 3-person AI feature team

A 3-person team shipping an AI-powered search feature needs to track which prompt variants perform best across user cohorts. Humanloop is the right call here: the Create Project and List Experiments tools let the team spin up a project per feature, run A/B tests on prompts, and pull experiment results into Switchy for standup review. The List Sessions tool surfaces individual user interactions when something breaks. The trade-off: if your team isn't iterating on prompts weekly, you're paying for infrastructure you don't use. This works when prompt experimentation is a weekly ritual, not a one-time setup task.

Customer success prompt debugging

Using Humanloop to diagnose support chatbot failures

A 5-person customer success team runs a support chatbot and needs to understand why certain queries return unhelpful answers. The List Sessions tool in Humanloop pulls session logs filtered by project, so the team can review failed interactions in Switchy without leaving their workspace. The Delete Project tool cleans up test environments after debugging sprints. This scenario works if your chatbot logs to Humanloop already—if you're using a different LLM provider's logging, adding Humanloop just for session review is overkill. The buying call: if you're already paying Humanloop for prompt management, surfacing those sessions in Switchy saves 10 minutes per support escalation.

Solo founder LLM feature audit

When a solo builder should skip Humanloop in Switchy

A solo founder building an AI writing assistant wants to review how users interact with different prompt templates. Humanloop's 4-tool MCP gives access to projects and experiments, but the overhead of API key setup and project creation is high for one person who could just check the Humanloop dashboard directly. The List Experiments and List Sessions tools shine when a team needs shared visibility in Switchy, not when one person is the only stakeholder. The threshold: if you're coordinating with a co-founder or contractor on prompt performance, Humanloop in Switchy pays off. If you're solo, the native Humanloop UI is faster.

Frequently asked

What does the Humanloop MCP do in Switchy?

It connects your Humanloop workspace to Switchy so AI assistants can create projects, list experiments, retrieve session data, and delete projects without leaving the conversation. Useful when you're iterating on prompts or reviewing LLM outputs and want to manage Humanloop resources inline instead of switching tabs.

Do I need admin access to connect Humanloop MCP?

You need an API key from Humanloop with permissions to create and delete projects, plus read access to experiments and sessions. If your key is scoped to read-only, the create and delete tools will fail. Generate the key in your Humanloop organization settings before connecting it to Switchy.

Can the Humanloop MCP update existing prompts or evaluations?

No. The four tools focus on project lifecycle and data retrieval—creating projects, deleting them, listing experiments, and fetching sessions. If you need to edit prompt versions or trigger evaluations, do that in the Humanloop UI or use their full REST API outside of this MCP.

Why use this MCP instead of the Humanloop dashboard?

Use the MCP when you're already in a Switchy conversation and want to spin up a test project, check experiment results, or pull session logs without context-switching. The dashboard is still better for visual prompt editing, side-by-side comparisons, and bulk operations the MCP doesn't expose.

Who on the team should connect the Humanloop MCP?

Whoever owns your Humanloop organization or has API key creation rights. Once connected in Switchy, any workspace member can invoke the tools in their chats. If you're worried about accidental deletions, consider a read-only key or restrict the connection to a single user.