GitHub Middleware Component Specification
This document specifies the architecture and logic of the GitHub Middleware, the "Control Plane" component responsible for bridging GitHub interactions with the Temporal orchestration.
Overview
The GitHub Middleware is a stateless FastAPI web service that bridges GitHub and Temporal. It acts as the "ears" (webhooks) and "voice" (escalations) of the Farmercode system.
Infrastructure
- Type: Stateless FastAPI Service
- Deployment: Separated from the Target App. Part of the
farmercodecontrol plane. - Public Access: Must be reachable by GitHub (public IP or proxy).
- Security: Shared Secret (HMAC) with GitHub App/Webhook settings.
Logic Flow
1. Webhook Validation
Every incoming request must be validated to ensure it originates from GitHub.
- Header: X-Hub-Signature-256
- Algorithm: HMAC-SHA256 using the GITHUB_WEBHOOK_SECRET.
2. Event Filtering
The GitHub Middleware only processes specific events:
- issue_comment.created: Human adds a comment.
- issues.opened (Optional): Triggering new workflows (e.g., WRD intake).
- pull_request.review_submitted (Optional): PR approvals.
Ignore:
- Bot comments (check sender.type == "Bot").
- Comments made by the Agent itself (to prevent feedback loops).
3. Workflow Identification (The "Footer" Protocol)
The GitHub Middleware does not maintain a database of Issue-to-Workflow mappings. Instead, it relies on the stateless metadata footer stamped into the Issue Body by the Workflow.
Footer Format (in Issue Body):
<!-- farmercode-metadata
workflow_id: wrd-123
run_id: 9f8487b0-c672-4f81-8123-109867123
group_id: blueprint
mode: live
-->
Note: The footer can be hidden using HTML comments or visible. HTML comments are preferred for cleanliness.
Resolution Steps:
1. Receive Comment: Payload contains issue.number.
2. Fetch Issue: (Optional) If the webhook payload doesn't contain the Issue Body, fetch it via API. Note: issue_comment payloads usually include the issue object.
3. Parse Footer: Extract workflow_id and run_id.
4. Target: The workflow_id is the primary business key. The run_id ensures we signal the specific execution instance (handling restarts/continues).
4. Signaling Temporal
Once the target is identified, the GitHub Middleware signals the workflow.
async def handle_comment(payload):
# ... validation & parsing ...
workflow_id = metadata["workflow_id"]
signal_name = "HumanInput"
signal_data = {
"author": payload["sender"]["login"],
"text": payload["comment"]["body"],
"issue_number": payload["issue"]["number"],
"timestamp": payload["comment"]["created_at"]
}
try:
await temporal_client.signal_workflow(
workflow_id=workflow_id,
signal=signal_name,
arg=signal_data
)
except WorkflowNotFound:
# Handle case where workflow is already finished or doesn't exist
log_warning(f"Orphaned comment on {workflow_id}")
## 5. Outbound Communications (Escalation)
The GitHub Middleware also acts as an **Outbound Gateway** for Agents to communicate with Humans. This allows the Workflow to post comments to GitHub Issues without the Agent (Worker) needing direct GitHub API write access.
### Endpoint: `POST /api/github/comment`
**Payload:**
```json
{
"issue_number": 123,
"body": "I have blocked on X. Please advise.",
"agent_name": "architect", // e.g., "architect", "tech_lead", "qa"
"workflow_id": "wrd-123"
}
Agent Personas (GitHub Apps)
To maintain clear separation of concerns and traceability, the GitHub Middleware supports Multi-App Posting.
* It does not post as a generic "Farmercode Bot".
* It looks up the correct GitHub App credentials based on agent_name.
* Example:
* agent_name="architect" -> Uses "Farmer Architect" GitHub App.
* agent_name="qa" -> Uses "Farmer QA" GitHub App.
Logic:
1. Receive POST request from Temporal Activity.
2. Load App ID and Private Key for the requested agent_name.
3. Generate Installation Token.
4. Call GitHub API: POST /repos/{owner}/{repo}/issues/{number}/comments.
```
Race Conditions & Buffering
Scenario: The human comments before the workflow has reached the await_signal state.
* Temporal Behavior: Signals are buffered by default. If a signal arrives while the workflow is running (not waiting), it is stored in history.
* Workflow Logic: When the workflow finally reaches workflow.wait_for_signal("HumanInput"), it will immediately consume the buffered signal and proceed.
* Benefit: The GitHub Middleware does not need to check the workflow status. It just sends the signal. Fire and forget.
Open Questions
- Metadata Location: Should the metadata be in the Issue Body (created once) or the Comment Chain?
- Decision: Issue Body. It is the stable anchor.
- Multiple Workflows per Issue: Can an issue be reused for multiple workflows?
- Constraint: No. One Issue = One Workflow Lifecycle.
- Authentication: How does the GitHub Middleware authenticate with the Temporal Cluster? (mTLS certs?).
Robustness & Safety
- Missing/Corrupt Footer: If a user accidentally deletes or corrupts the metadata footer in the Issue Body, the GitHub Middleware must ignore subsequent comments, as it cannot route the signal.
- Recovery: A "Watchdog" process (or the Workflow itself on a timer) could periodically check the Issue Body and restore the footer if it's missing, ensuring the feedback loop remains intact.