multi-line-diffs
Handle Multi-Line Diffs
When reviewing a GitHub pull request (PR), it's crucial to provide detailed feedback on specific code changes. This includes addressing multi-line diffs, where a block of code has been added, removed, or modified. To ensure comprehensive reviews, your AI-powered system must handle these multi-line changes effectively.
You:
Yes, I'm wondering if the solution includes returning a response for the multi-line diffs from AI. The executor should be responsible for running the git command to persist those comments, correct?
ChatGPT:
"Exactly, you nailed it. The executor should take charge of running the Git commands to persist AI-generated comments back to the PR. This keeps responsibilities clean: AI focuses on generating responses, and the executor ensures those responses are applied."
Here’s how we can refine the solution:
Key Responsibilities
-
AI Response:
- The AI generates a detailed review comment for each file or diff block, including:
- Multi-line context.
- Specific feedback (e.g., risks, improvements).
- Actionable recommendations.
- The AI generates a detailed review comment for each file or diff block, including:
-
Executor Role:
- Handles:
- Running
git diffor similar commands to extract multi-line changes. - Parsing and mapping AI responses to the correct file and line ranges.
- Persisting comments to GitHub using the
octokitlibrary.
- Running
- Handles:
How This Works
Step 1: Multi-Line Diff Parsing
- Use
git diffwith context options to identify:- Start Line: Beginning of the changed block.
- End Line: Last modified line.
Example:
git diff origin/main -- src/app/file.ts
Step 2: AI Analysis
- The AI generates responses for each block, including the start and end lines in its output.
Example Prompt:
{
"role": "system",
"content": "You are an AI assistant reviewing code changes. Provide detailed comments for each changed block, including line numbers."
}
Example Response:
[
{
"file_name": "src/app/file.ts",
"start_line": 10,
"end_line": 15,
"comment": "Consider simplifying the logic in this block to reduce cyclomatic complexity."
},
{
"file_name": "src/app/file.ts",
"start_line": 20,
"end_line": 25,
"comment": "Edge cases for null inputs are not handled. Add appropriate validation."
}
]
Step 3: Executor Persists Comments
- The executor iterates through the AI responses and posts comments using GitHub's API.
Updated Executor Code
Here’s how the executor handles this workflow.
1. Parsing Multi-Line Diffs
Extract multi-line ranges with git diff.
const { execSync } = require('child_process');
function getMultiLineDiffs(fileName, baseBranch) {
const diffOutput = execSync(`git diff origin/${baseBranch} -- ${fileName}`).toString();
const diffBlocks = [];
const lines = diffOutput.split('\n');
let startLine = null;
let endLine = null;
lines.forEach((line, index) => {
if (line.startsWith('@@')) {
// Extract line numbers from the hunk header
const matches = line.match(/@@ -\d+,\d+ \+(\d+),(\d+) @@/);
if (matches) {
startLine = parseInt(matches[1], 10);
endLine = startLine + parseInt(matches[2], 10) - 1;
diffBlocks.push({ startLine, endLine });
}
}
});
return diffBlocks;
}
2. AI Request for Block Analysis
Send each diff block to AI for analysis.
async function analyzeDiffBlock(diffBlock, fileName, apiKey) {
const response = await axios.post(
'https://api.openai.com/v1/chat/completions',
{
model: 'gpt-4',
messages: [
{
role: 'system',
content: "You are an AI reviewing code diffs. Provide detailed comments for each block."
},
{
role: 'user',
content: `Review this code block for ${fileName} (lines ${diffBlock.startLine}-${diffBlock.endLine}):\n\n${diffBlock.diff}`
}
]
},
{
headers: {
Authorization: `Bearer ${apiKey}`,
'Content-Type': 'application/json'
}
}
);
return response.data.choices[0].message.content;
}
3. Persist Comments to GitHub
Post the comments using the GitHub API.
async function postMultiLineComment(octokit, repoOwner, repoName, prNumber, fileName, diffBlock, comment) {
await octokit.pulls.createReviewComment({
owner: repoOwner,
repo: repoName,
pull_number: prNumber,
path: fileName,
side: "RIGHT",
start_line: diffBlock.startLine,
line: diffBlock.endLine,
body: comment
});
}
4. Overall Executor Flow
Combine everything to process multi-line diffs and persist comments.
async function run() {
const prNumber = core.getInput('pr_number');
const repoOwner = core.getInput('repo_owner');
const repoName = core.getInput('repo_name');
const baseBranch = core.getInput('base_branch');
const apiKey = process.env.OPENAI_API_KEY;
const githubToken = process.env.GITHUB_TOKEN;
const octokit = new Octokit({ auth: githubToken });
// Get changed files
const changedFiles = execSync(`git diff --name-only origin/${baseBranch}`).toString().split('\n').filter(Boolean);
for (const fileName of changedFiles) {
const diffBlocks = getMultiLineDiffs(fileName, baseBranch);
for (const diffBlock of diffBlocks) {
const aiComment = await analyzeDiffBlock(diffBlock, fileName, apiKey);
await postMultiLineComment(octokit, repoOwner, repoName, prNumber, fileName, diffBlock, aiComment);
}
}
console.log('AI-powered PR review completed.');
}
run();
Next-Level Features
-
Aggregate Feedback:
- Collect all comments and generate an overview summary for the PR.
-
Refine Block Context:
- Preprocess blocks to highlight key code areas (e.g., imports, function definitions).
-
Error Handling:
- Skip blocks that fail AI processing and log errors for later review.
Why This Approach Works
- Separation of Concerns: The executor handles Git and GitHub operations, while AI focuses on generating feedback.
- Multi-Line Precision: Each comment maps directly to the affected code range, ensuring high relevance.
- Scalability: Handles large PRs by chunking diffs into manageable blocks.
Let me know if you’d like further refinements or additional functionality—this system is ready to evolve!