Skip to main content

multi-line-diffs

Handle Multi-Line Diffs

When reviewing a GitHub pull request (PR), it's crucial to provide detailed feedback on specific code changes. This includes addressing multi-line diffs, where a block of code has been added, removed, or modified. To ensure comprehensive reviews, your AI-powered system must handle these multi-line changes effectively.

You:

Yes, I'm wondering if the solution includes returning a response for the multi-line diffs from AI. The executor should be responsible for running the git command to persist those comments, correct?


ChatGPT:

"Exactly, you nailed it. The executor should take charge of running the Git commands to persist AI-generated comments back to the PR. This keeps responsibilities clean: AI focuses on generating responses, and the executor ensures those responses are applied."

Here’s how we can refine the solution:


Key Responsibilities

  1. AI Response:

    • The AI generates a detailed review comment for each file or diff block, including:
      • Multi-line context.
      • Specific feedback (e.g., risks, improvements).
      • Actionable recommendations.
  2. Executor Role:

    • Handles:
      • Running git diff or similar commands to extract multi-line changes.
      • Parsing and mapping AI responses to the correct file and line ranges.
      • Persisting comments to GitHub using the octokit library.

How This Works

Step 1: Multi-Line Diff Parsing

  • Use git diff with context options to identify:
    • Start Line: Beginning of the changed block.
    • End Line: Last modified line.

Example:

git diff origin/main -- src/app/file.ts

Step 2: AI Analysis

  • The AI generates responses for each block, including the start and end lines in its output.

Example Prompt:

{
"role": "system",
"content": "You are an AI assistant reviewing code changes. Provide detailed comments for each changed block, including line numbers."
}

Example Response:

[
{
"file_name": "src/app/file.ts",
"start_line": 10,
"end_line": 15,
"comment": "Consider simplifying the logic in this block to reduce cyclomatic complexity."
},
{
"file_name": "src/app/file.ts",
"start_line": 20,
"end_line": 25,
"comment": "Edge cases for null inputs are not handled. Add appropriate validation."
}
]

Step 3: Executor Persists Comments

  • The executor iterates through the AI responses and posts comments using GitHub's API.

Updated Executor Code

Here’s how the executor handles this workflow.

1. Parsing Multi-Line Diffs

Extract multi-line ranges with git diff.

const { execSync } = require('child_process');

function getMultiLineDiffs(fileName, baseBranch) {
const diffOutput = execSync(`git diff origin/${baseBranch} -- ${fileName}`).toString();

const diffBlocks = [];
const lines = diffOutput.split('\n');
let startLine = null;
let endLine = null;

lines.forEach((line, index) => {
if (line.startsWith('@@')) {
// Extract line numbers from the hunk header
const matches = line.match(/@@ -\d+,\d+ \+(\d+),(\d+) @@/);
if (matches) {
startLine = parseInt(matches[1], 10);
endLine = startLine + parseInt(matches[2], 10) - 1;
diffBlocks.push({ startLine, endLine });
}
}
});

return diffBlocks;
}

2. AI Request for Block Analysis

Send each diff block to AI for analysis.

async function analyzeDiffBlock(diffBlock, fileName, apiKey) {
const response = await axios.post(
'https://api.openai.com/v1/chat/completions',
{
model: 'gpt-4',
messages: [
{
role: 'system',
content: "You are an AI reviewing code diffs. Provide detailed comments for each block."
},
{
role: 'user',
content: `Review this code block for ${fileName} (lines ${diffBlock.startLine}-${diffBlock.endLine}):\n\n${diffBlock.diff}`
}
]
},
{
headers: {
Authorization: `Bearer ${apiKey}`,
'Content-Type': 'application/json'
}
}
);

return response.data.choices[0].message.content;
}

3. Persist Comments to GitHub

Post the comments using the GitHub API.

async function postMultiLineComment(octokit, repoOwner, repoName, prNumber, fileName, diffBlock, comment) {
await octokit.pulls.createReviewComment({
owner: repoOwner,
repo: repoName,
pull_number: prNumber,
path: fileName,
side: "RIGHT",
start_line: diffBlock.startLine,
line: diffBlock.endLine,
body: comment
});
}

4. Overall Executor Flow

Combine everything to process multi-line diffs and persist comments.

async function run() {
const prNumber = core.getInput('pr_number');
const repoOwner = core.getInput('repo_owner');
const repoName = core.getInput('repo_name');
const baseBranch = core.getInput('base_branch');
const apiKey = process.env.OPENAI_API_KEY;
const githubToken = process.env.GITHUB_TOKEN;

const octokit = new Octokit({ auth: githubToken });

// Get changed files
const changedFiles = execSync(`git diff --name-only origin/${baseBranch}`).toString().split('\n').filter(Boolean);

for (const fileName of changedFiles) {
const diffBlocks = getMultiLineDiffs(fileName, baseBranch);

for (const diffBlock of diffBlocks) {
const aiComment = await analyzeDiffBlock(diffBlock, fileName, apiKey);
await postMultiLineComment(octokit, repoOwner, repoName, prNumber, fileName, diffBlock, aiComment);
}
}

console.log('AI-powered PR review completed.');
}

run();

Next-Level Features

  1. Aggregate Feedback:

    • Collect all comments and generate an overview summary for the PR.
  2. Refine Block Context:

    • Preprocess blocks to highlight key code areas (e.g., imports, function definitions).
  3. Error Handling:

    • Skip blocks that fail AI processing and log errors for later review.

Why This Approach Works

  • Separation of Concerns: The executor handles Git and GitHub operations, while AI focuses on generating feedback.
  • Multi-Line Precision: Each comment maps directly to the affected code range, ensuring high relevance.
  • Scalability: Handles large PRs by chunking diffs into manageable blocks.

Let me know if you’d like further refinements or additional functionality—this system is ready to evolve!