Review

  • Checklist and Test Driven are more system-like in the general sense, but the rest are mostly just comprehension strategies for reading code. Some have use cases that they are better suited for. Many can be used in combination.
  • Bottom-Up is best for small, self-contained code changes
  • Top-Down is useful for large code changes
  • Control Flow is useful for debugging
  • Data Flow is useful for uncovering security flaws that stem from the misuse of data
  • Change-Impact is best suited for examining how the new code will impact the rest of the system.

Misc

  • Notes from 10 Best Code Review Techniques
  • Issues with an ad-hoc approach:
    • Files, methods and even statements developers look at, and in which order, is up to the individual developer. Which aspect or issues they look for in code is not well defined, and can change from developer to developer.
    • The way junior developers will learn to review code will neither be efficient nor effective nor consistent.
    • Review methods will be in continuous flux as some members leave and new members join.

Checklist-based

  • Uses a predefined list of items or criteria to assess the code. A checklist can focus on a variety of factors such as coding standards, design principles, security guidelines, and performance considerations.
  • Resources
  • Pros:
    • Provides a systematic and structured method for reviewing code, ensuring consistency and thoroughness.
    • Helps reviewers focus on important aspects that might be overlooked otherwise.
    • Assists in educating new team members about coding standards and best practices.
    • Functions as a memory aid, ensuring important aspects are checked.
  • Cons:
    • Correct usage of the checklist might be unclear or confusing.
    • Aspects not covered by the list might be overlooked.
    • May lack flexibility, as not all checklist items are applicable to every piece of code.
    • It might be not clear where to look, as no order for inspection was given.

Test Driven

  • The code reviewer starts by reviewing test code which will inform them about its purpose. It helps them create a mental model. Then, the reviewer reviews the development code using that mental model as a guide.
  • Pros:
    • Helps improve the test code quality and thus maintainability and quality of the code.
    • Ensures high test coverage and quality.
    • Encourages writing meaningful tests that truly validate the code.
    • Tests can help the developer understand the production code.
  • Cons:
    • Depends heavily on the quality of the tests themselves.
    • Can’t be done in the absence of tests.
    • Tests might replicate the misunderstandings of the code author about the code’s specification.

Bottom-Up and Top-Down

  • In practice, a combination of both approaches is often used. Starting with a top-down review helps understand the system’s architecture and major components, setting the stage for a more detailed, bottom-up review of specific areas of interest or concern.
  • Bottom-Up
    • The developer starts reading, understanding, and reviewing small fragments of the code first.
    • Based on those small fragments, like single code statements, or small methods the developer works their way up to understand and evaluate more and more of the functionality and quality of the software under review. This way, the developer gradually builds a picture of how code elements interact to form the larger system.
    • Pros:
      • Developers build a thorough understanding of the nitty-gritty details of the code.
      • Helps to effectively identify local issues, bugs, and inefficiencies.
      • The developer needs less overall context about the system, making it suitable for new team members or for reviewing isolated components.
      • Works well for unclear or unfamiliar legacy code, or code that isn’t documented.
    • Cons:
      • It is a time-consuming approach due to the focus on details.
      • Reviewers might miss the big picture, as well as architectural or higher-level design problems.
      • When confronted with a large code change, starting at the lowest level can be overwhelming and inefficient.
    • Abstract-Driven
      • Reviewers read small pieces of code, and write down their understanding of the code (aka create descriptions or specifications). Then, they compare the description they created with the existing specification or documentation of the code. If there was a mismatch, they found a problem.
      • To determine the order in which methods or functions are reviewed, this approach often uses dependency information. Commonly, the reviewers are instructed to start with the methods that have the least dependencies on the other parts of the systems
  • Top-Down
    • Starts by reading a user story or the documentation of the software first. These are often non-code artifacts that explain the purpose of the code under review and give the reviewer an overview of what to expect.
    • Then, the reviewer examines the overall structure and components of the systems.
    • In a later step, the reviewer tries to find the most significant method(s), that drive the functionality and starts reviewing in more detail from there.
    • Pros:
      • This approach helps build an understanding of the system’s architecture and high-level design.
      • It is effective in spotting problems with the overall design, structure, and integration points.
      • The approach is suited for larger code changes or even complete codebases.
      • It helps prioritize which parts of the code to focus on for a more detailed review.
    • Cons:
      • Reviewers using only a top-down approach might overlook specific low-level coding issues or bugs.
      • Requires the reviewers to have a good understanding of the overall system, which might not be feasible for newcomers.
      • By examining only high-level artifacts, the reviewer might make incorrect assumptions about how high-level designs are implemented at the lower levels.
    • Pattern Recognition
      • Uses mostly a top-down approach to identify common patterns in the code. Such patterns include design patterns, algorithmic patterns, or idiomatic expressions in the programming language.
      • Pros:
        • Helps create a mental model of the code change and the structure of the code.
        • Streamlines the review process and leads to more consistent review approaches even amongst different team members.
        • Well suited to find problems within the software architecture or design.
      • Cons:
        • Due to the focus on patterns, reviewers might overlook other problems.
        • Using known and common patterns to judge code, can bias the reviewers and they might dismiss good, new approaches, just because they are not established yet.
        • Reviewers need a broad knowledge of patterns and anti-patterns to look for.

Control-Flow and Data-Flow

  • Both methods are complementary to each other. Understanding how data is handled (data-flow) is crucial for evaluating the logic and sequence of operations (control-flow).
  • Control-Flow
    • Focuses on understanding how the program’s execution progresses. It examines the order in which statements, instructions, or function calls are executed and how the program moves from one part to another.
      • This includes looking at loops, conditional statements, function calls, and recursion.
    • Control-flow driven code reviews are great for finding logical errors.
    • Pros:
      • Highly effective in spotting logical problems with the flow of execution, such as loops, conditionals, and sequence of operations.
      • Essential for understanding the execution order and for debugging issues like infinite loops or unexpected branches in logic.
      • Helps in understanding the overall structure of the program, particularly the execution order and the interrelation of various components.
    • Cons:
      • Reviewers might miss issues related to data handling or data integrity.
      • It is less effective for data-intensive applications.
      • Complex to perform in event-driven systems due to the non-linear progression of execution.
  • Data-Flow
    • Focuses on how data moves through the program.
      • Includes tracking the source, use, and modification of variables and data structures, understanding how data is passed between functions, and examining how data state changes over time.
    • Pros:
      • Effective in identifying problems related to data handling, such as data corruption, improper use of data structures, and issues with data lifecycle.
      • Particularly useful for uncovering security flaws that stem from the misuse of data, like buffer overflows or injection attacks.
      • Helps in tracing how data moves through the system, which is crucial for understanding complex interactions and dependencies.
      • Can identify inefficiencies in how data is processed, leading to opportunities for performance optimization.
    • Cons:
      • Tracing data flow can be challenging, especially in large codebases with numerous data paths and interactions.
      • Requires a detailed and thorough examination of how data is handled, manipulated, and transformed, which can be time-intensive.
      • Focusing primarily on data flow can lead to overlooking issues related to control flow, such as logic errors or incorrect program sequencing.
    • Trace-based
      • Dynamic method of following the execution of code at runtime. The code reviewer can do this by stepping through the code in a debugger or by examining logs or output traces.
      • Can be used as part of Control-Flow
      • Allows us to understand the state of the program at various points of execution, to track the sequence of function calls, and to observe changes in variable states over time.
      • Pros:
        • Allows the review to see how the system behaves at runtime.
        • Effective during diagnosing or evaluating “potential” bugs and problems.
        • Helps build a mental model of the code, thus helpful if the reviewer is unfamiliar with the system.
        • Can help catch side effects or anomalies that cannot be found during a static inspection.
        • Suitable for detecting performance problems.
      • Cons:
        • Setting up the reviewer’s system to run the code might be cumbersome and time-intensive.
        • Might distract the reviewer from actually reviewing the source code.
        • The review is limited to the scenarios the reviewer traces.

Change-Impact Analysis

  • The goal is to gather information to be able to judge how the code change will impact the rest of the system.
  • Involves analyzing high-level artifacts, such as specifications, documentations to understand the impact of the code change.
  • Start by understanding the rationale behind the code change, and its description (top-down). Then, identify affected components, analyze the dependency graph and assess the impact of this change on the functionality of the overall system.
  • During such an analysis the reviewer should also look at the impact on the test system: does it need tests to change, or the be added?
  • Pros:
    • Discussions on the impact of code changes can improve the understanding a team has of the overall codebase.
    • Allows to direct reviewing efforts towards impacted and high-risk parts of the codebase.
    • Reduces the risk of introducing bugs or regressions.
  • Cons:
    • Change impact analysis can be very time-consuming.
    • The reviewer has to have already a good and deep understanding of the codebase to understand the impact.
    • It can lead to analysis paralysis, because the reviewer cannot completely envision all potential impacts of this code change, and is thus reluctant to approve it.
  • Cross-Referencing
    • Reading method that can be used as part of the impact analysis
    • Systematically review the dependencies and relationships between code elements, such as classes, functions or variables.
      • Most IDE’s provide built-in tools that help with cross-referencing, which is also often used during code refactoring. These tools build an index of all code elements and link them, which helps during navigation.
    • Pros:
      • Finds dependencies, and relationships and reveals how a change impacts the rest of the system.
      • Allows the reviewer to build a deep understanding of the codebase, and the code change under review.
      • Reviewers get a good understanding of the control flow and the data flow, thus allowing them to identify errors in execution and data manipulation.
    • Cons:
      • If done manually, cross-referencing is tedious and error-prone.
      • For larger code changes and systems might be overwhelming for the reviewer.
      • Focusing too much on relationships and dependencies between artifacts can lead to overlooking other issues.