How to find out which files have been changed between two pushes in a GitLab merge request?

49 Views Asked by At

Imagine this scenario:

  • I create a merge request where I modified a file in src/. This triggers a long-running test job in the CI/CD pipeline.
  • I forgot to update the readme, so I push another commit. I don't want to run the long-running test again, since the readme is not in src/.

My idea now is to check the files that have changed between pipeline 1 and 2. We've enabled merged results pipelines (so GitLab creates a commit with the result of the source and target branches merged together when running pipelines).

My merge request has these pipelines:

Pipeline ID Pipeline Type Commit SHA Notes
#1 merged results fca36db3 First push with changes to src/
#2 merged results 6cb35df4 Second push with changes to the readme

My CI/CD job looks like this:

job:
  script:
    - git diff fca36db3..6cb35df4 --

The output:

$ git diff fca36db3..6cb35df4 --
fatal: bad revision 'fca36db3..6cb35df4'

Obviously the merge commits created by GitLab aren't available here. I've tried to set GIT_STRATEGY to clone and GIT_DEPTH to 0, without success. Is it even possible to work with these commits?

2

There are 2 best solutions below

1
halloei On

It's possible to get these commits from another repository by creating a branch off of them. Idea from Can I view the reflog of a remote (not remote ref)?

job:
  script:
    - curl --request POST --header "PRIVATE-TOKEN: $TOKEN" "$CI_API_V4_URL/projects/$CI_PROJECT_ID/repository/branches?branch=branch-for-fca36db3&ref=fca36db3"
    - git fetch origin branch-for-fca36db3
    - curl --request POST --header "PRIVATE-TOKEN: $TOKEN" "$CI_API_V4_URL/projects/$CI_PROJECT_ID/repository/branches?branch=branch-for-6cb35df4&ref=6cb35df4"
    - git fetch origin branch-for-6cb35df4
    - git diff fca36db3..6cb35df4 --
2
bhito On

Maybe you can try a filtering approach by dispatching pipelines from a job depending on what files have changed between commits. We're using this on a monorepo structure with each folder having its own .gitlab-ci.yml file and it's working quite well.

A dispatching/filtering job would look like this:

variables:
  DEFAULT_COMMIT: "0000000000000000000000000000000000000000" # This is GitLab's default value for a newly created branch

.dispatcher:
  stage: Dispatch
  trigger:
    strategy: depend
    include: $PROJECT_DIR/.gitlab-ci.yml
    forward:
      yaml_variables: true
      pipeline_variables: true
  rules:
      # Checks for first commit on branch - we compare changes to the values we have in main
    - if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH && $CI_COMMIT_BEFORE_SHA == $DEFAULT_COMMIT
      changes:
        paths:
          - $PROJECT_DIR/**/* # This is the filter
        compare_to: main # On first push you just compare with `main`
      variables:
        SOURCE: "commit"
        FOLDER: $PROJECT_DIR
      # If commit is not the first one - we compare changes with the previous commit
    - if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH && $CI_COMMIT_BEFORE_SHA != $DEFAULT_COMMIT
      changes:
        paths:
          - $PROJECT_DIR/**/*
      variables:
        SOURCE: "commit"
        FOLDER: $PROJECT_DIR
      # If it's a commit and we're on main branch, indicates we're merging changes
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
      changes:
        paths:
          - $PROJECT_DIR/**/*
      variables:
        SOURCE: "merge"
        FOLDER: $PROJECT_DIR

The variables defined on each of those jobs are totally optional but we use them on the downstream pipelines to have context of the event that triggered the pipeline and execute a different action depending if it's a merge/commit.