German’s engineer trick for perfect Salesforce CICD

I didn’t invent this method. I first saw this in action in 2017 when I was a junior devops, just started in IT and that was actually my first commercial project. So it came naturally to me that – aha that’s how it works by default on enterprise level. Later on I discovered it is not so default method of merging and deploying. On the contrary, it was an “exotic” / “extraordinary” way. I give credits to Uwe, a very good software craftsman from Germany, from who I learned this.

And that trick is to first deploy, then merge.

What it is all about?

In traditional software where you have compiled artifact, you build the solution from the version on the branch. Therefore you first build (for example) locally, if that works you open Pull Request, have your changes merged to target branch, and then it’s being build on target branch.

Optionally, you can auto-deploy it after it is being merged.

TL;DR: First I deploy, then I merge if deploy was successful. Go to How it works for diagram.

The Salesforce Problem

However, in Salesforce you cannot compile artifact and make sure it will work when deploying.

(I am not talking about managed/unlocked packages here, just normal unmanaged metadata a.k.a casual repository with code. – most of the Salesforce implementations)

All you can do before deployment is validation with or without running unit tests.

Now you can say: okay, but if I validate the code and it is success or even on quick deploy option – it assures me that I can deploy it.

Well, most of the times, yes. But there are cases when you validate, but deployment or even quick deployment fails:

You have validated some time ago, but other team members (or even you) have deployed some other components which are now blocking the next deployment.
Background jobs can affect deployment. You can get error like background job currently in progress, try again later.
Manual changes in the meantime – after package was validated, before deployment
When you are not using quick deployment, you need to run apex unit tests for deployment to Production org (depending on what metadata types you are changing within deployment) – then if your unit tests are using @SeeAllData you’re most probably using your org’s data, and org’s data change leaving you for a possible failure of unit test.
Validation does not change your org metadata. Whenever you have deployment which enables deletion (by removing correlation) of other components in post-destructive-changes – you cannot really test/validate if that will succeed or not. Validation can fail, but deployment will eventually pass, as it will remove correlation between to-stay org’s metadata and to-be-deleted org’s metadata.
Changing the type of field from Lookup to Master-Detail, or the other way around cannot be validated, it can only be done using deployment.

Those things don’t happen often BUT if they happen, according to my life experience – you will be in rush and need to deploy your items ASAP.

The Standard Pipeline Problem

We come to a detail which matters, but also which complicates the pipeline – what will happen if I am using traditional deploy after merge?

Depending on the designed pipeline

Validation pass > merge to target branch > deployment failed > retry deployment

Retry can eventually pass if background job finish, or after you manually tweak something in the org – data or metadata that deployment was relying on – but changing metadata manually in org is a bad practice, and sometimes it is violation of enterprise rules.

Nevertheless, if that eventually pass you are good, now, maybe during Root Cause Analysis call you will ask for forgiveness that you manually modified stuff to unblock deployment, but at the end of day you have deployed and that’s what matters. (easier is to ask for forgiveness than to ask for approval ~ some corporate senior employee)

It gets worse when retry is still failing. You are in situation when the branch is no longer deployable, the queue of deployment is piling up and people are screaming that they are blocked and they cannot work because of you. Well, aside people tend to look for excuse not to work, you are under pressure to fix it as soon as possible. Fuckups happen, that’s okay, you are adding fixes to your target branch, either directly or via pull request – it takes time for validation to succeed and unit tests to pass. After your fixes are applied to target branch you trigger next deployment hoping for the best. If it is failing again, you keep adding fixes or even opening case to salesforce support – as sometimes “it is not your fault, but it is still your problem” (I love that quote).

The Solution

What we can do to avoid this situation?

We can “complicate” things by doing first deployment, and only when it succeed we do a merge to target branch of our changes.

Suspension

The issue is, that is not so easy to implement, and it is not default way of handling deployments for most of the tools in the market – therefore you will need to adapt them to your needs. But once you do that, you have, in my opinion, a more bulletproof model of Salesforce CICD.

How it works

Deploy before Merge flow
Open Pull Request > Validation Starts > Trigger deployment > Merge Pull Request

Technical Implementation

From technical standpoint it first does git checkout to target branch, then does git pull of source branch with no fast-forward option. At this point we always have a local merge commit – we will use that to figure out what are the differences in the pull request. But most importantly, we do have combined version of target and source branch. As deploying from pure source branch can lead to overwrite items which were added recently. And my source branch is created some time ago and I can forget or even can’t do a git pull target branch.

Then we compute the delta and build the package.xml using open source plugin named – yes you guessed it – sfdx-git-delta. It can create package.xml, destructiveChanges.xml and source files. I like to store everything as artifact for future troubleshooting and audit matter, and use the package.xml and destructiveChanges.xml for validation and deployment.

At this point we have all we need, we perform validation or deployment – the only difference is in the sf project deploy start command – we either add “dry-run” flag for validation or erase it for deployment. This can also keep our pipeline code clean.

I like to use sfdx hardis here, as it helps a lot with the integrations around – like emails, JIRA etc. and basically I dont need to worry about anything. All it needs is to setup some flags, make sure environment variables and secrets (like API tokens) are sets, and that’s it. It will parse the commits and pull request searching for JIRA IDs, update them as I deploy, and what’s most important – parse the error message if there is any and post it into Pull Request as a comment, sometimes with proposed solution how to fix that. This makes life of devops a lot easier, especially when you dont need to maintain all these integrations all by yourself but community does that for you (it is open-source plugin). So big kudos to Nicolas Vuillamy and all the sfdx-hardis contributors (I am one of them by the way).

So once we have the validation passed we pretty much can pass the feedback of validation to developers and pretty much that’s all. But when it comes to deployment then we push the merge commit (as it is done only locally) to target branch – after that the Pull Request will be merged, as all the changes from source branch will land in target branch.

GitLab Variation

In some systems like GitLab things are little bit different as we have something called “merged results” option. This allow the pipeline to run already from the beginning in the mode of merged changes from source branch to target branch. In this case we don’t need to do git pulls, we will just compute the delta differences, perform validation/deployment and then in case of deployment merge the Merge Request (that is how Pull Request is called in GitLab) via API call.

Final Thoughts

I have been using this method for a few years now in small and medium projects with just a couple of developers, and big ones with 30-40+ people in. It does work in either size. It may be not perfect, it can have edge-cases, but that’s okay, as this is proven to be working and it brings value. It’s your choice whether you want to use a simple one, or a little more complicated one, it’s just good to know you have an option and you will choose the right one for your needs.

PS: And by the way if you are having issues with your CI/CD, or you just want to have this kind of deployment automation implemented in your organization – send me a direct message on LinkedIn, I will see if I can help you.