Working with many clients on Facebook’s Conversion API (aka CAPI), I often receive questions about how the Event Match Quality is calculated for these events.
To explain this, I usually take a step back to explain how I understand Facebook’s user identification mechanisms work. The attributes used in the CAPI payload open a window into these mechanisms that I’ll try to explain.
It’s also important to note that Facebook isn’t different from other ad platforms. Their advantage (along with Google) is widespread across multiple devices. With their apps installed on and integrated into websites, the amount of data they amass is beyond what any other ad network can compete. But that’s for another post.
Let’s dive in.
The basics – Types of user matching
To fully understand how Facebook’s user matching works, we need to go back to Statistics 101 (or later, I don’t remember) and talk about two marching models – Deterministic and Probabilistic.
Deterministic matching is relatively simple – we have complete certainty of the user’s identity, so Facebook can easily attribute the ad view and conversion to a specific user. In the good old days, pre-iOS 14.5, Facebook had your iPhone’s IDFA paired with your Facebook account. This made it possible for them to attribute every single ad watched with a product purchased.
Probabilistic matching is slightly more complex. In a nutshell, it means making an educated guess based on statistical modeling that an anonymous visitor is a certain identified user. Facebook, or any other platform, collects multiple data points about the user’s activity to help feed the model when a new unidentified user is first seen.
Deterministic user attributes
Since the value below offers the best method of identification, Facebook will prioritize using these over anonymous data. Let’s break down these attributes:
Personal information
When setting up the CAPI on Facebook, you can add multiple fields to the events sent, that can hold various user data.
These fields include:
- Phone number
- First name
- Last name
- ZIP Code
- City
- State
- Country
- Date of Birth
- Gender
- External ID (such as CRM ID)
These fields can help match the user to a specific Facebook user. For example, if you run an ecommerce business that offers phone transactions, you can report these as offline conversions back to Facebook with these attributes. Facebook will then look for that user in their database and attribute the ad activity accordingly.
In practice, only email and phone numbers are valid deterministic identifiers – only one person in the world owns a specific email address/phone number, but many can share the same name.
Because of this, personal information is also used for probabilistic matching. For example, if you run a B2B business and want to trigger a down funnel conversion event back to Facebook Ads, you might use this type of user data.
Since the user data you have is most likely their work email and phone, it might not be matched with a specific user in Facebook’s database. However, the user’s full name, gender, and country might give a good enough estimate for Facebook to zero in on a specific user.
Click ID
Another mechanism that Facebook uses is the Click ID (similar to Google’s solution). This ID is passed as a URL parameter that is appended to every outbound link from Facebook’s feed.
You can see that any link you click will be decorated in this manner:
https://www.example.com/cool-product?fbclid=123456789abcdefg
This Click ID enables Facebook to identify the user on the 3rd party browser (e.g. Safari or Chrome). The flip side to using this method is that many users share these links onwards so they can send mixed signals to Facebook – i.e. a friend with a different Facebook user opens a link with a Click ID belonging to me.
Since this method allows deterministic identification, some browsers will actively strip this from the URL.
The Click ID, when available, is stored in a browser cookie _fbc that can be retrieved and sent with CAPI events for improving the Event Match Quality.
Facebook Login ID
A relatively new addition to the identification options (introduced in 2021) is the Facebook Login ID.
Many websites and apps offer users the option to log in/sign up with their Facebook account, which practically identifies the user.
Using this login option returns several users parameters to the website or app:
Email, full name, and a unique Login ID (per user, per app/website).
Passing this value back to Facebook provides a deterministic identification of the user.
Lead ID
Facebook Lead Ads are a great way to drive direct response conversions. In many cases, these lead ads serve as an entry point for down funnel conversions. For example, a car dealership might get people to sign up for a test drive and report an actual conversion on the car sale event.
For this reporting to be accurate, we can use the Lead ID provided by Facebook on the conversion event. This Lead ID is reported to your selected integration, e.g., Hubspot or Zapier, and is too a deterministic identifier, as Facebook can tell which user clicked the Lead Ad.
Probabilistic attributes
Unlike the previous attributes, these attributes don’t provide clear identification of the user. Facebook uses these attributes alongside other signals it collects, to make educated guesses about who a certain anonymous user is.
FBP Cookie
This cookie, _fbp, is set by the Facebook Pixel once activated on a certain page the user visits. Previously, this cookie was set in a third-party context, but it is now set in a first-party context. This limits Facebook’s ability to create consistent tracking across multiple domains.
Setting this anonymous cookie helps Facebook so that their full activity history can still be traced back when a user is identified.
If you inspect the event data sent by the Facebook Pixel, you can see it also sends the screen resolution of the current device. This also helps with the IP/User Agent matching described below.
IP & User Agent
These two attributes are available by default on your browser. The IP address is used for serving the requested content back to your device (on a certain network).
The User-Agent is a short string that identifies the type you’re using so that the server you’re communicating with serves your content in the right format.
For example:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Contrary to popular belief, these don’t serve as unique identifiers. For example, multiple devices on a single Wifi network (e.g. an office space) will share the same IP. Devices might differ by their User Agent, but that too can become meaningless over a network with dozens of users in which multiple devices will share the same IP and User Agent.
Because these are weak identifiers, Facebook only accepts these attributes if both are sent in parallel.
So how can Facebook use these attributes?
Say you’re on a network with a few devices, such as your home IP. On this network, the User Agent attribute is more likely to be unique and can be matched back to a specific user with greater probability.
Try this – open up an incognito session on your phone’s browser. This way, no cookies exist previously, and no Facebook ID exists. No visit a site that is heavily invested in remarketing, for example, BlendJet (as of August 2022). Browse through a couple of pages on the site and exit the browser. Now, open up Facebook. Before you know it, you will be swarmed with remarketing ads by Blendjet, all based on your IP & User Agent. Boom.
Using these attributes
When working with Facebook’s CAPI, you should use as many possible data points as possible. It’s that simple. I understand there are scenarios where this isn’t simple or possible, so try and prioritize these.
Deterministic values are preferred. Whenever you can send these I recommend you do. The probabilistic values reside in the user’s browser and are usually easier to collect, but require you to store them actively. I recommend grabbing these as hidden fields on a form submission. Note that the IP address isn’t available directly in the browser (only on the server calls).
In B2B cases, the personal information (email, phone number) is tricky to use for matching as it will likely differ from the data in the user’s Facebook profile (you don’t log into Facebook with your work email). In this case, the Click ID and Lead ID are essential.
Facebook used to present a score in the Events Manager that ranked your Event Match Quality based on the number of attributes passed, regardless of their relevance/accuracy. A linear scale now replaces it, but I still tend to ignore it when doing CAPI implementations for clients.
On the significance of sending all keys
This case as in an interesting example from a client of mine. They are offline events for down funnel actions of the users arriving from their CRM.
Data sent to Facebook was based only on users with the FBCLID present. We also sent the user’s email address as an additional match key. The events are sent using Zapier (yellow line) and received in the Facebook Events Manager (red line) which logs all events. Events reported in the Ads Manager (blue line) are attributed to actual campaigns.
During March 19th, the scheduled query sending the data to Facebook was changed in a way that damaged the email. Only the FBCLID was now present as a match key, and the match rate from events sent to events attributed plummeted.