Human vs AI Lead Scoring: Which is Better for Your Sales Pipeline?

Learn how human scoring hurts your pipeline and why leading PLG companies are turning to AI to find leads.

Fred Melanson

March 23, 2023

min read

Tried using a printed map lately?

Why on earth would you when you can enter a location on your phone and know exactly when and where to turn?

The same applies to manual lead scoring versus AI scoring. You can try to guess which data points correlate to revenue potential, or let AI tell you exactly what they are.

Relying on humans to score revenue opportunities may have worked a decade ago, but it’s not working for product-led growth.

There’s a new sheriff in town: artificial intelligence (AI). And no, not ChatGPT.

Using artificial intelligence to score leads helps your team find opportunities that humans can’t see, and deprioritize the ones wasting your team’s time.

Early results of implementing AI scoring models show a lift in net new generated pipeline and a productivity boost for revenue teams.

This article showcases:

How human scoring hurts revenue.
Differences between human and AI scoring.
How AI lead models work.
Real examples from PLG companies who’ve implemented AI-based scoring models.

Already know human based scoring is no good? Jump ahead and see what AI scoring can do!

The old way of scoring leads

Most of us fall into this category.

First, marketing comes up with a list of criteria for scoring leads and accounts. Criteria come from gut instincts, persona documentation, and subjective interpretation of customer behavior.

For sales-led companies, criteria might look like these (aka MQLs):

Attended webinar
Title is VP+
Function is Sales
Visited our website 6 times
Downloaded a whitepaper

For PLG companies, the same may apply, but the focus shifts toward product data (aka PQLs):

Logged into product 5x in past 2 weeks
Invited 2 users
Uploaded 5 documents (if you were Dropbox, for example)
Tried a premium feature

Once the criteria are known, marketing (or sales leaders) assign a weight, in the form of points, to each of them.

Based on how many are true for any given lead, an arbitrary “lead score” (usually out of 100) is compiled.

When scores jump above a blindly agreed-upon threshold, leads get passed to sales through a custom CRM field or via alerts (email or Slack).

For example: “all leads that have a score above 75 need to be acted on by SDRs”.

The untold problem here is that scores are passed, but the context is not.

According to Ryan Milligan, Senior Director of Rev Ops at Quotapath:

“Scores alone don’t explain what actions a human took and, more importantly, how your reps should reach out and engage with that person. In other words, what someone did as a human is far more important than whatever their actual score might be.”

Here’s why this needs to stop 👇

Problems with human scoring

PLG has too much data

Picture this: You’re playing follow the cup.

A ball is placed under one of 3 red cups. The gamemaster starts moving them around for a while and you need to keep track of the cup with the ball.

Now imagine the same game with 1000 cups and 5 balls. That’s what it's like to score leads in a product-led motion.

PLG exponentially increases the amount of data factored into the score for 2 reasons:

1. It adds 1 more layer of complexity to the weighting formula.

Scoring used to be a 2D world: ICP data + marketing data.

With PLG, scoring becomes 3D: ICP data + marketing data + product usage data.

Rubik cube representing lead scoring in PLG

Also, correlations between a signal and the lead’s intent to purchase aren’t as clear as requesting a demo. There are more users and accounts to sift through, lots of noise, and relevant trends that are out of sight.

2. More users = more data

When you open up your funnel to freemiums or free trials, you inherently get more users, thus more leads.

Consequently, all 3 layers of your scoring model will have to factor in more data overall.

Inability to analyze large data sets

Humans are great at many things.

Finding trends & correlations in a sea of ever-changing data points is not one of them.

At a certain scale, traditional scoring systems inherently start to fail because we humans don’t have the ability to find imperceptible trends.

What humans are good at and what AI is good at

We are unconsciously (or consciously) bias

Humans have biases. Growth and marketing folks most likely have biases concerning which criteria they see as most useful for sales and how much to prioritize that signal in the model. Sales reps have biased toward the leads they receive from the model and who they consider a good fit.

You may think, for example, that fintech accounts are great prospects. Perhaps because you have one flagship fintech company as a customer. But one success does not mean repeatable, scalable success.

These biases might not correlate to potential revenue yet weigh heavily in your scoring model.

Manual scores don’t adapt to new data

“But wait! I have a data science team!”

Fair enough. Until you ship a new product update, change your onboarding, update pricing plans, or rename tables in your warehouse.

Data teams CAN find trends…but in fixed data sets! Updating the model as the customer experience changes is almost impossible at scale.

Iterating manual scores takes way too long

Furthermore, teams responsible for making use of the score, like sales and marketing, don’t control what goes into it. Communication needs to be incredibly efficient between sales and growth/data for tactical feedback to be factored into scoring updates. Even then, the workload on your data team is massive.

Let’s assume that I haven’t yet convinced you of the inability of your data team to build relevant scores for sales. It still takes weeks, if not months, for data teams to build or iterate upon adequate scoring models.

Result: By the time your model is ready to go live, it’s outdated.

Human-generated lead scores have horrible consequences on pipeline health and sales efficiency. Here’s exactly how it’s chipping away at your revenue 👇

Impacts of human lead scoring

Good opportunities never surfaced

One of the unavoidable consequences of having thousands, if not millions of users is that some of them fall through the cracks.

A recent Beacons case study found that the company used Calixa’s AI lead scoring model to surface the top 15% of users and drive upsells. Before implementing the model, the majority of upsell opportunities were unrealized because sales would blindly cherry-pick the ones to focus on.

Reps focus on bad opportunities

On the surface, some leads might look good: Company X has hundreds of employees, lead has checked your pricing page, and used the product a lot.

Chances are, they want to stay on your free plan forever and have no intention to convert, or don’t have approval from their boss to use your product.

Scores aren’t used at all

It happens more often than not.

Step 1: Data & growth teams spend countless hours building a model.

Step 2: The model doesn’t produce great results.

Step 3: Sales activate their “this is useless” switch.

Step 4: They start viewing scoring alerts as noise.

Step 5: The score becomes a custom attribute that no one checks on CRM records.

meme with Uno cards showing how reps dislike using manual lead scores

AI to the rescue! Introducing the new superpower behind fast-growing PLG companies 👇

How AI lead scoring works

For decades, marketing passed over an arbitrary score to sales, leading to reps losing trust in scoring altogether. Calixa opens that black box and gives sales what they need to find, engage and close the right opportunities.

TLDR? Watch this webinar on How to score PQLs & PQAs using AI 👇

Data to lead in seconds

Calixa’s AI-Powered Prospecting examines all your data sources (firmographic, product, intent, etc.) and trains the model to identify predictive data signals that correlate with your ideal customer.

Instead of painstakingly building a model with in-house resources, you can leverage Calixa’s ML expertise and decade of ML experience.

No need for any overhead from ops and data teams.

Curious about AI for lead generation? Read this 👈

Finding correlations between data and revenue

AI finds hidden buying signals that would have been impossible to find manually. It then weighs the impact that each signal has on each account’s score.

Signal examples (out of thousands of possibilities):

Your best-paying customers used feature A a total number of X times within their first 7 days of signing up.
Opening and resolving 3 support tickets within the first 30 days correlates to an expansion opportunity.
Account with over Y employees in the finance industry should be considered for a conversion opportunity if they surpass 10 users during their free trial.

The model takes out human error by taking an unbiased look at what REALLY drives sales potential, and improves upon itself as new conversion events happen.

Always learning

The advantage of AI is that the model gets smarter over time. The scoring model will continually adapt to the attributes and trends of your existing customers as your product and business change.

Every action reps take trains the model to understand which deals lead to closed won, so it can constantly optimize scores to fine-tune your revenue engine.

Giving reps the context they need to act

Context is key to advancing a deal. Instead of typical generic outreach, reps can look at what data points led to a high PQL or PQA score and use that information to craft the perfect sales narrative.

Calixa provides an easily digestible overview of what the lead has done that makes it qualified for sales, so reps can quickly understand WHY and HOW they should reach out.

A Calixa dashboard with product qualified leads

Instant adjustments to lead flow

Need more leads? Or want to test to see if a secondary product action actually is more meaningful than you first assumed?

There are 2 common scenarios with human-made scoring gone wrong:

1. Sales have too many leads, can’t focus, and discard them as noise.

2. Scoring doesn’t find enough leads and pipeline gen falls short.

With Calixa, GTM teams can compare how PQLs/PQAs with different product signals respond to sales, and tweak the model accordingly.

Revenue teams using Calixa can easily and self-sufficiently broaden and narrow the scope of the scoring model to adjust lead flow or test new assumptions.

Adjustments to the model can be done without being gated by ops or data. No more waiting for a quarter or 2 in between iterations.

Result? Reps get quality opportunities in front of them, understand how to approach leads in seconds, and have better quality conversations.

How Netlify drives revenue with Calixa’s AI scoring model

Using ML scores to prioritize outreach

Netlify’s sales teams use Calixa to find sales opportunities in their existing user base. With 100,000s of signups a month, it’s easy for Netlify reps to get lost in a sea of data. Not knowing which product data points matter.

Filtering account lists with ML (below) helps them go from insight to action.

Based on the reps' bandwidth, accounts can be filtered based on their PQA score, so the strongest accounts are prioritized first.

AI finds sales-ready leads that humans would overlook

Below is an example of an account that has a PQL score of 5, meaning that its revenue potential is high.

If you look at the surface of this account (below), usage is trending down. Prospects seem to be disengaging with the product. An account such as this one may often be overlooked or deprioritized by sales reps.

In this case, overall usage is misleading.

First, a high proportion of the account’s total users are active. Second, key features are being utilized. As you can see below, docs created and storage used are both up.

The AI model has evaluated that they heavily impact revenue potential, so although overall usage is down, relevant usage is up!

Reps can use this context to craft their sales pitch in a way to resonates with the right decision-makers at the account. Without the help of AI, this account might have never been touched by sales and potential revenue would have been left on the table.

Avoiding false positives

On the contrary, some accounts (like the one below) should be left alone even though reps have a natural tendency to throw themselves at it.

From an initial sales analysis of this account, things seem great. Usage is up, the company has hundreds of employees, and they’ve even hit paywalls!

Don’t be fooled.

Example of an account dashboard in Calixa

Digging into the PQL scoring context, we uncover that most users have registered with personal email addresses, the account has been on the free plan for ages and active users are going down.

For a sales rep, this translates into spending efforts on an account that most likely has the intention of staying on the free plan forever.

Even if the company is of decent size, account executives should wait until there are more buying signals before engaging.

Conditions negatively affecting a bad product qualified account in Calixa

How to get your AI lead model

Hopefully, I illustrated how manual scoring can hurt your sales pipeline, and the incredible potential of AI scoring to find the right leads for your reps and get the context they need to drive more sales conversations.

Setting up a Calixa's AI scoring model isn’t complicated nor expensive. Here’s what to do:

Sign up to Calixa for free
Connect data sources to Calixa
Book a call with a PLS expert
Get your model up in running in a few days. No coding or data team required.
Watch deals come falling down 🚀

Ready to see a product-led sales motion in action?

Turn your self-serve funnel into a revenue engine.

Get a demo

Human vs AI Lead Scoring: Which is Better for Your Sales Pipeline?

Fred Melanson

The old way of scoring leads