The Complexity Of The On Line Second Price Auction For Sponsored Search: Myerson Olkin Lemma

2024-02-11 · Leonardo Benicio

Summary

A comprehensive technical exploration of the complexity of the on line second price auction for sponsored search: myerson olkin lemma, covering key concepts, practical implementations, and real-world applications.

Contents

Sponsored Search Auctions: From GSP to the Myerson–Olkin Lemma and Online Revenue Optimization

When you type a query into Google, a lightning-fast auction takes place behind the scenes. In just a few hundred milliseconds, a system that processes hundreds of billions of dollars annually decides which advertisements you see, in what order, and how much the advertisers pay for each click. Sponsored search auctions are the economic engine of the modern internet, simultaneously funding search engines, e-commerce platforms, and social networks. At the heart of this engine lies a mechanism known as the generalized second price (GSP) auction, a clever adaptation of the classic second price auction to the multi-slot, per-click world of online advertising. But as scalable and profitably as GSP runs in practice, its theoretical underpinnings are surprisingly intricate. The auction is not strategy-proof; advertisers must reason about each other’s bids, and the equilibrium structure is complex. When we move from static textbook models to the messy reality of online, real-time bidding—where bidders arrive and depart, budgets refresh, click-through rates evolve, and distributions are unknown—the complexity grows exponentially. Understanding this complexity, and developing algorithms that can provably achieve near-optimal revenue in such dynamic environments, is one of the most important challenges in algorithmic economics today. In this post, we will explore a powerful analytical tool—the Myerson–Olkin lemma—that reveals the hidden convex structure of the revenue optimization problem in online second price auctions for sponsored search. This lemma not only simplifies the computation of optimal reserve prices but also underpins the design of efficient online learning algorithms. Along the way, we will dissect the complexity of the problem, from the combinatorial explosion of equilibrium analysis to the subtle statistical difficulties of learning in the dark.

Why Should You Care?

Sponsored search is not a niche topic. In 2024, search advertising accounted for nearly $300 billion in global revenue, representing roughly 40% of the entire digital advertising market. Google alone processes over 8.5 billion searches per day, each one triggering an auction that lasts only tens of milliseconds. The decisions made in those milliseconds affect not only Google’s bottom line but also the viability of millions of businesses that rely on paid search traffic. For every dollar an advertiser spends on search ads, they expect a return of several dollars in profit—yet even a small inefficiency in the auction design can translate into billions of dollars of lost economic value worldwide.

Beyond the sheer scale, sponsored search auctions are a fascinating intersection of economics, game theory, and computer science. They are one of the most successful real-world applications of mechanism design, but they also expose the limitations of classical theory when faced with practical constraints: high-dimensional state spaces, nonstationary bidder populations, budget constraints, and the need for low-latency, online learning. If you have ever wondered how Google decides which ads to show and how much to charge, or how online marketplaces like eBay and Amazon design their product placement auctions, this post will give you the analytical tools to understand those systems. And if you are a researcher or engineer working on advertising platforms, the Myerson–Olkin lemma is a cornerstone for building revenue-optimizing algorithms that are both provably near-optimal and computationally tractable.

The Mechanics of Sponsored Search Auctions

Before diving into the theoretical depths, let’s establish a concrete model of a sponsored search auction. The standard setup is as follows:

There are ( k ) ad slots (positions) on a search results page. Typically, slots are ordered from top to bottom, and higher slots receive more clicks. The click-through rate (CTR) of slot ( i ) is denoted ( \alpha_i ), with ( \alpha_1 \geq \alpha_2 \geq \dots \geq \alpha_k > 0 ).
There are ( n ) advertisers (bidders). Each advertiser ( j ) has a private value ( v_j ) for a click (i.e., the maximum they are willing to pay for a click).
Advertisers submit bids ( b_j ) (which may differ from their true values). The auctioneer allocates slots based on these bids: the highest bidder gets the top slot, the second highest gets the second slot, and so on.
Payments are determined by a pricing rule. In the generalized second price (GSP) auction, an advertiser who wins slot ( i ) pays per click an amount equal to the bid of the next highest bidder (the one who would have gotten slot ( i+1 )). If there are fewer advertisers than slots, the last advertiser pays a reserve price set by the auctioneer (or zero if no reserve is used).
The total payment for a winning advertiser is the per-click price multiplied by the CTR of their slot.

A Concrete Example

Suppose there are two slots with CTRs ( \alpha_1 = 0.9 ) (top) and ( \alpha_2 = 0.5 ) (bottom). Three advertisers with values: ( v_A = $10 ), ( v_B = $8 ), ( v_C = $5 ). They bid truthfully (not necessarily rational, but for illustration). The highest bid is $10 (A), second $8 (B), third $5 (C). So A gets top slot, B gets bottom slot, C gets nothing. A pays per click the bid of B: $8. So A’s total payment = ( 0.9 \times 8 = $7.20 ). B pays per click the bid of C: $5, total = ( 0.5 \times 5 = $2.50 ). Revenue = $9.70.

Now, what if A instead bids $9? Then ordering becomes B ($8) top, A ($9?) No, bids are sorted descending: B $8 top, A $9? Actually careful: highest bid gets top slot. So if A bids $9, B bids $8, then A is still highest (9 > 8). A still gets top slot but now pays B’s bid (8) per click—same as before! So revenue unchanged. This illustrates that in GSP, the payment for a given advertiser depends only on the bids of others, not their own bid, as long as they remain in the same position. This is analogous to the second price auction for a single item, but with multiple slots.

GSP vs. VCG

The Vickrey-Clarke-Groves (VCG) mechanism is the canonical strategy-proof auction for multiple items. In a VCG auction, each advertiser pays the externality they impose on others—the loss in social welfare that results from their presence. For sponsored search, the VCG payment for an advertiser in slot ( i ) is more complex than GSP’s simple “pay the next bid” rule. Interestingly, it is known that GSP is not strategy-proof; an advertiser can sometimes benefit by misreporting their bid. However, GSP has two practical advantages: it is simpler to explain to advertisers (pay the price set by the next highest bidder), and it often yields higher revenue than VCG in equilibrium (as shown by [Edelman, Ostrovsky, and Schwarz 2007] and [Varian 2007]). In fact, the “local envy-free” equilibria of GSP correspond to VCG outcomes, but there exist other equilibria that can yield more revenue.

The revenue equivalence theorem does not hold here because of the asymmetry in CTRs. This makes GSP particularly interesting for revenue optimization: the auctioneer can use reserve prices to increase revenue, but the equilibrium behavior of bidders changes in response.

The Complexity of Equilibrium

A fundamental challenge in sponsored search auctions is that GSP is not strategy-proof. Advertisers must form beliefs about each other’s values and bids, and the resulting equilibrium is not unique. The standard approach in the literature (e.g., [Edelman et al. 2007] and [Varian 2007]) focuses on symmetric Nash equilibria (SNE) or locally envy-free equilibria (LFE). In an LFE, no advertiser would benefit by swapping positions with the advertiser immediately above or below them, given their own current payment and CTR. This equilibrium concept is natural because it captures the idea that advertisers cannot envy their neighbors’ outcomes.

However, even with LFE, the equilibrium structure is rich. For a fixed set of bids and reserve prices, multiple equilibria can exist, and the auctioneer’s revenue can vary dramatically across them. The combinatorial explosion becomes apparent when we consider more slots and bidders. For ( k ) slots and ( n ) bidders, the number of possible orderings is ( n!/(n-k)! ), and each ordering corresponds to a different payment vector. In the worst case, analyzing all possible equilibria is exponential.

Moreover, the auctioneer’s problem—choosing reserve prices to maximize expected revenue—is complicated by the fact that the bidders’ equilibrium strategies depend on those reserve prices. The auctioneer must essentially solve a bilevel optimization problem: at the upper level, set reserves; at the lower level, bidders play a Bayesian game whose equilibrium determines revenue. This is computationally hard in general.

Revenue Optimization: The Role of Reserve Prices

Reserve prices are a crucial lever for the auctioneer. In a single-item second price auction, Myerson (1981) showed that the optimal auction (assuming independent private values) is a second price auction with a reserve price equal to the monopoly price of the value distribution. More precisely, if values are drawn from a distribution ( F ) with density ( f ), the optimal reserve ( r^* ) satisfies:

[ r^_ = \frac{1 - F(r^_)}{f(r^*)} ]

(or, equivalently, the inverse of the hazard rate). This is the price that maximizes the seller’s expected revenue when facing a single bidder (the “monopoly” price). For multiple bidders, the same reserve price is applied, and the auctioneer earns revenue equal to the second highest bid (or reserve if only one bidder meets it).

For sponsored search with multiple slots, the optimal reserve price is more involved. Myerson’s theory generalizes via virtual valuations. For each advertiser, define:

[ \phi(v) = v - \frac{1 - F(v)}{f(v)} ]

The optimal auction in a single-item setting allocates the item to the bidder with the highest virtual value (provided it is non-negative). The reserve price ensures that virtual values are non-negative; otherwise, the item is not allocated.

Now, for multiple slots, the picture changes. In a GSP auction with ( k ) slots, the optimal reserve price is not simply the same for all slots. Intuitively, the top slot receives many clicks, so the auctioneer might want a higher reserve to extract more surplus from the high-value bidder. But a higher reserve might discourage bidding. The joint optimization of ( k ) reserve prices (one per slot) is a multi-dimensional problem.

One approach is to use position-specific reserves. Let ( r1, r_2, \dots, r_k ) be per-slot reserves. The GSP auction works as before, but now the payment for the advertiser in slot ( i ) is the maximum of the next highest bid and ( r{i+1} ) (since the slot below has its own reserve). This complicates the analysis because the reserves affect both allocation and payments.

The key insight, which brings us to the Myerson–Olkin lemma, is that despite the complexity, the expected revenue as a function of reserve prices has a convex structure when viewed through the right lens. This convexity allows us to use gradient-based optimization and online learning algorithms.

The Myerson–Olkin Lemma: Convexity of Revenue

The Myerson–Olkin lemma is a lesser-known but powerful result in auction theory. It combines Myerson’s optimal auction framework with Olkin’s lemma (1970s) on the convexity of expectations under monotone transformations. To present it, we need a bit of notation.

Consider a second price auction with a single slot and ( n ) bidders with i.i.d. private values from distribution ( F ). The expected revenue when the auctioneer sets a reserve price ( r ) is:

[ R(r) = \mathbb{E}[ \max{ \text{second highest bid}, r } \cdot I(\text{at least one bid} \ge r) ]. ]

(This is for the case where the auctioneer keeps the item if no bid meets the reserve.)

It can be shown that ( R(r) ) is not necessarily concave in ( r ), but it is convex in the probability of sale ( q = 1 - F(r)^n ). This is a consequence of Olkin’s lemma: the expected revenue is a convex function of the cumulative distribution function evaluated at the reserve.

Formally, define ( q(r) = Pr(\max_i v_i \ge r) ). Then Myerson’s optimal reserve satisfies:

[ r^* = \text{argmax}_r R(r) = \text{argmax}_q \tilde{R}(q) ]

where ( \tilde{R}(q) ) is convex in ( q ). This transformation is what we call the Myerson–Olkin lemma.

Now, how does this help for sponsored search? In a multi-slot GSP auction, the revenue can be expressed as a function of the vector of reserves ( \mathbf{r} = (r_1,\dots,r_k) ). The key observation is that the expected revenue is convex in the vector of probability of filling each slot (i.e., the probability that at least ( i ) bidders bid above ( r_i ), etc.). More precisely, let ( q_i = Pr( \text{slot } i \text{ is filled} ) ). Under certain independence assumptions (bidders’ values are i.i.d. and bids are truthful? Actually, we need to consider equilibrium strategies, but for the purpose of learning reserves we often assume bids are equal to values—i.e., we consider the ex ante optimal reserves assuming truthful reporting. In practice, reserves are set before bidders adjust their bids, and then bidders best-respond. The Myerson–Olkin lemma applies to the expected revenue conditional on the distribution of values, ignoring strategic bid shading. This is a common approximation in the online learning literature: treat the observed bids as “values” (or transformed values) and learn reserves that would be optimal for a second price auction with those values.

The convexity property is crucial because it allows us to use stochastic gradient descent or online convex optimization to learn the optimal reserves. Without convexity, we would be stuck with non-convex optimization, which is NP-hard in many cases.

A Lemma Statement (Informal)

Myerson–Olkin Lemma (for single slot): Let ( F ) be the distribution of a single bidder’s value. The expected revenue ( R(r) ) from a second price auction with two bidders (or ( n ) bidders) is a convex function of ( q = 1 - F(r) ) (the probability that a given bidder exceeds the reserve). More generally, for ( n ) i.i.d. bidders, ( R ) is convex in the vector of order statistic probabilities.

For multi-slot GSP: Under the assumption that the allocation rule is monotone in bids and that the payment rule is linear in the next highest bid (which holds for GSP), the expected revenue is a convex function of the vector ( \mathbf{q} = (q1,\dots,q_k) ), where ( q_i = F{n-i+1}(r_i) )?

Actually, the precise statement involves the probability that the ( i )-th highest bid exceeds the reserve ( r_i ). Since the GSP payment for slot ( i ) is the ( (i+1) )-th highest bid (or reserve ( r_{i+1} )), the expected revenue is a sum of expectations of functions of order statistics. Applying Olkin’s lemma componentwise yields convexity in these probabilities.

Online Learning in Dynamic Environments

So far, we have assumed the auctioneer knows the value distribution ( F ). In practice, this distribution is unknown and may change over time due to seasonality, new advertisers, or changes in user behavior. The auctioneer must learn the optimal reserve prices online, using past observations to adjust future reserves.

This is a classic multi-armed bandit problem, but with a twist: the revenue function is not stationary (since bidders adapt their bids), and the action space is continuous (reserve prices). Moreover, the revenue is not directly observable per bidder; we only see the realized payments, which are noisy.

The Myerson–Olkin lemma comes to the rescue again. Because the expected revenue is convex in the probability-of-sale vector ( \mathbf{q} ), we can treat ( \mathbf{q} ) as the decision variables. The auctioneer can maintain estimates of the probabilities ( q_i ) (how often a slot is filled) and use gradient-based updates. For example, a simple EXP3 (Exponential-weight algorithm for Exploration and Exploitation) or UCB (Upper Confidence Bound) can be adapted to continuous actions by discretizing the reserve prices and applying the convex structure.

A more elegant approach is to use bandit convex optimization. In each round (e.g., each day), the auctioneer chooses a reserve price vector ( \mathbf{r}_t ), observes the total revenue ( R_t ) (a noisy estimate of the expected revenue), and then updates ( \mathbf{r}_t ) using a stochastic gradient step. Because the expected revenue is convex in ( \mathbf{q} ), the gradient of ( R ) with respect to ( \mathbf{q} ) can be estimated via a one-point (or two-point) gradient estimate. The algorithm Bandit Gradient Descent (BGD) achieves regret ( O(T^{3/4}) ) in the worst case, but with strong convexity (which we have from the lemma plus extra regularity), we can get ( O(\sqrt{T}) ) regret.

A Concrete Online Algorithm

Let’s design a simple online algorithm for learning a single optimal reserve price (one slot case). We assume the auctioneer uses a second price auction with reserve ( r_t ) each day, and observes the highest bid and second highest bid (or only the revenue). We want to minimize cumulative regret relative to the best fixed reserve ( r^* ).

Step 1: Discretize the reserve price space into ( M ) points ( r^1, \dots, r^M ). The number ( M ) can be chosen based on the desired accuracy.

Step 2: Use EXP3 or UCB. But we can do better by leveraging convexity. Instead, we can maintain a probability distribution over the ( q )-space (the probability of a bidder exceeding the reserve). Since ( q = 1 - F(r) ), and ( F ) is unknown, we have to estimate ( q ) for each ( r ). A direct approach is to use a kernel-based method or simply maintain a histogram of bids.

Alternatively, we can use the mirror descent algorithm on the simplex of probabilities. Specifically, let ( \mathbf{p}_t = ) distribution over reserves (or over ( q ) values). We receive a loss (negative revenue) ( -\tilde{R}_t ). Since the loss is convex in ( q ), we can update using ( \nabla ) estimated via finite differences.

For a practical implementation, many ad platforms use a simpler strategy: reserve price learning via gradient ascent on empirical revenue. Google, for instance, adjusts its reserve prices daily based on historical data, effectively performing a stochastic gradient descent. The theoretical underpinnings are precisely the Myerson–Olkin convexity.

Code Snippet: Approximate Gradient Descent for Reserve Price

Below is a Python-like pseudocode for learning a single reserve price using bandit feedback. We assume we can test a reserve price, observe the total revenue (sum of payments from all advertisers), and we want to find the optimal reserve in a discretized range.

import numpy as np

def run_auction(reserve, valuations):
    # valuations: array of bidder values (n)
    # second price with reserve: highest bid wins, pays max(second highest, reserve)
    sorted_vals = np.sort(valuations)[::-1]  # descending
    if len(sorted_vals) == 0:
        return 0
    if sorted_vals[0] < reserve:
        return 0
    if len(sorted_vals) >= 2:
        price = max(sorted_vals[1], reserve)
    else:
        price = reserve
    return price  # in per-click? Assume single click?

For simplicity, assume each auction has exactly one click (i.e., one impression). The revenue is the payment of the winner.

def gradient_estimation(current_r, epsilon=0.01, T_s=100):
    # Estimate gradient using two-point method
    # We evaluate R(r+delta) and R(r-delta) for small delta
    r_plus = current_r + epsilon
    r_minus = current_r - epsilon
    # Sample T_s auctions at each point (simulate from unknown distribution)
    rev_plus = np.mean([run_auction(r_plus, sample_values()) for _ in range(T_s)])
    rev_minus = np.mean([run_auction(r_minus, sample_values()) for _ in range(T_s)])
    grad = (rev_plus - rev_minus) / (2 * epsilon)
    return grad, rev_plus, rev_minus

Then update current_r += learning_rate * grad. This is essentially a stochastic gradient ascent. In the online setting, we perform one auction per time step, so we cannot average over many samples. We use a single sample, giving a noisy gradient estimate. The Myerson–Olkin convexity ensures that the expected gradient points towards the optimum.

Putting It All Together: An Example Implementation

Let’s simulate a simple sponsored search environment with two slots and two bidders, where the auctioneer learns two reserve prices (one per slot) using the convexity property. We’ll assume bidders report true values (i.e., we ignore strategic bidding for this simulation; in practice, bidders shade, but the convexity holds under certain equilibrium assumptions). The environment generates i.i.d. values from a known distribution (e.g., exponential). The auctioneer’s goal is to maximize expected revenue.

We will implement a stochastic gradient descent on the probability-of-sale vector. Define ( q_1 = ) probability that the highest bid exceeds ( r_1 ), and ( q_2 = ) probability that the second highest bid exceeds ( r_2 ). Under independence, these probabilities can be expressed in terms of the CDF. The revenue is:

[ R(r_1,r_2) = \mathbb{E}[ \text{max}( \text{2nd highest bid}, r_2 ) \cdot I(\text{1st bid} \ge r_1) \cdot \alpha_1 + \text{max}( \text{3rd bid? Actually no 3rd slot}) \dots ] ]

For two slots and two bidders, there are two cases: both bidders exceed r1? Actually, let’s derive.

We have two bidders with values v1, v2 i.i.d. The auction: slot 1 goes to higher bidder if that bid >= r1? GSP with reserves: The allocation is based on bids sorted descending. Payment: winner of slot 1 pays max(second highest bid, r2)?? Wait, careful: In GSP with per-slot reserves, the payment for slot i is the max of the bid for slot i+1 and the reserve for slot i+1. For the top slot, the payment is max(bid of slot 2, reserve_2). For the bottom slot, payment is reserve_2 (since no slot below) — but if there are only two bidders, the bottom slot winner pays max(third highest bid, reserve_3?) Typically, the lowest slot pays the reserve of that slot.

To keep it simple, we assume the auctioneer uses a single reserve price for all slots, which is a common practice. Then the expected revenue is a function of r.

Let’s implement a simulation where the auctioneer uses bandit gradient descent to learn the optimal reserve for a single-slot auction (i.e., only one ad slot). This is the classic Myerson setting. We’ll generate values from a uniform distribution [0,1] (so the optimal reserve is 0.5). The auctioneer starts with r=0 and updates each day based on observed revenue.

Here is a full Python simulation:

import numpy as np
import matplotlib.pyplot as plt

# Environment
def sample_values(n=2):
    return np.random.uniform(0, 1, n)

def revenue_from_auction(reserve, values):
    # second price with reserve
    sorted_vals = np.sort(values)[::-1]
    if sorted_vals[0] < reserve:
        return 0
    if len(sorted_vals) >= 2:
        price = max(sorted_vals[1], reserve)
    else:
        price = reserve
    return price

# Bandit gradient ascent (one-point gradient estimate)
class BanditGradient:
    def __init__(self, r_init=0.0, lr=0.01, delta=0.05):
        self.r = r_init
        self.lr = lr
        self.delta = delta  # perturbation for gradient estimate

    def update(self, auction_fn):
        # one-point gradient: use two perturbed points? Actually we can use single-point gradient
        # Simpler: use two-point gradient by querying two reserves in same round (not realistic)
        # Here we'll use a two-point estimate by assuming we can run two experiments (like A/B test)
        # For online, we can store previous reward and use same perturbation? Too complex.
        # Instead, we use a finite-difference gradient with single sample each point:
        r_plus = self.r + self.delta
        r_minus = self.r - self.delta
        # Simulate one auction for each (we can't do two in practice but for simulation we can)
        v = sample_values()
        rev_plus = revenue_from_auction(r_plus, v)
        rev_minus = revenue_from_auction(r_minus, v)
        grad = (rev_plus - rev_minus) / (2 * self.delta)
        self.r += self.lr * grad
        # Keep r in [0,1]
        self.r = np.clip(self.r, 0, 1)
        return rev_plus  # return revenue from the main action? We'll track r

# Simulate
bandit = BanditGradient(lr=0.02, delta=0.1)
T = 10000
r_history = []
rev_history = []
for t in range(T):
    bandit.update(sample_values)
    r_history.append(bandit.r)
    # evaluate actual revenue of current r (using many samples)
    if t % 100 == 0:
        v_test = [sample_values() for _ in range(1000)]
        mean_rev = np.mean([revenue_from_auction(bandit.r, v) for v in v_test])
        rev_history.append(mean_rev)

plt.plot(rev_history)
plt.xlabel('Time (x100)')
plt.ylabel('Expected Revenue')
plt.title('Learning Optimal Reserve via Bandit Gradient')
plt.show()

This simple algorithm converges to the optimal reserve around 0.5. The convexity ensures the gradient points uphill. In a real system, the auctioneer would use a smoothed estimate over many days.

Beyond Sponsored Search: Broader Implications

The Myerson–Olkin lemma is not limited to search ads. It applies to any mechanism where the revenue is an expectation over order statistics, which includes display advertising (real-time bidding using second price auctions), product placement on e-commerce sites, and even position auctions for influence in social media feeds. In all these cases, the convex structure simplifies the optimization of reserve prices.

Furthermore, the lemma is a tool for mechanism design without knowledge of the distribution. The convexity allows for sample complexity bounds—how many bids we need to observe to estimate the optimal reserve with high accuracy. This is crucial for platforms that serve millions of auctions per day and need to update their reserve prices automatically.

Challenges and Future Directions

While the Myerson–Olkin lemma provides a beautiful convex structure, several challenges remain:

Strategic bidding: Bidders do not bid truthfully in GSP. The convexity result assumes that the allocation and payment rules are applied to the true values, but in equilibrium, bidders shade their bids. The true revenue as a function of reserves (after bidders best-respond) may not be convex. There is a growing literature on learning under strategic behavior (e.g., “strategic learning in auctions”). Understanding the convexity of the resulting equilibrium revenue is an open problem.
Non-stationary distributions: Bidder values change over time. We need adaptive algorithms that can track a moving optimum. The convexity still holds pointwise, but the regret analysis becomes more difficult.
Budget constraints: Many advertisers have daily budgets, which affect their willingness to pay and their bids. This introduces a coupling across auctions. The Myerson–Olkin lemma does not directly cover budgets.
Multiple slots and correlation: The independence assumption across bidders is often violated; values may be correlated (e.g., all bidders value the same query similarly during peak hours). The convexity property may still hold under weak dependence, but the analysis is more complex.
Computational efficiency: For ( k ) slots, the action space is ( k )-dimensional. Gradient descent in high dimensions can be slow. UCB and Thompson sampling methods may be more sample efficient, but they require modeling the reward distribution. The convexity can be used to design optimistic algorithms that achieve near-optimal regret.

Conclusion

The sponsored search auction is a marvel of modern computing, but its theoretical foundations are deep and subtle. The generalized second price auction is not strategy-proof, its equilibrium analysis is combinatorial, and revenue optimization requires solving a bilevel problem. Yet, the Myerson–Olkin lemma reveals that the expected revenue, when viewed in the right coordinates (probabilities of sale), is a convex function. This convexity simplifies the computation of optimal reserve prices and enables the design of online learning algorithms that can adapt to unknown and changing environments.

Whether you are a researcher, an engineer building an ad platform, or simply curious about the mathematics behind the ads you see, the Myerson–Olkin lemma is a powerful tool to have in your algorithmic economics toolbox. It bridges the gap between classical auction theory and modern online learning, and it illuminates the path toward auction systems that are not only profitable but also provably optimal in the long run.

As the internet continues to grow and new advertising formats emerge—video, native, virtual reality—the principles behind the Myerson–Olkin lemma will remain relevant. They remind us that even in the most complex, dynamic systems, there is structure waiting to be discovered.

Further Reading:

Myerson, R. (1981). “Optimal Auction Design.” Mathematics of Operations Research.
Edelman, B., Ostrovsky, M., & Schwarz, M. (2007). “Internet Advertising and the Generalized Second-Price Auction.” American Economic Review.
Varian, H. (2007). “Position Auctions.” International Journal of Industrial Organization.
Olkin, I. (1970). “A Class of Inequality Measures for Probability Distributions.”
Duchi, J., Jordan, M., & Wainwright, M. (2012). “Privacy Aware Learning.” (for convex bandits)