A hotel in Canmore was running a midweek special — $159 per night for a king suite, breakfast included, valid through November. It was a genuine deal. The hotel posted it on their website, it got picked up by a couple of travel blogs, and it landed in the training data of several AI models.

It's January now. The special ended eight weeks ago. The rate is back to $249. But when you ask an AI assistant for affordable hotel options in Canmore, it still recommends this hotel at $159 with "breakfast included." The traveler calls to book, gets quoted $249, and loses trust in the AI that recommended it. The hotel gets an irritated caller who feels bait-and-switched. Everyone loses.

This is the freshness problem. And it's not a bug in any particular AI system — it's a structural failure in how business data moves from the real world to the tools that agents rely on.

The half-life of business data

Not all data goes stale at the same rate. A business's address, phone number, and name are stable — they change rarely. But most of the data that matters for matching and recommendations has a much shorter half-life.

Deals and promotions: hours to days. The restaurant's $65 prix fixe might run for one weekend. The car dealership's month-end incentive expires at midnight on the 31st. The salon's last-minute cancellation creates a 3 PM opening that's gone by 3:30 PM. This data is valuable precisely because it's time-limited — and it becomes harmful the moment it expires, because now it's a promise the business can't keep.

Availability and inventory: hours to days. The hotel's room inventory changes with every booking and cancellation. The mechanic's open appointment slots change every time someone calls. The daycare's group sizes shift daily. Availability data that's 24 hours old is unreliable. Availability data that's a week old is fiction.

Menus, pricing, and services: weeks to months. Restaurants change menus seasonally. Service businesses adjust pricing. New offerings launch, old ones are discontinued. The website might still say "Now offering hot stone massage!" six months after the therapist who did it left.

Hours and policies: months to years. COVID changed the hours for half the restaurants in North America, and three years later, many Google listings still show the old hours. Holiday hours are a recurring nightmare — the business is closed on Boxing Day, but AI assistants can still think it's open because the Google listing only knows "regular hours."

The pattern is clear: the more useful the data, the faster it expires. The stuff that changes slowly (name, address) is the stuff agents don't need much help with. The stuff that changes quickly (deals, availability, intent) is exactly the data agents need most — and it's the data most likely to be wrong.

Where stale data comes from

The data supply chain for AI agents is long, slow, and full of lag.

A business changes something in the real world — a new menu, a new price, a new deal. If the owner is diligent, they update the website. Maybe the same week, maybe next month, maybe never. If they're very diligent, they also update the Google Business Profile, the Facebook page, and the three other listings that have different information.

Then Google's crawler visits the website. This might happen within hours for high-traffic sites, or weeks for a small business with low domain authority. The crawler captures a snapshot. That snapshot gets indexed.

Then an AI training pipeline — or a retrieval system — ingests the indexed content. For training data, this might happen months later. For a retrieval-augmented system, it might happen within days.

Then the AI agent surfaces this information to a user. By this point, the data might be days, weeks, or months old. The path from reality to recommendation is:

Reality → Website update (days-weeks)
       → Crawler visit (hours-weeks)
       → Index update (hours-days)
       → AI ingestion (days-months)
       → User query (anytime)

Total lag: days to months

For a business's address, this lag is fine. For a deal that expires Friday, this lag means the AI is recommending something that no longer exists.

The trust destruction cycle

Stale data doesn't just produce bad recommendations. It actively destroys trust in AI agents — and by extension, in the businesses those agents recommend.

When an AI agent confidently recommends a deal that doesn't exist, two things happen. First, the user learns that the agent's recommendations can't be trusted, so they stop using it or start double-checking everything — which defeats the purpose of having an agent. Second, the business gets a customer who shows up expecting something the business isn't offering, which creates a negative interaction that the business didn't cause.

A stale recommendation is worse than no recommendation. If the agent said "I don't have current pricing for this hotel," the user would just check the hotel's website. But when the agent says "$159 per night, breakfast included" with full confidence, it creates an expectation that the business can't meet. The hallucination of confidence on top of stale data is the worst possible combination.

This isn't hypothetical. It's happening right now, millions of times a day. AI assistants recommending restaurants that have closed. Quoting prices from last year. Citing hours that changed during COVID and never got updated. Every one of these failures makes the next person less likely to trust an AI recommendation — which makes the AI less useful, which slows adoption, which hurts everyone.

Why scraping faster doesn't solve it

The obvious answer is to scrape more frequently. Recrawl business websites every hour instead of every week. Monitor Google listings in real time. Build better change-detection algorithms.

This helps at the margins but doesn't solve the structural problem, for three reasons.

Most changes never hit the website. The restaurant's Tuesday prix fixe exists in the chef's head. The dealership's month-end flexibility exists in a morning meeting. The groomer's Tuesday 2 PM opening exists in the booking calendar but not on any public-facing page. You can crawl the website every five minutes and you'll never see data that was never published.

Change detection is unreliable. Did the menu page change because the prices went up, or because someone fixed a typo? Is the new rate on the hotel's website a permanent change or a promotional rate that expires in three days? Scrapers see that something changed but can't interpret why it changed or when the new information expires. Structured intent — "this deal runs through January 15th" — requires context that scrapers don't have.

Scale makes it worse, not better. There are tens of millions of local businesses in North America. Scraping all of them frequently enough to maintain freshness is computationally expensive and produces enormous amounts of data that's mostly unchanged. The signal-to-noise ratio is terrible: you're recrawling millions of pages to catch the handful of meaningful changes.

The push model

The freshness problem can't be solved by pulling data faster. It can only be solved by letting businesses push updates when things change.

This is the core design principle behind Pawlo's data model. Instead of scraping business websites on a schedule, we give businesses a direct channel to push updates the moment something changes. The channel is SMS — the lowest-friction method possible.

A hotel revenue manager texts: "14 rooms opened up Thu-Sat, 25% below rack." It's structured, time-stamped, tagged with an expiration, and available to connected AI agents within minutes. Not hours. Not days. Minutes.

Compare the data paths:

Traditional path:
  Reality → Website → Crawler → Index → AI → User
  Lag: days to months

Pawlo push model:
  Reality → SMS → Structured data → AI agent
  Lag: minutes

The push model solves freshness because the business controls the timing. They update when something actually changes — not on a crawler's schedule, not when they remember to log into a CMS, not when the social media intern gets around to it. The update happens when the business reality changes, and the lag between reality and availability is measured in minutes.

Freshness isn't a nice-to-have feature in business data. It's the difference between a recommendation that creates value and a recommendation that destroys trust. The data layer that solves freshness — that gives agents information measured in minutes, not months — is the one agents will rely on. Everything else is yesterday's news, and in the agent economy, yesterday's news is already wrong.

The Freshness Problem: Why Yesterday's Data Is Already Wrong

The half-life of business data

Where stale data comes from

The trust destruction cycle

Why scraping faster doesn't solve it

The push model