Are You Ready for Tricky Wombat?

We’re not for everyone. If your procurement team needs a billion-dollar balance sheet to feel safe, you should buy the billion-dollar product. But if your team cares more about getting the right answer than buying the right logo, keep reading.

You probably think this is the part where we pretend to be humble. Where we say something about being a small company with a big heart, a scrappy underdog punching above its weight. We’ll pass on that.

The truth is simpler and less flattering to the industry. A multi-billion-dollar AI search platform and a small, focused engineering team are not playing the same game. They serve different customers, optimize for different outcomes, and make different trade-offs. Neither is wrong. But one is right for you, and the difference matters more than most vendor comparisons will admit.

Some organizations need the billion-dollar platform. They need a vendor that will survive a three-year procurement cycle, pass a 200-question security review on the first attempt, and show up in a Gartner report their board already reads. The CIO who signs a seven-figure contract with an established leader is making a rational choice. The consequences of that choice being wrong are survivable precisely because nobody questions the decision to buy from the market leader.

That safety has a price, and the price is not measured in dollars per seat.

What you actually buy from a large platform

When you sign with a well-funded AI search vendor, you buy their average. Their platform is tuned for the median query across thousands of companies. Their roadmap is shaped by Fortune 500 accounts that generate the majority of their revenue. Their support model is a ticket queue designed to handle volume, not depth. Their contract is non-cancellable for one to three years because their financial model depends on predictable revenue, not on your satisfaction.

None of this is hidden. These are structural properties of any company serving thousands of enterprise accounts at scale. The engineering team maintaining 100+ connectors does not also tune the pipeline for your specific data. Your account executive manages 40 other accounts. They aren’t spending three hours understanding your documentation architecture. And the product manager whose roadmap is driven by the ten largest customers cannot also prioritize the feature your 500-person team needs next quarter.

For a bank with 50,000 employees, a global compliance framework, and a procurement process that takes eight months, these trade-offs make sense. The platform does enough. “Enough” is the right standard when the cost of a wrong vendor choice is a career-ending headline.

For a 500-person engineering firm, a 1,200-person SaaS company, or a 300-person professional services team, “enough” is a different word for “mediocre.”

The question is not whether you can afford the large platform. The question is whether you can afford to be their 4,700th account when you need to be someone’s top five.

The project problem

Large platforms deploy across your entire organization at once. They have to. Their pricing model is per-seat. Their sales motion is optimized for maximum contract value on day one. Their implementation timeline assumes a company-wide rollout because that’s how the economics work for them.

This creates a problem the industry doesn’t talk about. A company-wide rollout delivers an average solution to everyone and an optimized solution to no one. Your engineering team has different data, different query patterns, and different definitions of a correct answer than your customer service team. Your legal department searches for precision in language. Your marketing team searches for competitive intelligence. A single search pipeline tuned for the average of these use cases serves none of them well.

PMI research on big-bang software implementations is clear: large-scope, simultaneous rollouts are more likely to stall in the definition phase than phased approaches. They expose organizations to higher risk. They create larger blast radii when something goes wrong. And they make it harder to measure success because everything changes at once and nothing can be attributed to a specific improvement.

Phased implementations, project by project, produce different outcomes. Each phase has a defined scope, measurable success criteria, and a small blast radius if something fails. The first project proves the approach works on your actual data. The second project extends to a different team with different needs. The third project incorporates what you learned from the first two. Each deployment is tuned for the specific users it serves.

The difference between these approaches is the difference between buying a suit off the rack and having one tailored. The off-the-rack suit fits. The tailored suit fits you.

Organization-wide rollout

Average pipeline tuned for the median query across all departments.
6–12 month implementation before anyone sees value.
Success measured by adoption percentage, not answer quality.
Your team adapts to the tool.

Project-by-project deployment

Pipeline tuned for each team’s specific data and query patterns.
First results in days. Each project builds on the last.
Success measured by whether the answers are right.
The tool adapts to your team.

The DNA question

This page exists because of a pattern we’ve observed. Some companies evaluate AI tools the way they evaluate office furniture: check the boxes, compare the spec sheets, pick the one with the best brand recognition. That’s a procurement exercise, and it produces procurement-grade outcomes.

Other companies evaluate AI tools the way they evaluate engineering hires: by running them against real problems and measuring what they produce. These companies share a set of traits that have nothing to do with their size, industry, or budget.

Answer quality matters to them more than feature counts. A platform with 100+ connectors sounds impressive until you realize your team uses eight data sources and needs all eight to work flawlessly. These companies would rather have eight connectors that return correct answers than a hundred that return plausible ones. And they measure accordingly. Not queries per day. Not sessions per week. Did the engineer find the runbook before the incident escalated? Did the customer get the right answer on the first attempt? They instrument for outcomes because usage without accuracy is cost without value.

They prefer iteration over commitment. A three-year contract with a vendor you’ve never tested in production is a bet, not a decision. A 30-day pilot with defined success criteria is a decision. If the system hits the bar, they convert. If not, they walk. The ability to leave is a feature, not a risk factor.

Access matters more to them than abstraction. When something breaks, they want to talk to the engineer who built it, not submit a ticket. They want roadmap influence based on their actual needs, not a vote alongside ten thousand other customers. And they’re comfortable with a product that ships weekly instead of quarterly, because they’re building a company that changes, and they need a vendor that keeps pace.

If you read that list and thought "that's us," you're the customer Tricky Wombat is built for. If you read it and thought "that sounds risky," the large platform is the better choice, and it probably is the better choice for you. The worst outcome is picking a vendor whose model doesn't match your operating culture.

What small means and what it doesn’t

You’re going to think we might disappear. That’s a rational concern. It’s also the single most weaponized objection in enterprise software sales. Every incumbent’s rep deploys it, usually in the first meeting, sometimes subtly: “What happens if they go under? Who supports you then?”

We hear this often enough to have an opinion about it. The objection sounds like it’s about your risk. It’s actually about their leverage. A vendor with a non-cancellable three-year contract and no termination rights isn’t offering you safety. They’re offering you lock-in, dressed up as stability. The question is never asked in the other direction: “What happens if this large vendor underdelivers for two years and you can’t leave?”

Here is what happens when a small, focused vendor works with you. Your CEO has our CEO’s direct number. Your support request goes to a named engineer, not a queue. Your feature request is evaluated against a roadmap that serves dozens of customers, not thousands. When something breaks at 9pm, the person who answers built the system that broke.

A large platform works differently by structure, not by choice. Your account is one of 47,000. Your support ticket is triaged by a contractor and routed by severity algorithm. Your feature request competes with Fortune 500 priorities. Your implementation is managed by a systems integrator whose incentive is billable hours, not your success. When something breaks at 9pm, you file a ticket. The response arrives during business hours, in whatever time zone your support tier covers. Escalation requires a manager’s approval. The manager is in a meeting.

Size is not safety. Safety is contractual. Source code escrow with release triggers for insolvency. Data portability guarantees in writing. Termination rights if performance benchmarks aren’t met. Financial penalties for missed SLAs. Change-of-control provisions protecting you in an acquisition. These are protections you can write into a contract with a small vendor. The irony is that most large vendors won’t offer them. Their contracts are non-cancellable. Their pricing is opaque. Their termination clauses are one-sided. The “safe” choice offers fewer contractual protections than the “risky” one.

Research on enterprise software vendor selection supports this. Buyers who represent 10% of a vendor’s revenue receive fundamentally different treatment than buyers who represent 0.01%. The smaller vendor doesn’t claim to care about your success. Your success is existential to their business. That alignment of incentives is structural, not aspirational.

How the winners chose differently

When Slack launched in 2014 with roughly 45 employees, it signed up 8,000 teams on the first day of beta. Not because it won an RFP. Not because procurement approved it. Because individual teams clicked a link, used it for a week, and refused to go back. The enterprise messaging suites offered more features, deeper compliance, and bundled pricing. Slack offered one thing those suites couldn’t match: it worked the way teams actually worked, not the way procurement wanted them to work. Stewart Butterfield’s observation captures the dynamic: “Unlike almost any enterprise software ever, people would talk about it at the coffee shop.”

Figma and Zoom followed the same pattern. Figma replaced desktop design tools by making collaboration instant rather than a process. Zoom replaced legacy video conferencing by making a meeting start with a single click instead of an IT ticket. In every case, the winning product was closer to the user’s actual problem. The losing product was closer to the buyer’s evaluation criteria. These are not the same thing.

A Harvard Business School study found that even after one of the largest software companies in the world launched a competing product, bundled it free, and distributed it to 400 million seats, the focused challenger retained its customer base. NPS scores for the large platform’s alternative dropped 48 points once developers tried the purpose-built tool.

The readiness test

Five questions will tell you whether Tricky Wombat is the right fit. Answer them honestly. If you score below three, the large platform is the better choice for where your organization is today.

Do you have a specific project where AI search needs to work, not a vague mandate to “add AI”? Tricky Wombat deploys project by project. If you don’t have a defined use case with a team ready to test it, you’re not ready for any AI search vendor, including us.
Can you define what a correct answer looks like for that project? “Better search” is not a success criterion. “The system returns the right project specification within three results, with the correct version, 90% of the time” is a success criterion. We’ll help you define this, but you need to care about defining it.
Does someone in your organization have the authority and motivation to run a 30-day pilot? Not a committee. A person. Someone who will test the system against real queries, report honestly on the results, and make a recommendation based on what they observed.
Are you comfortable working with a team that ships weekly instead of quarterly? Our pipeline improves continuously. That means your system changes. If your IT governance requires six months of change management before any software update, our pace will create friction.
Do you value getting the right answer over having the right vendor name on a slide? If the primary buying criterion is “the board needs to see a name they recognize,” Tricky Wombat is the wrong choice. If the primary criterion is “the engineering team needs answers they can trust,” we’re the right one.

Three or more honest “yes” answers means your organization has the culture, the specificity, and the decision-making velocity to get disproportionate value from a focused partner. You’ll deploy faster, get answers tuned to your actual data, and have direct access to the team building the system.

Fewer than three means the large platform’s strengths, broad coverage, brand recognition, procurement-friendly contracts, matter more to your organization right now than pipeline precision. That’s a valid position. The worst outcome is buying a solution that doesn’t match how you operate.

What happens if you don’t choose at all

This is the option nobody talks about in vendor comparisons: doing nothing.

40–60% of qualified enterprise software deals end in no decision. Not a competitor win. Inaction. The team that started evaluating AI search six months ago is still evaluating. The RFP added three more vendors. A new stakeholder joined the committee. Legal has questions. Security has questions. The champion who started the process is losing energy.

Meanwhile, your team is still searching. Still asking colleagues. Still rebuilding work that exists somewhere in a Confluence page nobody can find. Still giving customers inconsistent answers because the knowledge base returns different documents depending on how you phrase the query.

The cost of that status quo compounds. Not dramatically, day by day. Quietly, hour by hour. An engineer who spends 30 minutes finding a runbook that should have appeared in three seconds. A customer who gets a wrong answer and doesn’t come back. A new hire who takes 90 days to become productive instead of 30 because the knowledge system is a maze.

The compound cost of bad search is invisible in your quarterly reports. It shows up in slower velocity, lower retention, higher ramp times, and a persistent, low-grade friction that everyone has learned to work around. The workaround becomes the process. The process becomes the culture. And the company that should be moving fast is moving at the speed of its worst information system.

Would it be a terrible idea to test this on your actual data?

A 30-day pilot on one project with defined success criteria. If the answers aren’t better, you walk away. No multi-year commitment. No procurement odyssey.

Schedule a call

Tricky Wombat is not for every company. We’ve been direct about that on this page because the alternative, pretending to be all things to all buyers, is how enterprise software vendors end up with 100+ features and answers that come back muddy, confusing, and wrong.

We’d rather be the right choice for the right customer than the safe choice for the wrong one.

AI Slop is the New Spam

How to roll out AI implementations

Prompt engineering is only the first step to create great AI