Congratulations! You've been appointed to modernize operations. Manually retyping information is a thing of the past, and we'd rather not write our own emails anymore either. As an experienced manager, you've naturally scrutinized the various processes first and can confirm they still have a reason to exist in the first place. It's then been decided that we'd like to keep the team the same size, and outsourcing to low-wage countries is also off the table.
That means we get to go shopping for software. Preferably something with Artificial Intelligence in it, because it's 2025 and the management team wants us to keep up with the times. Full of optimism, you open Google and search for "automation," "AI," and "logistics." More than 500 results jump onto your screen. All software, all AI, and all logistics. How on earth do we find our way through this forest of smart solutions?
Skip straight to Chapter 5 if you're already well-versed in AI.
1. Which problems are we trying so solve?
Let's start with the problem. For this Shopping Guide, we're focusing on the administrative and communication burdens that come with transport. Overflowing mailboxes with orders, questions, and claims, invoices that don't match, stacks of unarchived CMRs, and a hefty pile of customs forms.
The different types of documents all look very similar at first glance, but on closer inspection turn out to vary in layout, language, and content and on top of that, they often contain errors or other irregularities. The customer support team handles this perfectly well, by the way. Some of them have been working there for decades and have become an irreplaceable part of the organization. They know your customers inside out and fill everything into the TMS, WMS, CRM, or ERP in no time. Good thing they're there.
However, the customer support team isn't just there to answer messages and retype documents: they're also crucial for creatively solving emergencies, being available to speak with customers at all times, and planning transports. They also go on holiday sometimes, deservedly so, which means specialist knowledge is temporarily unavailable. During peak periods, scaling up is difficult, and as workload increases, costly typos start creeping in.
Customer Service team in the wild - Source: Unsplash
In short, we're looking for a solution to improve the lives of customer service and ensure continuity of the various processes. The result should be: smoother communication, less retyping, fewer errors, more time for the customer, and preferably also outside working hours. All of this without losing control and while maintaining the personal touch.
2. What is AI and why do we need it?
Everything carries the label of Artificial Intelligence these days. My washing machine at home even has an "AI Wash" program (which I've never actually used, by the way). But what is it exactly, and when does it add value?
Taking a shortcut here: AI is a collective term for more than a hundred different techniques used to recognize patterns in a large pile of data, in order to then make a statement about a new, unseen data point. The underlying mathematical formulas are called an "algorithm" or "model."
Since 2017, we've had language models that are very good at understanding written text, regardless of layout, spelling mistakes, or language. This makes it possible to create a setup that works for all types of documents. Until this point, it was necessary to build a separate template for each document type. This is a stable solution once it's up and running, but not flexible enough to move along with your fickle customers. Every layout change meant back to the drawing board.
Over the past three years, the world of language models has exploded. Not least because the company OpenAI launched their model "ChatGPT" at the end of 2022. They were ingenious enough to train their model for months on pretty much the entire internet, turning it into the ultimate generalist. They then made it available as a chatbot, and you could ask it the most diverse questions, which it always answered neatly. This suddenly made interaction with AI very accessible. Certainly in the beginning, the answers weren't always correct, but still, it was impressive. It became a huge hit, and OpenAI is still one of the fastest-growing companies in the world.
News article from BBC, December 2022 where ChatGPT was announced
Other tech giants then also launched a chat model. Google has Gemini, Microsoft has Copilot, Anthropic has Claude, Facebook/Meta has Llama. Since then, it's easier than ever before to build applications with AI. Previously, it was necessary to label large amounts of data and then train for weeks on end. Now you can connect an LLM to your application and have immediate results. The consequence: an explosion of AI startups and an unprecedented shift from existing software vendors who suddenly all have "AI-powered" features.
3. But what is OCR then?
OCR stands for "Optical Character Recognition" and has been around since the 1970s (or for the purists, since the early 20th century). This fundamental technology is capable of converting scanned and handwritten texts into computer text. If you open a scan without OCR on your computer, you only see an image and can't select the text. With OCR, the text becomes selectable and searchable.
For a long time, the word OCR was used for everything related to document automation, perhaps because it's such an important part of the chain. However, OCR only becomes usable in combination with other technology. The combination of OCR and templates was until recently the most successful duo. Today, that's OCR in combination with AI language models.
4. And what are Agents?
The term Agent has been used more and more over the past year to describe a system that can, so to speak, independently execute tasks from A to Z. However, there's still a lot of ambiguity about the exact definition.
First, there were the Agentic LLMs. These LLMs are capable of answering complex questions that require multiple thinking steps. For example, a model can consult the internet and, based on the information found, perform a reasoning process needed to draw up a work schedule. The elegant thing about Agentic LLMs is that they perform a lot of work based on a single question. No complex configuration is involved; it comes up with the plan itself and executes it. And you can always provide feedback to adjust the results. Magical.
So magical, in fact, that many software entrepreneurs wanted to take that same principle to a higher level. The promise quickly became that a so-called Agent could run through a business process autonomously from start to finish. The reality, however, is more complicated. Complex or sometimes even impossible integrations with business software, endless exceptions (Business Rules), and the occasional hallucination from the model keep the magical effect at bay for now.
To still achieve results, we're seeing many software startups now offer a workflow canvas. As a user, you can use arrows and conditions to connect different actions together to work out a process. Naturally with a hidden LLM where needed, which can solve a complex task in the chain. Although a lot less flexible and elegant than the promise suggested, it is progress toward more, and we'll see a lot of innovation here.
In some cases an applicable meme. Source: X
Anyway, enough about the technology. Let's make our search for a suitable solution more concrete.
5. What to look for in a solution?
Before we dive into the market, we first need to establish the criteria. The processes we're automating are crucial to the business and must therefore meet high quality standards. What do we find important in a solution?
We assume that every serious candidate has the basics in order: it will be able to automatically extract information from a wide variety of documents. It can validate this information and then present it in the correct format to the TMS, WMS, and so on. They work with all common document formats, including PDF, scans, photos, Word, Excel, and email. Because an AI system needs instructions before use and has to learn on your specific data, there's a steep learning curve in getting started. So far, the solutions don't appear to differ much from each other at first glance. What else should we pay attention to?
- Stability – Whether a system processes accurately or moderately, stability comes first. You want to know where you stand and don't want to have to scale up your team's capacity in a panic because the system is often delayed or even comes to a complete halt. It's difficult to get guarantees on this, so the reputation of the solution is an important indicator. Speaking with references always helps.
- Validation against source data – Purely relying on the information from an email or document is often not enough. The way things are written varies per customer, while your system demands a standardized format, possibly including an ID. By connecting your source data, you can map to the desired standard form. This not only makes export possible but also increases your accuracy score.
- Working with exceptions – Information comes in fragmented, full of gaps, and even errors. Sometimes it says A, but the customer really means B. Where a seasoned customer service employee handles this without a second thought, for an outsider it's sometimes impossible to follow. Either way, an AI solution will need to be able to deal with this.
- Human interaction – There are two flavors here: either remove the process entirely from sight, or include the human closely in the process. In the first case, you don't bother the team, but scaling is difficult because feedback is scarce. The second case will lead to good results faster but does require cooperation from the team.
- Audit and management – Given the critical nature of the processes, it's nice to know what's passing through and who did what and when. Being able to assign user-specific permissions also helps to safeguard the integrity of the solution. As a bonus, perhaps Single Sign On, but that's not always a must.
Footnote – Naturally, we'd love to see the often-promised 99.7%+ accuracy. Such promises, however, are far removed from reality. If such a number is even defensible, it usually describes a subset of the data that fits perfectly into the solution's sweet spot. Healthy skepticism is warranted here.
6. The decision landscape
Now let's get down to business. What kind of solution do we choose? As you might have expected, there are many different flavors. We start at the bottom of the chain, with the most minimal bare-bones solutions, and then work our way toward complete end-to-end solutions.
Choose your own adventure. Source: ChatGPT
Category 1: The Do-It-Yourself solutions
Going for maximum freedom, minimal usage costs, and do you have at least 3 developers on the bench? Then this solution might be for you. Essentially, you're starting a startup within your own organization. It's easier than ever before to cobble together a platform yourself, because AI offers a solution there too. You will, however, have to develop all the above-mentioned components yourself to hold your own in a medium-sized organization.
Advantages
- Lowest costs in usage
- Fully customizable to the needs of the business
- Complete control over data and processes
- No vendor lock-in
Disadvantages
- High development costs and payroll
- Continuous further development to stay up to date
- Most components to build and maintain yourself
Success factors
- Minimum 3 to 6 months lead time
- 12+ months before everything works as planned
- 3 people available to carry the project (fewer is possible, but puts continuity under pressure)
Costs
- $200 per month for hosting and data storage
- $500 per month for processing 10,000 documents with the best LLMs
- $200,000+ in salaries for a full-time team of 3 developers
Providers
- OpenAI – The most well-known provider of advanced models
- Anthropic – Currently the most popular solution among developers
- Google Vertex – Complete, robust, but complex
- Orq.ai – Dutch party that brings together different LLM providers in one place, with additional features around monitoring and version control
Here's the translation:
Category 2: The Workflow Builders
This category is best described as a box of Lego. They possess all the necessary components, which connect seamlessly to each other. The focus is on creating background processes, with in some cases dashboards for standard monitoring. Perfect for organizations that already have technical knowledge in-house but don't want to start from scratch.
Advantages
- Quick start possible, first flows within days
- Visual interface makes it accessible for non-programmers
- Many pre-built integrations available
- Community support and templates
Disadvantages
- Limited in complexity and customization
- Can quickly become cluttered with larger setups
- Performance limitations at high volumes
- Dependent on platform updates and changes
Success factors
- 1-2 months for basic implementation
- 3-6 months for full workflow suite
- 1-2 people needed for implementation and maintenance
Costs
- $100 - $1,000 per month in platform licenses
- $65,000 - $125,000 in salaries for development and management roles
- Additional costs for premium integrations
Providers
- n8n – The most popular and most complete solution, originating from Berlin
- Zapier – Originally a data integration platform, now with AI capabilities
- HappyRobot – Strong focus on customer support and automated phone calls
- Lleverage – A Dutch player that focuses on logistics among other things
- UiPath – Pioneer in automation that has conquered the enterprise world
- Microsoft Power Automate – Widely known, integrates seamlessly with Office 365
Category 3: The Specialists
These are the platforms that have fully dedicated themselves to intelligent document processing. These parties have years of experience in the specific domain and have developed their product further based on thousands of use cases. You get a complete solution that, although configuration always remains an important aspect, works right away. You are, however, bound to the capabilities of the platform.
Advantages
- Fastest time-to-value
- Proven technology with existing use cases
- Continuous innovation without own effort
- Support and training included
Disadvantages
- Higher monthly costs
- Limited to the features of the platform
- Dependent on vendor roadmap
Success factors
- 4-6 weeks to first go-live
- Production-ready immediately
- 0.5 FTE for configuration and management
Costs
- $1,000 - $5,000 per month, depending on volume and features
- Often including support and updates
- Scalable pricing models
Providers
- Send AI – Strong focus on the end user and best-in-class validation
- Raft – Specialized in invoice processing and customs clearing
- Rossum – Focus on invoice processing with strong European presence
- Instabase – Broadly oriented, but still high Do It Yourself content
- Adabt – Known as an EDI integration party, now also focusing specifically on order entry
Category 4: The Consultants
This category concerns consultancy firms that implement a combination of the above solutions, often supplemented with custom work. They are uniquely positioned to combine the best of different worlds and can take care of everything. You rely entirely on external expertise without people on your own payroll.
Advantages
- Fully taken care of, no own team needed
- Best practices from the industry
- Combination of different technologies possible
- Scalable resources during implementation
- Often they implement proven platforms with customization on top
Disadvantages
- Highest initial investment
- Dependency on external party
- Knowledge retention is a challenge
- Change requests can be expensive
Success factors
- 3 - 9 months implementation time
- Dedicated project team from consultant
- Good handover to own organization crucial
Costs
- $25,000 - $250,000 for a project from intake to delivery
- License costs often come on top of this
- Hourly rates $95 - $200 for adjustments
Providers
- FutureWorkforce – Often combines platforms like UiPath with their own expertise
- Accenture – Positions itself as AI specialist with proprietary accelerators
- MVR Digital Workforce – Besides logistics, especially expertise in regulated markets
- Nokavision – Works with Instabase among others
- Xebia – Specialist firm with deep technical knowledge
Conclusion
Well, nobody said it would be easy. The choice is vast, and it's probably only going to get bigger. The ideal solution is therefore highly dependent on the composition and wishes of your organization. So we'll have to start testing.
Engaging in conversation with different parties is a first step, but it only really comes to life when you start experimenting. Don't be disappointed either if a pilot doesn't succeed right away. Many of the parties mentioned above were in a very different place a year or two ago, including us. Development is moving fast and it's never too early to learn and gain experience.
A perhaps underexposed aspect is the team from whom you're getting the solution. You can be sure that you'll work closely together to achieve the desired results. That's sometimes hard work for both parties, and it's therefore important that the contact feels natural and trusted. Together we'll get there. Good luck!
About Thom and Send AI
Thom Trentelman is founder and CEO of Send AI. Since 2020, they've been working on the mission to replace chaotic, overflowing Outlook inboxes with streamlined, AI-driven processes. With success in various sectors and an investment from Google's AI fund in their pocket, they're well on their way to making that vision a reality.