Business cards, those little cards with your contact information. I remember when I got my first box, I felt like a complete office professional! Giving and receiving them was a stringent ritual. I even had business card holders and organizers. But that was a long time ago. Do people still use those?!?
Well, apparently they still do. Yesterday, I got a big stack from my wife from her Singapore trip and she asked if I can encode the data into a spreadsheet. So me being me, instead of spending some time encoding, I spent a bit more time coding.
The first thing I tried to use was Tesseract. It is an open-source Optical Character Recognition (OCR) engine that enables the recognition of text within images, including scanned documents and photographs. Originally developed by Hewlett-Packard (HP) in the 1980s, it was later released as open-source software and has since been maintained and improved by Google.
Step 1. Create a new directory
$ mkdir business-card-scanner $ cd business-card-scanner
Step 2. Initialize a new node.js project
$ npm init -y
Step 3. Install required packages
$ npm install tesseract.js
Step 4. Write the code, I called mine processCard.js. The code is simple. You instantiate the Tesseract object then call its recognize method with the path of the image file, the language (‘eng’ for English), and optional parameters.
const Tesseract = require('tesseract.js');
function processImageFile(jpgFile) {
Tesseract.recognize(jpgFile, 'eng', {
}).then(({ data: { text } }) => {
console.log('OCR Result:', text);
}).catch(err => {
console.error(err);
});
}
processImageFile('IMG_8450.jpg');
So I ran it and I got this (data is fictionalized):
OCR Result:
RADHIKA BANERJEE aero
LEAD PROJECT DIRECTOR . OL afeRl]
Tg ek
© [email protected] : eRe: r] a 1
63 9876543210 : TET Br oz
© ©F Solaris Tower, 1234 Green St., Emerald City 1047-5160 | Tg wide
® www.quantumtech.com . rg ei tigheer Tr
QuantumTech | SCAN TO SAVE DETAILS
Not good, obviously. The reason for the garbage text is that there are symbols and graphics on the business card. The solution would be to segment the image and give only the text to Tesseract. I did this by simply erasing the symbols and graphics. After doing that, I got this:
OCR Result:
RADHIKA BANERJEE
LEAD PROJECT DIRECTOR
[email protected]
+63 9876543210
6F Solaris Tower, 1234 Green St., Emerald City 1047-5160
www.quantumtech.com
Much better but I’d have to segment the images. After that I’d still have to identify the fields. I have a big stack to go through so this won’t do.
So I gave the image to ChatGPT and I got this:
Here are the details from the business card image you provided:
Name: RADHIKA BANERJEE
Title: LEAD PROJECT DIRECTOR
Email: [email protected]
Phone: +63 9876543210
Address: 6F Solaris Tower, 1234 Green St., Emerald City 1047-5160
Website: www.quantumtech.com
The card also contains a QR code with the text "SCAN TO SAVE DETAILS" next to it.
This is impressive! Not only did it segment the image, did the OCR, and even identified the fields. I guess I’ll be using ChatGPT or rather the OpenAI API to scan these business cards!