Webinar
Smarty

Baking a batch: Processing multiple requests at once

The "fast lane" answer

Computers process requests. It's kinda their thing; they receive commands and obediently carry them out to the full extent (except when they don't). Typically these requests can be processed so fast humans don't even notice the time passing. But if it's more than one request, we as users start to notice the most time-consuming part of the process: actually giving the requests to the computer.

That time adds up quickly with additional request entries, and sometimes that means it's no longer faster to have the computer do it. You might compare it to sitting on the freeway during rush hour, and realizing you could Flintstone the car faster than it's current speed.

So what's to be done if you have a lot of requests, but don't want to enter them all manually? The answer is batch processing—a term that refers to any tool or function that allows a user to dump a whole mess of requests on a computer all at once, leaving the machine to sort it out and process the batch (don't feel bad, the machine can handle it). That way, the only time lost is in the computer's process time, which is largely determined by its hardware. As long as the hardware is robust, the computations happen in record time.

The "scenic route" answer

When getting what you want out of a computer takes longer than you want, there are two possible sources for the slow: the man and the machine. It's the human's fault when the user is typing up a document at 2 WPM. It's the machine's fault when your favorite video game glitches and lags because the machine can't handle the graphics.

These are rough estimations, but the principle is largely the same. The computer's "slow" happens because it's not properly prepared to handle the size or quantity of requests. The human "slow" happens not because the human brain is slower than a computer's processing speed, but rather because it takes him so long to turn his thoughts into computer input. In other words, your computer is browsing the internet at a reduced speed because you have 30 tabs open; your Word document is filling slowly because your fingers don't move as fast as your lips or brain.

Let's look briefly at each problem, then we'll look at the solution to both.

The capacity problem, A.K.A. Overcooking your CPU

Computers are fast. It has been noted and observed that they experience whole lifetimes worth of productivity in just a single second of human reality. So when you ask it to do something like put the letter "A" on the text document, it's no surprise that it responds in less time than either hitting the key with your finger or writing it yourself by hand.

Computers can even be pushed in some ways; things like overclocking", which is essentially the electronic equivalent to dumping nitrous oxide into a car engine. Overclocking a Central Processing Unit (CPU) pushes it beyond it's "100%" speed and gets it processing faster. The problem is it produces more heat, which risks frying the chip. Which leads us to the overall problem.

Computer hardware and computer software have limits to how fast they can work. And unlike human beings, they can't get better with practice.

So when you're using the computer and making your requests, if you can throw things at it fast enough, eventually it will start to gag on the input. It can still process it, but you'll see a significant slowdown effect (and like the computer will start to warm up; that's why your laptop burns your legs when you're working it too hard). It just has to put its foot down somewhere, or risk frying itself by overheating.

This is the issue at the core of increasing the "bit" base of a CPU or operating system. Changing the processor from a 32-bit processor to a 64-bit processor means you can handle numbers and requests up to 64 bits in size (it's a big number) before having to start crumbling things into more byte-sized pieces (which is where the slowdown starts).

Now there is a way around this. When requests are made individually, they are treated individually, which means they take up more space. Think of it like trying to write a book by only putting a single word on each page. Sure, the words are small, but they're using just as much paper as if the sheet were filled. But if you make the font small, drop it to single space, and really cram 'em in there, you can get more bang for your buck.

There's something similar going on with the bits and bytes you give the computer as input. Separate word files will take up more space than if they were compiled into the same document. Likewise, requests tend to be bigger when they stand on their own.

Which leaves us with a fairly simple workaround: grouping requests together in a batch.

Manual Entry, A.K.A. "I Can't Feel My Fingers Anymore"

Where computers suffer from issues of capacity, humans suffer from an inefficient interface. We do the best we can with what we've got, but we still can't give commands to a computer as fast as we can think them. So until we invent The Matrix, computer output will continue to be restricted by something other than the speed of human thought.

We might call this the problem of manual input. Its effects are minor on the small scale, but they accrue at an almost exponential rate as the size of the project grows. So telling the computer "Open this program," or "Save this file," may happen at the blink of an eye. But correcting by hand the titles of all 5,000 of the songs you just downloaded might take some "better pull up a seat and grab a snack" extra time.

A good example of this is, well, the kind of writing that generated this article. Each character had to be requested individually from the computer, which is done by pressing the buttons. This is a system that makes sense if the one hunched over the keyboard is the origin point for the text that is being entered into the computer. But if you were, say, copying text from another website/document/file the system is a little less efficient.

One keystroke at a time, you might be averaging 60 wpm. That's not bad if there's only three words. If there's three thousand, then you may have a problem. Eventually you reach a point where the convenience of the computer is all but negated by the burden of manual input.

Now if only there were some way to tell the computer "Hey, you know all that text over there? I want it over here too."

The shortcut, A.K.A. The warp pipe

There is, as a matter of fact, a way to tell the computer "That. Just all of it. I want all of that." It's called "copy and paste." This, ladies and gentlemen, is batching at its simplest and most elegant. It takes all the things you want the computer to do as one group, then hands it to the computer as one group, so it can be processed by the computer as one group.

When you copy and paste a wall of text onto a document, you are sending in two requests. "The first is Hey, could you please remember all of this stuff just the way it's written? Thanks, you're a lifesaver." The second request is "Do you still remember all that stuff I told you to remember? Good. Could you scribble it all on a notepad for me?"

That's way fewer than the "Can you give me this letter? Now can you give me this letter? Now can you give me that letter?" requests that have to happen when you do it the old fashioned way. And that means two things. One: you as a user had to make fewer requests. Two: the computer has to process fewer requests. It's cleaner; it takes less effort to begin the process, less effort to process the process, and less effort to return the result to you.

Now in actual practice you will experience a brief delay when pasting a large section of text. That's unavoidable. But it still happens faster than typing each character one by one. That's because even though it has to take a second or two to process, the computer can do it on its own time. Which means it can rearrange stuff on its own to-do list, putting other things aside so it can get the job done for you.

It's kind of like if you had over 9000 M&Ms you needed to load into a truck. You've got a spiffy conveyer belt and someone on the other end to move them from the belt to the truck. It should be easy. Except your M&Ms are unpackaged, and all over the floor. So you spend all your time trying to gather them together and throw them on the belt. It takes forever (and we would know, we've totally done this before).

Even if you found some magical way to start telepathically tossing hundreds of the delicious morsels on the belt at a time, they're still unpackaged, so you still have a similar problem at the other end. The helper has to pick up the pieces individually (just like you), and while he can do it faster than you can, he's still slowed down by how he has to move through the candy-coated nightmare one at a time. What's more, he may be managing another belt where you've put Skittles, and he's trying desperately to keep the two from mixing.

You ever grabbed a handful of M&Ms mixed with Skittles and eaten it? It doesn't end well.

As a computer user, you struggle putting your metaphorical M&Ms on the computer's conveyor belt because everything's in pieces. It takes forever to pick up all the little bits (and bytes) and put them on the belt; so the solution for you is some sort of box you can use to put all your M&Ms in. That way, all of your candy can go on the belt at once.

The computer struggles because everything's in pieces, so nothing can be processed as a cohesive whole. Your requests are thrown into the mix with all the other processes at hand, adding to all the juggling balls already in the air. What your helper (the computer) needs is a way to grab all the M&Ms at once so he can put them on the truck. You know, like having them packaged in a box.

Even if the computer has to halt all other conveyer belts for a few seconds, as long as your requests are in a box it's okay. It's still more efficient to stop everything else and grab the box than it is to try to pick up each piece separately—and keep them separate from the stuff from other conveyer belts long enough to throw it on the truck.

The point is, at its core, you and your computer are really experiencing the same problem.So since the problem is the same, the solution should be the same as well. It's all about that packaging thing, the miraculous "box," grouping stuff together so that they can be dealt with all at once. And this is what we call batching.

Pulling the batch out of the oven

In brief, batch processing is what's happening any time you can put all of your M&Ms in a box, hand them to the computer and say, "Process these for me please." The computer can then deal with the whole box without having to wait—stopping and starting each time—as they all flow in as individual pieces.

Batch processing is available in a lot of different fields. Anytime a program or website you're using let's you upload an Excel or CSV file, you're using batch processing. A good example is the kind of stuff we do here at Smarty. We do address validation and geocoding, and we offer functionality to our users that allows them to upload stupefyingly large numbers of requests all at once. The speed at which we process and return the results are just as stupefying.

So keep an eye out for batching functions; they make life a whole lot easier. And if you have any questions about how we bake our batches, or whether it's chocolate chip or oatmeal raisin*, give us a call; we'd be happy to talk about our recipe for awesome.