When Scott Thompson left Visa to take the CTO role at PayPal in 2005, the Web company's data centre surprised him. "Wait a minute," he recalls saying, "they run a payment system on Linux?"
"I was pretty familiar with payment systems and global trading systems, but I just scratched my head when I came here," Thompson says. With his history of working on IBM mainframes and large Sun Solaris systems, the PayPal approach to computing seemed alien, especially for a company whose core mission was dealing with money.
PayPal runs thousands of Linux-based, single-rack-unit servers, which host the company's Web-presentation layer, middleware and user interface. Thompson says he quickly saw the economic, operational and development advantages of open source and Linux technology. He now sees no other way to do it.
"When you're buying lots of big iron, as I did in other places I've worked, your upgrade path is $2 million, $3 million at a clip. You just had to buy big chunks of stuff to scale," he says. "Here at PayPal, our upgrade path is 10 $1,000 no-name servers, slapped into the mid-tier of the platform. And we just keep scaling it that way. It's unbelievably cost-effective."
This model also leads to a highly reliable site, says Matthew Mengerink, vice president of core technologies for PayPal, who helped build this architecture from scratch.
"Rather than have a monolithic box, or an impenetrable fortress that never breaks, we just have so many [nodes] that the breakages are irrelevant," Mengerink says. Using a proprietary operating system to build out a system with a thousand points of failure would not be an option, he says. "This distributed, highly redundant system we have is predicated on the cost model of Linux and Intel," he adds.
The distributed model also lets the company make massive shifts and resource allocation when needed. The generic Linux Lego-block-style servers that make up the company's Web tier can be easily shifted around for a variety of tasks.
For example, every day at 0100 PST, PayPal runs its batch processing for reconciling payments. Thompson says this kind of work, typically done on mainframes or large symmetric multiprocessing boxes in other payment organisations, is spread across the middle-tier Linux servers in the data centre.
"We don't bring the site down" he says. "We just allocate a higher portion of the [servers] to running batch processes, and we crunch through all that data in three hours every night."
On the back end, these thousands of systems communicate with just a few large Sun Solaris boxes, which run an Oracle database that stores all customer data. A custom-made database connection-management system links Web processes from the Linux-based Web and middleware tiers of the PayPal site to the Sun/Oracle back end.
"The speed with which the processes come and go is blindingly fast," Mengerink says. "So there is a tier that buffers between those Web and database layers. As far as the application is concerned, it just thinks it is making calls out to a database. The application just doesn't care there is this middle layer. Then the database on its side sees a nice, old-fashioned durable connection, and doesn't feel like it's being melted down by a connection storm."
Open source pays off in app development lab
Using open source Red Hat Linux also provides several advantages on the development side. Because of the low-cost hardware and software PayPal uses on its production system, it can almost replicate the state of its entire live site in the application development lab. This lets PayPal coders write new versions of the PayPal production applications, which can then be switched on live with minimal disruptions.
"When you have developers testing a system on the exact same environment that you have in production, the probability for weird things happening is a lot lower. That's really key," Mengerink says. "Open source clearly makes this a lot more cost effective as well, since you don't have the same licensing costs that would be associated" with duplicating a live site in the lab.
This model also helps PayPal developers frequently churn out new versions of the Web site's main applications, which can be both a positive and negative thing.
"The one struggle we have is a classic struggle, is our decision to align development and the live site," Mengerink says. "Developers are radical. They would be on the beta version of the newest latest and greatest all the time, with some kernel patch they found from some college Web site," he says. "The people in operations are a little more conservative than that. Their take is, no feelings would be hurt if we just used the most stable, known versions of things."
This approach PayPal developers take in moulding the Linux kernel and other open source code they use helps to make the overall system more secure, Mengerink says.
Linux servers in PayPal's data centre run Red Hat kernels with custom tweaks that add extra layers of security to the systems. As a basic step, superfluous services, packages and other software are stripped out.
"The combination of Linux and open source allows us to do the modifications we need to scale and have that extreme rigidity of security," Mengerink says.
Security policies and code also are added to the machines -- Mengerink would not give specifics -- which creates a built-in layer of mistrust among machines on the data centre network. Each box is configured as if it were operating in an untrusted network. "So there is no such thing as a site-wide compromise," he say. "You'd have to go box by box and fight your way."
So far, the mix of distributed Linux and open source software and rapid application development of open source code have been a success, Mengerink says. And it certainly keeps work interesting.
"Sometimes we feel a little schizophrenic," he says. "We're a Web company; we're a real-time payment system -- oh, dear. So doing both is very hard."