Wednesday, 12 December 2007

Why doesn't the Oracle RDBMS feature in the web space?

There was a recent post by Nati Shalom analyzing Why most large-scale web sites not written in Java based on a statistics from Pingdom on What nine of the world’s largest websites are running on.

Assuming the Pingdom analysis is correct, there is a glaring whole in the database technologies category used by major websites; Where is Oracle?

Geva Perry picked up on this in his post Where is Oracle?. There are a number of other follow up posts within the blogosphere, but unfortunately mostly have degenerated into personality clashes, protecting personal agendas and the usual large corporation bashing; you can follow the trail if you so desire.

But the question is still valid and seems to have passed the Oracle blogosphere by in the last few month possibly thanks to OOW (unless I missed it of course). Where is Oracle in the web space? Given that there is so much publicity about the Web 2.0 world and innovation making it an important market at the moment, why doesn't the Oracle RDBMS feature in that arena? Many of us claim that the Oracle RDBMS is a very sophisticated product, but why is the apparent market leader in RDBMS technology not adopted as the RDBMS of choice by web companies?

Maybe it's a question of ongoing cost. MySQL and the LAMP stack are free.

Maybe these web companies came from the "start-up" mold, were experimental to start with, and as such the reliance of free products from day 1 was imperative to get their products and services out the door?

Maybe it's a question of open source, and the ability to rewrite and debug the entire stack?

Maybe the Oracle RDBMS is overkill for the database requirements of most websites?

Or maybe the analysis is just plain wrong.

What's your opinion of why the Oracle RDBMS is missing?

Footnote: You'll note that there are some obvious missing web companies in the Pingdom analysis, including Google, Yahoo etc.


andrejk said...

I think it's a combination of 2 factors:
1) As a startup you dont have a lot of money, why spent it when you can get a free alternative.
2) The free alternatives are good enough. Why spend money on features you don't need? Many large websites can run on mysql or postgresql, so why wouldn't it be good enough for a startup?

andrejk said...

One more reason, maybe even more important:

Oracle is mainly choosen by IT people in large organisations where taking risk by choosing open source is hard. Maybe they would like to choose a free alternative, because it's cheap and good enough, but they're not in a possition to do so, because it's percieved as risky by their collegeas.

Startups are run by people who take risks all the time.

Another case of "nobody ever got fired for buying [ibm|microsoft|oracle|...]" you might say.

Patrick Wolf said...

To my knowledge eBay and Amazon are using Oracle.

But to your question. A reason why so many startups are using mySql could be that the programmers/students who who are coming or are still at the university have never used a real database like Oracle and are not aware of the features it offers. Especially if they treat the database just as a dumb data store as most middleware programmers do.


Gary Myers said...

Could be a barrage of reasons.
1. What DB is used/taught at college. If they are hiring grads who haved just used MySQL/Postgres, then stick with it.

I wouldn't be surprised if script coders started with a lot of unbound SQL, and it just wouldn't perform in Oracle, so they write off the database platform.

2. "Maybe the Oracle RDBMS is overkill for the database requirements of most websites?"
I'm not sure about 'overkill', but 'different kill'. Oracle DBMS comes from the days of client/server (and before) when a user would be logged in for minutes/hours with a stateful connection. MySQL especially is geared towards web transactions with low overhead (even at the expense of consistency).

Jake said...

Justin said over Twitter today that Bebo runs Oracle RAC.

Noons said...

Nati's post related to *large* scale web sites. Let's not confuse that with the typical "weekender" small business web site.

For the big folks, database niceties are the last thing in their radar. What they want is the ability to not lose traffic, period.

When I worked at a major SEM, that was the overwhelming priority: any lost click, every single one of them, would cost us real money. We had to provide absolute continuous service. Not .99, not .99999: absolute 0 failure, no excuses.

What they need is absolutely predictable and reliable performance. That means fast code - forget java and its code-bloat - and cached lookup data: no need for a db there.

Now as much as Oracle might want to convince the world they can provide that, fact is: they can't and neither can any other database, BTW. Neither can java. Not at realistic cost levels, anyway.

Look at the details of tpc benchmarks and the facts are there: results are *averaged* over a period. You may have relatively long periods of fast response, followed by short periods of worse performance while the db catches up, flushes caches, writes redos, whatever.

Try to explain that to someone who loses $$$ when they lose "clicks"?

Result? Large scale sites use load-balanced fast, cheap and mean web servers - usually Apache - complemented with custom code in C or other 3-gl language, to capture time critical data: only way to do it with absolute 100% reliability.

The equation is very simple and has never changed: code-bloat never performed well. You want speed at reasonable cost, you reduce the number of instructions you have to execute. Simple as that.

They then aggregate to data stores, as a back-end task. Usually a db, but it's not really important which one at that stage: anything will do as non-fail immediate response is not important by then. And they may use a db as well to conveniently manage and feed the lookup caches of the front-end.

No need for "rocket-science" db here. So they go with what makes sense in terms of $$$: mysql, postgres.

Hence why you don't see much Oracle on these sites, at least not in the front end.

You'll see it used in the back-end: financial side of the business, dw, large offline number-crunching traffic analysis and so on. Where Oracle can really deliver the goods and make sense in economic terms.

It's all to do with the nature of the business. Nothing to do with the merits of a db over another, really.

Alex Gorbachev said...

My comment would be somewhat inline with Noons observation but from a bit different angle.

I agree with him that to achieve extremely high availability and performance, we have to write damn good applications - designed and coded very very well. They can't afford complex environments either - "complexity is the enemy of availability". I just don't agree that Java itself is absolute devil. It's Java heavy frameworks and Java developers' coding and design style (well not all of them, again).

Having said that, MySQL does NOT give many possibilities for tweaking, tuning and working around badly designed applications. Practically any MySQL problem is fixed on application side and MySQL DBA's role is to provide right recommendations for that. At least, that what I concluded from my limited MySQL DBA experience.

I.e. well designed application don't need feature reach database tier.

On the other hand, Web applications are heavy on read traffic so caching on application tier and/or MySQL performs very well there. Now, try to use middle-tier cache and MySQL in heavy concurrent update application.

Moving further to web 2.0 (I hate that term to be honest)... Modern web applications are quite different. We are only in the *beginning* but web traffic is now not read-only anymore and I don't mean simple logging of page-clicks. Thus, I think Oracle time is just coming in web 2.0 space.

Chris Muir said...

Thanks for the comments Alex.

I see your point about most web-apps avoiding complexity in the middle tier. It's my understanding that a fair few of the most successful have actually written their own frameworks, mostly optimised for speed, and fully proprietary.

To bounce a question back at you, you mention MySql doesn't give many possibilities on tuning and working around badly designed applications. Have you any experience and comments of SQL-Server and it's capabilities?