What's Next: Data Disasters
Finding what you want in this pile of data would seem to be an easy problem for computers to solve, given that they are so good at fetching and carrying. Servers generate log files indicating what happened to every file or message, log files go into a huge database, and you run queries against this database, right? Unfortunately these log files are bigger than any database ever. They are bigger than database designers ever expected files to be. They are almost too big to even function in a database application. That's because when data is inserted into an Oracle or IBM DB2 database application, the data gets bigger. It grows by about 30% as metadata (data describing the data) is added. The result is a pile of data petabytes in size (one thousand terabytes). That's not too much data to store but it's too much data to search. It could take days, weeks, months to find what you need.
Until very recently the only searchable logging databases of such size I had heard of were at Amazon.com, eBay, and Google--each developed privately over a period of years and costing, in the case of Amazon at least, hundreds of millions of dollars. Amazon.com says it has so far spent more than $900 million on computer technology for its business and continues to invest at a rate of $200 million per year, a lot of it going to massaging log data. Faced with spending $200 million to avoid a $25,000 fine from the SEC, most companies would pay the fine--except for that little part about the CEO going to jail.
It could have been argued that these legal requirements are unreasonable, even unenforceable. But then along came Addamark Technologies, which changed everything. Addamark makes the storage and searching of petabyte logging databases not simple but easy, and easy is what counts. What couldn't be done at all can now be done in seconds and for around 1% of what Amazon.com paid for the same capability.
Addamark began as an idea in the mind of Adam Sah, who was at that time head techie at Internet Pictures, or iPix, which owns the servers that hold all those pictures of goods for sale on eBay and throws them onto your screen. With an average of 16 million items for sale each day on eBay, most of them having one or more pictures, that's a lot of images. It is also a lot of surfing, since iPix had to transmit those pictures over and over again as required by 50 million potential bidders. Because iPix was paid every time a picture was transmitted, its log files were essentially its billing system and Sah wanted to find a way to generate a detailed bill every day.
Rather than just throw the log data into Oracle or DB2, Sah thought about log data and how it is different from other kinds of database entries. It doesn't change, for one thing, since logs are entirely retrospective and are supposed to tell the truth. Sah found that you can strip log data down to its barest form, then compress it at least 10-to-1 (something you can't do in a regular database), then actually search the compressed data for what you need.
The result is a new type of specialized database that can be of almost limitless size yet can be searched in seconds. Addamark can be filled with any kind of log data from any logging application, and if you want to see every e-mail that mentions Microsoft or which times and by whom a confidential document was transmitted, Addamark produces the goods almost instantly. All this and it runs not on mainframes or even big servers but on clusters of commodity PCs. Expanding your Addamark system can mean a trip to BestBuy.
Addamark is shipping today, to customers that include Agilent Technology, Blue Cross-Blue Shield of North Dakota, Lehman Brothers, and Yahoo. In a high-tech depression this is a company that turned away venture capitalists. It is a 30-person firm at which 12 of those 30 are former CEOs or founders. Addamark, with its patented technology, could be the next Oracle. Remember the name; you might need it.
Robert X. Cringely is a writer, broadcaster, and entrepreneur specializing in technology. Contact him at cringely@inc.com.
- Home
- Magazine
- Contact Us
- About Us
- Advertise
- Events
- Legal Disclaimers
- Privacy Policies
- Subscriptions
- Inc. 500|5000
Copyright © 2009 Mansueto Ventures LLC. All rights reserved.




