Ahoy Matey! All Clear on the Data Lake
On Thursday, December 29, 1865, a ship named “London” sailed from the East India docks in England to Australia with 220 passengers and 69 crew members. It also carried cargo—too much cargo. By one account it had over 1,200 tons of iron and 500 tons of coal, including 50 tons on deck. When the “London” sailed out of the Thames, one seaman who watched it pass remarked to his friend: “It will be her last voyage. She is too low down in the water; she will never rise to a stiff sea.”
He was unfortunately right.
On January 11, 1866, the London capsized under its own weight on heavy seas and 220 people died. The shock and furor over this incident resulted in the English parliament creating “waterline” regulations that dictated how much weight a ship can carry in certain conditions. They marked these restrictions with a line on the ship’s hull so that when loaded the ship cannot sink in water below that line. The waterline, or Plimsoll line, has saved thousands of lives and changed shipping as we know it.
Today, Menlo is announcing its investment in Waterline Data—an exciting new big data company—that is transforming Hadoop data swamps into navigable, safe Hadoop data lakes.
The proliferation of Hadoop and Hadoop data lakes is unquestionable. By dramatically lowering the cost of storage, Hadoop has enabled enterprises to store incredible amounts of data. However it has also created a myriad of problems. Nobody knows what’s in these data lakes. Data analysts cannot find or inventory their data. The IT department is struggling with duplicate files and data genealogy. The data governance folks are worried about sensitive data and unfettered access.
These data lakes have become data swamps.
Waterline Data is announcing the release of their flagship product at Strata Hadoop World in New York, which addresses all of these problems. A ship is only as good as its crew—the team behind Waterline Data (Alex Gorelik, Jason Chen, Oliver Claude, Ling Ling, etc.) is exceptional. They live, eat, and breathe big data. They built the marquee products in this space at Acta (Business Objects), Informatica, IBM, and Teradata and have come together to be the waterline of this era. Menlo incubated the company in our offices and we’ve been proud to be their financial partners from day one.
Thankfully no humans are dying in Hadoop data lakes but these data swamps are drowning insights and sucking the life of key business initiatives. Just like the waterline changed shipping, Waterline Data will change Hadoop data lakes forever.
Happy sailing.