In Pictures: 10 Big Data startups to watch

The Big Data space is heating up

  • IDC predicts that the market for Big Data technologies will reach $32.4 billion by 2017, or about six times the growth rate of the overall information and communication technology market. These 10 startups to watch were chosen based on third-party validation, experience, and market potential. We also mixed slightly older startups on the brink of making it big with early stage startups exhibiting raw potential.

  • Why they're on this list: Sumo Logic claims to address the “unknown unknown” problem of machine data: how do you get insights about data that you don’t know anything about, or, worse, when you don’t even know what you should be looking for? Many IT departments study machine logs, but traditional log management tools rely on pre-determined rules and thus fail to help users proactively discover events they don’t anticipate. Sumo Logic’s Anomaly Detection enables enterprises to automatically detect events in streams of machine data, generating previously undiscoverable insights within a company’s IT and security infrastructure.

  • Why they're on this list: Since the creation of SQL, data analysts have tried to find insights by asking questions and writing queries. The query-based approach has two fundamental flaws. First, all queries are based on human assumptions and biases. Second, query results only reveal slices of data and do not show relationships between similar groups of data. Ayasdi believes a better approach is to look at the “shape” of the data. Ayasdi argues that large data sets have a distinct shape, or topology, and that shape has significant meaning. Ayasdi claims to help companies determine that shape in minutes so they can automatically discover insights without ever having to ask questions, formulate queries, or write code.

  • Why they're on this list: Feedzai claims that it can detect fraud in any commerce transaction, whether the credit card is present or not, in real-time. Feedzai combines artificial intelligence (AI) to build more robust predictive models and analyze consumer behavior in a way that mitigates risk, protects consumers and companies from fraud, and preserves consumer trust. Feedzai says that its fraud detection system aggregates both online and offline purchases for each consumer over a longer time-frame, which results in earlier, more reliable detection rates. The software creates profiles for each customer, merchant, location, and POS device, with up to a three-year history of data behind each one.

  • Why they're on this list: Virtualization and cloud management platforms lack actionable information that admins can use to better design, configure, operate, and troubleshoot their systems. CloudPhysics’ goal is to analyze the world’s IT data knowledge to transform computing, driving out machine and human costs in ways never before possible. Today, their servers receive a daily stream of 100+ billion samples of configuration, performance, failure, and event data from their global user base. CloudPhysics’ service combines Big Data analytics with data center simulation and resource management techniques. This approach uncovers hidden complexities in the infrastructure, discovers inefficiencies and identifies risks.

  • Why they're on this list: Connecting consumers with the products and content that they want and need means that smart businesses end up capturing an ever larger slice of that market. Companies like Amazon, Blue Nile, and even Walmart already leverage large-scale data and tech advantages. To compete with these companies, smaller retailers need to reach their audiences with increasing precision and accuracy. BloomReach’s Organic Search combines web-wide intelligence and site-level content knowledge with machine learning and natural language processing to predict demand and dynamically adapt pages to match consumer behavior and intent.

  • Why they're on this list: Hadoop is quickly becoming a key underlying technology for Big Data, but Hadoop is relatively new and rather complicated. Altiscale’s service is intended to abstract the complexity of Hadoop. Altiscale’s engineers set up, run, and manage Hadoop environments for their customers, allowing customers to focus on their data and applications. When customers’ needs change, services are scaled to fit – one of the core advantages of a cloud-based service.

  • Why they're on this list: Most consumer behavior is influenced by the opinions of people we know and trust. While marketers have known this for quite a while, they have trouble acting on it. Pursway’s software is intended to improve customer acquisition, cross-selling opportunities, and retention. By imprinting a social graph onto existing customer and prospect data, identifying actual relationships between buyers, and identifying target customers who have a demonstrated influence over others’ purchasing decisions, Pursway argues that it can help consumer-facing organizations close the gap between how businesses market and how people actually buy.

  • Why they're on this list: The typical way companies try to understand consumer behavior online is through cookies. On smartphones and tablets, cookies don’t have as much traction. Even if cookies are enabled in mobile browsers, they aren’t terribly useful, since browsers are giving way to apps. A potentially better replacement is location. PlaceIQ says that it “provides a multidimensional depiction of consumers across location and time.” PlaceIQ’s product, Audiences Now, focuses on targeting customers where they are, in real time, creating an immediacy to a brand’s marketing strategy.

  • Why they're on this list: There are challenges that prevent companies from fully extracting value from their data. Legacy database technologies are prone to latency. MemSQL says that it solves this performance bottleneck with a distributed in-memory computing model that runs on commodity servers. MemSQL’s in-memory SQL database accelerates applications, powers real-time analytics, and combines structured and semi-structured data into a consolidated Big Data solution. MemSQL says that it empowers organizations to make data-driven decisions, which helps them to better engage customers, discover competitive advantages, and reduce costs.

  • Why they're on this list: The landscape for Big Data database technology is in flux. Hadoop and NoSQL seem to be the platforms most favor, although plenty of organizations are still betting on SQL. Couchbase is placing its bet on NoSQL. The startup argues that its NoSQL document-oriented database technology provides the scalability and flexible data modeling needed for Big Data-scale projects. Couchbase also claims to offer the first NoSQL database for mobile devices.

Show Comments