Hadoop – O'Reilly Radar

Channel: Hadoop – O'Reilly Radar

The chicken and egg of big data solutions

May 16, 2012, 7:00 am

Before I came to O’Reilly I was building the “big data and disruptive analytics practice” at a major systems integrator. It was a blast to spend every week talking to customers in different industries...

View Article

Image may be NSFW.
Clik here to view.

Strata Week: Google unveils its Knowledge Graph

May 17, 2012, 6:45 am

Here’s what caught my attention in the data space this week. Google’s Knowledge Graph “Google does the semantic Web,” says O’Reilly’s Edd Dumbill, “except they call it the Knowledge Graph.” That...

View Article

Image may be NSFW.
Clik here to view.

Strata Week: Visualizing a better life

May 24, 2012, 8:30 am

Here are a few of the data stories that caught my attention this week: Visualizing a better life How do you compare the quality of life in different countries? As The Guardian’s Simon Rogers points...

View Article

Image may be NSFW.
Clik here to view.

Strata Week: Data prospecting with Kaggle

June 7, 2012, 9:30 am

Here are a few of the data stories that caught my attention this week: Prospecting for data The data science competition site Kaggle is extending its features with a new service called Prospect....

View Article

Four short links: 9 July 2012

July 9, 2012, 3:00 am

Personalized Leukemia Treatment (NY Times) — sequenced the tumor’s DNA, found the misbehaving gene, realized there was an existing experimental treatment to tackle that gene, and it worked. Reminds me...

View Article

Heavy data and architectural convergence

July 9, 2012, 9:45 am

Recently I spent a day at the Hadoop Summit in San Jose. One session in particular caught my attention because it hints at a continued merging of the RDBMS and Hadoop worlds. EMC’s Lei Chang gave a...

View Article

Image may be NSFW.
Clik here to view.

Seven reasons why I like Spark

August 21, 2012, 11:45 am

A large portion of this week’s Amp Camp at UC Berkeley, is devoted to an introduction to Spark – an open source, in-memory, cluster computing framework. After playing with Spark over the last month,...

View Article

Four short links: 25 October 2012

October 25, 2012, 3:00 am

Big Data: the Big Picture (Vimeo) — Jim Stogdill’s excellent talk: although Big Data is presented as part of the Gartner Hype Cycle, it’s an epoch of the Information Age which will have significant...

View Article

Four short links: 12 November 2012

November 12, 2012, 3:00 am

Teaching Programming to a Highly Motivated Beginner (CACM) — I don’t think there is any better way to internalize knowledge than first spending hours upon hours growing emotionally distraught over...

View Article

Image may be NSFW.
Clik here to view.

Predicting the future: Strata 2014 hot topics

August 7, 2013, 8:00 am

Conferences like Strata are planned a year in advance. The logistics and coordination required for an event of this magnitude takes a lot of planning, but it also takes a decent amount of prediction:...

View Article

Image may be NSFW.
Clik here to view.

How to analyze 100 million images for $624

December 10, 2013, 6:00 am

Jetpac is building a modern version of Yelp, using big data rather than user reviews. People are taking more than a billion photos every single day, and many of these are shared publicly on social...

View Article

Four short links: 23 May 2014

May 23, 2014, 3:00 am

How to Educate Users (Luke Wroblewski) — help new users in your app, not in a video. Hardware By The Numbers (Renee DiResta) — slides from her keynote at the Solid conference. The mean success rate...

View Article

Image may be NSFW.
Clik here to view.

Interactive Big Data analysis using approximate answers

August 18, 2013, 9:00 am

Interactive query analysis for (Hadoop scale data) has recently attracted the attention of many companies and open source developers – some examples include Cloudera’s Impala, Shark, Pivotal’s HAWQ,...

View Article

Image may be NSFW.
Clik here to view.

Running batch and long-running, highly available service jobs on the same...

September 1, 2013, 9:00 am

As organizations increasingly rely on large computing clusters, tools for leveraging and efficiently managing compute resources become critical. Specifically, tools that allow multiple services and...

View Article

Working in the Hadoop Ecosystem

September 5, 2013, 4:30 pm

I recently sat down with Mark Grover (@mark_grover), a Software Engineer at Cloudera, to talk about the Hadoop ecosystem. He is a committer on Apache Bigtop and a contributor to Apache Hadoop, Hive,...

View Article

Image may be NSFW.
Clik here to view.

Stream Processing and Mining just got more interesting

September 22, 2013, 9:00 am

Largely unknown outside data engineering circles, Apache Kafka is one of the more popular open source, distributed computing projects. Many data engineers I speak with either already use it or are...

View Article

Image may be NSFW.
Clik here to view.

Databricks aims to build next-generation analytic tools for Big Data

September 25, 2013, 6:40 pm

Key technologists behind the Berkeley Data Analytics Stack (BDAS) have launched a company that will build software – centered around Apache Spark and Shark – for analyzing big data. Details of their...

View Article

Dealing with Data in the Hadoop Ecosystem

October 24, 2013, 3:01 am

Kathleen Ting (@kate_ting), Technical Account Manager at Cloudera, and our own Andy Oram (@praxagora) sat down to discuss how to work with structured and unstructured data as well as how to keep a...

View Article

Image may be NSFW.
Clik here to view.

An Introduction to Hadoop 2.0: Understanding the New Data Operating System

January 3, 2014, 9:00 am

By Rich Raposa Apache Hadoop 2.0 represents a generational shift in the architecture of Apache Hadoop. With YARN, Apache Hadoop is recast as a significantly more powerful platform – one that takes...

View Article

More Pages to Explore .....

Latest Images