"Beautiful Data" is a collection of essays on data; how people have transformed it, worked within its confines, and offers a glimpse of where we might"Beautiful Data" is a collection of essays on data; how people have transformed it, worked within its confines, and offers a glimpse of where we might go. Many of the essays are wonderful snippets into how some people perceive data while others fall flat. Overall its a mostly enjoyable read that helps open up your mind to new potentials.
First a disclaimer; I am not a data person. However I've been involved, fairly heavily, in the data field. In the parlance of the world, I'm a back end person. However I'm always trying to think about the front end; how will things be used and what information can we gleen from the system (or systems). With that in mind, this is a book that speaks to me - its all about the front end.
Some of the best essays in the book would be:
The first essay by Nathan Yau he talks very much about user created data and personal databases (knowledge bases). What's exciting here is how he takes data already out there, data you have provided, and creates something useful and yes, beautiful, out of it.
The Second essay by Follett and Holm really gets down to how if you want the data, you need to present it in a way that brings people into the process. As someone who has a slight crush on the statistics and practices in polling (and designing poll questions) this essay really was a fascinating read.
The third essay by Hughes detailed how he handled images on the Mars mission. There wasn't anything here that wasn't done in embedded systems 15 years ago; still it was a great walk down memory lane since I used to program embedded imaging systems.
Chapter 4 really hit home PNUTShell is cloud storage and data processing in real time. This really is the stuff of the future.
Chapter 5 by Jeff Hammerbacher really didn't offer too many insights but his writing style is fluid and fun plus he offered a glimpse into how Facebook grew.
We then have the slow section of the book - Chapter 8 on distributed social data had promise but it read more like a company white page than an interesting article. Same with Chapter 12 and sense.us.
Thankfully chapter 10 on Radiohead's "House of Cards" video was there - and here we are presented with true beauty in data - beautiful enough to create a music video out of!
I'm still on the fence with Chapter 13 - What Data Doesn't Do. It was an interesting chapter but it felt both too long and too short at the same time. I almost felt that in the author, Coco Krumme, were to write a book on this topic, I'd want to read it. However her essay was not the right vehicle.
Finally, the last chapter - "Connecting Data" was a truly inspiring piece; one that offers up paths for the future. I am sure a few start ups will form over the questions posed in by Segaran (or maybe the questions to the questions).
Overall there were enough strengths to overcome the weak chapters. My main complaints are trivial; poor binding of the book, too many PhD candidate papers and not enough from out in the trenches. I'd love to see something from Stonebreaker here; its hard to talk about beautiful data and not have him in it. Or forget Sense.us and talk about many eyes. Or map reduce. Still, "Beautiful Data" succeeds. It opened up my mind to different possibilities for data representation and usage. ...more
Mr Reese has taken on a loaded topic and in less 200 pages he succinctly gets his major points across on that most nebulous term; Cloud Computing.
StaMr Reese has taken on a loaded topic and in less 200 pages he succinctly gets his major points across on that most nebulous term; Cloud Computing.
Starting in the first chapter, Mr Reese begins with his definition of cloud: 1) it must be accessible from a web browser or web service api (non proprietary) 2) 0 capital expenditure to start 3) you pay for only what you use
These simple statements provide the baseline for the rest of the book.
From here he dives right into the meat of the matter. The majority of the book details the things you, and your organization, will need to keep in mind as you move, or contemplate the cloud. Some of this is very obvious; cost of ownership, security, disaster recovery, hardware costs, backup, scaling, etc but Mr Reese pulls out the threads that make the cloud different: both in good ways and bad.
For example, a new wrinkle for cloud is what happens when your cloud provider goes out of business or has a poorly worded injunction exposing all their data (including yours) to the federal government? This is not something you worry about when you own the servers. Mr Reese elegantly explains how you can make this something you don't need to worry about even in the cloud; as long as you use some type of encryption.
Another example of where the cloud provides a potentially huge win would be in disaster recovery. Here a cloud provider provides redundancy of location and with virtual machines you should be easily able to get your system up and running again fairly quickly as long as you've taken the proper precautions (snapshots and a sane backup strategy).
Throughout the entire book, he really drills in security in the cloud. In several of the chapters, not including the security chapter, he keeps coming back to how the little things you do in your design can have a huge impact on your overall security. This is a major worry point and a barrier of entry point for many and Mr Reese spends just the right amount of time explaining how you can truly mitigate the security risks.
Another thread that runs throughout the book is scaling your application. This, to me, is one of the bread and butter wins of cloud computing. Mr Reese talks to some designs that work, and some that don't, when it comes to scaling. While all scaling talk is high level, I believe he succeeds in getting you the reader, to know what questions to ask in your next architecture meeting.
The book is a great overview and it focuses you to ask the right questions when you are dealing with cloud computing. Especially on the Amazon system. Mr Reese takes great pains to point out that yes, he is biased in talking about Amazon since that what he knows. Two appendices do talk about GoGrid and RackSpace but those read more like slick marketing glossies. And that's one of the two failings of the book. The other minor quibble is that a few times Mr Reese tries to go into detail about how something is done on the Amazon cloud (especially EC2 and S3). This is a mistake given how high level this book is. The appendix on the EC2 instructions also seem a little out of place. However these are minor quibbles.
If you are looking for a great introduction to the cloud, what it is and how to think about it, then this is the book for you. If you are looking for something to help you program, interact and learn the API for say Amazon, this is not the book for you. ...more