The term data engineer is relatively new, and the role appears only sporadically in technical literature. When you add up all the tasks that fit under this role -- obtaining the data, cleaning it, creating enhanced versions -- observers often claim that data engineering comprises 80 to 90% of the work organizations do with data.
Today, many organizations have adopted data lakes as they see the benefits of leveraging their inexpensive storage and flexibility that supports great diversity of data and advanced analytics that offer meaningful insights. The Data Engineering role has now evolved to take advantage of the facilities that data lakes offer.
If you’re pursuing a career in data engineering or looking for ways to adapt your enterprise to the world of big data, this book is our way of sharing some background knowledge you need to find your way forward.