“Big Data Glossary” could probably have been titled something like “Big Data Cheat Sheets” because it’s both more and less than a glossary. Instead the book is an excellent summary of tools in the “big data” space, rather than a list of terms with definitions.
Warden tackles eleven topics:
- Some background on fundamental techniques (e.g., key-value stores)
- NoSQL databases
- Storage techniques
- “Cloud” servers
- Data processing technologies (e.g., R and Lucene)
- Natural Language Processing
- Machine Learning
He covers none of these topics in great detail, which will no doubt cause carping among some folks. However, I really like his approach of sketching broad themes, identifying key projects (or products) in each space, and pointing the reader to further research. Because the field of “big data” is so large, this short book (it’s only 50 pages) serves the extremely useful purpose of tying together the field by providing an overview.
Highly recommended for folks looking to get their feet wet in the great lake of big data.