
Hadoop Lab Deployment and Support for a US Leading Educational Institution.
Industry
Education
Technologies
Big data, Hadoop
About
The Client is one of the largest educational institutions in the United States.
Challenge
The Client offers a computer science course for future data analysts and big data professionals. To provide their students with comprehensive training encompassing both theory and practice, the Client had a Hadoop lab deployed in the cloud. However, the Client didn t find this solution to be cost-effective. To cut the expenses, they decided to deploy an on-premises Hadoop lab. And they commissioned VolgoTechnologies to install and configure the Hadoop cluster for them, as well as support to ensure the lab s fast adoption.
Solution
Based on the projected data volume to be processed by students and the tasks to be performed, VolgoTechnologies Hadoop consulting team estimated minimal and optimal hardware requirements. Apart from that, our consultants advised the Client on what operating system to choose and what big data technologies and frameworks to deploy so that the Hadoop lab would function as intended. Our team also analyzed what versions of the suggested technologies would make the best combination for the lab.
To keep travel costs down, our team did all the preliminary work offsite. For example, we remotely installed the operating system and configured it. Only the final step – Hadoop deployment itself – required the presence of our consultant onsite.
Staging
Data Ware House
Data Ware House
Desktop Application

Results
The Client got a smoothly functioning on-premises Hadoop lab that serves a valuable source of practical knowledge for their students. Thanks to the training organized by VolgoTechnologies, the Client has quickly understood the role of every technology that makes part of the Hadoop lab and is ready to use them accordingly.
Technologies and Tools
Hadoop Distributed File System, Hadoop MapReduce, Apache Hive, Apache Hadoop YARN, Apache Ambari, Apache Oozie, Apache Spark, Apache Pig, Apache Zeppelin, Apache Ranger, Anaconda, Apache ZooKeeper, the Jupyter Notebook.