Comparative analysis of all Big Data technologies and services providers are available. Please connect with this for a copy.
The product set up starts with defining role or key that is used to access cloud and granting permission to use cloud for the product. The set up is completed through installing and configuring control and data plane. The control resides at the product provider while data plane belongs to user. There is tunnel between control and data planes. There are various license types which basically allow users to get access at software or platform or infrastructure layer or combinations of these while getting on these Big Data Product. These layers are further combined in compute engines and storage. The license type is just providing access to compute layers or engines or storage. There are more customized features and functionalities. These features and functionalities are accessible through ODBC/JDBC/SDK/API/UI. Auto scale happens in the cloud on the behalf of the product when loads increase or decrease.
For this category of product, the product set up starts with creating and configuring environment & cluster meta, and provisioning technical stack. Three parameters value are set for Big Data. These parameters are number of master nodes, number of worker nodes, and number of scheduler nodes. There is need to select machine type for these nodes. Once cluster or technical stack is ready, the cluster can be started. There are various out of box tools or third party tools used to interact with clusters and hence Big Data engines. Such a set up has out of box features and functionalities. These features and functionalities are accessible through ODBC/JDBC/SDK/API/UI. Auto scale happens in the cloud on the behalf of the product when loads increase or decrease.
A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions. There are cloud provider Data Lake as well as Open Source Data Lake