Big Data glossary

Deciphering the basics of Big Data

What is Big Data?

Variety    Velocity    Volume

Big data is a term used to describe data sets –both structured (eg databases) and unstructured (eg patient records)– that are too large or complex for traditional applications to process. Today's Big Data solutions allow companies to access, store, process and analyse this multitude of data to reveal patterns, trends, and associations.

Variety: Refers to the different types of data formats (text, video, images etc). Big Data is characterised by the many different formats of the data.

Velocity: Refers to the speed with which the data flows. Big Data is characterised by the high speed with which data flows.

Volume: Refers to the size or number of data. Big Data is characterised by very high volumes of data. 

What is a Data Lake?

Extract    Load    Transform

A data lake is a single dumping ground for data in its native format. Contrary to the 'traditional' Data Warehouse approach, the structure and requirements of the data stored in a data lake are not defined until the data is needed. This promotes a lot more usability and drives down costs, as storage is no longer limited by a specific use case.

Extract Transform Load (ETL): Data Warehouses employ ETL techniques that transform the data before they store them.

Extract Load Transform (ELT): Data Lakes store the data in raw format and transform them only when they are needed.

What is Data Fabric?

Cold Data   Warm Data   Hot Data

Data Fabric is not an application or a piece of software. It is a strategic approach towards data and storage. It is focused on how to store, manage, transfer and maintain data. This covers a much wider spectrum including but not limited to on-premise systems, offsite cloud hosted systems, data backups and archival, and other silos.

Organisations can better plan and manage their data, by not being limited to a single cluster view. One of the better ways to manage it, is by classifying the data into: 

Cold Data: old archived data

Warm Data: data that is a few days/weeks old

Hot Data: newly arrived data

How can WHISHWORKS help your business?

We work hard to be the best, and we are proud to be the data specialists of choice for a wide variety of brands. If you want a deeper understanding of your own data's potential, we are the right people to talk to.

Give me insight
Mulesoft
Hortonworks
MapR Technologies
Qlik
Cloudera
Databricks