Characteristics
Big data can be described by the following characteristics:
- Volume
- The quantity of generated and stored data. The size of the data determines the value and potential insight, and whether it can be considered big data or not. The size of big data is usually larger than terabytes and petabytes.
- Variety
- The type and nature of the data. The earlier technologies like RDBMSs were capable to handle structured data efficiently and effectively. However, the change in type and nature from structured to semi-structured or unstructured challenged the existing tools and technologies. The Big Data technologies evolved with the prime intention to capture, store, and process the semi-structured and unstructured (variety) data generated with high speed(velocity), and huge in size (volume). Later, these tools and technologies were explored and used for handling structured data also but preferable for storage. Eventually, the processing of structured data was still kept as optional, either using big data or traditional RDBMSs. This helps in analyzing data towards effective usage of the hidden insights exposed from the data collected via social media, log files, and sensors, etc. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.
- Velocity
- The speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. Big data is often available in real-time. Compared to small data, big data is produced more continually. Two kinds of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.
- Veracity
- It is the extended definition for big data, which refers to the data quality and the data value. The data quality of captured data can vary greatly, affecting the accurate analysis.
Other important characteristics of Big Data are:
- Exhaustive
- Whether the entire system (i.e., =all) is captured or recorded or not.
- Fine-grained and uniquely lexical
- Respectively, the proportion of specific data of each element per element collected and if the element and its characteristics are properly indexed or identified.
- Relational
- If the data collected contains common fields that would enable a conjoining, or meta-analysis, of different data sets.
- Extensional
- If new fields in each element of the data collected can be added or changed easily.
- Scalability
- If the size of the data can expand rapidly.
- Value
- The utility that can be extracted from the data.
- Variability
- It refers to data whose value or other characteristics are shifting in relation to the context in which they are being generated.
Comments
Post a Comment