Database
Last updated
Was this helpful?
Last updated
Was this helpful?
is a system or repository of data stored in its natural/raw format, usually object blobs or files
is a single store of data including raw copies of source system data, sensor data, social data etc
A data lake can include structured data from relational databases (rows and columns), semi-structured data (CSV, logs, XML, JSON), unstructured data (emails, documents, PDFs) and binary data (images, audio, video)
A data lake can be established "on premises" (within an organisation's data centres) or "in the cloud" (using cloud services from vendors such as Amazon, Microsoft, or Google)
Transformed data is used for tasks such as reporting, visualisation, advanced analytics and ML
BLOB stands for Binary Large OBject. A blob is a data type that can store binary data
This is different than most other data types used in databases, such as integers, floating point numbers, characters, and strings, which store letters and numbers
BLOB is a large complex collection of binary data which is stored in Database
Basically BLOB is used to store media files like images, video and audio files
Due to its ability to store multimedia files it takes a huge disk space
Also length of BLOB may go upto 2, 147, 483, 647 characters
BLOB provides fast multimedia transfer
Extract, load, transform (ELT) is an alternative to (ETL) used with data lake implementations
the data is not transformed on entry to the data lake, but stored in its original raw format, this enables faster loading times
ELT requires sufficient processing power within the data processing engine to carry out the transformation on demand, to return the results in a timely manner
Since the data is not processed on entry to the data lake, the query and schema do not need to be defined a priori
Latency: Response time for single request
Throughput: Number of transactions per sec
(ETL) and (ELT) are the two main approaches used to build a data warehouse system