Storing Archive Data in the Cloud
Unlike backups, data archiving is the mis-understood cousin, that could if implemented save a business or organisation thousands or hundreds of thousands in time and expenditure every year. A data archive isn’t somewhere to put legacy data and let it gather dust, just because it might be needed sometime in the future. The software solutions we provide is an active data archive whereby active and legacy data is constantly monitored and based on policies, the data could be moved to an on-premise NAS and then ultimately a cloud archive.
Unlike past HSM solutions that used stubs, our data archiving solutions use symbolic links, it avoids the use of static stubs and agents. It archives files such that the archived files continue to be accessed from their original location as files, while the data actually resides as objects in the cloud. It provides this file-to-object translation without requiring re-hydration back to the source. Files archived are also accessible as native objects from the cloud without going through the application or the original file storage, so there is no lock-in. Finally, it does not get in front of hot data access.
The data archiving software leverages the sources and targets themselves to create a redundant, overarching namespace. When data is moved, it keeps the data object in native form at the destination and also preserves all the file metadata and access controls. When an archived file is accessed from the source, the Cloud File System receives the access request and translates the cloud object into a file without having to rehydrate it back to the original storage. This provides a highly efficient way to conserve space on the original Network Attached Storage (NAS) by leveraging cheaper cloud object storage classes, such as Amazon S3. It also ensures the archived data in the cloud can be natively accessed as objects in the cloud without using the software application or the original storage, so there is no lock-in. This is very important because it allows you to use native cloud tools to access, process and extract value from your archived data. Finally, the Cloud File System ensures a file view of the archived objects, independent from the source, so data can be directly used in the cloud either as files or as objects.
The Unstructured Data Problem
Unstructured Data is the biggest headache today for any organisation trying to control and manage data. The unstructured data consumes over 70% of all information stored and is growing at 61% per annum!
Reduce Backup Times by 80%, by only backing up hot data!
Firstly, let us understand what we are dealing with. Unstructured data is the information which is typically not stored in a database.
Unstructured Data manifests itself in two ways:
- “TEXT” can be e-mail, texts, word documents, presentations, messaging systems, Twitter, Facebook etc.
- “RICH MEDIA” can be images, sound files, movie files etc.
As we have explained, unstructured data consumes vast amounts of storage, but another consideration is legislation. Where this data resides is important if you need to retrieve the information for a compliance audit or lawsuit and this is where a data archiving strategy can help.
- data can be of any type
- not necessarily following any format or sequence
- does not follow any rules
- is not predictable
Discovery of Unstructured Data
How organisations identify this data is of vital importance to find whether it has an intrinsic value to the business or the next lawsuit waiting to happen. Unstructured data resides in many places, desktops, laptops, servers, NAS, SAN, Cloud and it is growing fast, very fast!
By 2025 IDC estimates we will be creating 463 EB (Exabytes) of data daily or 168 ZB annually, this is 4-5x the increase over 2020 estimates.
Firstly, we need to identify the types of unstructured data and where it currently resides. From this we can make plans to carry out the following:
- How much unstructured data do we have?
- How many copies of the same file do we have?
- On which systems and data storage platforms does the information reside?
- When was it created?
- When was it last accessed?
- What size is the file data?
- Who owns the files?
- When was it last modified?
- Is the data relevant to the business?
- How many copies do we have?
- Do the files need to be archived?
- Should the data be restricted?
- Who is generating this data?
- Is the data ours?
Existing IT Investment
Companies spend a huge amount of money in purchasing storage and servers. The investment in the solutions is growing year on year. Recent reports indicate that by 2025 we will be purchasing two to three times as much storage capacity as we are today, whether this is cloud storage or on-premise, the data management issues aren’t going away.
By implementing a tiered data archive containing unstructured data and moving this through the different storage tiers frees up valuable disk space on the most expensive highest performing storage.
By moving this data, we can slow down the necessary and ongoing investment in purchasing tier 1 storage giving a huge ROI benefit. An additional benefit with active data archiving is that you may be able to utilise your existing older storage systems to archive data.
When storing unstructured data, it is an important consideration where it’s stored. Managing unstructured data will consume increasing amounts of the IT budget and available resources due to the explosion in data growth.
Data Archiving Benefits
- Cost savings
- Energy savings
- ROI savings
- Decrease Backup times
- Free up valuable Tier 1 disk space
- Non disruptive to users
- Enable identification of data for business governance
Download our Infographic on Unstructured Data https://bit.ly/35OlKaq
Trial our Cloud Storage
Please complete the form found here.
Thanks for reading