Datastor Questions & Answers

  • JPG-HI

  • Software-Based Data Deduplication Technology

 Q. What is data deduplication?

A. Data deduplication is a technology that reduces storage requirements by identifying and removing redundant data.

Q. What data deduplication techniques does DataStor Shield use?

A. All of DataStor Shield products were built around the same enterprise class, patent pending Adaptive Content FactoringTM engine. When you identify the product that best suits your environment, you can be confident you have the same deduplication technology that currently protects multinational data centre sites.

DataStor Shield uses three data deduplication techniques to identify and remove redundant data. First, all data is compressed using advanced data compression. Second, a global single-instance-storage (G-SIS) is used to remove redundant data at the file-level regardless of file name, path or even server. Third, active files are analysed at a sub-file level to remove redundancy which helps with problem files like PSTs, Exchange EDBs and SQL databases.

Q. Does DataStor Shield replace my existing backup application?

A. Yes, DataStor Shield can protect data without a third-party backup application. This is just one more cost saving offered by the solution.

Q. Is data deduplicated before it touches the network?

A. . Yes, often called “source-based” data deduplication, DataStor Shield removes redundant data on the protected server before any data is transferred across LAN or WAN network connections.

Q. Does DataStor Shield require proprietary hardware?

A. No, DataStor Shield is a software only solution that runs on Microsoft Windows. The administrator has freedom to choose the best hardware to fit their needs and budget.

Q. Can DataStor Shield protect data stored on non-Windows based servers (Linux, Solaris, HP-UX, etc. . .)?

A. Yes, DataStor Shield can “post process” data that has been transferred from non-Windows based servers. For example, a database backup on HP-UX is transferred to the dataStor Shield server and then deduplicated and stored efficiently by a local protection plan.
Currently, dataStor Shield only supports “source based” data deduplication on Microsoft Windows.

Q. What is the difference between “source based”, “post process” and “in-line” data deduplication?

A. One important aspect of data deduplication is WHERE the redundant data is processed and removed. “Source based” products process and remove redundant data on the protected server, before it is transferred across the network. “Post process” and “in-line” products process data in a central location and only store unique data. “Post process” also requires extra disk space to cache the data before redundant data is removed.

Q. Can iSCSI connected storage be used by DataStor Shield to store deduplicated data?

A. Yes, DataStor Shield uses standard NTFS volumes to store deduplicated data. These NTFS volumes can be internal, iSCSI and Fibre Channel connected.

Q. Can NAS connected storage be used by DataStor Shield to store deduplicated data?

A. Yes, DataStor Shield 3.0 and later versions support NAS connected shares (CIFSSMB, NFS) to store deduplicated data. These shares do not require NTFS.

Q. Does DataStor Shield install agent software on all the protected servers?

A. No, the only thing that DataStor Shield puts on the protected server is a scheduled task. This scheduled task remotely executes the deduplication process directly from the DataStor Shield server. This configuration simplifies future software upgrades because only the DataStor Shield server must be upgraded. Every scheduled task is still managed centrally through the dataStor ShieldTM management interface.

Q. What is the overhead (CPU and memory) of the deduplication process running on the protected server?

A. The memory usage of the deduplication process running on the protected server is less than 20MB. The CPU utilization varies based on the number and speed of the CPU(s). On most modern servers the CPU utilisation ranges between 25-35% while the plan is running.

Q. Can the DataStor Shield server be a virtual machine?

A. Yes, since dataStor Shieldfully distributes the data deduplication process across the protected servers the overhead on the DataStor Shield server is much less. One thing to note is that backend processes, like data expiration and data verification, will require more CPU and memory. These backend processes will take longer if the DataStor Shield server is running in a virtual machine.
Storage scalability should also be considered when DataStor Shield is running in a virtual machine. Determine how much storage capacity can be connected to the virtual machine and verify this meets the needs of your environment.

Q. Can DataStor Shield deduplicate and store virtual machine images (VMDK, VHD, XVA)?

A. Our best practice is to run a protection plan within the VM as if it were a physical server. The advantages are several. If the VM is an application server, our Exchange and SQL support will quiesce the system during the protection plan run. G-SIS will more efficiently store the files than with VM image processing. As well, you will also see a shorter backup window, allowing for additional plan runs per day. However, if you need to protect VM image files dataStor ShieldTM will process the large image files themselves. Simply store a copy of the virtual machine images directly on the DataStor Shield server and schedule a local protection plan to efficiently store these images. The original image can be overwritten every day with new images while DataStor Shield is keeping a deduplicated backup history.

Q. Can DataStor Shield deduplicate and store Microsoft Exchange storage groups?

A. Yes, DataStor Shield supports plans for Exchange 2003 and Exchange 2007, integrating with Exchange VSS Writer found in Windows 2003 or later to capture a consistent image of Exchange storage groups while they are mounted. After DataStor Shield has a consistent image it uses sub-file data deduplication to remove redundant data found in the large EDB files. Every recovery point is a FULL backup, but the disk space used is far less.
Exchange plans automatically discover storage group file locations, perform integrity checks on all EDB databases, and truncate logs after successful backup.

Q. Can DataStor Shield deduplicate and store Microsoft SQL databases?

A. Yes, DataStor Shield supports plans for SQL in both Simple Recovery mode and Full Recovery mode, integrating with SQL VSS Writer found in Windows 2003 or later to capture a consistent image of SQL databases while they are mounted. After DataStor Shield has a consistent image it uses sub-file data deduplication to remove redundant data found in the large MDF files. Every recovery point is a FULL backup, but the disk space used is far less.

Call for more information on Datastor, phone: 01953 886544