Take it from those who know firsthand, the best way to cover your storage needs is to plan ahead, know your data and users, use generous estimates, and review your needs often.
From the very beginning of its enterprise resource planning system implementation, Chevron Canada was clearly going to need a way to store older data so that it could be accessed, if only occasionally, in an organized manner. But company executives had no idea how to go about finding the right solutions or determining how much storage they would need for their R/3 system from SAP AG, of Walldorf, Germany.
Soon enough it came down to a choice between archive storage systems from FileNET Corp., of Costa Mesa, Calif., and iXOS Software AG, of Munich, Germany, because at the time, those systems worked the most seamlessly with R/3. In the end, Chevron analysts chose iXOS, a software-based archiving solution tightly integrated with (and now owned by) SAP.
Top determinants of storage burden
- Number of concurrent users.
- Number of modules.
- Volume of transactions.
- Length of time for which you must keep your transactions live in the database versus archiving them.
To address the more thorny problem of how much storage the system would need over the long term, Edmund Yee, manager of network operations for Vancouver, British Columbia-based Chevron Canada, went with SAP's recommendation - letting Houston-based hardware vendor Compaq Computer Corp.'s SAP Competency Center estimate the company's needs.
But the Competency Center's estimates proved woefully inadequate, and the entire scene ended with Yee and his team throwing up their hands in disgust. "We went with the certified competency center, which got the numbers by using our data on how many users were on the system, what modules we planned to run, and other factors," Yee says. "We even took a conservative approach by doubling the recommendation, but within three months, we were out of space. Whatever metrics they were using just weren't working."
| Disk storage capacity The ratio of disk gigabytes to processor capacity has been found to be an effective barometer of disk space management. As rule of thumb, the ratio of addressable storage capacity (in gigabytes) to MIPs is about 4:1, regardless of datacenter size. However, ERP systems create additional storage demands above the rule of thumb, though the impact has not yet been quantified. This chart shows the amount of storage used by IT organizations with either less than 300 MIPS, or between 300 and 799 MIPSa dn more than 800 MIPS of data center processing power. |
Source:Compass America Inc., www.compassamerica.com
Compaq spokesperson Keith Billow says that at the time Chevron Canada began going live with R/3, about six years ago, the ERP suite itself was fairly new in North America. Furthermore, Chevron was one of the first to use the Microsoft Corp. Windows NT platform for its R/3 implementation. So without much past data to work with, the Compaq SAP Competency Center was unable to come up with an accurate estimate. These days, of course, that's not the case, Billow notes.
At that point, though, Yee figured he could do better on his own. "We decided to forget about this estimating stuff and use real live data, so we took a month or two to see how much we would rack up. We tracked and monitored the usage of our system with the same methodology we use for all of our sizing and capacity planning," he says.
After monitoring the system for several months, Yee's team determined that the database was growing at a rate of nearly 3 gigabytes per month. "We noticed that as the database grew, it started to slow down. So we went back to our people and asked them how much data they needed to have available. They told us they needed data available for one year, so we started looking for archiving solutions," he says.
Stemming the growth of that database was key, and to do that Chevron Canada now uses iXOS-ARCHIVE. Here's how it works: The SAP system sends the sales data to the iXOS system, where the data is processed and put into temporary disk storage. This controls how much data should be left in a buffer and how much should be stored on long-term storage media. Chevron Canada chose to store its long-term data on CD-ROM disks and short-term data on Compaq disk drives.
Clearly, Chevron Canada's situation points out what more and more companies are starting to realize: Unless you find a way to archive older data, your ERP system will grow more and more unwieldy with time. Analysts report ERP implementations generating up to 60 gigabytes of storage growth per month, which makes Chevron Canada's problem seem relatively tame.
To archive or not to archive
Why, some ask, can't I just use the archiving facility that comes with my ERP system instead of shelling out more money for an additional piece of software and a host of storage media? Steve Tirone, senior analyst for the SAP Advisory Program at AMR Research Inc., of Boston, explains that there are two sides to archiving:
"There is data you have to be able to access all the time, and then there is data you have to keep forever in a readable format, but perhaps not on the main system," he says. "ERP systems all have some way of allowing you to take transactions out of your production tables and put them into a compressed archiving table, but those files are still within your system." That's where third-party solutions such as iXOS-ARCHIVE and CommonStore from IBM Corp., of Armonk, N.Y., come in (see box at right). These solutions take the archive files generated by ERP archiving and manage them in an accessible format. They move the data to tertiary storage on magneto-optical hard drives, CD-ROMs, disk cache, or tape array. Depending on whether or not you need to access the archived data frequently and what formats you need the data in, you may not need a third-party system. "But if you don't buy one, you have to keep archived files within your ERP storage systems," Tirone says. "So you are caught between a rock and a hard place." In the end, he says, most companies with large ERP systems will end up buying a third-party archive storage system. Understanding your data Before you engage in archiving, make sure you understand your company's data and its relationship to other data and other applications within the organization.
Popular archive storage solutions for ERP
Here, from ERP-specific to more general, are five of the most popular after-market archiving solutions available today: IBM CommonStore
IBM Corp., Armonk, N.Y.
This product is specifically for SAP archiving and document management. It removes inactive data from the operational SAP system to an external archive. Using the SAP ArchiveLink interface, IBM CommonStore integrates with R/3's financial, logistics, and human resources applications. The product also can function as a universal document management system. It supports storage devices, including optical jukeboxes and tape libraries.
iXOS Software AG, Munich, Germany
Since iXOS is an SAP business partner, it's only natural that iXOS is optimized for the SAP environment. This software tool allows R/3 users to view an entire business process and all relevant documents from inside the SAP ERP system. The product can support several thousand users and incorporate tens of thousands of new documents every day, according to a company spokesperson. The system works with all of the major archiving hardware.
Panagon Document Warehouse for SAP
FileNET Corp., Costa Mesa, Calif.
This suite of software allows SAP customers to capture, display, store, retrieve, and manage R/3-linked images, documents, and reports. It archives R/3 objects, including invoices and reports, and links to SAP Business Workflow. The modular, component-based architecture integrates to the SAP ArchiveLink interface and supports all SAP archiving scenarios.
Mobius Management Systems Inc., Rye, N.Y.
This general archiving engine and server supports both host-based and client/server implementations, scaling from the desktop to the department to the enterprise. Supported platforms include OS/390, Windows NT, UNIX, OS/2, OS/400, and NetWare. The system integrates all documents, making them available for viewing through a single point of access. ViewDirect's Universal Archive is a device-independent storage format that supports any mix of storage media, including disk, tape, and optical. It automatically manages the migration of long-term retention documents, moving the archive to less expensive storage as documents age. Through ViewDirect's EnterpriseIndex users can access multilevel documents.
"A company like Coca-Cola is going to use more features, and there will be many more cross-referenced tables than exist in a smaller company," says John Klaren, manager of presales consultants at iXOS. "If you turn on switches A, C, and D versus only A or A, B, C, and D, it will do different things. You have to understand your data before you can archive it."
Make sure the data you want to archive is "business complete"--that all associated documents to the document being archived are finished being used and that all associated documents are referenceable together in their archived state--before archiving, recommends Charles Farren, program manager for data archiving at SAP America Inc., of Newtown Square, Pa. "You don't want to remove delivery of associated billing documents, for example. It's not just a process of removing one table. You have to get across different tables and ensure that the data will be moving onto the storage system in an orderly fashion," he says.
After identifying relationships within the data, it's important to determine how that data will be used in the future.
"Do you need to read the data or be able to look at it in exactly the same format it was in when it was in the database, or do you just need to have a report from across the data? Do you need to reload the data into the database at a later stage?" Farren asks.
And the way you archive data will depend quite a bit on the type of industry you are in.
"Typically, the television [broadcasting] industry isn't perceived as being heavily transaction oriented, but if you realize that the industry treats every episode of a weekly production as an enterprise, you quickly run into some very high volumes of financial transactions. Did I make money on this particular episode? What are the costs associated with production of that particular episode?" says John Schiff, senior technologist at J.D. Edwards & Co., an ERP vendor based in Denver. "By gearing your archiving strategy to your industry and your needs, you'll achieve greater success."
Schiff notes that data archiving will be much easier if your company has an adequate data warehousing strategy: "If you are using your current transaction files as your data warehouse, that's where you get into trouble. But if you have a data warehousing strategy in place that extracts your decision data, then the archiving becomes much simpler," he says.
How much storage is enough? As Chevron Canada's Edmund Yee quickly found out, there is almost no such thing as too much storage. And the experts agree.
It seems the best advice is to analyze your needs the best way you can and then double or triple that number to determine your present storage requirements. And revisit the issue often, because databases tend to grow exponentially over time.
Analysts at Fujitsu Microelectronics Inc., of San Jose, Calif., a wholly owned subsidiary of Fujitsu Ltd., underestimated their system's needs for a variety of reasons.
| Disk storage costs plummeting Steadily falling prices for disk hardware is a major contributor to declining datacenter costs (annual acquisition and maintenance cost per gigabyte). Annual acquisition and maintenance costs or disk hardware in large datacenters (>800 MIPS) fell from $1,150 per gigabyte in 1996 to $910 per gigabyte in 1997. Smaller datacenters (<300 MIPS) followed a similar pattern. Based on its surveys of hundreds of datacenters around the world, Robert Gold, practice leader for integrated services at the IT management consulting firm Compass America Inc., of Reston, Va., says that current prices have dropped below $300 per gigabyte. |
Source: Compass America Inc., www.compassamerica.com
"We looked at the total data storage required by all of our existing systems and then determined what the initial load was going to be for our new SAP system. Then we anticipated a growth rate in documents of about 25% per year, but we should have been closer to 50%," says Jeff Jones, the former manager, planning and operations, information technology with Fujitsu Microelectronics. Jones is now the president of IT consultancy Cross-Jones and Jones Consulting, of Milpitas, Calif., as well as a board member of the Chicago-based American SAP User Group (ASUG). "[At Fujitsu,] we didn't anticipate that by going to an integrated ERP system, which was totally online, certain user groups [would need] to archive all of their reports. So by the end of the first year, we had to double our disk space," he says.
How much storage you need is largely dependent on understanding what your users need to retain and how long they need it retained in a usable format.
"You have to really understand your users' requirements, how long they need to retain certain data and what type of retrieval they need before you can determine how much storage you need," iXOS' Klaren says. "It depends more on a company's thought process and methodology than on how much data it has, because some companies may be so risk-averse that they want to keep every single thing in the database for 20 years. Other companies are willing to remove certain types of transactions after one year. There is always a balance between the technical people, who would like to get rid of the data yesterday, and the users, who want to keep it around forever."
The amount of storage needed also depends on the type of data that must be archived and how fast that data is growing.
"I talked to a semiconductor company recently that does 70 sales orders a year, but two of its sales contracts are with IBM to crank out chips and account for one-third of the revenue. So they don't worry too much about archiving financial information. On the other hand, they generate huge manufacturing-related transaction tables. That's where the growth of their infrastructure is," AMR's Tirone explains. "But retail companies with tens or hundreds of thousands of transactions per day generate different table growths."
The type of company and how fast the database is growing also may enter into the equation.
| Automated tape library costs Automated tape technology is changing rapidly, with capacity doubling every four years (annual hardware acquisition and maintenance cost per slot). Small (<50K slots) and medium-sized (50K to 100K slots) datacenters are migrating from 18-track to 36-track tape technology to increase per-slot data capacity. In large (>100K slots) datacenters, chiefly as a result of this migration, annual hardware acquisition and maintenance costs per slot have showed an upward spike, from $22 in 1996 to $27 in 1997. |
Source: Compass America Inc., www.compassamerica.com
A specialty-devices manufacturer with a current database of 1.2 terabytes and about 1,000 users is growing at 60 gigabytes a month, and employees there have been archiving historical data fast and furiously, recalls Tirone. The company managed to cut its monthly growth by half and hopes to stabilize the database at 1.5 to 2 terabytes. In contrast, a discrete manufacturer with about the same number of users but a simpler process has a database that is growing at 11 to 15 gigabytes per month because it doesn't have the same business need for transactions. This database topped out at 270 gigabytes, but the company has managed to archive the database down to 210 gigabytes.
Storage doesn't come cheaply
In the intangible sense, archived information takes longer to access than information residing on the primary system. If the CD with the information you need isn't mounted, for example, it can take several seconds or even minutes to gain access. Multiplied by dozens of disk loadings each day, the wasted time and money start to add up.
There are solutions to even this minor problem, however. Some companies keep archived data in a disk cache for 10 days before moving it to optical storage, on the theory that the data is more likely to be accessed during that first 10 days.
Archiving costs in terms of real dollars as well. The archiving hardware represents about 10% of the total storage costs, while the archiving software adds another 20% to 25%. The rest of the cost resides in labor, support, implementation, and administration.
Jones says the first-year archiving costs for Fujitsu Microelectronics, which included hardware, software, and support, ran about $200,000. But the software is a one-time cost, and hardware storage media is relatively inexpensive. Yee estimates Chevron Canada's annual incremental storage costs at $20,000 to $25,000.
But none of this matters if your company hasn't put enough priority on the concept of archiving in the first place. "Typically, customers won't address the archiving issues during the first two years, but after they see two years of growth, they start to address it," SAP America's Farren notes.
"At Fujitsu, we tried our best to ignore archiving," Jones says. "There were more important things to take care of, from management's point of view. That attitude really burned us when we got to the point where we couldn't back up our systems. And that's what finally moved Fujitsu to look at archiving."
Karen D. Schwartz is a freelance writer specializing in business and technology. Based in the Washington, D.C. area, she can be reached at firstname.lastname@example.org.