The Impact of Make vs Buy Decisions for Memory Interface Solutions

By Raj Mahajan, MemCore Inc.

When planning a complex product development project using an ASIC or SoC it is critical to analyze the various risks, project costs, resources required, and expertise required in order to allocate resources (money, equipment and people) to maximize the profit potential of the product. At MemCore we believe that many times this analysis does not include many of the key ‘hidden’ costs and risks associated with the implementation of a complex memory interface solution (memory controller). If these risks and costs were more clearly understood we believe that it would make the choice between making a memory controller internally and buying an IP Core much easier.

The complexities associated with doing a memory controller design are significant. They include the following key points-

Memory Standards are complex and evolving
Memory Standards have many features

Priority access, ECC, access re-ordering, readmodify- write, etc.

Variety of optional features in memory devices make it difficult to work with multiple device manufacturers
Memory Controller Design is complex

Trade-offs for latency, functionality, cost and performance are difficult
Optimization for specific application/system characteristics are difficult

Requires trial and error if the proper test infrastructure isn’t in place

Memory Controller verification is time consuming and difficult

Deep knowledge is needed to understand where the corner cases are and what the likely failure modes are

Design, verification and test of a PHY is very complex

PLLs, Delay Lines, Jitter, Drift, PTV compensation, building test boards, testing various memories, test wafers, etc

Controller functionality requirements will grow over time making it difficult to predict with accuracy the development schedule
Timing closure can be difficult to manage with many time critical functional blocks

There is no single critical path in memory controllers so timing closure is very iterative.

• Verification is complex and requires a significant investment in infrastructure

Very complex state machines and addressing tables make it impossible to just use random test vectors to identify bugs
Memory access characteristics need to be modeled carefully to predict system bandwidth
A variety of memory devices need to be included to allow for multiple sourcing

Test coverage improvements typically extend after the project end date so that yield can be improved- this needs to be planned for in project resources
System testing, integration and other similar engineering support tasks need to be resourced
Process migration needs to be planned for as designs migrate to lower cost and higher performance processes
Feature upgrades typically need to be supported during process migration

Once these points are considered and included in an analysis of overall project risk and cost it is easier to estimate their impact on project profitability. If the result of doing an in-house memory controller design increases project cost and pushes out the project schedule by 3 months the loss in profit can be considerable. Using some recent data on a cell phone project we can compare profitability between Company X (on time) and Company Y (3 month schedule slip and higher project costs). The comparison is given in this chart- notice the significantly higher profit for Company X. Even more importantly perhaps, is that Company Y is now 3 months behind Company X for the next project. If another 3 month schedule slip occurs then there will be a 6 month gap in the new product introduction. This could make it difficult for Company Y to generate ANY profit on their next product, clearly a potential disaster.

The rest of the white paper expands on these points in more detail, and the main conclusion- that buying a Memory Interface solution instead of making one internally is a much lower risk and lower cost approach than typically estimated- is expanded upon in much more detail.

Introduction

Memory Controllers have become more important subsystems in a variety of SoC/ASIC designs. Whether in low-power consumer electronics where MobileDDR memories require power efficient, but also high bandwidth access; or in highperformance low-latency data communication applications where cutting edge DDR3 memories are required the design of an efficient memory controller is now a very complex and time consuming task.

The analysis of whether to design a portion of an ASIC or SoC product or purchase an IP Core from a third party can be a complex task. The cost of purchasing the IP Core is usually easy- you just need a quote from the vendor. The cost to develop the IP Core internally is much more complex. Direct and indirect costs, risks, support, staffing, and a variety of other, sometimes hidden costs, can dominate the project.

This paper will review many of the complexities associated with Memory Controller designs and will identify the ways in which a third party IP Core can lower the cost, improve profitability and reduce risk in a complex SoC/ASIC design.

Overview of Memory Controller Requirements

These days memory controllers are key components to just about every SoC/ASIC design. Whether the design goal is to reduce cost, improve performance, provide advanced features, lower power, reduce board space or a combination of these, the implementation of the memory controller will be critical to accomplishing these goals. The right memory controller can reduce the amount of memory requiredlowering cost, power and board space- or it can improve the performance of the system by insuring that critical data is available at the right time for the application to process data at the highest possible performance level. In highperformance applications the right memory controller can provide improvements in bandwidth without the need for additional memory banks, thus freeing up cost, board space and power to be used in other ways to further increase performance. As you can see, the memory controller can impact system goals in a variety of ways.

Developing an optimal memory controller is a complicated design and verification task. There are a variety of standards that need to be considered. For example- DDR, DDR2, DDR3, MobileDDR, GraphicsDDR are key memory standards and each contains a variety of optional features and capabilities that need to be considered. Features like access priority, Error Checking and Correcting (ECC), readmodify- write support, byte-write implementations, out of order access support, FIFO options, latency and bandwidth trade-offs are just a few of the many features that need to be considered. Typically there are over 100 such features and trade-offs that should be considered and studied to determine their impact on a specific system. Added to the complexity of the standards are the specific implementations and capabilities various vendors provide in their memory devices. The memory controller should be designed with the ability to support a variety of vendors so that it is easy to obtain parts for more than just a couple of vendors. This reduces the risk in obtaining parts and keeps costs down since the least expensive product can be used without changes to the design.

The above issues, just a few of the ones that would come up when designing a memory controller in house, show that a significant amount of experience and knowledge would be required by the designer. An in-house expert would need to be dedicated to the design effort and also be available for supporting system integration, and helping with performance and feature tuning.

Feature and System Optimization

In addition to designing the memory controller it will be critical to make modifications to the design (optimization and tuning) during the system integration process. Since, typically, it can be difficult to model the complete behavior of the system (unless a platform for system integration has already been created that can easily support this task) so during the design, modifications and what-if scenarios need to be created that require design changes and a variety of experiments. Trade-offs for latency, performance, and special features that impact system bandwidth and cost must be explored to identify the mix that hits the design requirements.

These capabilities not only need to be explored during the design process but also during verification. A variety of test benches and regression suites may need to be created to insure that the system design space being explored is free of bugs and incorrect assumptions. The tests also need to mimic the system access patterns of the final system if benchmarks and other goal measuring metrics are to produce valid results.

Verification of the controller itself can be very complex as well. Deep knowledge of the corner cases for each standard and feature must be used if the verification process is to obtain the coverage required to create a robust solution. Memory vendor specific tests must also be included to insure that multiple vendor’s products can be interchanged without the need to modify the controller design.

All the above points just cover the controller portion of the design- the physical layer (PHY) interface needs to be considered as well. If an analog approach to the design is taken, a completely new skill set is required as well as tools and test methodology. Cell libraries and processes all need to be checked and verified too. If a digital approach is taken to the design, it simplifies some of the tool, process and design experience issues, but the details on how to design the digital phase locked loop and delay elements needed to insure that data recovery is robust, that jitter is minimized and that process, temperature and voltage variations are accounted for are still of critical importance. A third party PHY could be used instead of designing one internally and this can be a good alternative as long as the PHY vendor has a proven track record and will support the engineering team during development, verification, test development, system integration and bring-up. Not all IP vendors support their products to this extent (MemCore does) so you need to be careful if you go this route.

One additional task that is usually overlooked is the need for built-in testing and error tracking features. Many times a controller design is completed and testing well under way when it is discovered that it isn’t possible to test system memory within the time allowed or to the quality level required by the test plan. If these test features have not been included in the controller design it will be necessary to go back and add them in. This could result in a schedule slip, late in the development cycle and can have a significant impact on profitability and product success.

All of the risks we have discussed so far can impact schedules and development costs. The overall impact on profitability can be dramatic. The next section will give an example of how profitability was impacted by a schedule slip in a project described in a recent publication.

Impact of Schedule Slips to Profitability and Cost

There are a variety of reasons why a project schedule might slip- several of them were discussed in the above section. The result of a schedule slip to product profitability is dramatic and several recent studies have published profitability data to show the impact a schedule slip can have. Let’s take a recent example from the cell phone market, where two companies- Company X and Company Yintroduced similar products, but Company Y has a three month schedule slip. The profit and Gross Margin for Company X and Company Y are shown in Figure 1 below.

Since Company X introduces it’s product a full 3 months earlier than Company Y, it’s profit starts sooner and grows to a higher level than Company Y. Notice that the Gross Margin is higher during the early portion of the sales cycle so this provides the profit boost to Company X over Company Y. Late in the sales cycle Company Y shows slightly higher profit that Company X. This can be deceiving however since Company X is introducing a new product, with higher margin that is accounting for a significant amount of additional profit not shown on the single product comparison chart. If Company X leverages its 3 month advantage to create a new product earlier than Company Y it will again obtain higher profit. Additionally, if Company Y is again late this delay can compound (in the case of a single design team working on both projects) the schedule slip and now Company Y can be a full 6 months behind. Over time this competitive edge can create such a difference that Company Y may miss out on ANY profit and that would be a disaster.

Figure 1: Comparison of Profit- Company Y with a 3 Month Schedule Slip

The impact of the profit from sales is clearly one major component of the economics of a schedule slip. Let’s not forget that development costs are higher when a schedule slips. This results in larger project costs and further reduces overall profitability. If a schedule slips three months then the engineering costs will be that much higher. If the project is a year long then 3 months will be about 25% of the overall cost. In some cases engineering loading is higher toward the end of the project and thus the costs may actually be greater than 25%. If the SoC or ASIC requires a re-spin the manufacturing costs need to be paid again as well. This could be a multi-million dollar expense and will typically be a large percentage of the project cost. Other costs- material costs for scrapped wafers, testing expenses, EDA license costs, computer time or equipment rental all add up. A three month schedule slip with a re-spin could easily cost $5 to $10M. If this amount is amortized over the project sales cycle and subtracted from profitability for Company Y the result is shown in Figure 2 below.

Figure 2: Comparison of Profit- Company Y with a Schedule Slip and Increased CostsC

Key Sources of Risk in Memory Controller Design

In the previous section the impact of schedule slips on profitability was described in detail using recently published data on an actual design. The potential sources of slips- design risks- that can turn into schedule delays and requirements to re-spin a SoC/ASIC design are numerous, but for a memory controller design there are three main areas of risk that should be described in additional detail.

The first area of schedule risk is that functionality requirements for the memory controller can grow over the course of the project. Because memory bandwidth is so key to performance, functionality, power and cost metrics of a typical system it is not unusual for the memory controller to be used as the ‘go-to’ portion of the design when these metrics seem in danger of being met. More memory controller functionality can improve the overall system bandwidth and reduce processing requirements and the number of memory banks, thus reducing cost, power and board space. ECC and redundancy can improve system reliability and testing requirements that can impact deployment and support costs. The schedule impact of making these types of feature changes can be difficult to estimate and resource when done late in the development cycle. An IP Core with a robust set of features and expert support from the vendor provides an experience-based schedule and feature set. Schedule risks can be dramatically reduced.

Once the feature set is finalized and verified, timing closure- the second key source of risk- can be difficult to estimate when doing an in-house development. The memory controller, on its own, has a variety of critical design areas, primarily due to the importance of the physical layer interface timing between the controller and the external memory. The memory controller scheduling logic can be very complex as well. For example, maximizing memory bandwidth requires a significant amount of state information- address access sequences, memory bank state, refresh and ECC- that all need to be considered when determining what action must be taken on a new request. Logic levels can become too deep and timing difficult to achieve unless great care is taken in the design of the scheduling logic. Register retiming can help, but at the risk of increasing latency, which can then reduce overall bandwidth. As you can see, timing closure, even for a moderately experienced designer can be a challenge and will be a major risk to development schedules. Because the memory controller is so central to the typical system design there can also be numerous critical paths between various functional blocks in the design and the controller. The optimization of critical timing within the memory controller block and between other key system blocks can turn into a tug-of-war and easily make finding a workable solution almost impossible. In contrast- and IP Core has a proven set of constraints that have been used in a variety of applications and with a small amount of support from the vendor can be much more easily integrated into the design.

The third significant area of risk in an in-house development is verification. A memory controller must operate under a variety of different conditions and to verify all combinations and sequences of access patterns is clearly impossible. The verification approach must use intelligent, knowledge-based traffic patterns and access sequences that are known to exercise the key elements of the design. Pure pseudo-random approaches will not be able to converge on a useful coverage metric. Without detailed knowledge of the controller boundary conditions, enabling command and state conditions it is difficult to predict with any certainty how long it will take to reach a coverage metric that will reduce risk to a tolerable level. An IP Core that is part of a complete solution will have been completely verified using a wide variety of corner cases, conditions and target applications. Furthermore test suites included with the IP Core would allow bandwidth estimates to be made using example application specific access sequences. This makes it easier to develop early bandwidth estimates to avoid any last minute surprises that can also impact schedules. A complete test suite would also cover a variety of memory vendors’ products allowing multiple sourcing to reduce cost and procurement delays.

These key areas are just a few of the key risks that a development team needs to consider when comparing an IP Core vs in-house approach to implementing a memory controller. There are already so many risks in any complex SoC/ASIC development schedule it would seem clear that the purchase of a robust and well supported IP Core for functions like memory control would be an important consideration when looking at ways to mitigate the overall project risk.

Long-term Costs for Memory Controller Support

Once the design is completed and the new product rolled out the work on the memory controller is not finished- far from it. The memory controller will need to be supported during code development, feature enhancements, customer specific enhancements, test coverage enhancement and all the other wide range of engineering support required of a complex system. Test coverage enhancement is an excellent example of the kind of support that is usually required from the memory controller designer. As production ramps up more becomes known about the bottlenecks of test time, yield loss and reliability issues. Many times these issues can be linked to the memory sub-system. Vendors memory products can change from vendor to vendor or even between wafer lots. Vendors may migrate their processes and timing characteristics, reliability and sensitivity to noise and voltage variation can vary enough to impact system reliability. The memory controller may need to be utilized in modes not completely tested or debugged and sometimes a work-around will be required. Support from the designer can be critical when yield crashes halt production or key customers find bugs previously masked.

If a product is successful then a follow-on product will be required. If the follow-on is in a new process then the memory controller will need to be ported to the new process and another set of complexities will present itself. New library elements need to be tested and optimizations for power, performance and die size may need to be done all over again. New memory technology (like the migration from DDR2 to DDR3) needs to be taken into account as well as potential new vendors, memory features and characteristics. If a PHY needs to be ported as well it is usually necessary to start the design with almost a clean slate. Since the PHY design is typically very process sensitive and pushes the capabilities of the library cells to the limit trying to adapt a previous design to a new technology can take longer than a new design from scratch.

Opportunity Cost

While engineering resources are being used to design a memory controller they are not being used to differentiate the key core technology of the system. The ‘opportunity cost’ of this use of resources is sometimes overlooked. It can allow a competitor to do a better job of adding compelling features, pushing performance or lower costmaking it difficult to compete in the marketplace. If the memory controller was done using an IP Core instead of an in house effort (and purchased from the right company) engineering resources could be applied to counter likely moves by the competition and enhance the product lines core technology. Core technology investments can also carry forward into future designs and enhance profitability on multiple generations of products. Opportunity costs are difficult to estimate accurately, but experience has shown that investments in core technology, because they can pay off over multiple product developments (a 2x to 3x advantage over non-core investments) and can provide compelling advantages over the competition (another 2x to 3x advantage) can have from a 4x to 9x return on investment. Contrast this to an investment in non-core technology where the return is at best a 2x return and it is clear that opportunity costs should be included in any detailed cost estimate.

Overall Profitability Result

We have seen that there is a significant cost associated with designing a Memory Controller in-house instead of sourcing the controller from a third party. If these additional costs are factored into the project profit projection we saw in Figure 2 and amortized over the project sales cycle the chart in Figure 3 is the result. Company Y, now burdened with a 3 month schedule slip, the additional engineering costs associated with the slip, the costs for spinning the SoC/ASIC, the additional engineering costs (over the purchase price) for designing, verifying, supporting and testing the memory controller, as well as the opportunity cost of missing features and capabilities in the current project has a dramatically lower profit curve. In fact, this range of profit may not be sufficient to keep the company viable depending on the other costs (sales, administration, marketing, etc) associated with selling and supporting the product line. For products with smaller sales and profit potential- this was a cell phone example, remember- you can see that even small impacts to overall profitability can make the difference between a company staying in business and closing up shop.

Figure 3: Comparison of Profit- Company Y with a Schedule Slip, Increased Development Costs and Opportunity Costs

Conclusion

The design, verification and support for a memory controller are complex tasks. It is also costly, risky and in most cases unnecessary to do in house. Purchasing a memory controller as an IP Core avoids a host of problems and allows the company to focus on its core competency and differentiate its offering from its competition. The resulting increase in profitability can make a key difference in a product lines impact on the company bottom line.