Conquering Network Deployment Complexity – Part 2

In my last post, I looked at the problem of deploying frequent configuration updates to a large array of routers, deployed over a similarly large number of sites. In some cases, there will be a need to retrieve data off the routers and populate it into formatted output. I considered Puppet, but concluded that for this specific use case, Puppet might not (yet!) be the best tool. So in this post I’ll examine some other existing technologies and see how well each of them might fit. I’ll briefly review RANCID, before assessing SNMP and NETCONF together.

RANCID: Really Awesome New CIsco config Differ

In a nutshell, RANCID logs into a bunch of routers, runs some commands to collect output, cleans up the data a bit, calculates differences from a previous run and emails these to a mailing list, and stores the output into a revision control system as a baseline for future runs. Essentially RANCID notifies administrators when changes are made, so any misconfigurations can be caught early, and also establishes an audit trail if problems are found at a later stage.

One interesting observation from the RANCID website is that it changes administrator behaviour. Before they started using RANCID, some administrators made changes without informing colleagues, and were reluctant to own up to responsibility when something broke. With RANCID installed, all administrators get an email every time a change is made, so everyone is that much more careful when modifying configuration and will usually flag what they’re doing in advance.

RANCID really isn’t fit for making configuration changes however. It doesn’t initiate changes, and to make changes based on RANCID data requires additional tools. It’s primarily a read-only tool. RANCID can also act as a Looking Glass server. This is a standard facility on the internet for providing strictly regulated read-only access to routing configurations. So again, read-only access.

Our customer had in fact already implemented RANCID support within their network. Like Puppet, it’s an excellent tool, but solves a different problem than the one I was trying to solve.

SNMP and NETCONF

NETCONF is a network management protocol intended to provide configuration management capability. Great! This sounds exactly like what I’m looking for.

But first – an interlude on SNMP. The Simple Network Management Protocol is one of the older internet protocols – preceding for example HTTP. In fact earlier textbooks often presented it as one of the four ‘core’ applications protocols along with Telnet, FTP and SMTP. The protocol provides network monitoring, alarming and configuration. So why not use SNMP for router configuration? Because although SNMP is widely used for network alarming (mainly the ‘TRAP’ method), and is also used for network monitoring (the ‘GET’ method), it’s much less commonly used for network configuration (the ‘SET’ method). Given its first-runner advantage, this lack of success proved intriguing enough for the IAB (Internet Architecture Board, which has technical oversight of the technical aspects of the Internet) to call a special meeting in 2002 to determine what was wrong. From a network configuration perspective, they concluded that SNMP wasn’t fit for purpose from a number of points of view:

  • MIBs (Management Information Bases, the device database accessed by SNMP) were often read-only.
  • SNMPs data representations mean that processing large amounts of configuration data is relatively slow.
  • It’s difficult to carry out an atomic network configuration ‘transaction’, where a set of co-ordinated changes where either all or none of these are installed. This is because SNMP breaks these into more granular transactions, so there are intermediate configuration states which may be inconsistent.
  • It isn’t easy to pull a configuration and then restore (‘replay’) it later.
  • Operators think of configuration in terms of tasks; SNMP is modelled on data.

This was not the first time alternatives were considered. In the late 1990s, the COPS protocol was considered as an alternative to SNMP. However, by 2002, COPS hadn’t been widely adopted, and still shared some of the limitations of SNMP. This was reviewed again in RFC3535, along with CMI, HTTP/HTML, CLI and XML-based approaches.

The meeting published a list of user requirements. At the expense of some detail, I have grouped and condensed these into five items:

  1. Must be easy to handle separately configuration, operational and statistical device data.
  2. Must be possible to manage configuration of network as a whole rather than individual devices. It’s required to distinguish between the distribution of a configuration and its activation.
  3. It should be possible to generate the operations that convert one configuration to another and also to check two configurations are consistent. It must be possible to dump / restore configurations.
  4. Amenability of configuration data to text processing tools like diff and version control systems desirable. More generally, the system must be easy to use.
  5. Role-based or similarly rich access controls are needed, with supporting admin tools. Controls should work at both a data and task level.

As a result of this, a decision was made to create a new XML-based protocol to fulfil these requirements. This protocol became NETCONF.

 

NETCONF Protocol Layers

NETCONF is based on four layers. From bottom to top, they are:

  • Transport layer: normally SSH
  • Message layer: <rpc> and <rpc-reply>, and asynchronous <notification>
  • Operations layer: These are the actual operations of NETCONF
  • Content layer: This carries the configuration data consistent with a data model
  • One distinguishing aspect of NETCONF is the richer set of operations compared to SNMP. You can get and set all or part of a datastore (of configuration data). Multiple datastores are supported.

Data models are represented in either YANG or YIN encoding. YANG is meant to be human-friendly, easy for humans to read and write, whereas the YIN model is encoded in XML. They’re semantically equal: one can be converted into the other losslessly.

So, why not NETCONF? In practice, NETCONF is relatively new. The meeting in 2002 led to the creation of the NETCONF working group (in the IETF, Internet Engineering Task Force) in 2003. It published the base NETCONF protocol just before 2007. Perhaps it’s disappointing that eight years on NETCONF is still in transition, but it takes time for older hardware to be replaced, and operators are cautious in adopting new technologies until they can fully commit to them (and in some cases, only after they have been fully proven).

In my next post, I’ll take a look at OPENFLOW, which is receiving a lot of attention at the moment. As we’ll see, this isn’t really a configuration management tool as such, but it has enough parallels to make it interesting to compare with the alternatives so far.

Tags: , , ,