How do I have a better relationship with Cisco?
April 22, 2009 10:45 PM   Subscribe

At work I have to face the Final Boss of the Internet for the next several months, and my trusty Cisco sidekick is starting to tard out on me. How do I level up enough to be able to use him effectively?

I recently had a ton of generic networking jobs dumped on me at work. The company is a government contractor, and they suddenly need me to do dozens of small to medium sized networking projects for their customers all at once. Most of the jobs are of the form "design and implement this upgrade to the network," with a good helping of "find out what is causing this behavior and make it stop."

Before anyone suggests getting a Cisco certificate, I need to point out that this is going to blow over by the end of the year and I can't afford the time to get one. I don't think it would help anyway, and I'm saying that even though one of my good ex co-workers was a CCIE. The problems I run into are really hard because each site is different and they combine technologies almost at random because of organic network growth due to the sites swapping other contractors in and out. Although there is a lot of IP nearly every site has weird internal protocols going on as well, and there are many unusual layer 1 technologies.

My question is this: how do people deal with Cisco on a regular basis? I've had a low-level relationship with their products throughout my career and have been mostly happy. Now that I have to really get my hands dirty with their products, I'm beginning to think that their engineering quality isn't as good as I've always thought.

To start with, is there a better way of navigating the documentation? I have the overall document structure straight in my head and can usually find what I want, but answering certain questions can easily involve a half dozen browser tabs. For instance, let's say I want a list of all the caveats and release notes for a given IOS/hardware/feature set/WIC combination. Is there some magic tool I don't know about that will allow me to paste my "sh inv" and "sh ver" output to get that, instead of looking up everything by hand?

The docs need some work, too. I'm constantly finding myself faced with configuration guides or feature notes which say things like "the route-map argument selects a route-map to use for NAT" without describing in detail how one feature's memory structures are interpreted by another feature. Is this normal, or is there a publisher that creates books that will fill in the blanks for me?

I also wish my management tools were better. I hand-built a bunch of scripts to do some rudimentary analysis of customer networks, but I constantly find myself wishing I had some kind of automatic network diagramming/enumeration tool. E-DI won't work because it doesn't support non-standard ports for telnet, and the latest IOS it supports is 12.3. Ciscoworks is way, way too expensive given that most of these jobs involve just a few devices, tops. CNA is limited to the low end gear, and ipswitch is too expensive to use at these sites as well. My dream tool is something that knows all about CDP, LLDP, SNMP, and WS-Management and can integrate a dynamic network map with an interactive configuration editor that does automatic hyperlinks between the commands and the online docs.

I know about Kiwi Cattools and they help a bit, but what I really need to know is what people under major time pressure use for device management and troubleshooting. Like, "I'm running a large network with half the budget I should be and this is my secret weapon" sort of tools.

Another thing that would really, really help would be an integrated TAC case / bug toolkit search engine that didn't take several minutes to run queries. Does such a thing exist, or does everyone -- including TAC -- just put up with it?

Thanks in advance.
posted by thalakan to Computers & Internet (6 answers total) 9 users marked this as a favorite
This is sort of a litany of things, so:
-Everything set to the right logging level.
-Everything logging to a loghost.
-Everything synched to NTP.
-Universal AAA set ups pointing at a radius server or TACACS.
-Standards Standards Standards Standards.
-Use as few IGP's as possible, if possible, choose just one and use that.
-KEEP IT SIMPLE, static route where you don't have any option or need for redundant paths, static routing is stable and deterministic!
-Layer 2 can be your worst enemy, keep your layer 2 domains small and as isolated as you possibly can.
-Build logical diagrams for every environment, it should have everything you need to know about how traffic should flow without having to look at anything else. Subnets, gateways, circuit id's, interfaces, provider contact numbers. It should be functional, and use block diagrams, stencils are a waste of time.
-Build rack diagrams for everything, they should contain all the details on physical device location, cable plant endpoints and power, as well as the facilities contacts. Make it as basic as you can.
-Anything that has a hostname is labeled with it both physically and logically in the device. If an interface in a device is not shut down it is labeled. If it is not labeled it is to be treated with extreme hostility until it is labeled. Unlabeled interfaces are a sin against the old gods and the new.
-Get all the smartnet contracts locked in under one CCO ID, before you ever call tac make sure you have the serial numbers for every single device captured somewhere and what smartnet contract they are covered under.
-Cisco gear is both great and horrible all at once, you learn to drop the horrible end of things by the wayside over time, in general you want to run screaming from: Code that has more features than what you actually need, non safe harbor code for 65xx, service modules in 65xx, anything that uses stackwise technology. The latest and greatest software from Cisco is NEVER the greatest, it's greatest in the sense that it contains the greatest amount of undiagnosed software defects.
-Get used to doing bugscrubs, it's scut work but required, become very familiar with the bug toolkit, is something you're likely already spending a lot of time on, the rest is straight up google action.
-You can paste your output in Cisco output interpreter, you'll need a valid smartnet contract attached to your CCO ID to access this.
-The magic words for TAC are "Service Impacting Outage".
-Reading the syntax reference is pretty horrid if you're looking for context, you'll want to be searching for specific config references and guidelines, for example DMVPN implementation, route-maps with BGP with multiple peers, etc etc.
-Telnet BAD. SSH Good
-Organically grown networks is code for the bosses don't want to spend the time or money to do things right. Networks are utilities these days, you don't notice them when they're working but when they stop working everyone wants to know why right the fuck now. Make sure you document the hell out of problems you find and make the lack of investment someone elses problem.

Management tools:
-Free version of splunk.
-Opsware NAS - My choice for doing more when I'm hamstrung with too few people. It's worth it.
-NetQos - My other choice for doing more when I'm hamstrung with too few people. It is also worth it.
-Solsoft/Exaprotect - Firewall Policy Manager, annoying as hell but lowers the bar for the skill set required to manage infrastructure with multiple firewall tiers.

If you can't afford NetQos or NetVoyant get some sort of netflow data analyzer, even if you roll an OSS one, it's worth it.

If you aren't managing this infrastructure and are just a drop in consultant it might be work it to you to build a linux vm image and install Splunk, Cactii, Nagios and some basic loghost functions (ntp, etc etc) so you can spin something up immediately when you land and get going. It's what I do when I'm dropped in to a new client.

Other tools:
-Solarwinds makes a toolset that frankly sucks ass compared to the above, but you may find it useful.

The tool you're looking for is basically Opnet. Opnet is so ridiculously expensive that we don't even buy it, but it does about 90% of what you're asking about.

At the end of the day the only thing that will save you from budget constraints and running the network(s) is standards and documentation so at least you are dealing with known quantities and can ork on the real problems, not the leftover shit that you don't already have all the info on because you were too busy working on another issue.

Good luck!
(everyone starts out like this, you'll get the hang of it)
posted by iamabot at 11:26 PM on April 22, 2009 [13 favorites]

iamabot's answer is awesome and comprehensive - I am just posting to second taking the time to make rack diagrams. I have spent a goodly chunk of my worklife working with as-built diagrams for networks halfway around the world (hello government contracting), and it's amazing how much the visual helps. [Note: the diagram actually helps more than a photograph because the diagram removes the ability to get distracted by omg who cabled that thing.]
posted by catlet at 7:53 AM on April 23, 2009

This is more server room advice, but:


I cannot stress this enough. I know your current infrastructure is probably unlabeled, but whenever you put new cables in, label it with something like device:port<>device:port, and include patch panels if you use them. Make management invest in a good label printer, preferably per-site (cite the amount of time that it takes to trace a cable and how this would impact in a disaster scenario). This will save you immeasurably in the long run.
posted by kdar at 8:47 AM on April 23, 2009

On the label front, panduit makes fantastic labels for cables. Skip everything else and use these. They are superior in every conceivable way to brother p touch type devices.

To some extent you're going to become a BOFH, but that's not a bad thing sometimes.
posted by iamabot at 9:00 AM on April 23, 2009

I'm actually good on all the physical stuff like racks and cabling. Most of these sites are remote anyway, and I'll never see any of the gear I'm working with. The on-site people have their own ways of keeping track of it. That's fine with me, because the API (the phone) is always the same from my point of view. We also have autocad diagrams for almost everything, since most of these sites were built by contractors -- sometimes us -- who had to do genuine engineering to get paid for them.
Opnet is so ridiculously expensive that we don't even buy it, but it does about 90% of what you're asking about.
I talked to a rep there about Netmapper and got quoted $95k list, but he said no one pays that much. I suspect that's just, what's the term... "moron in a hurry pricing". I'm going to discuss some kind of service model payment plan with him.
posted by thalakan at 11:38 AM on April 23, 2009

If you're going to buy tools you're really better off with NetQos and Opsware NAS.

Opsware NAS is like Ciscoworks, only it's a product that actually works and it has way way more capabilities.

Netqos and Netvoyant are where the best bang for your buck is as far as traffic/flow analysis and they will tell you where to focus your energies when there is a problem with performance.

Splunk is great for event correlation, helping you figure out not what the symtoms are, but what the problem actually is. It's how I deal with not being able to train or teach people how to troubleshoot instinctively, I just give them a tool that helps them get to the problem rather than running around trying to bandaid symptoms.

You really do not need most of what opnet has based on your description, but as I said the components in it will do what you were talking about needing. I'd buy the tools above way way before I considered buying Opnet and especially Opnet modeler.
posted by iamabot at 11:47 AM on April 23, 2009

« Older What does Horton have to do with the Allied...   |   Seeking undergraduate research advice Newer »
This thread is closed to new comments.