Archive for December, 2006

Traditional File System + overloaded sys volume = downtime

December 30th, 2006

When I came into work this morning, I noticed a glut of SPAM. I added a couple static rules to my GWAVA config. Then I noticed my MTA wasn’t routing messages again. Apparently, I didn’t learn from yesterday. I tried to bounce the MTA and it locked the server into a high CPU state. I had to hard reboot the server, losing two post offices for some time. When the server came back up, I was stuck on a TTS backup file screen. It took 30 minutes or so to process the file and when it finished, it created swap space on the sys volume. It didn’t take long after that for the server to quit loading modules. I’m assuming the sys volume ran out of space. I ended up running a vrepair with the delete files option set(option 4). After almost 2 hours and over a million errors later, the server came back to life. This was a textbook example why you never put data on your sys volume. The fact that it was a traditional file system just made things worse. Once that server came back to life, and messages still didn’t flow correctly, I removed the rules I added to GWAVA. I’m thinking an address block I added didn’t like the wildcard I used to block a sub domain. The glut of spam turned out to be the lack of contact from the problem MTA box running gwava and all the RBL and SURBL lists I use. I added a public IP to an extra NIC in the machine and spam blocking went back to normal levels.

At least there weren’t many people at work today, I had that going for me…

GroupWise 7

December 29th, 2006

We upgraded from GroupWise 6.5 to 7.0 sp1 yesterday. We added two domains, one to house a seperated primary domain, and one to host the GWIA. We ran into a problem with the primary domain not upgrading its database during the recovery mode. Turns out some of the dc files were not copied correctly. Once we replaced those, the recovery went as planned and the database upgraded to version 7. Once we did that, the rest of the domains were upgraded with ease. Then we ran into an issue with a few of the post offices not talking to their parent domain correctly. We ended up having to band-aid fix them by using UNC paths instead of TCP/IP. I’m guessing there is an issue with MTP not working correctly, but it will have to work as is for right now.

Once all the guts were upgraded, we went ahead and did the Webaccess and the GWIA. I’ll say this much, webaccess 7 is a great leap from version 6.5. I’m very impressed with it so far. And the speed issues I’ve read about haven’t seemed to crop up yet. We’ll see if that continues when I get 300+ people on it at the same time. The GWIA went fairly painless as well. I was pretty burnt out by the time we finished up yesterday so I didn’t test the system much.
I came in today and something just didn’t seem right. After testing a bit, I realized that messages were taking 20 minutes or so to be routed from inside out, and outside in. I figured the GWIA was having issues. I checked the normal stuff like the threads, logs, in/out counts and everything seemed ok. After watching active logs on the POAs and MTAs and doing some more testing I realized that it didn’t have anything to do with the GWIA but rather a specific MTA. The only variable was that it is the only MTA running GWAVA for spam filtering. I brought down that MTA, removed the GWAVA hooks from the startup file, fired the MTA back up, and messages flowed correctly. A call to GWAVA tech support where I was told to use an updated NLM, and life was good. As an aside, it was one of the best support calls I’ve made. Big ups to Beginfinite.

There is one remaining issue. I brought the GWIA behind BorderManager through NAT. The problem I have run into is when a destination email server does a reverse DNS, it won’t route through to the internal network and that email server will reject the message. I’ll sort that out tomorrow.

Once I get the bugs all worked out, I’ll start in on signing up for the OES2 beta.

Thoughts about Novell in general

December 22nd, 2006

Any time a piece of bad PR comes out about Novell, such as Jeremy Allison resigning, I feel conflicted. I am not a Novell employee, so I don’t know what goes on behind those walls. I’m a Novell customer, or rather, my organization is. My day to day job is tied into the long term operation of Novell.

I maintain the network, and the backbone of the logical network is Novell. File and print with OES NetWare, email with GroupWise, workstation management with ZenWorks. Clearly, I identify with the products and specifically, the technical aspects of the products. So in turn, I identify with certain Novell employees. By extension of how I deal with Novell products, those are the engineers and support staff. Slightly beyond that, as a Novell product advocate, I identify with the people who reach out to the community such as the cool bloggers and the like.

Who I do not identify with and who I don’t fully understand is management, public relations, and marketing. Looking in from the outside, I’m not exactly sure what they do. I know day to day operations eat up much of the time for management. But I’m unsure of the overall strategic plan. Surely, this Microsoft deal is part of that plan. Despite what the open source community’s more vocal members think, I do not think this is some under handed attack on Linux or GPL based open source software. I truly believe Novell is trying to get back into the data center, through Microsoft. However, I believe they erred badly when working in the patent provisions as part of the deal. They further compounded that error by going silent about it. A few blog postings by executives didn’t seem to do a whole lot to stem the tide of discontent and uncertainty. The kicker of it all, is that they didn’t bother to consult important open source community members, within their own walls. They ignored a very valuable resource. That reflects very poorly on management and its ability to identify and use their own resources.

I’m currently reading a book on Steve Jobs, The Second Coming of Steve Jobs, and I’m in the section where he has returned to Apple after his company Next, was acquired by Apple. Steve Jobs is a marketing guy with an affinity for hardware. That hardware is the extension of his marketing aura. People tend to identify Apple’s turn around with the iPod and Mac OSX. However, that turn around started before those products were available. It might surprise people to learn that Apple swung their hundreds of thousands of dollars in losses per quarter to returns using existing technology and software. They took existing hardware running the classic Mac OS, stuffed it in a strange looking bubble of a case, called it the iMac, and then proceeded to market the hell out of it.

My point with all of that, is that Novell has the technology. While my example above does not directly apply to Novell’s position seeing as it’s only a software company and not a hardware company, my point remains the same. Novell keeps trying to gain market movement through their technology. That’s great, but it’s not nearly enough. They need to market their brand name. They need to get out there and actually get in contact with customers. Try playing their own game instead of Microsoft’s. How about trying new things such as free licenses for existing customers on certain products like SLED? I don’t know about anyone else, but I stopped installing time bombed software half a decade ago.

I’ll admit that I’m not a marketing person. But out here in the trenches, when someone says Novell marketing, a Novell admin will always pipe up and say “what marketing” or something along those lines. Clearly, Novell’s marketing has been hurting for a long time and has not done anything to change that perception lately. Their sales division does not go above and beyond the call of duty by any stretch. Their public relations seem to consist of nothing more than blog postings. And finally the upper management who deal with all these aspects, do not seem to be making a whole lot of positive change.

This post is painting a very dark picture at Novell. Not everything is dark. Their product lines continue to fill many needs I and many other customers and potential customers have, even if they don’t know about them. The idea to have and continue support for cool solutions, cool blogs, open audio, and the Novell forums is very much appreciated and the right thing to do. The unpopular decision to depreciate NetWare for OES Linux was a very difficult but necessary decision for the long term viability of Novell in the file and print realm. Finally, Novell offers products that have one piece of build in marketing that can be leveraged. Simply said, they aren’t Microsoft.

What Novell needs is leadership that can use all these pluses to change perception and put themselves on the radar where they otherwise aren’t at this time. The part of the Microsoft deal where Microsoft offers SLES certificates is very indicative or the sales and marketing issues Novell faces from within and how they seems wholly unable to do it themselves.

And finally, they need to do this for customers and potential customers and not for the investors. Investors are blood sucking leaches, a necessary evil in the corporate world and should be treated as such. They bring money, and should not be listened to otherwise. Just because you have money, doesn’t mean you have any clue how a company should be run. I understand how that is an overly simplistic view of the situation and probably isn’t well grounded. But I point to the Steve Jobs example from above. Apple’s board continued to put useless executives into positions at Apple as the losses mounted. When Steve Jobs came back, the board was gutted and replaced with “Steve’s people” who basically let him do what he needs to do.

This post is a simplified look at Novell from the outside in. I like Novell’s products, but I think management is failing the products, not the other way around. I want Novell to last long term and continue to crank out quality software. I just don’t think the software itself is enough.

GroupWise 7 sp1

December 15th, 2006

During the upcoming break, we plan on upgrading our GroupWise system. This will also give us the downtime needed to clean some of the system up. Right now the primary domain and a secondary domain live on system volumes. The secondary domain lives on an NSS volume, which helps. But our primary lives on the traditional sys volume. That volume also houses all the junk mail archives and GWAVA catches and hold. Add all that up, and it’s a problem. So, we’ll have to move some stuff around. I’m also planning on adding a couple domains to the mix. I’m going to separate out the GWIA into it’s own domain. I’m also going to add a domain for the primary to live on by itself. I’m trying to follow “best practice” suggestions according to the follow cool solutions article.

http://www.novell.com/coolsolutions/feature/15045.html

Webmail will live on a SLES box. I’m not sure if I’m going to keep the existing SLES9 box running webaccess, or I’m going to flatten and reinstall with SLES10. That might be a bit much for one day. But, we’ll see how the upgrade goes along. Webaccess will be the last upgrade anyway, so I can decide then.