Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
  #1 (permalink)  
Old 05-11-2008, 05:05 AM
LMStone's Avatar
LMStone LMStone is offline
Elite Member
 
Join Date: Sep 2006
Location: 477 Congress Street | Portland, ME 04101-3431
ZCS Version: Release 5.0.6_GA_2313.SLES10_64_20080522095725 SLES10_64 NETWORK edition
Posts: 295
LMStone is on a distinguished road
Default Backups to NAS via CIFS Failing Intermittently

Good Morning,

We are seeing intermittent errors with backups (we are testing using a Buffalo NAS device).

The two types of errors affecting perhaps 10% of the accounts in a system wide backup are "bad file descriptor" and "unable to remove /opt/zimbra/backup/tmp/..." (though there is nothing in ~/backup/tmp).

Accounts in the Admin UI displaying this errors are not restorable. Other accounts restore fine.

Not sure where to start debugging, so I thought I'd post here!

Our setup is that we have Buffalo NAS device on the same subnet exposing a Samba share. We mount the share on the Zimbra box and then create a symlink to the mount point from /opt/zimbra/backup, after mv-ing the existing /opt/zimbra/backup somewhere else.

I looked through previous posts here with "bad file descriptor" content, and didn't see anything that applied directly.

I grepped zimbra.log for "btpool" and nothing looked out of sorts there either.

We have fsck'd the Buffalo device and still get the same problem.

We did find this article:
Java Bad File Descriptor Close Bug

but I wouldn't know where to begin looking at Zimbra's code, nor even if the backups are being done via Java somehow.

The network connection between the two devices is about three feet, with known good Cat6 cables and an HP switch whose log is showing no errors.

Other servers using the NAS have no such problems. No other servers are accessing the NAS device during the Zimbra backup period.

Any ideas would be appreciated; NAS storage is much cheaper than DASD and makes off-site replication easier as well.

Thanks!
Mark
__________________
L. Mark Stone


Uptime. All the time. HIPAA-compliant Zimbra hosting nationwide, from Maine.
Reply With Quote
  #2 (permalink)  
Old 05-11-2008, 05:55 AM
uxbod's Avatar
uxbod uxbod is offline
Moderator
 
Join Date: Nov 2006
Location: Northampton, UK
ZCS Version: Release 5.0.7_GA_2450.RHEL5_20080630192737 CentOS5 NETWORK edition (Unsupported OS)
Posts: 1,335
uxbod is on a distinguished road
Send a message via MSN to uxbod
Default

Hi Mark,

are you using the same NIC for both servicing ZCS requests and the NAS backup share ?

Do you get the same problem if you use NFS ?
__________________
Server | CentOS 5.1 | Dual Opteron 250 | Tyan K8W Mobo | 6GB RAM | 3WARE 9550-SX4 | 4 x Samsung 200GB SATA II |
Zimbra | Release Release 5.0.7_GA_2450.RHEL5_20080630192737 NETWORK edition running on Xen 3.2 CentOS 5.2 i386 VM |
Network | Cisco 877 Router - Cisco ASA 5505 FW - Cisco 1131AP |
Reply With Quote
  #3 (permalink)  
Old 05-11-2008, 06:20 AM
LMStone's Avatar
LMStone LMStone is offline
Elite Member
 
Join Date: Sep 2006
Location: 477 Congress Street | Portland, ME 04101-3431
ZCS Version: Release 5.0.6_GA_2313.SLES10_64_20080522095725 SLES10_64 NETWORK edition
Posts: 295
LMStone is on a distinguished road
Default

Thanks for the fast reply!

Quote:
Originally Posted by uxbod View Post
Hi Mark,

are you using the same NIC for both servicing ZCS requests and the NAS backup share ?
Yes. On-board GB NIC on an HP DL-360G4p. We've run these servers as web servers with more than twenty IPs on a NIC, no problem. What are thinking?


Quote:
Do you get the same problem if you use NFS ?
This NAS does not support NFS; Samba, AppleTalk, HTTP and FTP only.

Zimbra docs discourage NFS use, and we've found it to be pretty slow in other use cases, so we tend to avoid NFS.

Ideas?

All the best,
Mark
__________________
L. Mark Stone


Uptime. All the time. HIPAA-compliant Zimbra hosting nationwide, from Maine.
Reply With Quote
  #4 (permalink)  
Old 05-11-2008, 06:42 AM
uxbod's Avatar
uxbod uxbod is offline
Moderator
 
Join Date: Nov 2006
Location: Northampton, UK
ZCS Version: Release 5.0.7_GA_2450.RHEL5_20080630192737 CentOS5 NETWORK edition (Unsupported OS)
Posts: 1,335
uxbod is on a distinguished road
Send a message via MSN to uxbod
Default

Quote:
Originally Posted by LMStone View Post
What are thinking?
Utilisation of the NIC with respect to load. Also, how is the server for memory ? Does anything show up in dmesg ? What about in /var/log/messages ?
__________________
Server | CentOS 5.1 | Dual Opteron 250 | Tyan K8W Mobo | 6GB RAM | 3WARE 9550-SX4 | 4 x Samsung 200GB SATA II |
Zimbra | Release Release 5.0.7_GA_2450.RHEL5_20080630192737 NETWORK edition running on Xen 3.2 CentOS 5.2 i386 VM |
Network | Cisco 877 Router - Cisco ASA 5505 FW - Cisco 1131AP |
Reply With Quote
  #5 (permalink)  
Old 05-11-2008, 04:32 PM
langs langs is offline
Senior Member
 
Join Date: Sep 2006
Location: Brisbane
ZCS Version: 5.0.5_GA_2201.RHEL4_64.NETWORK
Posts: 79
langs is on a distinguished road
Default

Quote:
Originally Posted by LMStone View Post
Zimbra docs discourage NFS use, and we've found it to be pretty slow in other use cases, so we tend to avoid NFS.
Out of curiosity where do they say that and why? Doesn't make a lot of sense to me, NFS is well proven and a hell of a lot better then Samba.

I have all my backups going to a NFS mounted target on my SAN, does 300gb nightly without issue and fast.. don't know where the slow comes into it.

my bet is your smb setup is having issues, are you using the NE version or the open source? If you are using the open source version with Rsync you will see "bad file descriptor" if you use -a, as it will want full *Nix permissions that a smb mount doesn't handle.
__________________
Vote to Make CentOS Official;
http://bugzilla.zimbra.com/show_bug.cgi?id=23487

Last edited by langs : 05-11-2008 at 04:42 PM.
Reply With Quote
  #6 (permalink)  
Old 05-11-2008, 07:02 PM
Rich Graves Rich Graves is offline
Elite Member
 
Join Date: Jan 2007
Location: Minnesota
ZCS Version: 5.0.6_GA_2313.RHEL4_64_20080522093238 RHEL4_64 NETWORK
Posts: 363
Rich Graves is on a distinguished road
Default

There are two major branches of code for smb/cifs mounts on Linux. Are you using smbfs or cifs? cifs is almost certainly better, but whichever it is, try the other.

There are several cifs tuning options... try turning off opslocks and directio, since they have historically been buggy and likely won't gain you anything with a single client doing sequential I/O.

You probably need to live with the hardware you've got, but for your next purchase, consider iSCSI. High-end iSCSI is expensive, but low-end ought to be fine for backup applications.
Reply With Quote
  #7 (permalink)  
Old 05-11-2008, 07:03 PM
LMStone's Avatar
LMStone LMStone is offline
Elite Member
 
Join Date: Sep 2006
Location: 477 Congress Street | Portland, ME 04101-3431
ZCS Version: Release 5.0.6_GA_2313.SLES10_64_20080522095725 SLES10_64 NETWORK edition
Posts: 295
LMStone is on a distinguished road
Default

Quote:
Originally Posted by langs View Post
Out of curiosity where do they say that and why? Doesn't make a lot of sense to me, NFS is well proven and a hell of a lot better then Samba.
The install docs list services to turn off/avoid; NFS is one of them.

Quote:
Originally Posted by langs View Post
I have all my backups going to a NFS mounted target on my SAN, does 300gb nightly without issue and fast.. don't know where the slow comes into it.
We have used and continue to use NFS in a variety of deployments just fine. We just find cifs to be able to transfer larger files (>50MB) faster than NFS.

Quote:
Originally Posted by langs View Post
my bet is your smb setup is having issues, are you using the NE version or the open source? If you are using the open source version with Rsync you will see "bad file descriptor" if you use -a, as it will want full *Nix permissions that a smb mount doesn't handle.
I suspect you are right. If so, either we'll need to tweak the mount options or we went a little too low end on this particular NAS server.

But, the java article in my original post led me to believe we ourselves may not be entirely to blame, hence my post to see if anyone has had this issue with Zimbra. Not casting a stone, just asking.

Thanks for your post!
Mark

P.S. We are on NE. Sorry, I don't post the version we are running anymore after we got into the habit of keeping our profile up to date. :-)
__________________
L. Mark Stone


Uptime. All the time. HIPAA-compliant Zimbra hosting nationwide, from Maine.
Reply With Quote
  #8 (permalink)  
Old 05-12-2008, 01:38 AM
uxbod's Avatar
uxbod uxbod is offline
Moderator
 
Join Date: Nov 2006
Location: Northampton, UK
ZCS Version: Release 5.0.7_GA_2450.RHEL5_20080630192737 CentOS5 NETWORK edition (Unsupported OS)
Posts: 1,335
uxbod is on a distinguished road
Send a message via MSN to uxbod
Default

Quote:
Originally Posted by LMStone View Post
But, the java article in my original post led me to believe we ourselves may not be entirely to blame, hence my post to see if anyone has had this issue with Zimbra. Not casting a stone, just asking
From a quick Google it would appear a lot of people have this issue with CIFS and Samba. I did see the Java one, but also when the machine is under high load it can happen. Certainly a interesting problem
__________________
Server | CentOS 5.1 | Dual Opteron 250 | Tyan K8W Mobo | 6GB RAM | 3WARE 9550-SX4 | 4 x Samsung 200GB SATA II |
Zimbra | Release Release 5.0.7_GA_2450.RHEL5_20080630192737 NETWORK edition running on Xen 3.2 CentOS 5.2 i386 VM |
Network | Cisco 877 Router - Cisco ASA 5505 FW - Cisco 1131AP |
Reply With Quote
  #9 (permalink)  
Old 05-12-2008, 02:17 AM
uxbod's Avatar
uxbod uxbod is offline
Moderator
 
Join Date: Nov 2006
Location: Northampton, UK
ZCS Version: Release 5.0.7_GA_2450.RHEL5_20080630192737 CentOS5 NETWORK edition (Unsupported OS)
Posts: 1,335
uxbod is on a distinguished road
Send a message via MSN to uxbod
Default

Out of interest which model are you trying ? I have been looking at the iSCSI one.
__________________
Server | CentOS 5.1 | Dual Opteron 250 | Tyan K8W Mobo | 6GB RAM | 3WARE 9550-SX4 | 4 x Samsung 200GB SATA II |
Zimbra | Release Release 5.0.7_GA_2450.RHEL5_20080630192737 NETWORK edition running on Xen 3.2 CentOS 5.2 i386 VM |
Network | Cisco 877 Router - Cisco ASA 5505 FW - Cisco 1131AP |
Reply With Quote
  #10 (permalink)  
Old 05-12-2008, 05:10 AM
LMStone's Avatar
LMStone LMStone is offline
Elite Member
 
Join Date: Sep 2006
Location: 477 Congress Street | Portland, ME 04101-3431
ZCS Version: Release 5.0.6_GA_2313.SLES10_64_20080522095725 SLES10_64 NETWORK edition
Posts: 295
LMStone is on a distinguished road
Default

Quote:
Originally Posted by uxbod View Post
Out of interest which model are you trying ? I have been looking at the iSCSI one.
Low-end: Linkstation Pro Duo. Ethernet, not iSCSI. For about $310 you get a RAID1 500GB device that does Samba, http, ftp and AppleTalk (heretofore) quite reliably. We've put a number of these out at smaller clients' sites.

We also like these devices for a simple disaster recovery plan: one of these devices can be set as the backup target of another one. So, for companies with branch offices, you can locate the primary one in the server rack at the main office, and put a second one in the telco closet at a branch office.

They keep themselves in sync, so if the main office burns down, all the data is in the branch office OK. The sync works well over WAN connections too.

As the units are not much bigger than the two disk drives inside them, we train clients to try to grab them (if it is safe to do so) when the office fire alarm goes off, even if only for a fire drill.

Plus, if you need more than 500GB storage, you can plug in external USB hard drives to these devices.

Where else can you do disk-to-disk-to-(WAN)-disk backups for about $1.25 per GB all in?

Much cheaper than a pair of Clarions doing SAN replication!

All the best,
Mark
__________________
L. Mark Stone


Uptime. All the time. HIPAA-compliant Zimbra hosting nationwide, from Maine.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Backups failing, "unable to read metadata for account" smcgrath1111 Administrators 10 04-10-2008 03:15 PM
NAS and CIFS ronnyek Administrators 11 12-14-2006 08:17 AM


freshmeat.net sourceforge.net The best Java IDE



 

Search Engine Optimization by vBSEO 3.0.0