Discussion:
ZFS hard locks (zfs diff, zfs send/recv, and large file copy)
(too old to reply)
g***@public.gmane.org
2014-08-10 17:12:20 UTC
Permalink
Over the past year or more, I've experienced hard lockups on my machine
usually separated by a few weeks (sometimes a few in the same day). I've
encountered it in the following situations:

* Running a zfs send/recv on the same machine to backup one FS to another
* Ripping a BD disk to ZFS
* Running an MKV merge to remux a large (~30GB) file (both source and
destination on the same FS).
* Running a zfs diff between two snapshots

Among the above, all but the zfs diff only sometimes cause a problem. The
zfs diff will always lock up the machine. Oddly enough I've run a scrub at
least 3 times and has always run without issue. I do not know if the zfs
diff lockup is related to my other lockups.

My setup consists of an Intel DQ670W motherboard with 12G of RAM and a 4
port SATA card connecting 6 WD 2TB Black drives in raidz2 and 1 Intel 520
60G SSD (boot drive). My FS was created under FreeBSD and has a "shares
ZAP object" prompting this change:
https://github.com/zfsonlinux/zfs/pull/1927. The case is very well cooled
for the motherboard, processor, and drives so I doubt if that is an issue.

In all of this, I have only been able to capture two reports of anything
going wrong. Usually the system is completely locked up and there is
nothing I can except hit the reset button. I do have the crashkernel
options set, but it seems to have failed to capture anything.

The crash log that appeared in /var/crash at one point has since
disappeared so I no longer have it. The below dmesg output was from a zfs
diff. Every single time I run a ZFS diff, I get a hard-lock, though only
once was I able to get the output of dmesg nor anything on the console. In
that case, the console did not contain anything not in the dmesg output.
Also, below, sdd is the boot drive.

I am asking if anyone has any ideas as to how I could diagnose this
problem. At this point I do not know if the problem is with the pool/FS or
if it is hardware. I am worried that this is an issue with the pool and I
will have to recreate it.

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-10 17:14:38 UTC
Permalink
Sorry, I missed the attachments.
Post by g***@public.gmane.org
Over the past year or more, I've experienced hard lockups on my machine
usually separated by a few weeks (sometimes a few in the same day). I've
* Running a zfs send/recv on the same machine to backup one FS to another
* Ripping a BD disk to ZFS
* Running an MKV merge to remux a large (~30GB) file (both source and
destination on the same FS).
* Running a zfs diff between two snapshots
Among the above, all but the zfs diff only sometimes cause a problem. The
zfs diff will always lock up the machine. Oddly enough I've run a scrub at
least 3 times and has always run without issue. I do not know if the zfs
diff lockup is related to my other lockups.
My setup consists of an Intel DQ670W motherboard with 12G of RAM and a 4
port SATA card connecting 6 WD 2TB Black drives in raidz2 and 1 Intel 520
60G SSD (boot drive). My FS was created under FreeBSD and has a "shares
https://github.com/zfsonlinux/zfs/pull/1927. The case is very well
cooled for the motherboard, processor, and drives so I doubt if that is an
issue.
In all of this, I have only been able to capture two reports of anything
going wrong. Usually the system is completely locked up and there is
nothing I can except hit the reset button. I do have the crashkernel
options set, but it seems to have failed to capture anything.
The crash log that appeared in /var/crash at one point has since
disappeared so I no longer have it. The below dmesg output was from a zfs
diff. Every single time I run a ZFS diff, I get a hard-lock, though only
once was I able to get the output of dmesg nor anything on the console. In
that case, the console did not contain anything not in the dmesg output.
Also, below, sdd is the boot drive.
I am asking if anyone has any ideas as to how I could diagnose this
problem. At this point I do not know if the problem is with the pool/FS or
if it is hardware. I am worried that this is an issue with the pool and I
will have to recreate it.
To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Fajar A. Nugraha
2014-08-10 17:33:14 UTC
Permalink
Post by g***@public.gmane.org
Over the past year or more, I've experienced hard lockups on my machine
usually separated by a few weeks (sometimes a few in the same day). I've
* Running a zfs send/recv on the same machine to backup one FS to another
* Ripping a BD disk to ZFS
* Running an MKV merge to remux a large (~30GB) file (both source and
destination on the same FS).
* Running a zfs diff between two snapshots
Among the above, all but the zfs diff only sometimes cause a problem. The
zfs diff will always lock up the machine. Oddly enough I've run a scrub at
least 3 times and has always run without issue. I do not know if the zfs
diff lockup is related to my other lockups.
My setup consists of an Intel DQ670W motherboard with 12G of RAM and a 4
port SATA card connecting 6 WD 2TB Black drives in raidz2 and 1 Intel 520
60G SSD (boot drive). My FS was created under FreeBSD and has a "shares ZAP
object" prompting this change: https://github.com/zfsonlinux/zfs/pull/1927.
The case is very well cooled for the motherboard, processor, and drives so I
doubt if that is an issue.
In all of this, I have only been able to capture two reports of anything
going wrong. Usually the system is completely locked up and there is
nothing I can except hit the reset button. I do have the crashkernel
options set, but it seems to have failed to capture anything.
The crash log that appeared in /var/crash at one point has since disappeared
so I no longer have it. The below dmesg output was from a zfs diff. Every
single time I run a ZFS diff, I get a hard-lock, though only once was I able
to get the output of dmesg nor anything on the console. In that case, the
console did not contain anything not in the dmesg output. Also, below, sdd
is the boot drive.
I am asking if anyone has any ideas as to how I could diagnose this problem.
At this point I do not know if the problem is with the pool/FS or if it is
hardware. I am worried that this is an issue with the pool and I will have
to recreate it.
Your dmesg basically says ata8.00/sdd is busted. Even if that disk is
not zfs (i.e. the boot disk), I don't think zfs (or any userland
program) appreciates having the rootfs suddenly missing.

My suggestions:
- don't use that disk. If it's root, temprarily move your root
somewhere else. It can even be in the zfs pool (pretty easy if you use
ubuntu). If it's used for swap, run with no swap (at least
temporarily)
- limit the size of your arc, at least for testing purposes. 2GB
should be enough (e.g. "echo 2147483648 >
/sys/module/zfs/parameters/zfs_arc_max")
- with the above changes, rerun the "zfs diff" (or whatever commands
that most often give you problems)

My GUESS is that you have unlimited ARC, which in turns cause your
sytem to swap, which hits some bad blocks on your boot disk, which
cause linux to have all sorts of problem with it.
--
Fajar

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Michael Kjörling
2014-08-10 17:54:35 UTC
Permalink
Post by Fajar A. Nugraha
Your dmesg basically says ata8.00/sdd is busted. Even if that disk is
not zfs (i.e. the boot disk), I don't think zfs (or any userland
program) appreciates having the rootfs suddenly missing.
I concur, all those ata8 and ata8.00 errors really stood out to me.
I'd definitely fix that issue first, and ASAP, especially since the
root file system is on that one and even more so since it does not
have any redundancy. See if the problems go away with that, and if
they don't, I'd investigate further _at that point_.

I'm curious: What does the SSD say in its SMART data? Can you post the
output of running "smartctl --attributes" pointed at the SSD device?
--
Michael Kjörling • https://michael.kjorling.se • michael-/***@public.gmane.org
OpenPGP B501AC6429EF4514 https://michael.kjorling.se/public-keys/pgp
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-10 18:06:52 UTC
Permalink
Post by Michael Kjörling
Post by Fajar A. Nugraha
Your dmesg basically says ata8.00/sdd is busted. Even if that disk is
not zfs (i.e. the boot disk), I don't think zfs (or any userland
program) appreciates having the rootfs suddenly missing.
I concur, all those ata8 and ata8.00 errors really stood out to me.
I'd definitely fix that issue first, and ASAP, especially since the
root file system is on that one and even more so since it does not
have any redundancy. See if the problems go away with that, and if
they don't, I'd investigate further _at that point_.
Thank you for your suggestions. I'm currently building an Xubuntu USB boot
disk to test with it. I figured the ata8 errors were a result of the prior
error, not the cause.
Post by Michael Kjörling
I'm curious: What does the SSD say in its SMART data? Can you post the
output of running "smartctl --attributes" pointed at the SSD device?
I was previously checking the smart status of all the drives in the GUI,
but it said it was OK. I ran the command you suggested:

sudo smartctl --attributes /dev/sdd
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age
Always - 0
9 Power_On_Hours_and_Msec 0x0032 000 000 000 Old_age
Always - 906091h+14m+07.340s
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 38
170 Available_Reservd_Space 0x0033 100 100 010 Pre-fail
Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age
Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age
Always - 0
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age
Always - 37
184 End-to-End_Error 0x0033 100 100 090 Pre-fail
Always - 0
187 Uncorrectable_Error_Cnt 0x000f 099 099 050 Pre-fail
Always - 2502420
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
Always - 37
225 Host_Writes_32MiB 0x0032 100 100 000 Old_age
Always - 25016
226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age
Always - 65535
227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age
Always - 11
228 Workload_Minutes 0x0032 100 100 000 Old_age
Always - 65535
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail
Always - 0
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age
Always - 0
241 Host_Writes_32MiB 0x0032 100 100 000 Old_age
Always - 25016
242 Host_Reads_32MiB 0x0032 100 100 000 Old_age
Always - 3308
249 NAND_Writes_1GiB 0x0013 100 100 000 Pre-fail
Always - 1566

The Uncorrectable_Error_Cnt looks worrisome, unless I'm misinterpreting it.

I will post again with my results after testing with the USB boot drive.
Post by Michael Kjörling
--
<javascript:>
OpenPGP B501AC6429EF4514 https://michael.kjorling.se/public-keys/pgp
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)
To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Fajar A. Nugraha
2014-08-10 19:00:59 UTC
Permalink
Post by g***@public.gmane.org
The Uncorrectable_Error_Cnt looks worrisome, unless I'm misinterpreting it.
I will post again with my results after testing with the USB boot drive.
There's also a simple read test to make sure that it's not something
obvious as an unreadable sector. I like to use ddrescue. badblocks -vv
should also do the job.
--
Fajar

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-10 23:07:30 UTC
Permalink
Post by Fajar A. Nugraha
Post by g***@public.gmane.org
The Uncorrectable_Error_Cnt looks worrisome, unless I'm misinterpreting
it.
Post by g***@public.gmane.org
I will post again with my results after testing with the USB boot drive.
There's also a simple read test to make sure that it's not something
obvious as an unreadable sector. I like to use ddrescue. badblocks -vv
should also do the job.
--
Fajar
Thank you for the suggestions.

I promised to post the results after trying the thumb drive, but before I
first posted, I forgot to mention that I did reduce arc to 3G. It also
locked up with top showing 10G free at the time of the lock. This was in
the normal configuration using the SSD at the boot drive. Also in reading
the comments, I realized I forgot to mention that the machine is running
Ubuntu Server. It was originally 12.04, and subsequently undated to 14.04
about a month ago. The locks did occur under 12.04. ZFS is from the
zfs-native/stable ppa, and zfs-dkms is at 0.6.3-2~trusty (spl-dkms is at
0.6.3-1~trusty)

I installed Xubuntu on a thumb drive and booted it (using swap on this
thumb drive). I imported the pool and ran zfs diff across the same two
snapshots. It locked again, only this time it was complaining about ata5,
which is the bluray burner. This drive had no media at the time nor had it
been used since boot. I had the previous boot drive disconnected at the
time, so it could not be a contributing factor.

I ran ddrescue to copy the root to the raidz2 array. It ran around
300MB/s, never dropping, with no errors. I also copied the swap too for
good measure. Same results. The root is 14G and the swap is 2G. I do use
LVM on the SSD and it is not used by ZFS. Strangely enough, my SSD is now
reporting:
187 Uncorrectable_Error_Cnt 0x000f 120 120 050 Pre-fail
Always - 0

It seems the SSD is not the issue. I sorta wish it was, because that's not
too hard to fix. Any other suggestions or tests?

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-11 20:21:58 UTC
Permalink
I had a thought on my situation. Since I had previously used FreeBSD with
this pool, my pool is not using the raw disk but rather in a partition (the
first partition was a boot partition). Could this be a contributing factor
in my case or does the code have no issues with using a pool inside a
partition? Should I use replace to change the partitions with the raw
disks or is the effort not worth it?

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Turbo Fredriksson
2014-08-11 20:30:52 UTC
Permalink
does the code have no issues with using a pool inside a partition?
I've never heard or seen any indication to this! We even recommend doing this
specifically in our 'root on Debian GNU/Linux and Ubuntu' HOWTOs!


This because having /boot on ZFS have proven ... 'problematic'. Work for some
(me included, twice but not the third time or the tries after that), but not all...

This makes it almost impossible to figure out exactly WHAT the problem is...
--
Try not. Do. Or do not. There is no try!
- Yoda

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-11 20:37:23 UTC
Permalink
does the code have no issues with using a pool inside a partition?
I've never heard or seen any indication to this! We even recommend doing this
specifically in our 'root on Debian GNU/Linux and Ubuntu' HOWTOs!
This because having /boot on ZFS have proven ... 'problematic'. Work for some
(me included, twice but not the third time or the tries after that), but not all...
This makes it almost impossible to figure out exactly WHAT the problem is...
OK, I'm just making sure. I figured it had no issues in a partition but
I'm starting to grasp at straws here as to what I can do. I know zpool
prefers raw disks; just wanted to make sure it wasn't problematic if one
doesn't use a raw disk.

I too tried to boot Ubuntu from a ZFS partition and found it problematic.
I was intending to make my SSD a zpool and backup to the raid periodically,
but deemed that booting from ZFS was quite fragile. This was also over a
year ago and I know progress had been made since then.
--
Try not. Do. Or do not. There is no try!
- Yoda
To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Michael Kjörling
2014-08-11 21:16:34 UTC
Permalink
I know zpool prefers raw disks; just wanted to make sure it wasn't
problematic if one doesn't use a raw disk.
If you give ZFS a raw disk, it'll partition it itself, but it'll know
that it is using the whole disk so can make optimizations with regards
to e.g. caching that it otherwise cannot (since those apply per
device, not per partition).

So using partitions _should_ only cause some performance degredation.
It certainly shouldn't be a factor in hard lockups.
--
Michael Kjörling • https://michael.kjorling.se • michael-/***@public.gmane.org
OpenPGP B501AC6429EF4514 https://michael.kjorling.se/public-keys/pgp
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Neal H. Walfield
2014-08-11 21:39:29 UTC
Permalink
At Mon, 11 Aug 2014 21:16:34 +0000,
Post by Michael Kjörling
I know zpool prefers raw disks; just wanted to make sure it wasn't
problematic if one doesn't use a raw disk.
If you give ZFS a raw disk, it'll partition it itself, but it'll know
that it is using the whole disk so can make optimizations with regards
to e.g. caching that it otherwise cannot (since those apply per
device, not per partition).
I've come across this before, but I haven't figure out what those
optimizations could be. Do you know?

Thanks,

Neal

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Schlacta, Christ
2014-08-11 21:52:20 UTC
Permalink
Zfs instructs Linux to disable the OS level page cache on a per device
basis if it has whole disks. You can work around the lack of this feature
using third party hacks on Linux. Other platforms may or may not provide
this interface to userspace administrators. Other platforms may provide
additional optimizations that may or may not be configurable from
userspace. Linux may in the future provide optimizations other than this
that may or may not be configurable from userspace. The inability to
configure whether a vdev or pool is a whole disk pool is a bug.
Post by Neal H. Walfield
At Mon, 11 Aug 2014 21:16:34 +0000,
Post by Michael Kjörling
I know zpool prefers raw disks; just wanted to make sure it wasn't
problematic if one doesn't use a raw disk.
If you give ZFS a raw disk, it'll partition it itself, but it'll know
that it is using the whole disk so can make optimizations with regards
to e.g. caching that it otherwise cannot (since those apply per
device, not per partition).
I've come across this before, but I haven't figure out what those
optimizations could be. Do you know?
Thanks,
Neal
To unsubscribe from this group and stop receiving emails from it, send an
To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
g***@public.gmane.org
2014-08-14 01:45:02 UTC
Permalink
I've done some more tests, and concluded that this is a bug in ZoL, which I
filled here: https://github.com/zfsonlinux/zfs/issues/2597

Basically, the zfs diff succeeds on FreeBSD. It reports one object for
which it cannot determine the path. zdb on Linux is unable to provide
information on this object; the process hangs, but is still killable
(unlike zfs diff). I was able to run zdb on FreeBSD which yielded some
interesting results. This object is a file, which has a parent object ID
which corresponds to a different file. Apparently, according to an
opensolaris thread [1], this can occur when one uses hard links and the
last link's directory is removed (but the file is still linked elsewhere).
I suspect this is the cause of the hard locks I'm experiencing with zfs
diff.

Thank you everyone for providing ideas for things to test. Hopefully this
will help resolve my lockups with ZFS.

[1] https://www.mail-archive.com/zfs-discuss-***@public.gmane.org/msg50101.html

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Turbo Fredriksson
2014-08-11 21:41:26 UTC
Permalink
Post by Michael Kjörling
If you give ZFS a raw disk, it'll partition it itself, but it'll know
that it is using the whole disk so can make optimizations with regards
to e.g. caching that it otherwise cannot (since those apply per
device, not per partition).
That would be great to have in the faq...

http://zfsonlinux.org/faq.html

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Turbo Fredriksson
2014-08-11 21:09:24 UTC
Permalink
[missed the list]
I suspect it is the same of everyone who sees this thread.
Probably.

The "Guys That Knows" isn't on the list (they're very busy fixing issues and
improving ZFS :). My last recommendation is simply to open an issue on the
tracker...

Give a short description of your problem, what you've tried and maybe a
screenshot (an actually screenshot - i.e. photo :) if possible and then link
to this thread 'for more information'.

That way, you/they don't have to go through all the tests and verifications
all over...

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+unsubscribe-VKpPRiiRko7s4Z89Ie/***@public.gmane.org
Continue reading on narkive:
Loading...