Discussion:
[zfs-discuss] Running out of memory
a***@whisperpc.com
2014-12-08 18:06:30 UTC
Permalink
I'm running ZFS on top of CentOS 6 (recently patched) on a small file
server (8TB after RAID-Z2). After about a month, the system runs out of
memory. I believe the memory leak is in ZFS.

I collected hourly copies of /proc/spl/kstat/zfs/arcstats, starting at
20141112.101839, and ending at 20141208.161002 (last copy prior to the
system hanging). I combined the results into a CSV file (attached - I
hope it goes through). I added slab information (/proc/spl/kmem/slab) to
the data being gathered, but that was just after the system boot (no
useful data yet).

Does anyone have any ideas about solving this problem?

The configuration of the pool (zpool status) is as follows:

pool: data
state: ONLINE
scan: scrub repaired 0 in 5h45m with 0 errors on Fri Oct 3 03:05:30 2014
config:

NAME STATE READ
WRITE CKSUM
data ONLINE 0
0 0
raidz2-0 ONLINE 0
0 0
ata-ST91000640NS_9XG6JS7B ONLINE 0
0 0
ata-ST91000640NS_9XG6JTCF ONLINE 0
0 0
ata-ST91000640NS_9XG6JSX9 ONLINE 0
0 0
ata-ST91000640NS_9XG6JSMC ONLINE 0
0 0
ata-ST91000640NS_9XG6K3AK ONLINE 0
0 0
ata-ST91000640NS_9XG6LGZ8 ONLINE 0
0 0
ata-ST91000640NS_9XG6JT27 ONLINE 0
0 0
ata-ST91000640NS_9XG6JSFW ONLINE 0
0 0
ata-ST91000640NS_9XG6JS7N ONLINE 0
0 0
ata-ST91000640NS_9XG6JSWG ONLINE 0
0 0
logs
mirror-1 ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV42130229100FGN ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV4213020K100FGN ONLINE 0
0 0
cache
ata-INTEL_SSDSC2BA100G3_BTTV42130229100FGN ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV4213020K100FGN ONLINE 0
0 0

errors: No known data errors

The data disks are 1TB 7200RPM Nearline SATA drives. The log slices are
16GiB, and the cache slices are what's left of the 100GB SSDs (~80GiB).

The /etc/modprobe.d/zfs.conf file is as follows:

#
# Set ZFS tuning parameters.
#
# System memory = 64GB
# L2ARC - two fast SSDs

# Minimum - 24GB - Don't allow the system to shrink the ARC to less
# than 24GB
#
# The ARC can choose to be smaller, but it can't be forced to be
# smaller by memory pressure.
options zfs zfs_arc_min=25769803776
# 32GB
# options zfs zfs_arc_min=34359738368

# Maximum - 40GB - Don't allow the ARC to grow to be larger than 40GB
options zfs zfs_arc_max=42949672960
# 48GB
# options zfs zfs_arc_max=51539607552

# Set the ARC shrink size to 1/256 memory (256MB) at a time
#
# The process of shrinking the ARC is very time consuming. Freeing
# large amounts at a time can cause a huge latency spike, which is
# bad for interactive response.
options zfs zfs_arc_shrink_shift=8

# Set the L2ARC write buffer size to 24MB
options zfs l2arc_write_max=25165824

# Set the buffer size to 48MB while the L2ARC for initial fill
options zfs l2arc_write_boost=50331648

# Allocate 4 L2ARC write buffers (2 per device)
options zfs l2arc_headroom=4

# Sync every second. This will keep the amount of data per sync down,
# delivering smoother operation.
options zfs zfs_txg_timeout=1

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.
Fajar A. Nugraha
2015-02-23 13:14:30 UTC
Permalink
Zol still somewhat sucks at memory usage. I recommend taking max memory
you're willing to give zfs (40G?), divide it by 2 or 2.5, and set it as
zfs_arc_max.

There's a workaround for that involving zram as l2arc, but it adds
complexity, and (I just learned) that it doesn't play nice with zed from
latest git. So I recommend just live with the memory hog for now.
--
Fajar
Post by a***@whisperpc.com
I'm running ZFS on top of CentOS 6 (recently patched) on a small file
server (8TB after RAID-Z2). After about a month, the system runs out of
memory. I believe the memory leak is in ZFS.
I collected hourly copies of /proc/spl/kstat/zfs/arcstats, starting at
20141112.101839, and ending at 20141208.161002 (last copy prior to the
system hanging). I combined the results into a CSV file (attached - I
hope it goes through). I added slab information (/proc/spl/kmem/slab) to
the data being gathered, but that was just after the system boot (no
useful data yet).
Does anyone have any ideas about solving this problem?
pool: data
state: ONLINE
scan: scrub repaired 0 in 5h45m with 0 errors on Fri Oct 3 03:05:30 2014
NAME STATE READ
WRITE CKSUM
data ONLINE 0
0 0
raidz2-0 ONLINE 0
0 0
ata-ST91000640NS_9XG6JS7B ONLINE 0
0 0
ata-ST91000640NS_9XG6JTCF ONLINE 0
0 0
ata-ST91000640NS_9XG6JSX9 ONLINE 0
0 0
ata-ST91000640NS_9XG6JSMC ONLINE 0
0 0
ata-ST91000640NS_9XG6K3AK ONLINE 0
0 0
ata-ST91000640NS_9XG6LGZ8 ONLINE 0
0 0
ata-ST91000640NS_9XG6JT27 ONLINE 0
0 0
ata-ST91000640NS_9XG6JSFW ONLINE 0
0 0
ata-ST91000640NS_9XG6JS7N ONLINE 0
0 0
ata-ST91000640NS_9XG6JSWG ONLINE 0
0 0
logs
mirror-1 ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV42130229100FGN ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV4213020K100FGN ONLINE 0
0 0
cache
ata-INTEL_SSDSC2BA100G3_BTTV42130229100FGN ONLINE 0
0 0
ata-INTEL_SSDSC2BA100G3_BTTV4213020K100FGN ONLINE 0
0 0
errors: No known data errors
The data disks are 1TB 7200RPM Nearline SATA drives. The log slices are
16GiB, and the cache slices are what's left of the 100GB SSDs (~80GiB).
#
# Set ZFS tuning parameters.
#
# System memory = 64GB
# L2ARC - two fast SSDs
# Minimum - 24GB - Don't allow the system to shrink the ARC to less
# than 24GB
#
# The ARC can choose to be smaller, but it can't be forced to be
# smaller by memory pressure.
options zfs zfs_arc_min=25769803776
# 32GB
# options zfs zfs_arc_min=34359738368
# Maximum - 40GB - Don't allow the ARC to grow to be larger than 40GB
options zfs zfs_arc_max=42949672960
# 48GB
# options zfs zfs_arc_max=51539607552
# Set the ARC shrink size to 1/256 memory (256MB) at a time
#
# The process of shrinking the ARC is very time consuming. Freeing
# large amounts at a time can cause a huge latency spike, which is
# bad for interactive response.
options zfs zfs_arc_shrink_shift=8
# Set the L2ARC write buffer size to 24MB
options zfs l2arc_write_max=25165824
# Set the buffer size to 48MB while the L2ARC for initial fill
options zfs l2arc_write_boost=50331648
# Allocate 4 L2ARC write buffers (2 per device)
options zfs l2arc_headroom=4
# Sync every second. This will keep the amount of data per sync down,
# delivering smoother operation.
options zfs zfs_txg_timeout=1
To unsubscribe from this group and stop receiving emails from it, send an
To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.
Douglas J Hunley
2015-02-25 16:37:58 UTC
Permalink
Post by Fajar A. Nugraha
There's a workaround for that involving zram as l2arc, but it adds
complexity, and (I just learned) that it doesn't play nice with zed from
latest git.
pointer to an issue?
--
Douglas J Hunley (***@gmail.com)
Twitter: @hunleyd Web:
about.me/douglas_hunley
G+: http://google.com/+DouglasHunley

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.
Continue reading on narkive:
Loading...