自从2003-2006年,Google发表了三篇著名的大数据相关论文(Google FS,MapReduce,Big Table)后,内存问题一直困扰大数据工程师们。
这一问题从MR1.0一直延续到Spark时代,从Spark晚期版本试图由应用程序自行管理内存后,人们才初步解决了内存问题。
使用原生的JVM内存管理会带来如下的致命问题:
在Apache Flink中,taskManager自行管理的内存,避免了JVM原生内存管理的缺陷,本文将详细介绍相关逻辑。
Task manager管理的JVM内存主要分为Network Buffers、MemoryManager 和 Free 三个区域。
内存管理的相关配置
静态抽象类MemoryPool定义了内存池的方法,它有两个实现类HybridHeapMemoryPool和HybridOffHeapMemoryPool,堆内内存池和堆外内存池。
abstract static class MemoryPool {
abstract int getNumberOfAvailableMemorySegments();
abstract MemorySegment allocateNewSegment(Object owner);
abstract MemorySegment requestSegmentFromPool(Object owner);
abstract void returnSegmentToPool(MemorySegment segment);
abstract void clear();
}
MemoryManager 类负责管理sorting,、hashing、caching使用的内存,主要方法有allocatePages(申请内存段)和release(释放内存段)
public void allocatePages(Object owner, List<MemorySegment> target, int numPages)
throws MemoryAllocationException {
... 入参校验
// -------------------- BEGIN CRITICAL SECTION -------------------
synchronized (lock) {
if (isShutDown) {
throw new IllegalStateException("Memory manager has been shut down.");
}
// 可用内存校验
if (numPages > (memoryPool.getNumberOfAvailableMemorySegments() + numNonAllocatedPages)) {
throw new MemoryAllocationException("Could not allocate " + numPages + " pages. Only " +
(memoryPool.getNumberOfAvailableMemorySegments() + numNonAllocatedPages)
+ " pages are remaining.");
}
// allocatedSegments是个HashMap<Object, Set<MemorySegment>>,key是owner,值是此owner占用的segment列表
Set<MemorySegment> segmentsForOwner = allocatedSegments.get(owner);
if (segmentsForOwner == null) {
segmentsForOwner = new HashSet<MemorySegment>(numPages);
allocatedSegments.put(owner, segmentsForOwner);
}
if (isPreAllocated) {
for (int i = numPages; i > 0; i--) {
MemorySegment segment = memoryPool.requestSegmentFromPool(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
}
else {
for (int i = numPages; i > 0; i--) {
MemorySegment segment = memoryPool.allocateNewSegment(owner);
target.add(segment);
segmentsForOwner.add(segment);
}
numNonAllocatedPages -= numPages;
}
}
// -------------------- END CRITICAL SECTION -------------------
}
MemorySegment类管理Flink中的一个内存页,MemorySegment是抽象类有两个实现类HeapMemorySegment和HybridMemorySegment。
MemorySegment定义了Segment的基本操作:
// 返回segment字节数
public int size();
// segment是否已释放
public boolean isFreed();
// 释放segment
public void free();
// 是否使用堆外内存
public boolean isOffHeap();
// 返回堆内内存的数组
public byte[] getArray();
// 返回堆外内存的地址
public long getAddress();
// 返回指定区域的数据,并封装成ByteBuffer
public abstract ByteBuffer wrap(int offset, int length);
// 返回segment owner
public Object getOwner();
// 随机读写API
public abstract byte get(int index);
public abstract void put(int index, byte b);
public abstract void get(int index, byte[] dst);
public abstract void put(int index, byte[] src);
public abstract void get(int index, byte[] dst, int offset, int length);
public abstract void put(int index, byte[] src, int offset, int length);
public abstract boolean getBoolean(int index);
public abstract void putBoolean(int index, boolean value);
public final char getChar(int index);
public final char getCharLittleEndian(int index);
public final char getCharBigEndian(int index);
public final void putChar(int index, char value);
public final void putCharLittleEndian(int index, char value);
public final void putCharBigEndian(int index, char value);
public final short getShort(int index);
public final short getShortLittleEndian(int index);
public final short getShortBigEndian(int index);
public final void putShort(int index, short value);
public final void putShortLittleEndian(int index, short value);
public final void putShortBigEndian(int index, short value);
public final int getInt(int index);
public final int getIntLittleEndian(int index);
public final int getIntBigEndian(int index);
public final void putInt(int index, int value);
public final void putIntLittleEndian(int index, int value);
public final void putIntBigEndian(int index, int value);
public final long getLong(int index);
public final long getLongLittleEndian(int index);
public final long getLongBigEndian(int index);
public final void putLong(int index, long value);
public final void putLongLittleEndian(int index, long value);
public final void putLongBigEndian(int index, long value);
public final float getFloat(int index);
public final float getFloatLittleEndian(int index);
public final float getFloatBigEndian(int index);
public final void putFloat(int index, float value);
public final void putFloatLittleEndian(int index, float value);
public final void putFloatBigEndian(int index, float value);
public final double getDouble(int index);
public final double getDoubleLittleEndian(int index);
public final double getDoubleBigEndian(int index);
public final void putDouble(int index, double value);
public final void putDoubleLittleEndian(int index, double value);
public final void putDoubleBigEndian(int index, double value);
public abstract void get(DataOutput out, int offset, int length) throws IOException;
public abstract void put(DataInput in, int offset, int length) throws IOException;
public abstract void get(int offset, ByteBuffer target, int numBytes);
public abstract void put(int offset, ByteBuffer source, int numBytes);
// 拷贝数据到目标segment
public final void copyTo(int offset, MemorySegment target, int targetOffset, int numBytes);
// 比较两个segment的数据
public final int compare(MemorySegment seg2, int offset1, int offset2, int len);
// 交互两个segment的数据
public final void swapBytes(byte[] tempBuffer, MemorySegment seg2, int offset1, int offset2, int len);
通过MemoryManager、MemoryPool、MemorySegment等类,Flink实现了应用层级对于内存的管理,规避了JVM原生内存管理带来的诸多问题,有效的提升了Flink的内存效率和性能。